152 37 33MB
English Pages [227] Year 2022
GARP FRM Financial Risk Manager
Market Risk Measurement and Management
Pearson
Excerpts taken from: O ption s, Futures, and O th er D erivatives, Tenth Edition by John C. Hull Copyright © 2017, 2015, 2012, 2009, 2006, 2003, 2000 by Pearson Education, Inc. New York, New York 10013 Copyright © 2022, 2021 Global Association of Risk Professionals All rights reserved. This copyright covers material written expressly for this volum e by the editor/s as well as the com pilation itself. It does not cover the individual selections herein that first appeared elsew here. Permission to reprint these has been obtained by Pearson Learning Solutions for this edition only. Further reproduction by any means, electronic or m echanical, including photocopying and recording, or by any information storage or retrieval system , must be arranged with the individual copyright holders noted.
Grateful acknowledgment is made to the following sources for permission to reprint material copyrighted or controlled by them: "Estim ating M arket Risk M easures," "N on-Param etric A p p ro aches," and "Param etric Approaches (III): Extrem e Value" by Kevin Dowd, reprinted from M easuring M arket Risk, Second Edition (2005), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. "B ack Testing VaR" and "VaR M apping," by Philippe Jo rio n, reprinted from Value at Risk: The N ew Benchm ark for M anaging Financial Risk, Third Edition (2007), by permission of The M cG raw Hill Com panies. "M essages from the A cadem ic Literature on Risk M easurem ent for the Trading Book," W orking Paper No. 19 January 2011, reprinted by permission of the Basel Com m ittee on Banking Supervision. "Som e Correlation Basics: Definitions, A pplications, and Term inology," "Em pirical Properties of Correlation: How Do Correlations Behave in the Real W orld?," and "Financial Correlation M odeling— Bottom Up A p p ro aches," by G unter Meissner, reprinted from Correlation Risk M o d elin g and M anagem ent, Second Edition (2019), by permission of Risk Books/InfoPro Digital Services, Ltd. "Em pirical Approaches to Risk M etrics and H ed g es," "Th e Science of Term Structure M odels," "The Evolution of Short Rates and the Shape of the Term Structure," "The A rt of Term Structure M odels: D rift," and "The A rt of Term Structure M odels: Volatility and D istribution," by Bruce Tuckman and Angel Serrat, reprinted from F ixe d Incom e Secu rities: Tools fo r Today's M arkets, Third Edition (2012), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. "Fundam ental Review of the Trading Book," by John C . Hull, reprinted from Risk M anagem ent and Financial Institutions, Fifth Edition (2018), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. Learning O bjectives provided by the Global Association of Risk Professionals. All tradem arks, service marks, registered tradem arks, and registered service marks are the property of their respective owners and are used herein for identification purposes only. Pearson Education, Inc., 330 Hudson Street, New York, New York 10013 A Pearson Education Com pany w w w .pearsoned.com Printed in the United States of Am erica
ScoutAutomatedPrintCode 00011693-00000005 / A 1 03000278803 EEB/M B
Pearson
ISBN 10: 0137686617 ISBN 13: 9780137686612
Chapter 1
Estimating Market Risk Measures
1.1 Data
1.6 The Core Issues:An Overview
1 2
Profit/Loss Data
2
Loss/Profit Data
2
A rithm etic Return Data
2
G eom etric Return Data
2
1.7 Appendix
13 13
Prelim inary Data Analysis
13
Plotting the Data and Evaluating Sum m ary Statistics
14
Q Q Plots
14
Chapter 2
Non-Parametric Approaches
17
1.2 Estimating Historical Simulation VaR
3
1.3 Estimating Parametric VaR
4
Estim ating VaR with Norm ally Distributed Profits/Losses
2.1 Compiling Historical Simulation Data
18
4
Estim ating VaR with Norm ally Distributed A rithm etic Returns
2.2 Estimation of Historical Simulation VaR and ES
19
5
Basic Historical Simulation
19
Estim ating Lognormal VaR
6
Bootstrapped Historical Simulation
19
1.4 Estimating Coherent Risk Measures 7
Historical Simulation Using Non-param etric Density Estim ation
19
Estim ating Curves and Surfaces for VaR and ES
21
Estim ating Expected Shortfall
7
Estim ating Coherent Risk M easures
8
1.5 Estimating the Standard Errors of Risk Measure Estimators
10
Standard Errors of Q uantile Estim ators
10
Standard Errors in Estim ators of Coherent Risk M easures
12
2.3 Estimating Confidence Intervals for Historical Simulation VaR and ES 21 An O rder Statistics Approach to the Estim ation of Confidence Intervals for HS VaR and ES
22
in
A Bootstrap Approach to the Estimation of Confidence Intervals for HS VaR and ES
2.4 Weighted Historical Simulation
23
Age-w eighted Historical Simulation
24
Volatility-weighted Historical Simulation
25
Correlation-w eighted Historical Simulation
26
Filtered Historical Simulation
26
2.5 Advantages and Disadvantages of Non-Parametric Methods
28
A dvantages
28
D isadvantages
28
Conclusions
29
Appendix 1
29
Estim ating Risk M easures with O rder Statistics
29
Using O rder Statistics to Estim ate Confidence Intervals for VaR
29
Conclusions
30
Appendix 2
31
The Bootstrap
31
Lim itations of Convention Sampling Approaches
31
The Bootstrap and Its Im plem entation
31
Standard Errors of Bootstrap Estim ators
33
Tim e Dependency and the Bootstrap
34
Chapter 3
Parametric Approaches (II): Extreme Value
3.1 Generalised Extreme-Value Theory Theory A Short-Cut
EV Method
Estim ation of EV Param eters
iv
22
■ Contents
35
3.2 The Peaks-Over-Threshold Approach: The Generalised Pareto Distribution
43
Theory
43
Estim ation
45
G E V vs PO T
45
3.3 Refinements to EV Approaches
46
Conditional EV
46
Dealing with Dependent (or Non-iid) Data
46
M ultivariate EV T
47
3.4 Conclusions
47
Chapter 4
Backtesting VaR
4.1 Setup for Backtesting
49 50
An Exam ple
50
Which Return?
50
4.2 Model Backtesting with Exceptions
51
Model Verification Based on Failure Rates
51
The Basel Rules
54
Conditional C overage M odels
55
Extensions
56
4.3 Applications
56
4.4 Conclusions
57
Chapter 5
VaR Mapping
5.1 Mapping for Risk Measurement
59 60
W hy M apping?
60
36
Mapping as a Solution to Data Problems
60
39
The Mapping Process
61
39
General and Specific Risk
62
36
5.2 Mapping Fixed-Income Portfolios
63
6.4 Risk Measures
82
O verview
82
M apping Approaches
63
VaR
82
Stress Test
64
Expected Shortfall
84
Benchm arking
64
Spectral Risk M easures
85
O ther Risk M easures
86
Conclusions
86
5.3 Mapping Linear Derivatives
66
Forward Contracts
66
Com m odity Forw ards
67
Forward Rate Agreem ents
68
O verview
87
Interest-Rate Swaps
69
Incorporating Stress Testing into M arket-Risk Modelling
87
Stressed VaR
88
Conclusions
89
5.4 Mapping Options
70
5.5 Conclusions
72
Chapter 6
Messages from the Academic Literature on Risk Management for the Trading 73 Book
6.1 Introduction 6.2 Selected Lessons on VaR Implementation
74 74
O verview
74
Tim e Horizon for Regulatory VaR
74
Time-Varying Volatility in VaR
76
Backtesting VaR Models
77
Conclusions
78
6.3 Incorporating Liquidity 1 1 tJ
6.5 Stress Testing Practices for Market Risk
87
6.6 Unified Versus Compartmentalised 89 Risk Measurement O verview
89
Aggregation of Risk: Diversification versus Com pounding Effects
90
Papers Using the "Bottom -U p" Approach
91
Papers Using the "Top-Down" Approach
94
Conclusions
95
6.7 Risk Management and Value-at-Risk in a Systemic Context
95
O verview
95
Interm ediation, Leverage and Value-at-Risk: Em pirical Evidence
96
W hat Has All This to Do with VaR-Based Regulation?
97
Conclusions
98
Annex
103
78
O verview
78
Exogenous Liquidity
79
Endogenous Liquidity: Motivation
79
Endogenous Liquidity and M arket Risk for Trading Portfolios
80
Adjusting the VaR Tim e Horizon to Account for Liquidity Risk
81
Conclusions
81
Chapter 7
Correlation Basics: Definitions. Applications, and Terminology 105
7.1 A Short History of Correlation
Contents
106
■ v
7.2 What Are Financial Correlations? 106 7.3 What Is Financial Correlation Risk? 7.4 Motivation: Correlations and Correlation Risk Are Everywhere in Finance
106
108
Investm ents and Correlation
108
7.5 Trading and Correlation
109
8.2 Do Equity Correlations Exhibit Mean Reversion?
128
How Can We Q uantify Mean Reversion?
128
8.3 Do Equity Correlations Exhibit Autocorrelation?
129 130
Risk M anagem ent and Correlation
112
The Global Financial C rises 2007 to 2009 and Correlation
8.4 How Are Equity Correlations Distributed?
113
Regulation and Correlation
116
8.5 Is Equity Correlation Volatility an Indicator for Future Recessions? 130
7.6 How Does Correlation Risk Fit into the Broader Picture of Risks in Finance? 116 Correlation Risk and M arket Risk
117
Correlation Risk and C redit Risk
117
7.7 Correlation Risk and Systemic Risk
119
7.8 Correlation Risk and Concentration Risk
119
7.9 A Word on Terminology
121
Summary
121
Appendix A1
122
8.6 Properties of Bond Correlations and Default Probability Correlations 131 Summary
131
Questions
132
Chapter 9
Financial Correlation Modeling—Bottom-Up Approaches 133
9.1 Copula Correlations
134
D ependence and Correlation
122
Exam ple A 1: Statistical Independence
122
The Gaussian Copula
Correlation
122
Independence and Uncorrelatedness
122
Simulating the Correlated Default Tim e for M ultiple A ssets 137
Appendix A2 Questions
Chapter 8
■ Contents
134
123
On Percentage and Logarithm ic Changes 123
vi
8.1 How Do Equity Correlations Behave in a Recession, Normal Economic Period or Strong Expansion? 126
124
Empirical Properties of Correlation: How Do Correlations Behave in the Real World? 125
Chapter 10
Empirical Approaches to Risk Metrics and Hedging 139
10.1 Single-Variable Regression-Based Hedging 140 Least-Squares Regression Analysis
141
The Regression Hedge
142
11.6 Option-Adjusted Spread
162
The Stability of Regression Coefficients over Tim e
143
11.7 Profit and Loss Attribution with an OAS
162
144
11.8 Reducing the Time Step
163
10.3 Level Versus Change Regressions
146
11.9 Fixed Income Versus Equity Derivatives
164
10.4 Principal Components Analysis
146
10.2 Two-Variable Regression-Based Hedging
O verview
146
PCA s for USD Swap Rates
147
Hedging with P C A and an Application to Butterfly W eights
149
Principal Com ponent Analysis of EUR, GBP, and JP Y Swap Rates
150
The Shape of PCs over Tim e
150
Appendix A
151
The Least-Squares Hedge Minimizes the Variance of the P&L of the Hedged Position
Appendix B
152
Constructing Principal Com ponents from Three Rates
Chapter 11
151
The Science of Term Structure Models
152
Chapter 12
167
12.1 Introduction
168
12.2 Expectations
168
12.3 Volatility and Convexity
169
12.4 Risk Premium
171
Chapter 13 155
The Evolution of Short Rates and the Shape of the Term Structure
The Art of Term Structure Models: Drift 175
11.1 Rate and Price Trees
156
13.1 Model 1: Normally Distributed Rates and No Drift 176
11.2 Arbitrage Pricing of Derivatives
157
13.2 Model 2: Drift and Risk Premium
178
11.3 Risk-Neutral Pricing
158
11.4 Arbitrage Pricing in a Multi-Period Setting
13.3 The Ho-Lee Model: Time-Dependent Drift
179
159
13.4 Desirability of Fitting to the Term Structure
180
13.5 The Vasicek Model: Mean Reversion
181
11.5 Example: Pricing a Constant-Maturity Treasury Swap
161
Contents
■ vii
Chapter 14
The Art of Term Structure Models: Volatility and Distribution 187
14.1 Time-Dependent Volatility: Model 3 14.2 The Cox-Ingersoll-Ross and Lognormal Models: Volatility as a Function of the Short Rate
188
189
14.3 Tree for the Original Salomon Brothers Model 190 14.4 The Black-Karasinski Model: A Lognormal Model with Mean Reversion 191 14.5 Appendix Closed-Form Solutions for Spot Rates
Chapter 15
191 191
Volatility Smiles 193
15.1 Why the Volatility Smile Is the Same for Calls and Puts 194 15.2 Foreign Currency Options Em pirical Results
195
Reasons for the Smile in Foreign Currency O ptions
196
15.3 Equity Options The Reason for the Smile in Equity O ptions
15.4 Alternative Ways of Characterizing the Volatility Smile
viii
195
■ Contents
196 197
198
15.5 The Volatility Term Structure and Volatility Surfaces
198
15.6 Minimum Variance Delta
199
15.7 The Role of the Model
199
15.8 When a Single Large Jump Is Anticipated
199
Summary
200
Appendix
201
Determ ined Implied Risk-Neutral Distributions from Volatility Sm iles
Chapter 16
Fundamental Review of the Trading Book
201
203
16.1 Background
204
16.2 Standardized Approach
205
Term Structures
206
Curvature Risk Charge
206
Default Risk Charge
207
Residual Risk Add-On
207
A Sim plified Approach
207
16.3 Internal Models Approach
207
Back-Testing
208
Profit and Loss A ttribution
209
Credit Risk
209
Securitizations
209
16.4 Trading Book vs. Banking Book
209
Summary
210
Index
211
On behalf of our Board of Trustees, G A RP's staff, and particu
The FRM program addresses the financial risks faced by both
larly its certification and educational program s team s, I would
non-financial firm s and those in the highly interconnected and
like to thank you for your interest in and support of our Financial
sophisticated financial services industry, because its coverage is
Risk M anager (FRM®) program .
not static, but vibrant and forward looking.
The past couple of years have been difficult due to C O V ID -19.
The FRM curriculum is regularly reviewed by an oversight com
And in that regard, our sincere sym pathies go out to anyone
m ittee of highly qualified and experienced risk-m anagem ent
who was ill or suffered a loss due to the pandem ic.
professionals from around the globe. These professionals con
The FRM program also experienced many virus-related chal
sist of senior bank and consulting practitioners, governm ent
lenges. Because we always place candidate safety first, we cancelled the May 2020 FRM exam offering and deferred all can didates to October, while reserving an optional date in January 2021 for candidates not able to sit for the examination in October. A change like this has never happened before. Ultimately, we were able to offer the FRM exam to all 2020 registered candidates who wanted to sit for it during the year and were not constrained by COVID-related restrictions, which was most of our registrants. Since its inception in 1997, the FRM program has been the
regulators, asset m anagers, insurance risk professionals, and academ ics. Their mission is to ensure the FRM program remains current and its content addresses not only standard credit and m arket risk issues, but also em erging issues and trends, ensur ing FRM candidates are aware of what is or is expected to be im portant in the near future. W e're com m itted to offering a pro gram that reflects the dynam ic and sophisticated nature of the risk-m anagem ent profession and those who are making it a career.
global industry benchm ark for risk-m anagem ent professionals
We wish you the very best as you study for the FRM exam s, and
wanting to dem onstrate objectively their knowledge of financial
in your career as a risk-m anagem ent professional.
risk-m anagem ent concepts and approaches. Having FRM hold
Yours truly,
ers on staff also tells com panies' that their risk-m anagem ent professionals have achieved a dem onstrated and globally adopted level of expertise. In a world where risks are becom ing more com plex daily due to any number of technological, capital, governm ental, geopoliti cal, or other factors, com panies know that FRM holders possess the skills to understand and adapt to a dynam ic and rapidly
Richard A postolik
changing financial environm ent.
President & C E O
Preface
■ ix
Chairperson Michelle McCarthy Beck Form er G A R P Board M em ber
Members Richard Apostolik
Dr. Attilio Meucci, CFA
President and C E O , Global Association of Risk Professionals
Founder, ARPM
Richard Brandt
Dr. Victor Ng
MD O perational Risk M anagem ent, Citigroup
MD Head of Risk Architecture, Goldm an Sachs
Julian Chen, FRM, SVP
Dr. Matthew Pritsker
FRM Program Manager, Global Association of Risk Professionals
Senior Financial Econom ist and Policy Advisor / Supervision,
Dr. Christopher Donohue, MD
Regulation, and Credit, Federal Reserve Bank of Boston
G A R P Benchm arking Initiative, Global Association of Risk
Dr. Samantha Roberts, FRM
Professionals
SVP Balance Sheet Analytics & M odeling, PN C Bank
Donald Edgar, FRM
Dr. Til Schuermann
MD Risk & Q uantitative Analysis, BlackRock
Partner, O liver W yman
Herve Geny
Nick Strange
Form er Group Head of Internal A udit, London Stock Exchange
Senior Technical Advisor, O perational Risk & Resilience,
Group and form er C R O , ICAP
Supervisory Risk Specialists, Prudential Regulation Authority,
Keith Isaac, FRM VP Capital M arkets Risk M anagem ent, TD Bank Group
William May, SVP Global Head of Certifications and Educational Program s, Global Association of Risk Professionals
x
■
FRM® Committee
Bank of England
Dr. Sverrir Porvaldsson, FRM Senior Q uant, SEB
Estimating Market Risk Measures
An Introduction and Overview Learning Objectives A fter com pleting this reading you should be able to: Estim ate VaR using a historical simulation approach.
Estim ate risk measures by estim ating quantiles.
Estim ate VaR using a param etric approach for both normal
Evaluate estim ators of risk measures by estim ating their
and lognormal return distributions.
standard errors.
Estim ate the expected shortfall given profit and loss (P/L)
Interpret quantile-quantile (Q Q ) plots to identify the
or return data.
characteristics of a distribution.
D escribe coherent risk m easures.
E x c e rp t is C h a p ter 3 o f Measuring M arket Risk, S e co n d Edition, by Kevin D ow d.
1
Loss/Profit Data
This chapter provides a brief introduction and overview of the main issues in m arket risk m easurem ent. O ur main
W hen estim ating VaR and ES, it is som etim es more convenient
concerns are: •
Prelim inary data issues: How to deal with data in profit/loss form , rate-of-return form , and so on.
•
to deal with data in loss/profit (L/P) form . L/P data are a simple transform ation of P/L data: L /P t — ~ P / L t
Basic m eth ods o f VaR estim ation: How to estim ate simple
(1.4)
VaRs, and how VaR estim ation depends on assum ptions
L/P observations assign a positive value to losses and a negative
about data distributions.
value to profits, and we will call these L/P data 'losses' for short.
•
How to estim ate coherent risk m easures.
Dealing with losses is som etim es a little more convenient for risk
•
How to gauge the precision of our risk measure estim ators by estim ating their standard errors.
•
O verview : An overview of the different approaches to market risk m easurem ent, and of how they fit together.
m easurem ent purposes because the risk measures are them selves denom inated in loss term s.
Arithmetic Return Data Data can also come in the form of arithm etic (or simple) returns.
We begin with the data issues.
The arithm etic return rt is defined as:
Pt + Dt
1.1 DATA
Pt-
Profit/Loss Data
-
1
(1.5)
which is the same as the P/L over period t divided by the value of the asset at the end of t — 1.
O ur data can com e in various form s. Perhaps the sim p lest is in term s of profit/loss (or P/L). The P/L g enerated by an
assumption will seldom be appropriate over long periods because
as the value of the asset (or portfolio) at the end of t
interim income is usually reinvested. Hence, arithmetic returns
plus any interim paym ents D t m inus the asset value at
should not be used when we are concerned with long horizons.
the end of t — 1: P/ Lt — Pt + D t - Pt_!
In using arithmetic returns, we implicitly assume that the interim payment Dt does not earn any return of its own. However, this
asset (or portfolio) over the period t, P /L t, can be defined
(1.1)
If data are in P/L form , positive values indicate profits and nega
Geometric Return Data Returns can also be expressed in geom etric (or compound)
tive values indicate losses. If we wish to be strictly correct, we should evaluate all paym ents from the same point of tim e (i.e., we should take account of the tim e value of money). We can do so in one of two ways. The first way is to take the present value of P /L t evaluated at the end of the previous period, t — 1: P resen t Value (P /L t) =
P t + Dt Pt- 1
Pt- 1 1
form . The geom etric return Rt is Pt +
p
A
p t- 1 )
( 1.6 )
The geom etric return implicitly assumes that interim payments are continuously reinvested. The geom etric return is often more
(Pt + D t)
- Pt_-|
(1.2)
ensures that the asset price (or portfolio value) can never become negative regardless of how negative the returns might be. With
where d is the discount rate and w e assum e for convenience that Dt is paid at the end of t. The alternative is to take the for ward value of P /L t evaluated at the end of period t: Forw ard Value (P / L t) = Pt + Dt - {1 + d)Pt_-,
economically meaningful than the arithmetic return, because it
arithmetic returns, on the other hand, a very low realized return— or a high loss— implies that the asset value Pt can become nega tive, and a negative asset price seldom makes economic sense.1
(1.3)
The geom etric return is also more convenient. For exam ple,
which involves com pounding Pt_-| by d. Th e d ifferen ces
if we are dealing with foreign currency positions, geom etric
betw een th ese values depend on the discount rate d, and
returns will give us results that are independent of the reference
will be sm all if the p erio d s th em selves are short. W e will ignore th ese d ifferen ces to sim plify the d iscussio n, but they can m ake a d ifferen ce in p ractice w hen dealing with longer p erio d s.
2
■
1 This is mainly a point of principle rather than practice. In practice, any distribution we fit to returns is only likely to be an approximation, and many distributions are ill-suited to extreme returns anyway.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
returns are 1 + rt = e xp (Rt) => rt — e xp (Rt) — 1 = exp(0.05) — 1 = 0.0513. In both cases the arithmetic return is close to, but a little higher than, the geometric return— and this makes intuitive sense when one considers that the geometric return compounds at a faster rate.
1.2 ESTIMATING HISTORICAL SIMULATION V a R The sim plest way to estim ate VaR is by means of historical simulation (HS). The HS approach estim ates VaR by means of ordered loss observations. Suppose we have 1000 loss observations and are interested in the VaR at the 95% confidence level. Since the confidence level
Arithmetic Return
Figure 1.1
im plies a 5% tail, we know that there are 50 observations in the
Geometric and arithmetic returns.
tail, and we can take the VaR to be the 51st highest loss observation.2
currency. Sim ilarly, if we are dealing with multiple periods, the
W e can estim ate the VaR on a sp read sh eet by ordering
geom etric return over those periods is the sum of the one-
our data and reading off the 51st largest observation from
period geom etric returns. Arithm etic returns have neither of
the sp read sh eet. W e can also estim ate it more directly
these convenient properties.
by using the 'Larg e' com m and in E xce l, which gives us
The relationship of the two types of return can be seen by
the kth largest value in an array. Thus, if our data are an
rewriting Equation (1.6) (using a Taylor's series expansion for the natural log) as:
array called 'L o ss_d a ta ', then our VaR is given by the Excel com m and 'La rg e (L o ss_d a ta ,5 1 )'. If w e are using MATLA B , w e first order the loss/profit data using the 'Sort()'
n(1 + rt) = rt
(1.7)
from which we can see that Rt ~ rt provided that returns are 'sm all'. This conclusion is illustrated by Figure 1.1, which plots the geom etric return Rt against its arithm etic counterpart rt. The difference between the two returns is negligible when both returns are sm all, but the difference grows as the returns get bigger— which is to be expected, as the geom etric return is a log function of the arithm etic return. Since we would expect returns to be low over short periods and higher over longer periods, the difference betw een the two types of return is
com m and (i.e ., by typing 'Loss_data = Sort(Loss_data)'); and then derive the VaR by typing in 'Lo ss_d a ta (5 1 )' at the com m and line. More generally, if we have n observations, and our confi dence level is a, we would want the (1 — a), n + 1th highest observation, and we would use the com m ands 'Large(Loss_data,(1 — a/pha)*n + 1)' using Excel, or 'Loss_data((1 — alpha)*n + 1)' using M A TLA B , provided in the latter case that our 'Loss_data' array is already sorted into ordered observations.3
negligible over short periods but potentially substantial over longer ones. And since the geom etric return takes account of earnings on interim incom e, and the arithm etic return does not, we should always use the geom etric return if we are dealing with returns over longer periods.
Example 1.1
Arithmetic and Geom etric Returns
If arithmetic returns rt over some period are 0.05, Equation (1.7) tells us that the corresponding geometric returns are Rt — ln(1 + rt) — ln(1.05) = 0.0488. Similarly, if geometric returns Rt are 0.05, Equation (1.7) implies that arithmetic
2 In theory, the VaR is the quantile that demarcates the tail region from the non-tail region, where the size of the tail is determined by the confidence level, but with finite samples there is a certain level of arbitrariness in how the ordered observations relate to the VaR itself— that is, do we take the VaR to be the 50th observation, the 51st obser vation, or some combination of them? However, this is just an issue of approximation, and taking the VaR to be the 51st highest observation is not unreasonable. 3 We can also estimate HS VaR using percentile functions such as the 'Percentile' function in Excel or the 'pretile' function in MATLAB. However, such functions are less transparent (i.e., it is not obvious to the reader how the percentiles are calculated), and the Excel percentile function can be unreliable.
Chapter 1 Estimating Market Risk Measures
■ 3
An exam p le of an HS VaR is given in Figure 1.2. This figure
In p ractice, it is often helpful to obtain HS VaR estim ates from
shows the histogram of 1000 hypothetical loss o b serva
a cum ulative histogram , or em pirical cum ulative frequency
tions and the 95% VaR. Th e figure is g enerated using the
function. This is a plot of the ordered loss observations against
'hsvarfig ure' com m and in the MMR Toolbox. The VaR is
their em pirical cum ulative freq uency (e .g ., so if there are n
1.704 and sep arates the top 5% from the bottom 95% of
observations in to tal, the em pirical cum ulative freq uency of
loss o bservations.
the rth such ordered observation is i/n). The em pirical cum u lative freq uency function of our earlier data set is shown in Figure 1.3. The em pirical freq uency function m akes it very easy to obtain the VaR: we sim ply move up the cum ulative freq uency axis to w here the cum ulative freq uency equals our confidence level, draw a horizontal line along to the curve, and then draw a vertical line down to the x-axis, which gives us our VaR.
1.3 ESTIMATING PARAMETRIC V a R W e can also estim ate VaR using p aram etric ap p roaches, the distinguishing feature of which is that they require us to exp licitly sp ecify the statistical distribution from which our data observations are draw n. W e can also think of p aram etric app roaches as fitting curves through the data Loss (+)/profit (-)
Figure 1.2
Historical simulation VaR.
Note: Based on 1000 random numbers drawn from a standard normal L/P distribution, and estimated with 'hsvarfigure' function.
and then reading off the VaR from the fitted curve. In making use of a param etric approach, we therefore need to take account of both the statistical distribution and the type of data to which it applies.
Estimating VaR with Normally Distributed Profits/Losses Suppose that we wish to estim ate VaR under the assumption that P/L is normally distributed. In this case our VaR at the confi dence level a is: aVaR — —/JLp/i + crp/Lza
(1-8)
w here za is the standard normal variate corresponding to a, and /xP/L and a P/L are the mean and standard deviation of P/L. Thus, za is the value of the standard normal variate such that a of the probability density mass lies to its left, and 1 — a of the probability density mass lies to its right. For exam p le, if our confidence level is 95% , za — z095 will be 1.645. Loss (+)/profit (-)
Figure 1.3 Historical simulation via an empirical cumulative frequency function. Note: Based on the same data as Figure 1.2.
4
■
In practice, /xP/L and a P/L would be unknown, and we would have to estim ate VaR based on estim ates of these param eters. Our VaR estim ate, aVaRe, would then be: aVaRe - —mP/L + s P/Lza
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
(1.9)
where mP/L and sP/L are estim ates of the mean and standard deviation of P/L. Figure 1.4 shows the 95% VaR for a norm ally distributed P/L with mean 0 and standard deviation 1. Since the data are in P/L form , the VaR is indicated by the negative of the cut off point betw een the low er 5% and the upper 95% of P/L observations. The actual VaR is the negative of —1.645, and is therefore 1.645. If we are working with normally distributed U P data, then i i L/P — —/Jip/L and a/_/p = a P//_, and it im m ediately follows that: aVaR — /jl i_/p + crL/pZa
(1.10a)
aVaRe — nnL/P + s L/Pza
(1.10b)
Figure 1.5 illustrates the corresponding VaR. This figure gives the sam e inform ation as Figure 1.4, but is a little more straightforw ard to in terp ret because the VaR is defined in units of losses (or 'lost m oney') rather than P/L. In this case,
Profit (+)/loss (-)
Fiqure 1.4 VaR with standard normally distributed profit/loss data. Note: Obtained from Equation (1.9) with /jl P/L = 0 and crP/L = 1. Esti mated with the 'normalvarfigure' function.
the VaR is given by the point on the x-axis that cuts off the top 5% of the pdf m ass from the bottom 95% of pdf m ass. If we prefer to w ork with the cum ulative density function, the VaR is the x-value that co rresp onds to a cd f value of 95% . Eith er w ay, the VaR is again 1.645, as w e w ould (hopefully) e xp e ct.
Example 1.2
VaR with Normal P/L
If P/L over some period is normally distributed with mean 10 and standard deviation 20, then (by Equation (1.8)) the 95% VaR is - 1 0 + 20z0.95 = - 1 0 + 20 X 1.645 = 22.9. The corresponding 99% VaR is —10 + 20 z q 99 — —10 + 20 X 2.326 = 36.52.
Estimating VaR with Normally Distributed Arithmetic Returns
) ______ 1 -5 -4
--- - 1______ I______ I______ I __I--- --- - ----- I______ -3 -2 -1 0 1 2 3 4 5 Loss (+)/profit (-)
We can also estim ate VaR making assum ptions about
Fiaure 1.5 data.
VaR with normally distributed loss/profit
returns rather than P/L. Suppose then that w e assum e that
Note: Obtained from Equation (1,10a) with /jl L/P = 0 and aL/P = 1.
arithm etic returns are normally distributed with mean /ir and standard deviation ay. To derive the VaR, we begin by obtaining the critical value of rt, r*, such that the probabil
Since the actual return rt is the loss/profit divided by the earlier
ity that rt exceeds r* is equal to our confidence level a. r* is
asset value, Pt_ 1f it follows that:
therefore: r* = Hr ~ o-rza
( 1. 11)
( 1. 12)
Chapter 1 Estimating Market Risk Measures
■ 5
Substituting r* for rt then gives us the relationship betw een r* and the VaR: (1.13)
BOX 1.1 THE LOGNORMAL DISTRIBUTION
Equation (1.14) will give us equivalent answers to our earlier VaR
A random variate X is said to be lognorm ally d istrib uted if the natural log of X is norm ally d istrib u ted . The lognorm al distribution can be sp ecified in term s of the mean and standard deviation of In X. Call these p aram eters fi and cr. The lognorm al is often also rep re sented in term s of m and cr, w here m is the m edian of x, and m — exp(ju).
equations. For exam ple, if we set a — 0.95, f ir — 0, a r — 1 and
The pdf of X can be w ritten:
Substituting Equation (1.11) into Equation (1.13) and rearrang ing then gives us the VaR itself: aVaR - - { f i r - crrz JP t_-|
(1.14)
Pt_i = 1, which correspond to our earlier illustrative P/L and U P
1
param eter assum ptions, aVaR is 1.645: the three approaches give the same results, because all three sets of underlying assum ptions are equivalent.
Example 1.3
VaR with Normally Distributed Arithmetic
Returns Suppose arithm etic returns rt over some period are distributed as normal with mean 0.1 and standard deviation 0.25, and we have a portfolio currently worth 1. Then (by Equation (1.14)) the 95% VaR is -0 .1 + 0.25 X 1.645 = 0.331, and the 99% VaR is
\
1 /log(x) -
^X = ~U\/2tt 6XP{ ~2 \
J
J
for x > 0. Thus, the lognorm al pdf is only defined for positive values of x and is skew ed to the right as in Figure 1.6. Let co — exp(cr2) for convenience. The mean and variance of the lognormal can be written as: mean — mexp( In P* = R* + In Pt_-|
non-negative, and its distribution is skew ed with a long right-hand tail.
=> P* = Pt- 1 exp[P*] = Pt_'i exp [jl l r - a Rza]
Since the VaR is a loss, and since the loss is the difference
=> aVaR - Pt_-| - P* = Pt—1(1 - exp[/uLR - crRz J )
between Pt (which is random) and Pt_-| (which we can take here as given), then the VaR itself has the same distribution as Pt. Normally distributed geom etric returns imply that the VaR is
If we proceed as we did earlier with the arithm etic return, probability that R > R* is a:
■
normally distributed geom etric returns.
the standardised (but typically unrealistic) assum ptions that
we begin by deriving the critical value of R, R*, such that the
6
This gives us the lognormal VaR, which is consistent with
The lognormal VaR is illustrated in Figure 1.7, based on
lognormally distributed.
R* = HR - ( T Rza
(1.16)
juR — 0,
ctr
— 1, and Pt_[p) is specified by the user. The ES gives all tail-loss quantiles an equal w eight, and other quantiles a w eight of 0. Thus the ES is a sp e cial case of M^ obtained by setting a
which is 2.0250.
Table 1.1 Tail VaRs
(1.17)
0
high value of n and carry out the calculations on a sp read sh eet or using ap p rop riate so ftw are. How ever, to show the p ro ce
I (p)qpd p
True value
2.0630
Note: VaRs estimated assuming the mean and standard deviation of losses are 0 and 1.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
The more general coherent risk m easure,
involves a poten
to give accurate results. Table 1.4 reports som e alternative
tially more sophisticated weighting function (p). W e can there
estim ates obtained using this procedure with increasing val
fore estim ate any of these m easures by replacing the equal
ues of n. These results show that the estim ated risk m easure
w eights in the 'average VaR' algorithm with the 0(p) weights
rises with n, and gradually converges to a value in the region
appropriate to the risk measure being estim ated.
of about 1.854. The estim ates in this table indicate that we
To show how this might be done, suppose we have the exp o nential weighting function:
may need a considerably larger value of n than we did earlier to get results of the sam e level of accuracy. Even so, a good com puter should still be able to produce accurate estim ates of
p—(1—p)/y
spectral risk m easures fairly quickly.
=
( 1 1 9 )
W hen estim ating ES or more general coherent risk m easures in
and we believe that w e can represent the degree of our
practice, it also helps to have som e guidance on how to choose
risk-aversion by setting y = 0.05. To illustrate the procedure
the value of n. G ranted that the estim ate does eventually con
manually, we continue to assume that losses are standard
verge to the true value as n gets large, one useful approach is to
normally distributed and w e set n = 10 (i.e., we divide the
start with som e small value of n, and then double n repeatedly
com plete losses density function into 10 equal-probability
until we feel the estim ates have settled down sufficiently. Each
slices). W ith n = 10, we have n — 1 = 9 (i.e., n — 1) loss quan
tim e we do so, we halve the width of the discrete slices, and
tiles or VaRs spanning confidence levels from 0.1 to 0.90. These
we can monitor how this 'halving' process affects our estim ates.
VaRs are shown in the second column of Table 1.3, and vary
This suggests that we look at the 'halving error' s n given by:
from —1.2816 (for the 10% VaR) to 1.2816 (for the 90% VaR). The third column shows the (p) w eights corresponding to each confidence level, and the fourth column shows the products of each VaR and corresponding w eight. O ur estim ated exponential spectral risk measure is the 0(p)-w eighted average of the VaRs,
s n = M (n) - M (n/2)
(1.20)
w here M(n) is our estim ated risk m easure based on n slices. W e stop doubling n when s n falls below som e to lerance level that indicates an accep tab le level of accuracy. The process is
and is therefore equal to 0.4228. A s when estim ating the ES earlier, when using this type of routine in practice we would w ant a value of n large enough
Estimating Exponential Spectral Risk Table 1.3 Measure as a Weighted Average of VaRs Confidence Level (or)
aVaR
Weight (f>{a)
Table 1.4 Estimates of Exponential Spectral Coherent Risk Measure as a Function of the Number of Tail Slices Number of Tail Slices
Estimate of Exponential Spectral Risk Measure
10
0.4227
{a) are given by the exponential function (Equation (1.19)) with y = 0.05.
Chapter 1 Estimating Market Risk Measures
■ 9
Table 1.5
Estimated Risk Measures and Halving Errors
Number of Tail Slices 100
Estimated Spectral Risk Measure 1.5853
Thus, the key to estim ating any co herent risk m easure is to be able to estim ate quantiles or VaRs: the co h eren t risk m easures can then be obtained as ap p ro p riate ly w eig hted
Halving Error
averag es of q u an tiles. From a p ractical point of view , this is e xtre m e ly helpful as all the building blocks that go into
0.2114
quantile or VaR estim ation— d atab ases, calculation rou tin es, e tc .— are exactly w hat w e need for the estim ation of
200
1.7074
0.1221
400
1.7751
0.0678
800
1.8120
0.0368
1600
1.8317
0.0197
3200
1.8422
0.0105
6400
1.8477
0.0055
12 800
1.8506
0.0029
25 600
1.8521
0.0015
51 200
1.8529
0.0008
Note: VaRs estimated assuming the mean and standard deviation of losses are 0 and 1, using the 'normalvar' function in the MMR Toolbox. The weights 4>[a) are given by the exponential function (Equation (1.19)) with y — 0.05.
co h eren t risk m easures as w ell. If an institution already has a VaR eng ine, then that engine needs only sm all ad ju st m ents to produce estim ates of co h eren t risk m easures: in d eed , in many cases, all th at needs changing is the last few lines of code in a long data processing system . Th e costs of sw itching from VaR to more so p h isticated risk m easures are th erefo re very low.
1.5 ESTIMATING THE STANDARD ERRORS O F RISK M EASURE ESTIMATORS We should always bear in mind that any risk measure estim ates that we produce are just that— estim ates. We never know the true value of any risk m easure, and an estim ate is only as good
shown in Table 1.5. Starting from an arb itrary value of 100, we
as its precision: if a risk measure is very im precisely estim ated,
rep eated ly double n (so it becom es 200, 400, 800, e tc.). A s we
then the estim ator is virtually w orthless, because its imprecision
do so, the estim ated risk m easure gradually co nverg es, and
tells us that true value could be almost anything; on the other
the halving error gradually falls. So, for exam p le, for n = 6400,
hand, if we know that an estim ator is fairly precise, we can be
the estim ated risk m easure is 1.8477, and the halving error is
confident that the true value is fairly close to the estim ate, and
0 .0 0 5 5 . If w e double n to 12,800, the estim ated risk m easure
the estim ator has some value. Hence, we should always seek to
becom es 1.8506, and the halving error falls to 0 .0 0 2 9 ,
supplem ent any risk estim ates we produce with som e indicator
and so on.
of their precision. This is a fundam ental principle of good risk
However, this 'weighted average quantile' procedure is rather
m easurem ent practice.
crude, and (bearing in mind that the risk measure (Equation (1.17))
We can evaluate the precision of estim ators of risk m easures by
involves an integral) we can in principle expect to get substantial
means of their standard errors, or (generally better) by produc
improvements in accuracy if we resorted to more 'respectable'
ing confidence intervals for them . In this chapter we focus on
numerical integration or quadrature methods. This said, the crude
the more basic indicator, the standard error of a risk measure
'weighted average quantile' method actually seems to perform
estimator.
well for spectral exponential risk measures when compared against some of these alternatives, so one is not necessarily better off with the more sophisticated m ethods.5
Standard Errors of Quantile Estimators W e first co nsid er the standard errors of quantile (or VaR) estim ato rs. Follow ing Kendall and S tu a rt,6 sup po se w e have a distribution (or cum ulative density) function F(x), which
5 There is an interesting reason for this: the spectral weights give the highest loss the highest weight, whereas the quadrature methods such as the trapezoidal and Simpson's rules involve algorithms in which the two most extreme quantiles have their weights specifically cut, and this undermines the accuracy of the algorithm relative to the crude approach. However, there are ways round these sorts of problems, and in principle versions of the sophisticated approaches should give better results.
10
■
m ight be a param etric distribution function or an em pirical
6 Kendall and Stuart (1972), pp. 251-252.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
distribution function (i.e ., a cum ulative histogram ) estim ated from real data. Its co rresponding density or relative-frequency function is f(x). Sup p o se also that w e have a sam ple of size n, and w e se le ct a bin w idth h. Let d F be the p ro b ab ility that (k — 1) o b servatio n s fall below som e value q — h/2, th at one observation falls in the range q ± h/2, and that (n — k) o b ser vations are g reater than q + h/2. d F is proportional to {F(q )}k- l f(q)dq{1 - H q )}n~k
(1.21)
This gives us the frequency function for the quantile q not exceeded by a proportion k/n of our sam ple, i.e ., the 100(k/n)th percentile. If this proportion is p, Kendall and Stuart show that Equation (1.21) is approxim ately equal to p np(1 — p)n(1~p) for large values of n. If e is a very small increm ent to p, then p np( 1 - p)n(1" p) « (p + £)np(1 - p - £)n(1“ p)
(1.22)
Taking logs and expanding, Equation (1.22) is itself
Note: Based on random samples of size n drawn from a standard normal distribution. The bin width h is set to 0.1.
approxim ately
(p + £)np(1 - p - £)n(1" p) «
(1.23)
2p(1 - p)
which im plies that the distribution function d F is approxim ately proportional to
make a difference to our estim ates— and also on the bin width h, which is essentially arbitrary.
—ns2
exp
\
(1.24)
2p(1 - P ) J
The standard error can be used to construct confidence inter vals around our quantile estim ates in the usual textb o o k w ay.
Integrating this out, dF —
In addition, the quantile standard error depends on the prob ability density function f(.)— so the choice of density function can
For exam p le, a 90% confidence interval for a quantile q is
1 V 2 7 rV p (1 - p)/n
exp
—ns‘ 2p(1 - p)
ds
(1.25)
which tells us that £ is normally distributed in the limit with vari ance p(1 — p )/n. However, we know that de — d F(q ) = f(q)dq,
given by [q — 1.645 se(q), q + 1.645 se(q)] 1.645
Vp(1 - p)/n . . Kq)
Vp(1 - p)/n
. q + 1.645 —
(1.27)
Kq)
so the variance of q is variq)
P( 1 ~ P) n[f(q)]2
(1.26)
This gives us an approxim ate expression for the variance, and hence its square root, the standard error, of a quantile estim ator q. This expression shows that the quantile standard error depends on p, the sam ple size n and the pdf value f(q). The way in which the (normal) quantile standard errors depend on these param eters is apparent from Figure 1.9. This shows that:
Example 1.6
Obtaining VaR Confidence Intervals Using
Quantile Standard Errors Suppose we wish to estim ate the 90% confidence interval for a 95% VaR estim ated on a sam ple of size of n = 1000 to be drawn from a standard normal distribution, based on an assum ed bin width h — 0.1. W e know that the 95% VaR of a standard normal is 1.645. W e take this to be q in Equation (1.27), and w e know that q falls in the bin spanning 1.645 ± 0.1/2 = [1.595, 1.695].
•
The standard error falls as the sam ple size n rises.
•
The standard error rises as the probabilities becom e more
is also equal to p, and the probability of profit or a loss less
extrem e and we move further into the tail— hence, the more
than 1.595 is 0 .9 4 4 6 . Hence f(q), the probability mass in the
extrem e the quantile, the less precise its estimator.
q range, is 1 — 0.0450 — 0.9446 = 0.0104. W e now plug the
The probability of a loss exceed ing 1.695 is 0.0 4 5 , and this
Chapter 1 Estimating Market Risk Measures
■
11
relevant values into Equation (1.27) to obtain the 90% confi
of VaR and ES estim ators for Levy distributions with varying a
dence interval for the VaR:
stability param eters. Their results suggested that VaR and ES
1.645 - 1.645 1.645 + 1.645
V 0. 045(1
-
0 . 045)/1000
' 0.0104 V 0 . 045(1 - 0 . 045)/1000
0.0104 = [0.6081,2.6819]
This is a wide confidence interval, especially when com pared to the O S and bootstrap confidence intervals. The confidence interval narrows if we take a w ider bin width, so suppose that we now repeat the exercise using a bin width h — 0.2, which is probably as wide as we can reasonably go with these data, q now falls into the range 1.645 ± 0.2/2 = [1.545, 1.745]. p, the prob ability of a loss exceeding 1.745, is 0.0405, and the prob ability of profit or a loss less than 1.545 is 0.9388. Hence f[q) — 1 — 0.0405 — 0.9388 = 0.0207. Plugging these values into Equation (1.27) now gives us a new estim ate of the 90% confi dence interval: 1.645 - 1.645 1.645 + 1.645
Vo.0405(1
- 0.0405 )/1000
0.0207
Vo.0405(1
- 0.0405)/1000
0.0207 = [1.1496, 2.1404]
This is still a rather wide confidence interval. This exam ple illustrates that although we can use quantile stan dard errors to estim ate VaR confidence intervals, the intervals can be wide and also sensitive to the arbitrary choice of bin width. The quantile-standard-error approach is easy to implement and has some plausibility with large sample sizes. However, it also has weaknesses relative to other methods of assessing the precision of quantile (or VaR) estimators— it relies on asymptotic theory and requires large sample sizes; it can produce imprecise estimators,
estim ators had com parable standard errors for near-normal Levy distributions, but the ES estim ators had much bigger standard errors for particularly heavy-tailed distributions. They explained this finding by saying that as tails becam e heavier, ES estim a tors becam e more prone to the effects of large but infrequent losses. This finding suggests the depressing conclusion that the presence of heavy tails might make ES estim ators in general less accurate than VaR estim ators. Fortunately, there are grounds to think that such a conclusion might be overly pessim istic. For exam ple, Inui and Kijima (2003) present alternative results showing that the application of a Richardson extrapolation method can produce ES estim ators that are both unbiased and have com parable standard errors to VaR estim ators.7 A cerbi (2004) also looked at this issue and, although he confirm ed that tail heaviness did increase the stan dard errors of ES estim ators relative to VaR estim ators, he con cluded that the relative accuracies of VaR and ES estim ators were roughly com parable in em pirically realistic ranges. However, the standard error of any estim ator of a coherent risk measure will vary from one situation to another, and the best practical advice is to get into the habit of always estim ating the standard error w henever one estim ates the risk measure itself. Estim ating the standard error of an estim ator of a coherent risk measure is also relatively straightforw ard. O ne way to do so starts from recognition that a coherent risk measure is an L-estim ator (i.e., a w eighted average of order statistics), and L-estim ators are asym ptotically normal. If we take N discrete points in the density function, then as N gets large the variance of the estim ator of the coherent risk measure (Equation (1.17)) is approxim ately
cr(M ^) —> / 4>{p)4>{q)---------------------- d p d q * N J ^ WV (F -1(p))f(F-1(q)) P W pMF(x)W>(F(y))F(x)(1 - F(y))dxdy
(1.28)
x< y
or wide confidence intervals; it depends on the arbitrary choice of bin width; and the symmetric confidence intervals it produces are
and this can be com puted numerically using a suitable num eri
misleading for extrem e quantiles whose 'true' confidence inter
cal integration procedure. W here the risk measure is the ES, the
vals are asymmetric reflecting the increasing sparsity of extrem e
standard error becom es
observations as we move further out into the tail.
F-\a)F~\a) .( E S - ) - ^
Standard Errors in Estimators of Coherent Risk Measures
/
j[m in(F(x), R y)) - F(x)F(y)]dxdy (1.29) 0
0
and used in conjunction with a suitable numerical integration m ethod, this gives good estim ates even for relatively low values
We now consider standard errors in estim ators of coherent risk m easures. O ne of the first studies to exam ine this issue (Yamai and Yoshiba (2001b) did so by investigating the relative accuracy
12
■
7 See Inui and Kijima (2003).
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
of N .8 If we wish to obtain confidence intervals for our risk m ea
portfolio level is more limiting, but easier, while working at the
sure estim ators, we can make use of the asym ptotic normality of
position level gives us much more flexibility, but can involve
these estim ators to apply textb o o k form ulas (e.g ., such as
much more work.
Equation (1.27)) based on the estim ated standard errors and centred around a 'good' best estim ate of the risk m easure.
Which m eth od ? Having chosen our risk measure and level of analysis, we then choose a suitable estim ation m ethod.
An alternative approach to the estim ation of standard errors for
To decide on this, we would usually think in term s of the classic
estim ators of coherent risk measures is to apply a bootstrap: we
'VaR trinity':
bootstrap a large num ber of estim ators from the given distribu
•
Non-param etric m ethods
•
Param etric methods
ple of bootstrapped estim ators. Even better, we can also use
•
Monte Carlo simulation methods
a bootstrapped sam ple of estim ators to estim ate a confidence
Each of these involves som e com plex issues.
tion function (which might be param etric or non-param etric, e .g ., historical); and we estim ate the standard error of the sam
interval for our risk measure.
1.6 THE CO R E ISSUES: AN O VERVIEW Before proceeding to more detailed issues, it might be help ful to pause for a moment to take an overview of the structure,
1.7 APPEN D IX Preliminary Data Analysis W hen confronted with a new data set, we should n ever proceed
as it w ere, of the subject m atter itself. This is very useful, as it
straight to estim ation without som e prelim inary analysis to get
gives the reader a mental fram e of reference within which the
to know our data. Prelim inary data analysis is useful because it
'detailed' material that follows can be placed. Essentially, there
gives us a feel for our data, and because it can highlight prob
are three core issues, and all the material that follows can be
lems with our data set. Rem em ber that we never really know
related to these. They also have a natural sequence, so w e can
where our data come from , so we should always be a little wary
think of them as providing a roadmap that leads us to where we
of any new data set, regardless of how reputable the source
want to be. Which risk m easure? The first and most im portant is to choose the type of risk m easure: do we want to estim ate VaR, ES, etc.? This is logically the first issue, because w e need to know what we are trying to estim ate before we start thinking about how we are going to estim ate it.
might appear to be. For exam ple, how do you know that a clerk hasn't made a m istake som ewhere along the line in copying the data and, say, put a decim al point in the wrong place? The answer is that you don't, and never can. Even the most repu table data providers provide data with errors in them , however careful they are. Everyone who has ever done any em pirical work will have encountered such problem s at some tim e or
Which level? The second issue is the level of analysis. Do we
other: the bottom line is that real data must always be viewed
wish to estim ate our risk measure at the level of the portfo
with a certain amount of suspicion.
lio as a whole or at the level of the individual positions in it? The form er would involve us taking the portfolio as our basic unit of analysis (i.e., we take the portfolio to have a specified com position, which is taken as given for the purposes of our
Such prelim inary analysis should consist of at least the first two and preferably all three of the following steps: •
The first and by far the most im portant step is to eyeball the
analysis), and this will lead to a univariate stochastic analysis.
data to see if they 'look right'— or, more to the point, we
The alternative is to work from the position level, and this has
should eyeball the data to see if anything looks w rong. Does
the advantage of allowing us to accom m odate changes in the
the pattern of observations look right? Do any observations
portfolio com position within the analysis itself. The disadvan
stand out as questionable? And so on. The interocular trauma
tage is that we then need a m ultivariate stochastic fram ew ork,
test is the most im portant test ever invented and also the
and this is considerably more difficult to handle: we have to
easiest to carry out, and we should always perform it on any
get to grips with the problem s posed by variance-covariance
new data set.
m atrices, copulas, and so on, all of which are avoided if we work at the portfolio level. There is thus a trade-off: working at the
•
W e should plot our data on a histogram and estim ate the relevant summary statistics (i.e., mean, standard deviation, skew ness, kurtosis, etc.). In risk m easurem ent, we are par ticularly interested in any non-normal features of our data:
8 See Acerbi (2004, pp. 200-201).
skew ness, excess kurtosis, out-liers in our data, and the like.
Chapter 1 Estimating Market Risk Measures
■
13
We should therefore be on the lookout for any evidence of
of the distribution, which might indicate fat tails on at least
non-normality, and w e should take any such evidence into
the left-hand side. In fact, these particular observations are
account when considering w hether to fit any param etric dis
drawn from a Student-t distribution with 5 degrees of freedom ,
tribution to the data.
so in this case we know that the underlying true distribution
Having done this initial analysis, we should consider what
is unim odal, sym m etric and heavy tailed. However, we would
•
kind of distribution might fit our data, and there are a num ber of useful diagnostic tools available for this purpose, the most popular of which are Q Q plots— plots of em pirical quantiles against their theoretical equivalents.
not know this in a situation with 'real' data, and it is precisely because we do not know the distributions of real-world data sets that prelim inary analysis is so im portant. Som e sum m ary statistics for this data set are shown in Table 1.6. The sam ple mean (—0.099) and the sam ple m ode dif
Plotting the Data and Evaluating Summary Statistics
fer som ew hat (-0 .0 3 0 ), but this difference is small relative to
To get to know our data, we should first obtain their histogram
(3.985) is a little bigger than norm al. The sam ple minimum
and see what might stand out. Do the data look normal, or
(—4.660) and the sam ple maximum (3.010) are also not sym m et
the sam ple standard deviation (1.363). However, the sam ple skew (—0.503) is som ew hat negative and the sam ple kurtosis
non-normal? Do they show one pronounced peak, or more than
ric about the sam ple mean or m ode, which is further evidence
one? Do they seem to be skew ed? Do they have fat tails or thin
of asym m etry. If we encountered these results with 'real' data,
tails? Are there outliers? And so on.
we would be concerned about possible skew ness and kurtosis.
As an exam ple, Figure 1.10 shows a histogram of 100 random observations. In practice, we would usually wish to w ork with considerably longer data sets, but a data set this small helps to highlight the uncertainties one often encounters in practice.
However, in this hypothetical case we know that the sam ple skew ness is m erely a product of sam ple variation, because we happen to know that the data are drawn from a sym m etric distribution.
These observations show a dom inant peak in the centre, which
Depending on the context, we might also seriously consider
suggests that they are probably drawn from a unimodal distri
carrying out some formal tests. For exam ple, we might test
bution. On the other hand, there may be a negative skew, and
w hether the sam ple param eters (mean, standard deviation, etc.)
there are som e large outlying observations on the extrem e left
are consistent with what we might exp ect under a null hypoth esis (e.g ., such as normality). The underlying principle is very sim ple: since we n ever know the true distribution in practice, all we ever have to w ork with are estim ates based on the sam ple at hand; it therefore behoves us to make the best use of the data we have, and to extract as much information as possible from them .
QQ Plots Having done our initial analysis, it is often good practice to ask what distribution might fit our data, and a very useful device for identifying the distribution of our data is a quantile-quantile or Q Q plot— a plot of the quantiles of the em pirical distribu tion against those of some specified distribution. The shape of the Q Q plot tells us a lot about how the em pirical distribution com pares to the specified one. In particular, if the Q Q plot is linear, then the specified distribution fits the data, and we have identified the distribution to which our data belong. In addi
Fiaure 1.10
A histogram.
Note: Data are 100 observations randomly drawn from a Student-t with 5 degrees of freedom.
14
■
tion, departures of the Q Q from linearity in the tails can tell us w hether the tails of our em pirical distribution are fatter, or thinner, than the tails of the reference distribution to which it is being com pared.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
Table 1.6
6
Summary Statistics
Parameter
Value
4
Mean
-0 .0 9 9
Mode
-0 .0 3 0 •
Standard deviation Skewness Kurtosis
1.363 -0 .5 0 3 3.985
Minimum
-4 .6 6 0
Maximum
3.010
Num ber of observations
100
Note: Data are the same observations shown in Figure 1.10.
To illustrate, Figure 1.11 shows a Q Q plot for a data sam ple drawn from a normal distribution, com pared to a reference distribution that is also norm al. The Q Q plot is obviously close to linear: the central mass observations fit a linear Q Q plot very closely, and the extrem e tail observations som ew hat
u. The usefulness of the MEF stems from the fact that each distribution has its own distinc tive MEF. A comparison of the empirical MEF with the theoretical MEF associated with some specified distribution function therefore gives us an indication of whether the chosen distribution fits the tails of our empirical distribution. However, the results of MEF plots need to be interpreted with some care, because data observations become more scarce as X gets larger. For more on these and how they can be used, see Embrechts et. al. (1997, Chapters 3.4 and 6.2).
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
Non-Parametric Approaches Learning Objectives A fter com pleting this reading you should be able to: A p p ly the bootstrap historical simulation approach to
Com pare and contrast the age-w eighted, the volatility-
estim ate coherent risk m easures.
w eighted, the correlation-w eighted, and the filtered
D escribe historical simulation using non-param etric density estim ation.
historical simulation approaches. Identify advantages and disadvantages of non-param etric estim ation m ethods.
E x c e rp t is C h apter 4 o f M easuring M a rket Risk, S e co n d Edition, by Kevin D ow d.
17
This chapter looks at som e of the most popular approaches
for HS VaR and ES. Then we will address w eighted HS— how
to the estim ation of risk m easures— the non-param etric
we might w eight our data to capture the effects of observation
approaches, which seek to estim ate risk m easures without
age and changing m arket conditions. These m ethods introduce
making strong assum ptions about the relevant (e.g ., P/L)
param etric form ulas (such as G A R C H volatility forecasting
distribution. The essence of these approaches is that we try to
equations) into the picture, and in so doing convert hitherto
let the P/L data speak for them selves as much as possible, and
non-param etric m ethods into what are best described as semi-
use the recent em pirical (or in som e cases simulated) distribu
param etric m ethods. Such m ethods are very useful because
tion of P/L— not som e assumed theoretical distribution— to
they allow us to retain the broad HS fram ew ork while also
estim ate our risk m easures. All non-param etric approaches are
taking account of ways in which we think that the risks we face
based on the underlying assumption that the near future will
over the foreseeable horizon period might differ from those
be sufficiently like the recent past that we can use the data
in our sam ple period. Finally we review the main advantages
from the recent past to forecast risks over the near future— and
and disadvantages of non-param etric and sem i-param etric
this assumption may or may not be valid in any given context.
approaches, and offer some conclusions.
In deciding w hether to use any non-param etric approach, we must make a judgm ent about the extent to which data from the recent past are likely to give us a good guide about the risks we face over the horizon period we are concerned with. To keep the discussion as clear as possible, we will focus on the estim ation of non-param etric VaR and ES. However, the m ethods discussed here extend very naturally to the estimation of coherent and other risk measures as well. These can be estim ated using an 'average quantile' approach along the lines discussed in C hapter 1: w e would select our weighting function 4>{p), decide on the num ber of probability 'slices' n to take, estim ate the associated quantiles, and take the w eighted aver age using an appropriate numerical algorithm (see Chapter 1).1 We can then obtain standard errors or confidence intervals for our risk measures using suitably modified form s. In this chapter we begin by discussing how to assem ble the P/L data to be used for estim ating risk m easures. We then discuss the most popular non-param etric approach-historical simulation (HS). Loosely speaking, HS is a histogram -based approach: it is
2.1 COM PILING HISTORICAL SIMULATION DATA The first task is to assem ble a suitable P/L series for our portfo lio, and this requires a set of historical P/L or return observations on the positions in our current portfolio. These P/Ls or returns will be measured over a particular frequency (e.g ., a day), and we want a reasonably large set of historical P/L or return obser vations over the recent past. Suppose we have a portfolio of n assets, and for each asset / we have the observed return for each of T subperiods (e.g ., daily subperiods) in our historical sam ple period. If Ri t is the (possibly m apped) return on asset / in subperiod t, and if w( is the amount currently invested in asset /, then the historically sim ulated portfolio P/L over the subperiod t is: P /L, =
(=1
(2-1)
conceptually sim ple, easy to im plem ent, very w idely used, and
Equation (2.1) gives us a historically sim ulated P/L series for our
has a fairly good historical record. We focus on the estim ation of
current portfolio, and is the basis of HS VaR and ES. This series
VaR and ES, but as explained in the previous chapter, more gen
will not generally be the same as the P/L actually earned on our
eral coherent risk measures can be estim ated using appropri
portfolio— because our portfolio may have changed in com posi
ately w eighted averages of any non-param etric VaR estim ates.
tion over tim e or be subject to mapping approxim ations, and so
We then discuss refinem ents to basic HS using bootstrap and
on. Instead, the historical simulation P/L is the P/L we w ould
kernel m ethods, and the estim ation of VaR or ES curves and sur
have earned on our current portfolio had w e held it throughout
faces. W e will discuss how we can estim ate confidence intervals
the historical sam ple period.2 As an aside, the fact that multiple positions collapse into
A
Nonetheless, there is an important caveat. This method was explained in Chapter 1 in an implicit context where the risk measurer could choose n, and this is sometimes not possible in a non-parametric context. For example, a risk measurer might be working with an n deter mined by the HS data set, and even where he/she has some freedom to select n, their range of choice might be limited by the data avail able. Such constraints can limit the degree of accuracy of any resulting estimated risk measures. However, a good solution to such problems is to increase the sample size by bootstrapping from the sample data. (The bootstrap is discussed further in Appendix 2 to this chapter).
18
■
one single HS P/L as given by Equation (2.1) im plies that it is
2 To be more precise, the historical simulation P/L is the P/L we would have earned over the sample period had we rearranged the portfolio at the end of each trading day to ensure that the amount left invested in each asset was the same as at the end of the previous trading day: we take out our profits, or make up for our losses, to keep the w,- constant from one end-of-day to the next.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
70
very easy for non-param etric m ethods to accom m odate high dim ensions— unlike the case for som e param etric m ethods. W ith non-param etric m ethods, there are no
60 -
problem s dealing with variance-covariance m atrices, curses of dim ensionality, and the like. This means that
50 -
non-param etric m ethods will often be the most natural choice for high-dimension problem s. C
D D"
95% VaR = 1.704
40 -
CD
2.2 ESTIMATION O F HISTORICAL SIMULATION V a R AND ES
CD
20
Basic Historical Simulation
10
Having obtained our historical simulation P/L data, we
-
-
can estim ate VaR by plotting the P/L (or L/P) on a simple 0 -4
histogram and then reading off the VaR from the histo gram. To illustrate, suppose we have 1000 observations in
Figure 2.1
Figure 2.1. If these were daily data, this sam ple size would to a year. If we take our confidence level to be 95% , our
rh .m - 3
-
2
-
1
0
1
2
rTTTn
Loss (+)/Profit (-)
our HS P/L series and we plot the L/P histogram shown in be equivalent to four years' daily data at 250 trading days
95% ES = 2.196
30 -
Basic historical simulation VaR and ES.
Note: This figure and associated VaR and ES estimates are obtained using the 'hsesfigure' function.
VaR is given by the x-value that cuts off the upper 5% of very high losses from the rest of the distribution. Given
bootstrap is very intuitive and easy to apply. A bootstrapped
1000 observations, we can take this value (i.e., our VaR) to be
estim ate will often be more accurate than a 'raw' sam ple esti
the 51 st highest loss value, or 1.704.3 The ES is then the aver
m ate, and bootstraps are also useful for gauging the precision
age of the 50 highest losses, or 2.196.
of our estim ates. To apply the bootstrap, we create a large num
The imprecision of these estim ates should be obvious when we consider that the sam ple data set was drawn from a standard normal distribution. In this case the 'true' underlying VaR and ES are 1.645 and 2.063, and Figure 2.1 should (ideally) be normal. O f course, this imprecision underlines the need to work with arge sam ple sizes where practically feasible.
Bootstrapped Historical Simulation O ne simple but powerful im provem ent over basic HS is to estim ate VaR and ES from bootstrapped data. As explained in A p p end ix 2 to this chapter, a bootstrap procedure involves resampling from our existing data set with replacem ent. The
3 We can also estimate the HS VaR more directly (i.e., without bothering with the histogram) by using a spreadsheet function that gives us the 51st highest loss value (e.g., the 'Large' command in Excel), or we can sort our losses data with highest losses ranked first, and then obtain the VaR as the 51st observation in our sorted loss data. We could also take VaR to be any point between the 50th and 51st largest losses (e.g., such as their mid-point), but with a reasonable sample size (as here) there will seldom be much difference between these losses anyway. For conve nience, we will adhere throughout to this convention of taking the VaR to be the highest loss observation outside the tail.
ber of new sam ples, each observation of which is obtained by drawing at random from our original sam ple and replacing the observation after it has been drawn. Each new 'resam pled' sam ple gives us a new VaR estim ate, and we can take our 'best' esti mate to be the mean of these resam ple-based estim ates. The same approach can also be used to produce resam ple-based ES estim ates— each one of which would be the average of the losses in each resam ple exceeding the resam ple VaR— and our 'best' ES estim ate would be the mean of these estim ates. In our particular case, if we take 1000 resam ples, then our best VaR and ES estim ates are (because of bootstrap sampling variation) about 1.669 and 2.114— and the fact that these are much closer to the known true values than our earlier basic HS estim ates suggests that bootstraps estim ates might be more accurate.
Historical Simulation Using Non-parametric Density Estimation A nother potential im provem ent over basic HS som etim es suggested is to make use of non param etric density estim a tion. To appreciate what this involves, w e must recognise that basic HS does not make the best use of the information we have. It also has the practical draw back that it only allows us to
Chapter 2 Non-Parametric Approaches
■
19
estim ate VaRs at discrete confidence intervals determ ined by
(a) Original histogram
(b) Surrogate density function
the size of our data set. For exam ple, if we have 100 HS P/L observations, basic HS allows us to estim ate VaR at the 95% confidence level, but not the VaR at the 95.1 % confidence level. The VaR at the 95% confidence level is given by the sixth largest loss, but the VaR at the 95.1% confidence level is a problem because there is no corresponding loss observation to go with it. W e know that it should be greater than the sixth largest loss (or the 95% VaR), and sm aller than the fifth largest loss (or the 96% VaR), but with only 100 observations there is no observation that corresponds to any VaR whose confi dence level involves a fraction of 1%. With n observations, basic HS only allows us to estim ate the VaRs associated with, at best, n different confidence levels.
Fiaure 2.2
Histograms and surrogate density functions.
Non-param etric density estim ation offers a potential solution to both these problem s. The idea is to treat our data as if they were drawings from some unspecified or unknown em pirical distribution function. This approach also encourages us to con front potentially im portant decisions about the width of bins and where bins should be centred, and these decisions can som e tim es make a difference to our results. Besides using a histo gram, w e can also represent our data using naive estim ators or, more generally, kernels, and the literature tells us that kernels are (or ought to be) superior. So, having assem bled our 'raw' HS data, we need to make decisions on the widths of bins and where they should be centred, and w hether to use a histogram , a naive estimator, or som e form of kernel. If we make good deci sions on these issues, we can hope to get better estim ates of VaR and ES (and more general coherent measures). Non-parametric density estimation also allows us to estimate VaRs and ESs for any confidence levels we like and so avoid con straints imposed by the size of our data set. In effect, it enables us to draw lines through points on or near the edges of the 'bars' of a histogram. We can then treat the areas under these lines as a surrogate pdf, and so proceed to estim ate VaRs for arbitrary con fidence levels. The idea is illustrated in Figure 2.2. The left-hand side of this figure shows three bars from a histogram (or naive estimator) close up. Assuming that the height of the histogram (or naive estimator) measures relative frequency, then one option is to treat the histogram itself as a pdf. Unfortunately, the result ing pdf would be a strange one—just look at the corners of each bar— and it makes more sense to approxim ate the pdf by drawing
our data set. Each possible confidence level would correspond to its own tail sim ilar to the shaded area shown in Figure 2.2(b), and we can then use a suitable calculation method to estim ate the VaR (e.g ., w e can carry out the calculations on a spreadsheet or, more easily, by using a purpose-built function such as the 'hsvar' function in the MMR Toolbox).4 O f course, drawing straight lines through the mid-points of the tops of histogram bars is not the best we can do: we could draw smooth curves that m eet up nicely, and so on. This is exactly the point of nonparam etric density estim ation, the purpose of which is to give us some guidance on how 'best' to draw lines through the data points we have. Such m ethods are also straightforward to apply if we have suitable software. Some empirical evidence by Butler and Schachter (1998) using real trading portfolios suggests that kernel-type methods produce VaR estimates that are a little different to those we would obtain under basic HS. However, their work also suggests that the differ ent types of kernel methods produce quite similar VaR estimates, although to the extent that there are differences among them, they also found that the 'best' kernels were the adaptive Epanechinikov and adaptive Gaussian ones. To investigate these issues myself, I applied four standard kernel estimators— based on normal, box, triangular and Epanechinikov kernels— to the test data used in earlier examples, and found that each of these gave the same VaR estimate of 1.735. In this case, these different kernels produced the same VaR estimate, which is a little higher (and, curiously,
lines through the upper parts of the histogram. A sim ple way to do this is to draw in straight lines connecting the mid-points at the top of each histogram bar, as illustrated in the figure. O nce we draw these lines, we can forget about the histogram bars and treat the area under the lines as if it were a pdf. Treating the area under the lines as a pdf then enables us to estim ate VaRs at any confidence level, regardless of the size of
20
■
4 The actual programming is a little tedious, but the gist of it is that if the confidence level is such that the VaR falls between two loss obser vations, then we take the VaR to be a weighted average of these two observations. The weights are chosen so that a vertical line drawn through the VaR demarcates the area under the 'curve' in the correct proportions, with a to one side and 1 — a to the other. The details can be seen in the coding for the 'hsvar' and related functions.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
a little less accurate) than the basic HS VaR estimate of 1.704 obtained earlier. Other results not reported here suggest that the different kernels can give somewhat different estimates with smaller samples, but again suggest that the exact kernel specification does not make a great deal of difference. So although kernel m ethods are better in theory, they do not necessarily pro duce much better estim ates in practice. There are also practical reasons why we might prefer sim pler non-param etric density estim ation m ethods over kernel ones. Although the kernel m ethods are theoretically better, crude m ethods like drawing straight-line 'curves' through the tops of histogram s are more transparent and easier to check. W e should also not forget that our results are subject to a number of sources of error (e.g ., due to errors in P/L data, mapping approxim a tions, and so on), so there is a natural limit to how much real fineness we can actually achieve.
Confidence level
Fiaure 2.3
Plots of HS VaR and ES against confidence level.
Note: Obtained using the 'hsvaresplot2D_cl' function and the same hypothetical P/L data used in Figure 2.1.
Estimating Curves and Surfaces for VaR and ES It is straightforward to produce plots of VaR or ES against the con fidence level. For exam ple, our earlier hypothetical P/L data yields the curves of VaR and ES against the confidence level shown in Figure 2.3. Note that the VaR curve is fairly unsteady, as it directly reflects the randomness of individual loss observations, but the ES curve is smoother, because each ES is an average of tail losses.
soon find that we don't have enough data. To illustrate, if we have 1000 observations of daily P/L, corresponding to four years' worth of data at 250 trading days a year, then w e have 1000 P/L observations if we use a daily holding period. If we have a w eekly holding period, with five days to a w eek, each w eekly P/L will be the sum of five daily P/Ls, and we end up with only 200 observations of w eekly P/L; if we have a monthly hold ing period, we have only 50 observations of monthly P/L; and so on. Given our initial data, the num ber of effective observations
It is more difficult constructing curves that show how non-para
rapidly falls as the holding period rises, and the size of the data
metric VaR or ES changes with the holding period. The methods
set im poses a major constraint on how large the holding period
discussed so far enable us to estimate the VaR or ES at a single
can practically be. In any case, even if we had a very long run of
holding period equal to the frequency period over which our
data, the older observations might have very little relevance for
data are observed (e.g., they give us VaR or ES for a daily holding
current m arket conditions.
period if P/L is measured daily). In theory, we can then estimate VaRs or ESs for any other holding periods we wish by construct ing a HS P/L series whose frequency matches our desired hold ing period: if we wanted to estimate VaR over a weekly holding period, say, we could construct a weekly P/L series and estimate the VaR from that. There is, in short, no theoretical problem as such with estimating HS VaR or ES over any holding period we like.
2.3 ESTIMATING C O N FID EN C E INTERVALS FOR HISTORICAL SIMULATION V a R AND ES The m ethods considered so far are good for giving point esti
However, there is a major practical problem : as the holding
mates of VaR or ES, but they don't give us any indication of the
period rises, the number of observations rapidly falls, and we
precision of these estim ates or any indication of VaR or ES
Chapter 2 Non-Parametric Approaches
■ 21
confidence intervals. However, there are
200
m ethods to get around this limitation and produce confidence intervals for our
180 -
risk estim ates.5 160 -
An Order Statistics Approach to the Estimation of Confidence Intervals for HS VaR and ES One of the most promising methods is
140 120
Upper bound of 90% confidence interval = 1.797
Lower bound of 90% confidence interval = 1.552
-
u c
CD
3 100 O" Cl)
-
80 -
to apply the theory of order statistics, explained in Appendix 1 to this chapter.
60 -
This approach gives us, not just a VaR (or ES) estim ate, but a complete VaR
40 -
(or ES) distribution function from which we can read off the VaR (or ES) confidence
20
-
interval. (The central tendency param eters (mean, mode, median) also give us
0
alternative point estimates of our VaR or ES, if we want them.) This approach is (relatively) easy to programme and very general in its application. Applied to our earlier P/L data, the OS
1.5
1.55
1.6
1.65
1.7
1.75
1.8
1.85
1.9
VaR
Fiqure 2.4
Bootstrapped VaR.
Note: Results obtained using the 'bootstrapvarfigure' function with 1000 resamples, and the same hypothetical data as in earlier figures.
approach gives us estim ates (obtained using the 'hsvarpdfperc' function) of the 5% and 95% points of the 95% VaR distribution function— that is, the bounds of the 90% confidence interval for our VaR— of 1.552 and 1.797. This tells us we can be 90% confident that the 'true' VaR lies in the range [1.552, 1.797].
A Bootstrap Approach to the Estimation of Confidence Intervals for HS VaR and ES W e can also estim ate confidence intervals using a b o o t strap app ro ach: we produce a b o o tstrap p ed histogram of resam ple-based VaR (or ES) estim ates, and then read the
The corresponding points of the ES distribution function can be
co nfidence interval from the quantiles of this histogram .
obtained (using the 'hsesdfperc' function) by mapping from the
For exam p le, if w e take 1000 b o o tstrap p ed sam ples from our
VaR to the ES: we take a point on the VaR distribution function,
P/L data set, estim ate the 95% VaR of each, and then plot
and estim ate the corresponding percentile point on the ES
them , we g et the histogram shown in Figure 2 .4 . Using the
distribution function. Doing this gives us an estim ated 90%
basic p ercentile interval approach outlined in A p p e n d ix 2
confidence interval of [2.021, 2.224].6
to this chapter, the 90% confidence interval fo r our VaR is [1 .5 5 4 , 1.797]. The sim ulated histogram is surprisingly disjointed , although the bootstrap seem s to give a relatively
5 In addition to the methods considered in this section, we can also estimate confidence intervals for VaR using estimates of the quantile standard errors. However, as made clear there, such confidence intervals are subject to a number of problems, and the methods suggested here are usually preferable. 6 Naturally, the order statistics approach can be combined with more sophisticated non-parametric density estimation approaches. Instead of applying the OS theory to the histogram or naive estimator, we could apply it to a more sophisticated kernel estimator, and thereby extract more information from our data. This approach has some merit and is developed in detail by Butler and Schachter (1998).
22
■
robust estim ate of the co nfidence interval if w e keep repeating the e xercise. W e can also use the b o o tstrap to estim ate ESs in much the sam e w ay: for each new resam pled data set, w e estim ate the VaR, and then estim ate the ES as the averag e of losses in e xce ss of VaR. Doing this a large num ber of tim es gives us a large num ber of ES estim ates, and w e can plot them in the sam e w ay as the VaR estim ates. Th e histogram of b o o tstrap p ed ES values is shown in Figure 2 .5 , and is b etter
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
2.4 W EIGHTED HISTORICAL SIMULATION
100
90 80 -
Upper bound of 90% confidence interval = 2.271
Lower bound of 90% confidence interval = 1.986
70 -
O ne of the most im portant features of traditional HS is the way it w eights past observations. Recall that Rl t is the return on asset / in period t, and we are im ple
60 > u* c o 3 50 CT 0)
menting HS using the past n observa tions. An observation Rj t_j will therefore belong to our data set if j takes any of the values 1 , . . . , t — n, where j is the
40 -
age of the observation (e.g ., so j — 1 indicates that the observation is 1 day
30 -
old, and so on). If we construct a new HS 20
P/L series, P /L t, each day, our observa
-
tion Ri t_j will first affect P / L t, then affect 10
P /L t+ , and so on, and finally affect 7
-
P / L i+n: our return observation will affect rz 1.9
0
1.8
2.1
2.2
2.3
2.4
ETL
Figure 2.5
2.5
each of the next n observations in our P/L series. A lso, other things (e.g ., posi tion weights) being equal, Rjt-j will affect
Bootstrapped ES.
each P/L in exactly the sam e way. But
Note: Results obtained using the 'bootstrapesfigure' function with 1000 resamples, and the same hypothetical data as in earlier figures.
after n periods have passed, Ri t-j will fall out of the data set used to calculate the current HS P/L series, and will thereafter
Table 2.1 90% Confidence Intervals for Non-parametric VaR and ES Approach
Lower bound
have no effect on P/L. In short, our HS P/L series is constructed in a way that gives any observation the
Upper bound 95% VaR
O rder statistics Bootstrap
1.552 1.554
no w eight (i.e., a zero weight) if it is older than that. This weighting structure has a number of problem s. One
1.797
problem is that it is hard to justify giving each observation in our
1.797
sam ple period the sam e w eight, regardless of age, market
95% ES O rder statistics
2.021
2.224
Bootstrap
1.986
2.271
Note; Bootstrap estimates based on 1000 resamples.
behaved than the VaR histogram in the last figure because the ES is an averag e of tail VaRs. Th e 90% co nfid ence interval for o u r E S is [1 .9 8 6 , 2.271]. It is also interesting to com pare the VaR and ES confidence intervals obtained by the tw o m ethods. Th ese are sum m arised in Table 2.1, and we can see that the O S and bootstrap approaches give very sim ilar results. This suggests that either approach is likely to be a reasonable one to use in p ractice.
sam e w eight on P/L provided it is less than n periods old, and
volatility, or anything else. A good exam ple of the difficulties this can create is given by Shimko et. al. (1998). It is well known that natural gas prices are usually more volatile in the winter than in the summer, so a raw HS approach that incorporates both sum m er and w inter observations will tend to average the sum m er and w inter observations together. As a result, treating all observations as having equal w eight will tend to underesti mate true risks in the winter, and overestim ate them in the sum m er .7 The equal-weight approach can also make risk estim ates unresponsive to major events. For instance, a stock m arket crash
7 If we have data that show seasonal volatility changes, a solution— suggested by Shimko et. al. (1998)— is to weight the data to reflect seasonal volatility (e.g., so winter observations get more weight, if we are estimating a VaR in winter).
Chapter 2 Non-Parametric Approaches
■ 23
might have no effect on VaRs excep t at a very high confidence
historical sim ulation' and can be regarded as sem i-param etric
level, so we could have a situation where everyone might agree
m ethods because they com bine features of both param etric and
that risk had suddenly increased, and yet that increase in risk
non-param etric m ethods.
would be missed by most HS VaR estim ates. The increase in risk would only show up later in VaR estim ates if the stock market continued to fall in subsequent days— a case of the stable door closing only well after the horse had long since bolted. That
Age-weighted Historical Simulation O ne such approach is to w eight the relative im portance, of our
said, the increase in risk w ould show up in ES estim ates just
observations by their age, as suggested by Boudoukh, Richard
after the first shock occurred— which is, incidentally, a good
son and W hitelaw (BRW : 1998). Instead of treating each obser
exam ple of how ES can be a more inform ative risk measure
vation for asset / as having the same implied probability as any
than the VaR .8* The equal-weight structure also presum es that each observa tion in the sam ple period is equally likely and independent of the others over tim e. However, this 'iid' assumption is unrealistic because it is well known that volatilities vary over tim e, and that periods of high and low volatility tend to be clustered together. The natural gas exam ple just considered is a good case in point. It is also hard to justify why an observation should have a w eight that suddenly goes to zero when it reaches age n. W hy is it that an observation of age n — 1 is regarded as having a lot of value (and, indeed, the same value as any more recent observation),
other (i.e., 1 /n), we could w eight their probabilities to discount the older observations in favour of newer ones. Thus, if w(1) is the probability w eight given to an observation 1 day old, then w(2 ), the probability given to an observation 2 days old, could be Aw(1); w(3) could be A2w(1); and so on. The A term is between
0 and 1 , and reflects the exponential rate of decay in the w eight or value given to an observation as it ages: a A close to 1 indi cates a slow rate of decay, and a A far away from 1 indicates a high rate of decay. w ( 1 ) is set so that the sum of the weights is 1, and this is achieved if we set w(1) = (1 — A)/(1 — An). The w eight given to an observation / days old is therefore: AM (1 - A) r
but an observation of age n is regarded as having no value at
-( 0 = - r ^
all? Even old observations usually have som e information con
(
2 . 2)
tent, and giving them zero value tends to violate the old statisti
and this corresponds to the w eight of 1 /n given to any in-sample
cal adage that we should never throw information away.
observation under basic HS.
This weighting structure also creates the potential for ghost
O ur core inform ation— the inform ation inputted to the HS
effects— we can have a VaR that is unduly high (or low) because
estim ation p ro cess— is the paired set of P/L values and asso
of a small cluster of high loss observations, or even just a single
ciated p ro b ab ility w eig h ts. To im plem ent ag e-w eig hting ,
high loss, and the measured VaR will continue to be high (or low)
w e m erely replace the old equal w eig h ts 1 /n with the age-
until n days or so have passed and the observation has fallen out of the sam ple period. A t that point, the VaR will fall again, but
d ep en d en t w eig h ts w(i) given by (2.4). For exam p le , if we are using a sp re a d sh e e t, w e can ord er our P/L o b servatio n s
the fall in VaR is only a ghost effect created by the weighting
in one colum n, put th e ir w eig h ts w(i) in the next colum n, and
structure and the length of sam ple period used.
go down that colum n until w e reach our desired p ercen tile.
We now address various ways in which we might 'adjust' our data to overcom e som e of these problem s and take account of ways in which current m arket conditions might differ from those in our sam ple. These fall under the broad heading of 'w eighted
O ur VaR is then the negative of the co rresponding value in the first colum n. And if our d esired p ercen tile falls betw een tw o p e rce n tile s, w e can take our VaR to be the (negative of the) in terp o lated value of the co rresp ond ing first-colum n o b servatio n s. This age-weighted approach has four major attractions. First,
8 However, both VaR and ES suffer from a related problem. As Pritsker (2001, p. 5) points out, HS fails to take account of useful information from the upper tail of the P/L distribution. If the stock experiences a series of large falls, then a position that was long the market would experience large losses that should show up, albeit later, in HS risk estimates. However, a position that was short the market would experi ence a series of large profits, and risk estimates at the usual confidence levels would be completely unresponsive. Once again, we could have a situation where risk had clearly increased—because the fall in the market signifies increased volatility, and therefore a significant chance of losses due to large rises in the stock market—and yet our risk estimates had failed to pick up this increase in risk.
24
■
it provides a nice generalisation of traditional HS, because we can regard traditional HS as a special case with zero decay, or A —> 1. If HS is like driving along a road looking only at the rear view mirror, then traditional equal-weighted HS is only safe if the road is straight, and the age-weighted approach is safe if the road bends gently. Second, a suitable choice of A can make the VaR (or ES) esti mates more responsive to large loss observations: a large loss event will receive a higher w eight than under traditional HS, and
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
the resulting next-day VaR would be higher than it would oth
exam ple, if the current volatility in a m arket is 1.5% a day, and it
erwise have been. This not only means that age-weighted VaR
was only 1 % a day a month ago, then data a month old under
estim ates are more responsive to large losses, but also makes
state the changes we can exp ect to see tom orrow, and this
them better at handling clusters of large losses.
suggests that historical returns would underestim ate tom orrow's
Third, age-weighting helps to reduce distortions caused by events that are unlikely to recur, and helps to reduce ghost effects. As an observation ages, its probability w eight gradually falls and its influence dim inishes gradually over tim e. Further more, when it finally falls out of the sam ple period, its w eight will fall from Anw(1) to zero, instead of from 1 /n to zero. Since
risks; on the other hand, if last month's volatility was 2 % a day, month-old data will overstate the changes we can exp ect tom or row, and historical returns would overestim ate tom orrow's risks. We therefore adjust the historical returns to reflect how volatility tom orrow is believed to have changed from its past values. Suppose we are interested in forecasting VaR for day T. Let
Anw(1) is less than 1/n for any reasonable values of A and n, then
rti be the historical return in asset / on day t in our historical
the shock— the ghost effect— will be less than it would be under
sam ple, ort j be the historical G A R C H (or EW M A) forecast of the
equal-weighted HS.
volatility of the return on asset / for day t, made at the end of
Finally, we can also modify age-weighting in a way that makes our risk estim ates more efficient and effectively elim inates any remaining ghost effects. Since age-weighting allows the im pact
day t — 1 , and crTi be our most recent forecast of the volatility of asset /. W e then replace the returns in our data set, rt i, with volatility-adjusted returns, given by:
of past extrem e events to decline as past events recede in tim e, it gives us the option of letting our sam ple size grow over tim e. (Why can't we do this under equal-weighted HS? Because we would be stuck with ancient observations whose information content was assumed never to date.) Age-w eighting allows us to let our sam ple period grow with each new observation, so we never throw potentially valuable information away. This would improve efficiency and elim inate ghost effects, because there would no longer be any 'jum ps' in our sam ple resulting from old observations being thrown away. However, age-weighting also reduces the effective sam ple size, other things being equal, and a sequence of major profits or losses can produce major distortions in its implied risk profile. In
Actual returns in any period ta re therefore increased (or decreased), depending on w hether the current forecast of vola tility is greater (or less than) the estim ated volatility for period t. We now calculate the HS P/L using Equation (2.3) instead of the original data set rt/, and then proceed to estim ate HS VaRs or ESs in the traditional way (i.e., with equal w eights, e tc .).10* The HW approach has a number of advantages relative to the traditional equal-weighted and/or the BRW age-weighted approaches: •
addition, Pritsker shows that even with age-weighting, VaR
and the age-weighted approach treats volatility changes in a
estim ates can still be insufficiently responsive to changes in
rather arbitrary and restrictive way.
underlying risk .9 Furtherm ore, there is the disturbing point that the BRW approach is ad hoc, and that excep t for the special
It takes account of volatility changes in a natural and direct way, w hereas equal-weighted HS ignores volatility changes
•
case where A = 1 w e cannot point to any asset-return process
It produces risk estim ates that are appropriately sensitive to current volatility estim ates, and so enables us to incorpo
for which the BRW approach is theoretically correct.
rate information from G A R C H forecasts into HS VaR and ES estim ation.
Volatility-weighted Historical Simulation
•
It allows us to obtain VaR and ES estim ates that can exceed the maximum loss in our historical data set: in periods of high
We can also w eight our data by volatility. The basic idea— sug
volatility, historical returns are scaled upwards, and the HS
gested by Hull and W hite (HW; 1998b)— is to update return
P/L series used in the HW procedure will have values that
information to take account of recent changes in volatility. For
exceed actual historical losses. This is a major advantage over traditional HS, which prevents the VaR or ES from being any
9 If VaR is estimated at the confidence level a, the probability of an HS estimate of VaR rising on any given day is equal to the probability of a loss in excess of VaR, which is of course 1 — a. However, if we assume a standard GARCH(1,1) process and volatility is at its long-run mean value, then Pritsker's proposition 2 shows that the probability that HSVaR should increase is about 32% (Pritsker (2001, pp. 7-9)). In other words, most of the time HS VaR estimates should increase (i.e., when risk rises), they fail to.
bigger than the losses in our historical data set. •
Em pirical evidence presented by HW indicates that their approach produces superior VaR estim ates to the BRW one.
10 Naturally, volatility weighting presupposes that one has estimates of the current and past volatilities to work with.
Chapter 2 Non-Parametric Approaches
■ 25
The HW approach is also capable of various extensions. For
major generalisation of the HW approach, because it gives us a
instance, we can com bine it with the age-weighted approach if
weighting system that takes account of correlations as well as
we wished to increase the sensitivity of risk estim ates to large
volatilities.
losses, and to reduce the potential for distortions and ghost effects. W e can also com bine the HW approach with O S or bootstrap m ethods to estim ate confidence intervals for our VaR
Example 2.1
Correlation-weighted HS
or ES— that is, w e would w ork with order statistics or resam ple
Suppose we have only two positions in our portfolio, so m — 2.
with replacem ent from the HW -adjusted P/L, rather than from
The historical correlation between our two positions is 0.3, and
the traditional HS P/L.
we wish to adjust our historical returns R to reflect a current correlation of 0.9.
Correlation-weighted Historical Simulation
If 3jj is the /, jth elem ent of the 2 x 2 m atrix A, then applying the Choleski decom position tells us that
We can also adjust our historical returns to reflect changes
3n = 1 ,
between historical and current correlations. Correlation-w eight
a -12 — 0 ,
a2i = p,
322 ~ V 1 — p2
ing is a little more involved than volatility-weighting. To see the
where p — 0.3. The m atrix A is sim ilar except for having p — 0.9.
principles involved, suppose for the sake of argum ent that we
Standard matrix theory also tells us that
have already made any volatility-based adjustm ents to our HS
a 22/ — a 12
returns along Hull-White lines, but also wish to adjust those
a 21/a 11 _
returns to reflect changes in correlations .11
Substituting these into Equation (2.5), w e find that
To make the discussion concrete, we have m positions
R = AA 1 R =
and our (perhaps volatility adjusted) 1 X m vector of his torical returns R for som e period t reflects an m X m
HR 1 V i - 0.32 V i - 0.32,0 -0.3,1 _o.9, v T - 0.92_ 1 , 0
variance-covariance m atrix X . X in turn can be decom posed into the product a C a T, where a is an m X m diagonal m atrix of volatilities (i.e., so the /th elem ent of a is the ith volatility 07 and the off-diagonal elem ents are zero), crT is its transpose, and C is
1 V i - 0 .3 2
the m X m m atrix of historical correlations. R therefore reflects an historical correlation m atrix C, and we wish to adjust R so
R
Vi
- 0 .3 2
0.9 Vi - 0 .3 2
0.3 Vi - 0.92, V i
1, 0
that they becom e R reflecting a current correlation m atrix C. Now suppose for convenience that both correlation m atrices are
0.7629, 0.4569
positive definite. This means that each correlation m atrix has an m X m 'm atrix square root', A and A respectively, given by a Choleski decom position (which also im plies that they are easy to obtain). W e can now write R and R as m atrix products of the rel evant Choleski m atrices and an uncorrelated noise process s:
Filtered Historical Simulation Another promising approach is filtered historical simulation (FH S ).12*This is a form of sem i-param etric bootstrap which aims
R = Ae
(2.4a)
to com bine the benefits of HS with the power and flexibility of
R = Ae
(2.4b)
conditional volatility m odels such as G A R C H . It does so by boot
We then invert Equation (2.4a) to obtain e = A 1R, and substi tute this into (Equation 2.4b) to obtain the correlation-adjusted series R that we are seeking:
R = A A -1R
strapping returns within a conditional volatility (e.g ., G A R C H ) fram ew ork, where the bootstrap preserves the non-param etric nature of HS, and the volatility model gives us a sophisticated treatm ent of volatility.
(2.5)
The returns adjusted in this way will then have the currently
Suppose we wish to use FHS to estim ate the VaR of a single asset portfolio over a 1-day holding period. The first step in
prevailing correlation m atrix C and, more generally, the
FHS is to fit, say, a G A R C H model to our portfolio-return data.
currently prevailing covariance m atrix S .T h is approach is a
We want a model that is rich enough to accom m odate the key
11 The correlation adjustment discussed here is based on a suggestion by Duffie and Pan (1997).
12 This approach is suggested in Barone-Adesi et. al. (1998), Barone-Adesi et. al. (1999), Barone-Adesi aud Giannopoulos (2000) and in other papers by some of the same authors.
26
■
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
features of our data, and Barone-Adesi and colleagues recom
keep any correlation structure present in the raw returns. The
mend an asym m etric G A R C H , or A G A R C H , m odel. This not only
bootstrap thus maintains existing correlations, without our hav
accom m odates conditionally changing volatility, volatility clus
ing to specify an explicit m ultivariate pdf for asset returns.
tering, and so on, but also allows positive and negative returns to have differential im pacts on volatility, a phenom enon known as the leverage effect. The A G A R C H postulates that portfolio returns obey the following process:
(x)
+ a(et_i + y )2 + /3cr2_i
we have a longer holding period, we would first take a draw ing and use Equation (2.8) to get a return for tom orrow ; we would then use this drawing to update our volatility forecast
rt — /jl + s t of =
The other obvious extension is to a longer holding period. If
(2 . 6 a) (2 . 6 b)
The daily return in Equation (2.6a) is the sum of a mean daily return (which can often be neglected in volatility estim ation) and a random error s t. The volatility in Equation (2.6b) is the sum of a constant and term s reflecting last period's 'surprise' and last period's volatility, plus an additional term y that allows for the surprise to have an asym m etric effect on volatility, depending on w hether the surprise term is positive or negative. The second step is to use the model to forecast volatility for each of the days in a sam ple period. These volatility forecasts are then divided into the realised returns to produce a set of standardised returns. These standardised returns should be independently and identically distributed (iid), and therefore be suitable for HS.
for the day after tom orrow, and take a fresh drawing to deter mine the return for that day; and we would carry on in the same manner— taking a drawing, updating our volatility forecasts, taking another drawing for the next period, and so on— until we had reached the end of our holding period. A t that point we would have enough information to produce a single simulated P/L observation; and we would repeat the process as many tim es as we wished in order to produce the histogram of sim u lated P/L observations from which we can estim ate our VaR. FHS has a number of attractions: (i) It enables us to com bine the non-param etric attractions of HS with a sophisticated (e.g ., G A R C H ) treatm ent of volatility, and so take account of changing m arket volatility conditions, (ii) It is fast, even for large portfo lios. (iii) As with the earlier HW approach, FHS allows us to get VaR and ES estim ates that can exceed the maximum historical loss in our data set. (iv) It maintains the correlation structure in
Assuming a 1-day VaR holding period, the third stage involves
our return data without relying on knowledge of the variance-
bootstrapping from our data set of standardised returns: we take
covariance m atrix or the conditional distribution of asset returns,
a large number of drawings from this data set, which we now
(v) It can be modified to take account of autocorrelation or past
treat as a sample, replacing each one after it has been drawn,
cross-correlations in asset returns, (vi) It can be modified to pro
and multiply each random drawing by the A G A R C H forecast of
duce estim ates of VaR or ES confidence intervals by combining
tomorrow's volatility. If we take M drawings, we therefore get M
it with an O S or bootstrap approach to confidence interval esti
simulated returns, each of which reflects current market conditions
m ation .14 (vii) There is evidence that FHS works w e ll .15
because it is scaled by today's forecast of tomorrow's volatility. Finally, each of these sim ulated returns gives us a possible end-of-tomorrow portfolio value, and a corresponding possible loss, and we take the VaR to be the loss corresponding to our chosen confidence level .13 We can easily modify this procedure to encom pass the obvious com plications of a multi asset portfolio or a longer holding period. If we have a multi-asset portfolio, we would fit a m ultivariate G A R C H (or A G A R C H ) to the set or vector of asset returns, and we would standardise this vector of asset returns. The bootstrap would then select, not just a standardised portfolio return for som e chosen past (daily) period, but the standardised vector of asset returns for the chosen past period. This is im portant because it means that our sim ulations would
13 The FHS approach can also be extended easily to allow for the esti mation of ES as well as VaR. For more on how this might be done, see Giannopoulos and Tunaru (2004).
14 The OS approach would require a set of paired P/L and associated probability observations, so we could apply this to FHS by using a P/L series that had been through the FHS filter. The bootstrap is even easier, since FHS already makes use of a bootstrap. If we want 8 bootstrapped estimates of VaR, we could produce, say, 100*8 or 1000*8 bootstrapped P/L values; each set of 100 (or 1000) P/L series would give us one HS VaR estimate, and the histogram of M such estimates would enable us to infer the bounds of the VaR confidence interval. 15 Barone-Adesi and Giannopoulos (2000), p. 17. However, FHS does have problems. In his thorough simulation study of FHS, Pritsker (2001, pp. 22-24) comes to the tentative conclusions that FHS VaR might not pay enough attention to extreme observations or time-varying correla tions, and Barone-Adesi and Giannopoulos (2000, p. 18) largely accept these points. A partial response to the first point would be to use ES instead of VaR as our preferred risk measure, and the natural response to the second concern is to develop FHS with a more sophisticated past cross-correlation structure. Pritsker (2001, p. 22) also presents simula tion results that suggest that FHS-VaR tends to underestimate 'true' VaR over a 10-day holding period by about 10%, but this finding conflicts with results reported by Barone-Adesi et. al. (2000) based on real data. The evidence on FHS is thus mixed.
Chapter 2 Non-Parametric Approaches
■ 27
2.5 ADVANTAGES AND DISADVANTAGES O F NON-PARAMETRIC METHODS
on the historical data s e t .16 Th ere are various other related problem s: •
If our data period was unusually quiet, non-param etric m ethods will often produce VaR or ES estim ates that are
Advantages
too low for the risks we are actually facing; and if our data
In draw ing our discussion to a clo se , it is perhaps a
ES estim ates that are too high.
good idea to sum m arise the main ad van tag es and d isad van tag e s of non-param etric ap p ro ach es. The ad van tag es
period was unusually volatile, they will often produce VaR or •
include: • •
if there is a perm anent change in exchange rate risk, it will
Non-param etric approaches are intuitive and conceptually
usually take tim e for the HS VaR or ES estim ates to reflect
sim ple.
the new exchange rate risk. Sim ilarly, such approaches are som etim es slow to reflect major events, such as the increases
Since they do not depend on param etric assum ptions about
in risk associated with sudden m arket turbulence.
P/L, they can accom m odate fat tails, skew ness, and any other non-normal features that can cause problem s for param etric
•
•
estim ates even though we don't expect them to recur.
They can in theory accom m odate any type of position, including derivatives positions.
•
shadow effects.
that HS works quite well em pirically, although formal em piri •
•
actually occur, in our sam ple period.
spreadsheet. N on-param etric m ethods are free of the operational prob lems to which param etric m ethods are subject when applied to high-dim ensional problem s: no need for covariance m atrices, no curses of dim ensionality, etc. •
They use data that are (often) readily available, either from public sources (e.g ., Bloom berg) or from in-house data sets (e.g ., collected as a by-product of marking positions to market).
•
In general, non-param etric estim ates of VaR or ES make no allowance for plausible events that might occur, but did not
They are (in varying degrees, fairly) easy to im plem ent on a
•
Most (if not all) non-param etric m ethods are subject (to a greater or lesser extent) to the phenom enon of ghost or
There is a w idespread perception among risk practitioners cal evidence on this issue is inevitably m ixed.
If our data set incorporates extrem e losses that are unlikely to recur, these losses can dom inate non-param etric risk
approaches. •
Non-param etric approaches can have difficulty handling shifts that take place during our sam ple period. For exam ple,
•
Non-param etric estim ates of VaR and ES are to a greater or lesser extent constrained by the largest loss in our historical data set. In the sim pler versions of HS, we cannot extrapolate from the largest historical loss to anything larger that might conceivably occur in the future. More sophisticated versions of HS can relax this constraint, but even so, the fact remains that non-param etric estim ates of VaR or ES are still con strained by the largest loss in a way that param etric estim ates are not. This means that such m ethods are not well suited to
They provide results that are easy to report and com m uni
handling extrem es, particularly with small- or medium-sized
cate to senior m anagers and interested outsiders (e.g ., bank
sam ples.
supervisors or rating agencies). •
It is easy to produce confidence intervals for non-param etric VaR and ES.
•
N on-param etric ap p ro ach es are cap ab le of co n sid
able refinem ents. For exam ple, we can am eliorate volatility, m arket turbulence, correlation and other problem s by semiparam etric adjustm ents, and we can am eliorate ghost effects
erab le refinem ent and potential im provem ent if we
by age-weighting our data and allowing our sam ple size to rise
com bine them with p aram etric 'add-ons' to m ake
over tim e.
them sem i-p aram etric: such refinem ents include a g e w eig hting (as in BRW ), vo latility-w eig htin g (as in HW and FH S), and co rrelatio n -w eig h tin g .
Disadvantages Perhaps th e ir b ig g est potential w eakn ess is that their results are very (and in m ost cases, co m p letely) d ep en d en t
28
However, we can often am eliorate these problem s by suit
■
There can also be problem s associated with the length of the sam ple w indow period. W e need a reasonably long w indow
16 There can also be problems getting the data set. We need time series data on all current positions, and such data are not always available (e.g., if the positions are in emerging markets). We also have to ensure that data are reliable, compatible, and delivered to the risk estimation system on a timely basis.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
to have a sam ple size large enough to get risk estim ates of accep tab le precision, and as a broad rule of thum b, most exp erts believe that w e usually need at least a couple of year's
Using Order Statistics to Estimate Confidence Intervals for VaR
worth of daily observations (i.e ., 500 observations, at 250
If we have a sam ple of n P/L observations, we can regard
trading days to the year), and often m ore. On the other hand,
each observation as giving an estim ate of VaR at an implied
a very long w indow can also create its own problem s. The
confidence level. For exam ple, if n = 1000, we might take the
longer the w indow : •
the greater the problem s with aged data;
•
the longer the period over which results will be distorted by unlikely-to-recur past events, and the longer we will have to wait for ghost effects to disappear;
•
the more the news in current m arket observations is likely to be drowned out by older observations— and the less responsive will be our risk estim ates to current m arket conditions; and
•
the greater the potential for data-collection problem s. This is a particular concern with new or em erging m arket instru m ents, where long runs of historical data don't exist and are not necessarily easy to proxy.
95% VaR as the negative of the 51 st sm allest P/L observation, we might take the 99% VaR as the negative of the 11th sm all est, and so on. We therefore take the a VaR to be equal to the negative of the rth lowest observation, where r is equal to 100(1 — a) + 1. More generally, with n observations, we take the VaR as equal to the negative of the rth lowest observation, where r — n( 1 — a) + 1 . The rth order statistic is the rth lowest (or, alternatively, highest) in a sam ple of n observations, and the theory of order statis tics is well established in the statistical literature. Suppose our observations x 1# x 2, . . . , x n come from som e known distribution (or cum ulative density) function F(x), with rth order statistic x (r). Now suppose that x (1) < x (2) ^ . . .
^ x (n). The probability that
j of our n observations do not exceed a fixed value x must obey the following binomial distribution:
CO N CLUSIO N S
P r { j o b serva tion s < x j —
{ P (x )}7{ 1 — F ( x ) } n_;
(2.7)
Non-param etric m ethods are w idely used and in many respects
It follows that the probability that at least r observations in the
highly attractive approaches to the estim ation of financial risk
sam ple do not exceed x is also a binomial:
m easures. They have a reasonable track record and are often superior to param etric approaches based on sim plistic assum p tions such as normality. They are also capable of considerable refinem ent to deal with som e of the w eaknesses of more basic non-param etric approaches. As a general rule, they work fairly well if m arket conditions remain reasonably stable, and are capable of considerable refinem ent. However, they have their limitations and it is often a good idea to supplem ent them with other approaches. W herever possible, we should also com ple ment non-param etric m ethods with stress testing to gauge our vulnerability to 'what if' events. We should never rely on nonparam etric m ethods alone.
GrCx) =
- R x)}" '
( 2 . 8)
G r(x) is therefore the distribution function of our order statistic and, hence, of our quantile or VaR .17 This VaR distribution function provides us with estim ates of our VaR and of its associated confidence intervals. The median (i.e., 50 percentile) of the estim ated VaR distribution function gives us a natural 'best' estim ate of our VaR, and estim ates of the lower and upper percentiles of the VaR distribution function give us estim ates of the bounds of our VaR confidence interval. This is useful, because the calculations are accurate and easy to carry out on a spreadsheet. Equation (2.8) is also very general and
A PPEN D IX 1
gives us confidence intervals for any distribution function F(x),
Estimating Risk Measures with Order Statistics
To use this approach, all we need to do is specify F(x) (as nor
The theory of order statistics is very useful for risk m easure m ent because it gives us a practical and accurate m eans of estim ating the distribution function for a risk m easure— and
param etric (normal, t, etc.) or em pirical.
mal, t, etc.), set our param eter values, and use Equation (2.8) to estim ate our VaR distribution function. To illustrate, suppose w e w ant to apply the order-statistics (OS) approach to estim ate the distribution function of a standard
this is useful because it enables us to estim ate confidence intervals for them .
17 See, e.g., Kendall and Stuart (1973), p. 348, or Reiss (1989), p. 20.
Chapter 2 Non-Parametric Approaches
■ 29
normal VaR. W e then assum e that F(x) is standard normal
equivalent— the t-distribution function, the Gum bel distribu
and use Equation (2.8) to estim ate three key param eters of
tion function, and so on. We can also use the same approach to
the VaR distribution: the median or 50 percentile of the esti
estim ate the confidence intervals for an em pirical distribution
m ated VaR distribution, which can be interpreted as an O S
function (i.e., for historical simulation VaR), where F(x) is some
estim ate of normal VaR; and the 5 and 95 percentiles of the
em pirical distribution function.
estim ated VaR distribution, which can be interpreted as the O S estim ates of the bounds of the 90% confidence interval for standard normal VaR.
Conclusions
Some illustrative estim ates for the 95% VaR are given in Table 2.2.
The O S approach provides an ideal method for estim ating the
To facilitate comparison, the table also shows the estim ates of
confidence intervals for our VaRs and ESs. In particular, the O S
standard normal VaR based on the conventional normal VaR
approach is:
formula as explained in Chapter 1. The main results are: •
•
The confidence interval— the gap between the 5 and 95
ric or non-param etric VaR or ES.
percentiles— is quite wide for low values of n, but narrows as n gets larger. • •
Com pletely general, in that it can be applied to any param et
•
Reasonable even for relatively small sam ples, because it is not based on asym ptotic theory— although it is also the case
As n rises, the median of the estim ated VaR distribution
that estim ates based on small sam ples will also be less accu
converges to the conventional estim ate.
rate, precisely because the sam ples are small.
The confidence interval is (in this case, a little) w ider for more extrem e VaR confidence levels than it is for the more central ones.
•
Easy to im plem ent in practice.
The O S approach is also su p erio r to co nfid ence-in terval estim ation m ethods based on estim ates of quantile sta n
The same approach can also be used to estim ate the per
dard errors (see C h a p te r 1), because it does not rely on
centiles of other VaR distribution functions. If we wish to
asym p to tic theo ry and/or force estim ated co nfid ence
estim ate the percentiles of a non-normal param etric VaR, we
intervals to be sym m etric (which can be a problem for
replace the normal distribution function F(x) by the non-normal
extrem e VaRs and ESs).
Table 2.2
Order Statistics Estimates of Standard Normal 95% VaRs and Associated Confidence Intervals (a) As n varies
No. of observations
100
500
1000
5000
10 000
Lower bound of confidence interval
1.267
1.482
1.531
1.595
1.610
Median of VaR distribution
1.585
1.632
1.639
1.644
1.644
Standard estim ate of VaR
1.645
1.645
1.645
1.645
1.645
Upper bound of confidence interval
1.936
1.791
1.750
1.693
1.679
6 .0%
4.2%
W idth of interval/m edian
42.2%
18.9%
13.4%
(b) As VaR confidence level varies (with n = 500) VaR confidence level
0.90
0.95
0.99
Low er bound of confidence interval
1.151
1.482
2.035
Median of VaR distribution
1.274
1.632
2.279
Standard estim ate of VaR
1.282
1.645
2.326
Upper bound of confidence interval
1.402
1.791
2.560
W idth of interval/m edian of interval
19.7%
18.9%
23.0%
Notes: The confidence interval is specified at a 90% level of confidence, and the lower and upper bounds of the confidence interval are estimated as the 5 and 95 percentiles of the estimated VaR distribution (Equation (2.8)).
30
■
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
A PPEN D IX 2
distribution are unknown— and, more likely than not, so too is the distribution itself. We are interested in a particular parameter
The Bootstrap
6, where 0 might be a mean, variance (or standard deviation),
The bootstrap is a simple and useful method for assessing
estimate 0 using a suitable sample estimator— so if 0 is the mean,
uncertainty in estim ation procedures. Its distinctive feature
our estimator 0 would be the sample mean, if 0 is the variance, our
quantile, or some other parameter. The obvious approach is to
/V
/V
is that it replaces mathem atical or statistical analysis with
estimator 6 would be based on some sample variance, and so on.
sim ulation-based resam pling from a given data set. It therefore
Obtaining an estimator for 0 is therefore straightforward, but how
provides a means of assessing the accuracy of param eter esti
do we obtain a confidence interval for it?
mators without having to resort to strong param etric assum p tions or closed-form confidence-interval form ulas. The roots of the bootstrap go back a couple of centuries, but the idea only took off in the last three decades after it was developed and popularised by the w ork of Bradley Efron. It was Efron, too, who first gave it its name, which refers to the phrase 'to pull oneself up by one's bootstraps'. The bootstrap is a form of statistical 'trick', and is therefore very aptly named. The main purpose of the bootstrap is to assess the accuracy of
To estim ate confidence intervals for 0 using traditional closedform approaches requires us to resort to statistical theory, and the theory available is of limited use. For exam ple, suppose we wish to obtain a confidence interval for a variance. If we assume that the underlying distribution is normal, then we know that (n — 1 ) z° +
z° + za
, a 2 = [z° +
1 - a(z° + z j
Z° + Z - Q 1
(2 . 12)
1 - a(z° + z ^ J .
If the param eters a and z° are zero, this B C a confidence interval will coincide with the earlier percentile interval. However, in general, they will not be 0, and we can think of the B C a method as correcting the end-points of the confidence interval. The param eter a refers to the rate of change of the standard error of 9 with respect to the true param eter 9, and it can be regarded as a correction for skew ness. This param eter can be estim ated from the follow ing, which would be based on an initial bootstrap or jackknife exercise:
the original sam ple. O nce we have our resam ple, we use it to
(2.13)
estim ate the param eter we are interested in. This gives us a resam ple estim ate of the param eter. We then repeat the 'resam pling' process again and again, and obtain a set of B resam ple
The param eter z° can be estim ated as the standard normal
param eter estim ates. This set of B resam ple estim ates can also
inverse of the proportion of bootstrap replications that is less
be regarded as a bootstrapped sam ple of param eter estim ates.
than the original estim ate 9. The B C a method is therefore
We can then use the bootstrapped sam ple to estim ate a con fidence interval for our param eter 9. For exam ple, if each resaA p
mple i gives us a resam ple estim ator 9 (i) we might construct a /v D
sim ulated density function from the distribution of our 9 b[i) val ues and infer the confidence intervals from its percentile points. If our confidence interval spans the central 1 — 2a of the prob ability mass, then it is given by: C o n fid en ce Interval — [9B,
(2.11)
AQ
where 9% is the a quantile of the distribution of bootstrapped A
q
9 (i) values. This 'percentile interval' approach is very easy to apply and does not rely on any param etric theory, asym ptotic or otherwise.
A
(relatively) straightforward to im plem ent, and it has the theoreti cal advantages over the percentile interval approach of being both more accurate and of being transform ation-respecting, the latter property meaning that if we take a transform ation of 9 (e.g ., if 9 is a variance, we might wish to take its square root to obtain the standard deviation), then the B C a method will auto matically correct the end-points of the confidence interval of the transform ed param eter .23 We can also use a bootstrapped sam ple of param eter estim ates to provide an alternative point estim ator of a
/V
param eter that is often superior to the raw sam ple estim ator 9. G iven that there are B resam ple estim ators, we can take our A p
bootstrapped point estim ator 9 as the sam ple mean of our
N onetheless, this basic percentile interval approach is limited
B § B (/) valu e s :24
itself, particularly if param eter estim ators are biased. It is there fore often better to use more refined percentile approaches, and perhaps the best of these is the bias-corrected and
=
° i=
1
Relatedly, we can also use a bootstrap to estim ate the bias in an estimator. The bias is the difference between the expectation
21 This application of the bootstrap can be described as a non-parametric one because we bootstrap from a given data sample. The bootstrap can also be implemented parametrically, where we bootstrap from the assumed distribution. When used in parametric mode, the bootstrap provides more accurate answers than textbook formulas usually do, and it can provide answers to problems for which no textbook formulas exist. The bootstrap can also be implemented semi-parametrically and a good example of this is the FRS approach. 22 In practice, it might be possible to choose the value of n, but we will assume for the sake of argument that n is given.
32
■
of an estim ator and the quantity estim ated (i.e., the bias equals
23 For more on BCa and other refinements to the percentile interval approach, see Efron and Tibshirani (1993, Chapters 14 and 22) or Davison and Hinkley (1997, Chapter 5). 24 This basic bootstrap estimation method can also be supplemented by variance-reduction methods (e.g., importance sampling) to improve accuracy at a given computational cost. See Efron and Tibshirani (1993, Chapter 23) or Davison and Hinkley (1997, Chapter 9).
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
E[d ] — d)), and can be estim ated by plugging Equation (2.14)
these results are lim ited, because Equation (2.19) only applies to
and a basic sam ple estim ator 0 into the bias equation:
the mean and Equation (2.20) presupposes normality as well.
/V
/V
Bias — E[0] — 0=> Estim a ted Bias — 0 B — 0
(2.15)
We can use an estim ate of bias for various purposes (e.g ., to correct a biased estimator, to correct prediction errors, etc.). However, the bias can have a (relatively) large standard error. In such cases, correcting for the bias is not always a good idea, because the bias-corrected estim ate can have a larger standard error than the unadjusted, biased, estim ate. The program s to com pute bootstrap statistics are easy to write and the most obvious price of the bootstrap, increased com pu
We therefore face two related questions: (a) how we can esti mate var(sB) in general? and (b) how can we choose B to achieve a given level of accuracy in our estim ate of sB? O ne approach to these problem s is to apply brute force: we can estim ate var(sB) using a jackknife-after-bootstrap (in which we first bootstrap the data and then estim ate var(sB) by jackknifing from the bootstrapped data), or by using a double bootstrap (in which we estim ate a sam ple of bootstrapped sB values and then estim ate their variance). W e can then experim ent with different values of B to determ ine the values of these param eters needed to bring
tation, is no longer a serious problem .25
var(sB ) down to an acceptable level.
Standard Errors of Bootstrap Estimators
to choose B), a more elegant approach is the follow ing, sug
Naturally, bootstrap estim ates are them selves subject to error.
our 'ideal' the value of sB associated with an infinite number of
If we are more concerned about the second problem (i.e., how
Typically, bootstrap estim ates have little bias, but they often have substantial variance. The latter com es from basic sampling variability (i.e., the fact that we have a sam ple of size n drawn from our population, rather than the population itself) and from resampling variability (i.e., the fact that we take only B bootstrap
gested by Andrew s and Buchinsky (1997). Suppose we take as resam ples, i.e ., s x . Let r be a target probability that is close to
1 , and let b o u n d be a chosen bound on the percentage devia tion of scrB from s^. W e want to choose B = B(bound, t ) such that the probability that
resam ples rather than an infinite number of them ). The esti-
Pr 100
/V
mated standard error for 0, sB, can be obtained from:
1 B
A
where 6
Q
P
A
Q
\ 1/2 - « e)2 J
(2.16)
SB — S oo /V
SB
m2f m 4 _
(2.18)
_ 4 B \ rri2
where m,- is the rth moment of the bootstrap distribution of the
m 4/m 2 - rh2 4n‘
+
a2 2nB
+
(T2{m4/m 2 - 3) 4n2B
d
2 n:
1+s
the kurtosis
p
lem, we replace
k
with a consistent estim ator of k , and this leads
method to determ ine B: •
W e initially assum e that
k
— 3, and plug this into
Equation (2.22) to obtain a prelim inary value of B, denoted by B0l where
f 5000^ \
(2.19)
If the distribution is normal, this further reduces to var(sB) -
k,
2 . 22)
of the distribution of 6 , is unknown. To get around this prob
0B(i). In the case where 0 is the mean, Equation (2.18) reduces to: var(sB) -
(
Andrew s and Buchinsky to suggest the following three-step
/V / /V
var(sB) = var[rr?2 2] + E
:
(2 . 2 1 )
= r
However, this formula is not operational because A
rearranged as:
< bound
2500( k - 1)x? B « ------ ------ bound
However, sB is itself variable, and the variance of sB is:
Following Efron and Tibshirani (1993, Chapter 19), this can be
t
If B is large, then the required number of resam ples is
^
(2.17)
is within the desired bound is /V
approxim ately
— (1 / B ) 2 /=10 (/). sB is of course also easy to estim ate. var(sB) = var[ E(se)] + Etvar(sB)]
s B
A
\ b o u n d 2J
(2.23)
and where int(a) refers to the sm allest integer greater than or ( 2 .20)
equal to a. •
W e sim ulate B 0 resam ples, and estim ate the sam ple kurtosis A p
of the bootstrapped 0 values,
We can then set B to reduce var(sB) to a desired level, and so achieve a target level of accuracy in our estim ate of sB. However,
•
A
k
.
W e take the desired num ber of bootstrap resam ples as equal to m ax(B0, B i), where
25 An example of the bootstrap approach applied to VaR is given ear lier in this chapter discussing the bootstrap point estimator and boot strapped confidence intervals for VaR.
2500( k - 1)*? By - ------------ 7 bound2
Chapter 2 Non-Parametric Approaches
(2.24)
■ 33
•
This method does not directly tell us what the variance of s B
G A R C H procedure). W e can then bootstrap from the residu
might be, but we already know how to estim ate this in any
als, which should be independent. However, this solution
case. Instead, this method gives us som ething more useful: it
requires us to identify the underlying stochastic model and
tells us how to set B to achieve a target level of precision in
estim ate its param eters, and this expo ses us to model and
our bootstrap estim ators, and (unlike Equations (2.19) and
param eter risk. A Q
(2 . 20 )) it applies for any param eter u and applies however 6 is distributed .26
•
An alternative is to use a block approach: we divide sam ple data into non-overlapping blocks of equal length, and select a block at random. However, this approach can 'whiten' the data (as the joint observations spanning different blocks are
Time Dependency and the Bootstrap
taken to be independent), which can underm ine our results.
Perhaps the main limitation of the bootstrap is that standard
On the other hand, there are also various m ethods of dealing
bootstrap procedures presuppose that observations are inde
with this problem (e.g ., making block lengths stochastic, etc.)
pendent over tim e, and they can be unreliable if this assumption
but these refinem ents also make the block approach more
does not hold. Fortunately, there are various ways in which we
difficult to im plem ent.
can modify bootstraps to allow for such dependence: •
If we are prepared to m ake param etric assum ptions, we can model the dep endence param etrically (e .g ., using a
•
A third solution is to m odify the p ro b ab ilities with which individual o b servatio n s are chosen. Instead of assum ing th at each observation is chosen with the sam e p ro b ab ility, w e can m ake the p ro b ab ilities of selectio n d e p e n d e n t on the tim e indices of recently se lecte d o b servatio n s: so, for
A /
This three-step method can also be improved and extended. For example, it can be improved by correcting for bias in the kurtosis estima tor, and a similar (although more involved) three-step method can be used to achieve given levels of accuracy in estimates of confidence intervals as well. For more on these refinements, see Andrews and Buchinsky (1997).
34
■
exam p le, if the sam ple data are in chronological order and observation / has ju st been chosen, then observation / + 1 is more likely to be chosen next than m ost other o b servatio n s.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
Parametric Approaches (II): Extreme Value Learning Objectives A fter com pleting this reading you should be able to: Explain the im portance and challenges of extrem e values in risk m anagem ent. D escribe extrem e value theory (EVT) and its use in risk m anagem ent.
Evaluate the tradeoffs involved in setting the threshold level when applying the generalized Pareto (GP) distribution. Explain the m ultivariate EVT for risk m anagem ent.
D escribe the peaks-over-threshold (PO T) approach. Com pare and contrast the generalized extrem e value and P O T approaches to estim ating extrem e risks.
E x c e rp t is C h a p ter 7 o f Measuring M arket Risk, S e co n d Edition, by Kevin D ow d.
35
There are many problem s in risk m anagem ent that deal with
to these problem s .1 EVT focuses on the distinctiveness of extrem e
extrem e events— events that are unlikely to occur, but can be
values and makes as much use as possible of what theory has to
very costly when they do. These events are often referred to
offer. Not surprisingly, EVT is quite different from the more fam il
as low-probability, high-impact events, and they include large
iar 'central tendency' statistics that most of us have grown up
m arket falls, the failures of major institutions, the outbreak of
with. The underlying reason for this is that central tendency statis
financial crises and natural catastrophes. Given the im portance
tics are governed by central limit theorem s, but central limit theo
of such events, the estim ation of extrem e risk measures is a key
rems do not apply to extrem es. Instead, extrem es are governed,
concern for risk m anagers.
appropriately enough, by extreme-value theorem s. EVT uses
However, to estim ate such risks we have to confront a d if ficult problem : extrem e events are rare by definition, so we have relatively few extrem e observations on which to base our estim ates. Estim ates of extrem e risks must therefore be very uncertain, and this uncertainty is especially pronounced if we are interested in extrem e risks not only within the range
these theorem s to tell us what distributions we should (and should not!) fit to our extrem es data, and also guides us on how we should estimate the param eters involved. These EV distribu tions are quite different from the more familiar distributions of central tendency statistics. Their parameters are also different, and the estimation of these parameters is more difficult.
of observed data, but w ell b e y o n d it— as m ight be the case if
This chapter provides an overview of EV theory, and of how it
we w ere interested in the risks associated with events more
can be used to estimate measures of financial risk. We will focus
extrem e than any in our historical data set (e .g ., an unprec
mainly on the VaR (and to a lesser extent, the ES) to keep the dis
edented stock m arket fall).
cussion brief, but the approaches considered here extend natu
Practitioners can only respond by relying on assum ptions to make up for lack of data. Unfortunately, the assum ptions they make are often questionable. Typically, a distribution is selected arbitrarily, and then fitted to the whole data set. However, this means that the fitted distribution will tend to accom m odate the
rally to the estimation of other coherent risk measures as well. The chapter itself is divided into four sections. The first two discuss the two main branches of univariate EV theory, the next discusses some extensions to, including m ultivariate EVT, and the last concludes.
more central observations, because there are so many of them , rather than the extrem e observations, which are much sparser. Hence, this type of approach is often good if w e are interested in the central part of the distribution, but is ill-suited to handling extrem es. W hen dealing with e xtre m e s, w e need an app roach that com es to term s with the basic problem posed by extrem e-
3.1 G EN ERA LISED EXTREM E-VALUE TH EO RY Theory Suppose we have a random loss variable X, and we assume to
value estim atio n: that the estim ation of the risks asso ciated
begin with that X is independent and identically distributed (iid)
with low -frequency events with lim ited data is inevitab ly
from som e unknown distribution F(x) = Prob(X < x). We wish to
p ro b lem atic, and th at th ese d ifficu lties increase as the events
estim ate the extrem e risks (e.g ., extrem e VaR) associated with
co ncerned becom e rarer. Such problem s are not unique to
the distribution of X. Clearly, this poses a problem because we
risk m anagem ent, but also occur in other d iscip lin es as w ell.
don't know what F(x) actually is.
Th e standard exam p le is hydrology, w here en g in eers have long stru g g led with the question of how high d ikes, sea w alls and sim ilar b arriers should be to contain the p ro b ab ilities of flo o d s w ithin reaso n ab le lim its. T h e y have had to do so with even less data than financial risk p ractitio n ers usually
This is where EVT comes to our rescue. Consider a sample of size n drawn from F(x), and let the maximum of this sample be M n.12 If n is large, we can regard M n as an extrem e value. Under rela tively general conditions, the celebrated Fisher-Tippett theorem
have, and th eir quantile estim ates— the flood w ate r levels they w ere contending w ith— w ere also typ ically w ell out of the range of th e ir sam ple data. So they have had to grapp le with co m p arab le problem s to those faced by insurers and risk m anagers, but have had to do so with even less data and p o ten tially much more at stake. The result of their efforts is extrem e-value theory (EVT)— a branch of applied statistics that is tailor-made
36
■
1 The literature on EVT is vast. However, some standard book references on EVT and its finance applications are Embrechts et. al. (1997), Reiss and Thomas (1997) and Beirlant et. al. (2004). There is also a plethora of good articles on the subject, e.g., Bassi et. al. (1998), Longin (1996, 1999), Danielsson and de Vries (1997a,b), McNeil (1998), McNeil and Saladin (1997), Cotter (2001,2004), and many others. 2 The same theory also works for extremes that are the minima rather than the maxima of a (large) sample: to apply the theory to minima extremes, we simply apply the maxima extremes results but multiply our data by - 1 .
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
(1928) then tells us that as n gets large, the distribution of
O bserve that most of the probability mass is located between
extrem es (i.e., M n) converges to the following generalised
x values of —2 and + 6 . More generally, this means most of the
extreme-value (GEV) distribution:
probability mass will lie between x values of /jl — 2 a and \x + 6 a . To obtain the quantiles associated with the G E V distribution, we set the left-hand side of Equation (3.1) to p, take logs of both
£ * 0 (3.1) exp
exp
sides of Equation (3.1) and rearrange to get: /
£ = 0
-1 / f
V
£ # 0 (3.2)
if
where x satisfies the condition 1 + £(x — ju)/a > 0 .3 This distri bution has three param eters. The first two are ju, the location
£ = 0
param eter of the limiting distribution, which is a measure of the central tendency of Mn, and cr, the scale param eter of the limit ing distribution, which is a measure of the dispersion of Mn. These are related to, but distinct from , the more fam iliar mean and standard deviation, and we will return to these presently. The third param eter, £, the tail index, gives an indication of the shape (or heaviness) of the tail of the limiting distribution.
V We then unravel the x-values to get the quantiles associated with any chosen (cumulative) probability p :5 x = n -
(hrecnet, £ > u
1 - (-ln (p )) H
x — fjL — a ln[ —ln(p)]
(3.3a)
(Gum bel, £ = 0)
(3.3b)
The G E V Equation (3.1) has three special cases: •
If £ > 0, the G E V becom es the Frechet distribution. This case
Example 3.1
Gumbel quantiles
applies where the tail of F(x) obeys a pow er function and is
For the standardised G um bel, the 5% quantile is
therefore heavy (e .g ., as would be the case if F(x) were a Levy
—In[ —ln(0.05)] = —1.0972 and the 95% quantile is
distribution, a t-distribution, a Pareto distribution, etc.). This
—In[ —ln(0.95)] = 2.9702.
case is particularly useful for financial returns because they are typically heavy-tailed, and we often find that estim ates of •
•
£ for financial return data are positive but less than 0.35.
Example 3.2
Frechet quantiles
If £ = 0, the G E V becom es the Gum bel distribution, corre
For the standardised Frechet with £ = 0.2, the 5% quantile is
sponding to the case where F(x) has exponential tails. These
—(1/0.2)[1 - (—ln(0.05))-° 2] = -0 .9 8 5 1 and the 95% quantile
are relatively light tails such as those we would get with nor
is —(1/0.2)[1 - (—ln(0.95))-0-2] = 4.0564. For £ = 0.3, the 5%
mal or lognormal distributions.
quantile is —(1/0.3)[1 — (—ln(0.05))~° 3] = —0.9349 and the
If £ < 0, the G E V becom es the W eibull distribution, corre
95% quantile is —(1/0.3)[1 - (- ln (0 .9 5 )r°- 3] = 4.7924. Thus,
sponding to the case where F(x) has lighter than normal tails.
Frechet quantiles are sensitive to the value of the tail index £,
However, the W eibull distribution is not particularly useful for
and tend to rise with £. Conversely, as £ —» 0, the Frechet quan
modelling financial returns, because few em pirical financial
tiles tend to their Gum bel equivalents.
returns series are so light-tailed .4
We need to rem em ber that the probabilities in Equations
The standardised (i.e., /jl — 0, a — 1) Frechet and Gum bel prob
(3.1)-(3.3) refer to the probabilities associated with the extrem e
ability density functions are illustrated in Figure 3.1. Both are
loss distribution, not to those associated with the distribution
skew ed to the right, but the Frechet is more skew ed than the
of the 'parent' loss distribution from which the extrem e losses
Gum bel and has a noticeably longer right-hand tail. This means
are drawn. For exam ple, a 5th percentile in Equation (3.3) is the
that the Frechet has considerably higher probabilities of produc
cut-off point between the lowest 5% of extrem e (high) losses
ing very large X-values.
3 See, e.g., Embrechts et. al. (1997), p. 316. 4 We can also explain these three cases in terms of domains of attrac tion. Extremes drawn from Levy or t-distributions fall in the domain of attraction of the Frechet distribution, and so obey a Frechet distribution as n gets large; extremes drawn from normal and lognormal distribu tions fall in the domain of attraction of the Gumbel, and obey the Gumbel as n gets large, and so on.
5 We can obtain estimates of EV VaR over longer time periods by using appropriately scaled parameters, bearing in mind that the mean scales proportionately with the holding period h, the standard deviation scales with the square root of h, and (subject to certain conditions) the tail index does not scale at all. In general, we find that the VaR scales with a parameter k (i.e., so VaR(h) = VaR{ 1)(h)K, where h is the holding period), and empirical evidence reported by Hauksson et. al. (2001, p. 93) sug gests an average value for k of about 0.45. The square-root scaling rule (i.e., k — 0.5) is therefore usually inappropriate for EV distributions.
Chapter 3 Parametric Approaches (II): Extreme Value
■ 37
X 10-3
and the highest 95% of extrem e (high) losses; it is not the 5th
estim ate the relevant VaRs. O f course, the VaR form ulas given by
percentile point of the parent distribution. The 5th percentile of
Equation (3.5) are meant only for extrem ely high confidence lev
the extrem e loss distribution is therefore on the /eft-hand side of
els, and w e cannot expect them to provide accurate estim ates
the distribution of extrem e losses (because it is a small extrem e
for VaRs at low confidence levels.
loss), but on the right-hand tail of the original loss distribution (because it is an extrem e loss). To see the connection between the probabilities associated with the distribution of M n and those associated with the distribution of X, we now let Mn* be some extrem e threshold value. It then follows that: Pr[Mn < M*] = p = {Pr[X < M*n]}" = [a]"
(3.4)
where a is the VaR confidence level associated with the thresh
Example 3.3
Gumbel VaR
For the standardised Gum bel and n = 100, the 99.5% VaR is —ln[—100 X ln(0.995)] = 0.6906, and the 99.9% VaR is —Inf—100 X ln(0.999)] = 2.3021.
Example 3.4
Frechet VaR
VaR — /in — —^[1 — (—n In(a))- ^1] (Frechet, £ > 0)
(3.5a)
0.2 and n — 100, the 99.5% (1/0.2)[1 - (-100 X ln(0.995)r02] = 0.7406 and the 99.9%VaR is —(1/0.2)[1 - (-100 X ln(0.999)r02] = 2.9237. For £ = 0.3, the 99.5%VaR is —(1/0.3)[1 - (-100 X ln(0.995)r03]
VaR — ixn — crn Inf—n In (a)] (Gum bel, £ = 0)
(3.5b)
= 3.3165.
old Mn*. To obtain the a VaR, we now use Equation (3.4) to sub stitute [a]n for p in Equation (3.3), and this gives us:
(Since n is now explicit, we have also subscripted the param e
For the standardised Frechet with £ =
VaR is —
= 0.7674 and the 99.9% VaR is —(1/0.3)[1 - (- 1 0 0 X ln (0.99 9)r 03
These results tell us that EV-VaRs (and, by im plication, other
ters with n to make explicit that in practice these would refer to
EV risk measures) are sensitive to the value of the tail index £n,
the param eters associated with maxima drawn from sam ples of
which highlights the im portance of getting a good estim ate of
size n. This helps to avoid errors with the limiting VaRs as n gets
£n when applying EVT. This applies even if we use a G um bel,
large.) Given values for the extrem e-loss distribution param
because we should use the Gum bel only if we think £n is insigni
eters /xn, crn and (where needed) £n, Equation (3.5) allows us to
ficantly different from zero.
38
■
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
Example 3.5
where k(x) varies slowly with x. For exam ple, if w e assume
R e a lis tic F re c h e t V aR
for convenience that k(x) is approxim ately constant, then
Suppose w e wish to estim ate Frech et VaRs with more real
Equation (3.6) becom es:
istic p aram eters. For US stock m arkets, som e fairly plau
F(x) « kx-1/£
sible param eters are fx — 2%, cr — 0.7% and £ = 0.3% . If
(3.7)
Now consider two probabilities, a first, 'in-sam ple' probability
we put these into our Frech et VaR form ula Equation (3.5a) and retain the earlier n value, the estim ated 99.5% VaR
P i n - s a m
p l e
#and a second, sm aller and typically out-of-sample
probability
(in %) is 2 - (0.7/0.3)[1 - (- 1 0 0 X ln (0 .9 9 5 )r03] = 2.537,
p
0 U t. 0 f .s a m p l e -
Equation (3.7) im plies:
and the estim ated 99.9% VaR (in %) is 2 — (0.7/0.3)
_
[1 — (—100 X ln(0.999))_a3] = 4.322. For the next trading
P i n - s a m
day, these estim ates tell us that w e can be 99.5% confident
P
of not m aking a loss in excess of 2.537% of the value of our portfolio, and so on.
o u t - o f - s a m
jL, - 1/£
p l e
p l e
™
k
J
i n - s a m
x
ana
p l e
o u t - o f - s a m
(3.8)
p l e
which in turn im plies: P i n - s a m
It is also interesting to note that had we assum ed a Gum bel P
(i.e., £ = 0) we would have estim ated these VaRs (again
p l e
o u t - o f - s a m
*
p l e
*
i n - s a m
p l e
o u t - o f - s a m
in %) to be 2 — 0.7 X ln[ —100 X ln(0.995)] = 2.483 and
(
2 —0.7 X ln[—100 X ln(0.999)] = 3.612. These are lower than
\
^
p l e
P i n - s a m
p l e
P o u t - o f - s a m
the Frechet VaRs, which underlines the im portance of getting
y
(3.9)
I
p l e
/
This allows us to estim ate one quantile (denoted here as
the £ right.
X
o u t - o f - s a m
p l e
) based on a known in-sample quantile x /n_samp/e, a (which is known
Flow do w e choose betw een the G um b el and the Frech et?
known out-of-sample probability
Th ere are various w ays w e can d ecid e which EV distribution
because it com es directly from our VaR confidence level), and an
to use:
unknown in-sample probability P\n.samp\e.
•
If we are confident that w e can identify the parent loss distri
The latter can easily be proxied by its em pirical counterpart, t/n,
bution, we can choose the EV distribution in whose domain
where n is the sam ple size and t the num ber of observations
of attraction the parent distribution resides. For exam ple, if
higher than x,n.samp/e. Using this proxy then gives us:
p
0 u t - o f - s a m p i e
we are confident that the original distribution is a t, then we would choose the Frechet distribution because the t belongs in the domain of attraction of the Frechet. In plain English, we choose the EV distribution to which the extrem es from
p l e
* i n - s a m
nP
o u t - o f - s a m
p l e
\
^
(3.10)
p l e y
To use this approach, we take an arbitrarily chosen in-sample
W e could te st the significance of the tail index, and we m ight choose the G um bel if the tail index w as insignificant and the Frech et o therw ise. H ow ever, this leaves us open to the danger that w e m ight incorrectly conclude that £ is
0 , and this could lead us to underestim ate our extrem e risk m easures. •
o u t - o f - s a m
which is easy to estim ate using readily available information.
the parent distribution will tend. •
(
X
Given the dangers of model risk and bearing in mind that the
quantile, x /n.samp/e, and determ ine its counterpart em pirical prob ability, t/n. We then determ ine our out-of-sample probability from our chosen confidence level, estim ate our tail index using a suitable m ethod, and our out-of-sample quantile estim ator im m ediately follows from Equation (3.10 ).6
estim ated risk measure increases with the tail index, a safer
Estimation of EV Parameters
option is always to choose the Frechet.
To estim ate EV risk m easures, we need to estim ate the relevant EV param eters fx, a and, in the case of the Frechet, the tail index £, so we can insert their values into our quantile form ulas
A Short-Cut EV Method There are also short-cut ways to estim ate VaR (or ES) using EV theory. These are based on the idea that if £ > 0, the tail of an extrem e loss distribution follows a power-law tim es a slowly varying function: F(x) = k(x)x_1/^
(3.6)
6 An alternative short-cut is suggested by Diebold et. al. (2000). They suggest that we take logs of Equation (3.7) and estimate the log-transformed relationship using regression methods. However, their method is still relatively untried, and its reliability is doubtful because there is no easy way to ensure that the regression procedure will produce a 'sensible' estimate of the tail index.
Chapter 3 Parametric Approaches (II): Extreme Value
■ 39
(i.e., Equation (3.5)). We can obtain estim ators using maximum
above.) In the case where £
likelihood (ML) m ethods, regression m ethods, moment-based or
together give
0, Equations (3.1) and (3.13)
sem i-param etric m ethods. T-— “ I + m
ML Estimation Methods
e xp [-(1 + U M „ - /*„)/ 0. The ML approach has som e attractive p ro p erties (e .g ., it is sta tisti cally well g ro und ed , p aram eter estim ato rs are co nsistent and asym p to tically norm al if £n > —1 / 2 , w e can easily te st hyp otheses about p aram eters using likelihood ratio sta tis tics, e tc .). H ow ever, it also lacks closed-form solutions for the p aram eters, so the M L approach requires the use of an ap p ro p riate num erical solution m ethod. This requires suitable so ftw are, and there is the d ang er th at M L estim ato rs m ight not be robust. In ad d itio n , b ecause the underlying th eo ry is asym p to tic, th ere is also the potential for problem s arising from sm allness of sam p les.
These are typically used to estimate the tail index £, and the most popular of these is the Hill estimator. This estimator is directly applied to the ordered parent loss observations. Denoting these from highest to lowest by X 1; X 2, . . . , X n, the Hill
is:
2 > X i - l n X k +1
k 1=1
(3.17)
where k, the tail threshold used to estim ate the Hill estimator, has to be chosen in an appropriate way. The Hill estim ator is the average of the k most extrem e (i.e., tail) observations, minus the k + 1th observation, or the one next to the tail. The Hill estim a tor is known to be consistent and asym ptotically normally dis tributed, but its properties in finite sam ples are not well understood, and there are concerns in the literature about its
Regression Methods
small-sample properties and its sensitivity to the choice of
An easier method to apply is a regression method due to
threshold k. However, these (and other) reservations notwith
Gum bel (1958). 7 To see how the method works, we begin by
standing, many EV T practitioners regard the Hill estim ator as
ordering our sam ple of M'n values from lowest to highest, so
being as good as any other .8
Mn 1
< ...
< Mn - Because these are order statistics, it
follows that, for large n: E [H [M'n)]
1 + m
The main problem in practice is choosing a cut-off value for k. We know that our tail index estim ates can be sensitive to the
H(M')
1 + m
(3.13)
choice of k, but theory gives us little guidance on what the value of k should be. A suggestion often given to this problem is that
where H(M'n) is the cumulative density function of maxima, and we drop all redundant scripts for convenience. (See Equation (3.1)
7 See Gumbel (1958), pp. 226, 260, 296.
40
■
8 An alternative is the Pickands estimator (see, e.g., Bassi et. al. (1998), p. 125 or Longin (1996), p. 389). This estimator does not require a posi tive tail index (unlike the Hill estimator) and is asymptotically normal and weakly consistent under reasonable conditions, but is otherwise less effi cient than the Hill estimator.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
BOX 3.1 MOMENT-BASED ESTIMATORS OF EV PARAMETERS An alternative approach is to estim ate EV param eters using em pirical m om ents. Let m,- be the /th em pirical moment of our extrem es data set. Assum ing £ ^ 0, w e can adapt Em brechts et. al. (1997, pp. 293-295) to show that:
£ = m, + r(1)
-5 -
10
-
1997.5 1998 1998.5 1999 1999.5 2000
Fiqure 4.4
Bank VaR and trading profits.
ple failing to acco u n t fo r d iv e rsific a tion e ffe c ts. Yet ano ther exp lan atio n is th at capital req u irem en ts are cu rren tly not b ind ing . The am ount of eco n o m ic cap ital U .S . banks cu rren tly hold is in
p re fe r to rep o rt high VaR num bers to avoid the p o ssib ility of reg u lato ry intrusio n. S till, th ese p ractice s im poverish the inform ational co n ten t of VaR num bers.
e xce ss of th e ir reg u lato ry ca p ita l. A s a result, banks may
4.4 CO N CLU SIO N S 4 Including fees increases the P&L, reducing the number of violations. Using hypothetical income, as currently prescribed in the European Union, could reduce this effect. Jaschke, Stahl, and Stehle (2003) com pare the VaRs for 13 German banks and find that VaR measures are, on average, less conservative than for U.S. banks. Even so, VaR forecasts are still too high.
Model verification is an integral com ponent of the risk m anage ment process. Backtesting VaR numbers provides valuable fe e d back to users about the accuracy of their m odels. The procedure also can be used to search for possible im provem ents.
Chapter 4 Backtesting VaR
■ 57
BOX 4.2 NO EXCEPTIONS The C E O of a large bank receives a daily report of the bank's VaR and P&L. W henever there is an exception, the C E O calls in the risk officer for an explanation. Initially, the risk officer explained that a 99 percent VaR con fidence level implies an average of 2 to 3 exceptions per year. The C E O is never quite satisfied, however. Later, tired of going "upstairs," the risk officer simply increases the confidence level to cut down on the number of exceptions. Annual reports suggest that this is frequently the case. Financial institutions routinely produce plots of P&L that show no violation of their 99 percent confidence VaR over long periods, proclaiming that this supports their risk model.
Verification tests usually are based on "excep tio n " counts, defined as the number of exceedences of the VaR m easure. The goal is to check if this count is in line with the selected VaR confidence level. The method also can be modified to pick up bunching of deviations. Backtesting involves balancing two types of errors: rejecting a correct model versus accepting an incorrect m odel. Ideally, one would want a fram ew ork that has very high power, or high prob ability of rejecting an incorrect m odel. The problem is that the power of exception-based tests is low. The current fram ew ork could be improved by choosing a lower VaR confidence level or by increasing the number of data observations. Adding to these statistical difficulties, w e have to recognize other practical problem s. Trading portfolios do change over the horizon. M odels do evolve o v e rtim e as risk m anagers improve
Due thought should be given to the choice of VaR quantitative param eters for backtesting purposes. First, the horizon should
their risk modeling techniques. All this may cause further struc tural instability.
be as short as possible in order to increase the number of obser
D espite all these issues, backtesting has becom e a central
vations and to m itigate the effect of changes in the portfolio
com ponent of risk m anagem ent system s. The m ethodology
com position. Second, the confidence level should not be too
allow s risk m anagers to im prove their m odels constantly.
high because this decreases the effectiveness, or power, of the
Perhaps most im portant, backtesting should ensure that risk
statistical tests.
m odels do not go astray.
58
■
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
VaR Mapping Learning Objectives A fter com pleting this reading you should be able to: Explain the principles underlying VaR mapping and
D escribe how mapping of risk factors can support stress
describe the mapping process.
testing.
Explain how the mapping process captures general and
Explain how VaR can be com puted and used relative to a
specific risks.
perform ance benchm ark.
Differentiate among the three m ethods of mapping
D escribe the method of mapping forw ards, forward rate
portfolios of fixed income securities.
agreem ents, interest rate sw aps, and options.
Sum m arize how to map a fixed income portfolio into positions of standard instrum ents.
E x c e rp t is C h a p ter 17 o f Value at Risk: The New Benchm ark for Managing Financial Risk, Third Ed itio n , by Philippe Jo rio n .
59
The second [principle], to divide each of the difficulties under exam ination into as many parts as possible, and as might be necessary for its adequate solution. — Rene D escartes
W hichever value-at-risk (VaR) method is used, the risk m easure ment process needs to sim plify the portfolio by m apping the positions on the selected risk factors. M apping is the process by which the current values of the portfolio positions are replaced by exposures on the risk factors. Mapping arises because of the fundam ental nature of VaR, which is portfolio m easurem ent at the highest level. As a result, this is usually a very large-scale aggregation problem . It would be too com plex and tim e-consum ing to model all positions indi vidually as risk factors. Furtherm ore, this is unnecessary because many positions are driven by the same set of risk factors and can be aggregated into a small set of exposures without loss of risk inform ation. O nce a portfolio has been m apped on the risk factors, any of the three VaR m ethods can be used to build the distribution of profits and losses. This chapter illustrates the mapping process for major financial instrum ents. First we review the basic principles behind m ap ping for VaR. W e then proceed to illustrate cases where instru ments are broken down into their constituent com ponents. We will see that the mapping process is instructive because it reveals useful insights into the risk drivers of derivatives. The next sections deal with fixed-incom e securities and linear deriva tives. We cover the most im portant instrum ents, forward con tracts, forward rate agreem ents, and interest-rate sw aps. Then we describe nonlinear derivatives, or options.
BOX 5.1 WHY MAPPING? " J .R Morgan Chase's VaR calculation is highly granular, com prising more than 2.1 million positions and 240,000 pricing series (e.g ., securities prices, interest rates, foreign exchange rates)." (Annual report, 2004)
forw ard contracts. The positions may differ owing to different m aturities and delivery prices. It is unnecessary, however, to model all these positions individually. Basically, the positions are exposed to a single m ajor risk factor, which is the dollar/ euro spot exchange rate. Thus they could be sum m arized by a single aggregate exposure on this risk factor. Such ag g reg a tion, of course, is not appropriate for the pricing of the port folio. For risk m easurem ent purposes, however, it is perfectly accep tab le. This is why risk m anagem ent m ethods can differ from pricing m ethods. Mapping is also the only solution when the characteristics of the instrum ent change over tim e. The risk profile of bonds, for instance, changes as they age. O ne cannot use the history of prices on a bond directly. Instead, the bond must be m apped on yields that best represent its current profile. Sim ilarly, the risk profile of options changes very quickly. O ptions must be m apped on their primary risk factors. M apping provides a way to tackle these practical problem s.
Mapping as a Solution to Data Problems Mapping is also required in many common situations. O ften a com plete history of all securities may not exist or may not be relevant. Consider a mutual fund with a strategy of investing in initial p u b lic offerin gs (IPOs) of common stock. By definition, these stocks have no history. They certainly cannot be ignored
5.1 MAPPING FOR RISK M EASUREM EN T
in the risk system , however. The risk m anager would have to replace these positions by exposures on sim ilar risk factors already in the system .
Why Mapping?
Another common problem with global m arkets is the tim e at
The essence of VaR is aggregation at the highest level. This gen
or mutual funds invested in international stocks. As much as
which prices are recorded. Consider, for instance, a portfolio
erally involves a very large number of positions, including bonds,
15 hours can elapse from the tim e the m arket closes in Tokyo
stocks, currencies, com m odities, and their derivatives. As a result,
at 1:00
it would be impractical to consider each position separately (see
United States at 4:00
Box 5.1). Too many com putations would be required, and the
close ignore intervening information and are said to be stale.
tim e needed to measure risk would slow to a crawl.
This led to the mutual-fund scandal of 2003, which is described
Fortunately, m apping provides a shortcut. Many positions
a
.m .
EST (3:00
P .M . P.M .
in Jap an) to the tim e it closes in the As a result, prices from the Tokyo
in Box 5.2.
can be sim plified to a sm aller num ber of positions on a
For risk m anagers, stale prices cause problem s. Because returns
set of elem entary, or prim itive, risk factors. Consider, for
are not synchronous, daily correlations across m arkets are too
instance, a trader's desk with thousands of open dollar/euro
low, which will affect the m easurem ent of portfolio risk.
60
■
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
BOX 5.2 MARKET TIMING AND STALE PRICES In Septem ber 2003, New York Attorney G eneral Eliot Spitzer accused a number of investm ent com panies of allowing m arket tim ing into their funds. M arket timing is a short-term trading strategy of buying and selling the same funds. Consider, for exam ple, our portfolio of Jap anese and U.S. stocks, for which prices are set in different time zones. The problem is that U.S. investors can trade up to the close of the U.S. market. M arket tim ers could take advantage of this discrepancy by rapid trading. For instance, if the U.S. mar ket moves up following good news, it is likely the Japanese market will move up as well the following day. M arket timers would buy the fund at the stale price and resell it the next day.
O ne possible solution is mapping. For instance,
Such trading, however, creates transactions costs that are borne by the other investors in the fund. A s a result, fund com panies usually state in their prospectus that this practice is not allow ed. In practice, Eliot Spitzer found out that many mutual-fund com panies had encouraged m arket tim ers, which he argued was fraudulent. Eventually, a number of funds settled by paying more than USD 2 billion. This practice can be stopped in a number of ways. Many mutual funds now im pose short-term redem ption fees, which make m arket timing uneconom ical. Alternatively, the cutoff tim e for placing trades can be moved earlier.
Instruments
prices at the close of the U.S. market can be esti mated from a regression of Japanese returns on U.S. returns and using the forecast value conditional on the latest U.S. information. Alternatively, correlations can be measured from returns taken over longer time intervals, such as weekly. In practice, the risk manager needs to make sure that the data-collection process will lead to meaningful risk estim ates.
The Mapping Process Figure 5.1 illustrates a sim ple mapping process, where six instrum ents are m apped on three risk factors. The first step in the analysis is marking all positions to m arket in current dollars or w hatever reference currency is used. The m arket value for each instrum ent then is allocated to the three risk factors.
current m arket value is not fully allocated to the risk factors, it
Table 5.1 shows that the first instrum ent has a m arket value of
must mean that the rem ainder is allocated to cash, which is not
Vh which is allocated to three exposures, x 11# x12, and x-i3. If the
a risk factor because it has no risk.
Table 5.1
Mapping Exposures E x p o s u re on R isk F a c to r M a rk e t V alu e
1
2
3
Instrument 1
Vi
*11
x 12
x 13
Instrument 2
v2
X 21
x 22
x 23
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Instrument 6
V6
*6 1
x 62
x 63
Total portfolio
V
Y 6 *1 = 2/= iXn
Y 6 X2 = 2/=lXj2
Y 6 x3 = 2 /=i Xi3
Chapter 5 VaR Mapping
■ 61
N ext, the system allocates the position for instrum ent 2 and
These exposures are aggregated across all the stocks in the
so on. A t the end of the process, positions are sum m ed for
portfolio. This gives
each risk factor. For the first risk factor, the dollar exposure is x-i = X /= iX n. This creates a vector x of three exposures that can be fed into the risk m easurem ent system . Mapping can be of two kinds. The first provides an exact alloca tion of exposures on the risk factors. This is obtained for deriva tives, for instance, when the price is an exact function of the risk factors. As w e shall see in the rest of this chapter, the partial derivatives of the price function generate analytical measures of exposures on the risk factors. A lternatively, exposures may have to be estim ated. This occurs, for instance, when a stock is replaced by a position in the stock index. The exposure then is estim ated by the slope coefficient from a regression of the stock return on the index return.
N
Pp =
2
(=1
W/A
( 5 -3 )
If the portfolio value is W, the mapping on the index is x = W(3p. N ext, we decom pose the variance Rp in Equation (5.2) and find N V(RP) = (f?p)V(Rm) + 2 w ? c r e2i 1=1
(5.4)
The first com ponent is the general m arket risk. The second com ponent is the aggregate of specific risk for the entire portfolio. This decom position shows that with more detail on the prim itive or general-m arket risk factors, there will be less specific risk for a fixed amount of total risk V{Rp). As another exam ple, consider a corporate bond portfolio. Bond positions describe the distribution of money flows over tim e by
General and Specific Risk
their am ount, tim ing, and credit quality of the issuer. This cre
This brings us to the issue of the choice of the set of prim itive risk factors. This choice should reflect the trade-off betw een
ates a continuum of risk factors, going from overnight to long m aturities for various credit risks.
better quality of the approxim ation and faster p ro cess
In practice, we have to restrict the number of risk factors to a
ing. More factors lead to tig h ter risk m easurem ent but also
small set. For som e portfolios, one risk factor may be sufficient.
require more tim e devoted to the m odeling process and risk
For others, 15 m aturities may be necessary. For portfolios with
com putation.
options, we need to model m ovem ents not only in yields but also in their implied volatilities.
The choice of prim itive risk factors also influences the size of specific risks. S p e cific risk can be defined as risk that is due to
O ur prim itive risk factors could be m ovem ents in a set of J gov
issuer-specific price m ovem ents, after accounting for general
ernm ent bond yields Zj and in a set of K credit spreads s^ sorted
m arket factors. Hence the definition of specific risk depends on
by credit rating. W e model the m ovem ent in each corporate
that of general m arket risk. The Basel rules have a separate
bond yield dy( by a m ovem ent in z at the closest maturity and in
charge for specific risk .1
s for the same credit rating. The remaining com ponent is er
To illustrate this decom position, consider a portfolio of N stocks.
The m ovem ent in value W th e n is
We are mapping each stock on a position in the stock market index, which is our prim itive risk factor. The return on a stock Ri is regressed on the return on the stock m arket index Rm, that is, R-, = atj + (3jRm + e(-
(5.1)
which gives the exposure /3,-. In what follows, ignore a, which does not contribute to risk. We assume that the specific risk owing to e is not correlated across stocks or with the m arket. The relative w eight of each stock in the portfolio is given by w(.
Rp =
N
N
2WiRi = i=1 2w&iRm + /=12Wi€i
/=1
K N + 2 DVBP kd sk + 2 )D V B P ide, k= 1 /=1
(5.5)
where DVBP is the total dollar value of a basis point for the asso ciated risk factor. The values for DVBPj then represent the sum mation of the DVBP across all individual bonds for each maturity. This leads to a total risk decom position of
Thus the portfolio return is N
N J d W — 2 DVBP id y i= 2 DVBPjdz/ ;=1 j= 1
(5 -2)
N V[dW) — general risk + ^ D V B P fV (d e ()
(5.6)
/=1
A greater number of general risk factors should create less residual risk. Even so, we need to ascertain the size of the second, specific 1 Typically, the charge is 4 percent of the position value for equities and unrated debt, assuming that the banks' models do not incorporate spe cific risks.
62
■
risk term. In practice, there may not be sufficient history to measure the specific risk of individual bonds, which is why it is often assumed that all issuers within the same risk class have the same risk.
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
5.2 MAPPING FIXED-IN COM E PORTFOLIOS
The portfolio has an average m aturity of 3 years and a
Mapping Approaches
zero-coupon rate.
duration of 2.733 years. The table lays out the present value of all portfolio cash flow s discounted at the appropriate
Principal mapping considers the timing of redem ption pay
O nce the risk factors have been selected , the question is how
ments only. Since the average maturity of this portfolio
to map the portfolio positions into exposures on these risk
is 3 years, the VaR can be found from the risk of a 3-year
factors. W e can distinguish three m apping system s for fixed-
maturity, which is 1.484 percent from Table 5.3. VaR then is
incom e portfolios: principal, duration, and cash flow s. W ith
USD 200 X 1.484/100 = USD 2.97 million. The only positive
prin cipal m apping, one risk facto r is chosen that corresponds to the average portfolio m aturity. W ith duration m apping, one risk facto r is chosen that corresponds to the portfolio dura
aspect of this method is its sim plicity. This approach overstates the true risk because it ignores intervening coupon paym ents. The next step in precision is duration m apping. W e replace
tion. W ith cash-flow m apping, the portfolio cash flow s are
the portfolio by a zero-coupon bond with maturity equal
grouped into m aturity buckets. M apping should preserve the m arket value of the position. Ideally, it also should preserve its m arket risk.
to the duration of the portfolio, which is 2.733 years. Table 5.3 shows VaRs of 0.987 and 1.484 for these maturi ties, respectively. Using a linear interpolation, we find a risk of
A s an exam p le, Table 5.2 d escrib es a two-bond portfolio
0.987 + (1.484 - 0.987) X (2.733 - 2) = 1.351 percent for this
consisting of a USD 100 million 5-year 6 percent issue and
hypothetical zero. With a USD 200 million portfolio, the duration-
a USD 100 million 1-year 4 percent issue. Both issues are
based VaR is USD 200 X 1.351/100 = USD 2.70 million, slightly
selling at par, im plying a m arket value of USD 200 m illion.
less than before.
Table 5.2
Mapping for a Bond Portfolio (USD millions) M ap p in g (PV)
C a sh F lo w s Term (Year)
5-Year
1-Year
S p o t R ate
USD 6
USD 104
4.000%
0.00
0.00
USD 105.77
2
USD 6
0
4.618%
0.00
0.00
USD 5.48
—
—
—
USD 200.00
—
3
USD 6
0
5.192%
USD 200.00
0.00
USD 5.15
4
USD 6
0
5.716%
0.00
0.00
USD 4.80
5
USD 106
0
6. 1 1 2 %
0.00
0.00
USD 78.79
USD 200.00
USD 200.00
USD 200.00
Total
Table 5.3 Term (Year)
Loss
C a sh F lo w
1
2.733
Total
D u ratio n
P rin cip al
Computing VaR from Change in Prices of Zeroes C a sh F lo w s
O ld Z e ro V alu e
O ld P V o f F lo w s
R isk (%)
N e w Z e ro V alu e
N e w P V o f F lo w s
1
USD 110
0.9615
USD 105.77
0.4696
0.9570
USD 105.27
2
USD 6
0.9136
USD 5.48
0.9868
0.9046
USD 5.43
3
USD 6
0.8591
USD 5.15
1.4841
0.8463
USD 5.08
4
USD 6
0.8006
USD 4.80
1.9714
0.7848
USD 4.71
5
USD 106
0.7433
USD 78.79
2.4261
0.7252
USD 76.88
USD 200.00
USD 197.37 USD 2.63
Chapter 5 VaR Mapping
■ 63
Finally, the cash-flow mapping method consists of grouping
measures, (USD 2.70 — USD 2.57 million), USD 70,000 is due to
all cash flows on term -structure "vertices" that correspond to
differences in yield volatility, and (USD 2.70 — USD 2.63 million),
m aturities for which volatilities are provided. Each cash flow
USD 60,000 is due to imperfect correlations. The last column pres
is represented by the present value of the cash paym ent, dis
ents the component VaR using computations as explained earlier.
counted at the appropriate zero-coupon rate.
Stress Test
The diversified VaR is com puted as VaR = a V x 'X x = V (x X V )'R (x X V)
(5.7)
Table 5.3 presents another approach to VaR that is directly derived from m ovem ents in the value of zeroes. This is an exam
where V = acr is the vector of VaR for zero-coupon bond
ple of stress testing.
returns, and R is the correlation matrix. Table 5.4 shows how to com pute the portfolio VaR using cash flow m apping. The second column reports the cash flows x from Table 5.2. Note that the current value of USD 200 million is fully allocated to the five risk factors. The third column presents the product of these cash flows with the risk of each vertex x X V, which represents the individual VaRs. With perfect correlation across all zeros, the VaR of the portfolio is N Undiversified VaR = ^ |x,-| V, /= 1
Assum e that all zeroes are perfectly correlated. Then we could decrease all zeroes' values by their VaR. For instance, the 1-year zero is worth 0.9615. Given the VaR in Table 5.3 of 0.4696, a 95 percent probability move would be for the zero to fall to 0.9615 X (1 — 0.4696/100) = 0.9570. If all zeroes are perfectly correlated, they should all fall by their respective VaR. This gen erates a new distribution of present-value factors that can be used to price the portfolio. Table 5.3 shows that the new value is USD 197.37 million, which is exactly USD 2.63 million below the original value. This num ber is exactly the same as the undiversi
which is USD 2.63 million. This number is close to the VaR obtained
fied VaR just com puted. The two approaches illustrate the link between computing VaR
from the duration approximation, which was USD 2.70 million.
through matrix multiplication and through movements in underly
The right side of the table presents the correlation m atrix of zeroes for m aturities ranging from 1 to 5 years. To obtain the portfolio VaR, we prem ultiply and postm ultiply the m atrix by the dollar amounts (xV) at each vertex. Taking the square root, we find a diversified VaR measure of USD 2.57 million.
ing prices. Com puting VaR through matrix multiplication is much more direct, however, and more appropriate because it allows nonperfect correlations across different sectors of the yield curve.
Benchmarking
Note that this is slightly less than the duration VaR of USD 2.70 million. This difference is due to two factors. First, risk measures
N ext, we provide a practical fixed-incom e com pute VaR in
are not perfectly linear with maturity, as we have seen in a previ
relative term s, that is, relative to a perform ance benchm ark.
ous section. Second, correlations are below unity, which reduces
Table 5.5 presents the cash-flow decom position of the
risk even further. Thus, of the USD 130,000 difference in these
J.P. Morgan U.S. bond index, which has a duration of 4.62 years.
Table 5.4
Computing the VaR of a USD 200 Million Bond Portfolio (Monthly VaR at 95 Percent Level)
Term (Year)
P V C a sh F lo w s
Individual V aR
C o rre la tio n M a trix R
X
x X V
1Y
2Y
C o m p o n e n t V aR 3Y
4Y
1
USD 105.77
0.4966
1
2
USD 5.48
0.0540
0.897
1
3
USD 5.15
0.0765
0.886
0.991
1
4
USD 4.80
0.0947
0.866
0.976
0.994
1
5
USD 78.79
1.9115
0.855
0.966
0.988
0.998
USD 200.00
2.6335
Total Undiversified VaR
5Y
USD 0.45 USD 0.05 USD 0.08
■
USD 0.09
1
USD 1.90
USD 2.63
Diversified VaR
64
xA VaR
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
USD 2.57
Table 5.5
Benchmarking a USD 100 Million Bond Index (Monthly Tracking Error VaR at 95 Percent Level) Position: Portfolio
Vertex
Risk (%)
Position: Index (USD)
1 (USD)
2 (USD)
3 (USD)
4 (USD)
5 (USD)
< 1m
0.022
1.05
0.0
0.0
0.0
0.0
84.8
3m
0.065
1.35
0.0
0.0
0.0
0.0
0.0
6m
0.163
2.49
0.0
0.0
0.0
0.0
0.0
1Y
0.470
13.96
0.0
0.0
0.0
59.8
0.0
2Y
0.987
24.83
0.0
0.0
62.6
0.0
0.0
3Y
1.484
15.40
0.0
59.5
0.0
0.0
0.0
4Y
1.971
11.57
38.0
0.0
0.0
0.0
0.0
5Y
2.426
7.62
62.0
0.0
0.0
0.0
0.0
7Y
3.192
6.43
0.0
40.5
0.0
0.0
0.0
9Y
3.913
4.51
0.0
0.0
37.4
0.0
0.0
10Y
4.250
3.34
0.0
0.0
0.0
40.2
0.0
15Y
6.234
3.00
0.0
0.0
0.0
0.0
0.0
20Y
8.146
3.15
0.0
0.0
0.0
0.0
0.0
30Y
11.119
1.31
0.0
0.0
0.0
0.0
15.2
100.00
100.0
100.0
100.0
100.0
100.0
Total Duration
4.62
4.62
4.62
4.62
4.62
4.62
Absolute VaR
USD 1.99
USD 2.25
USD 2.16
USD 2.04
USD 1.94
USD 1.71
Tracking error VaR
USD 0.00
USD 0.43
USD 0.29
USD 0.16
USD 0.20
USD 0.81
Assum e that we are trying to benchm ark a portfolio of
USD 1.99 million absolute risk of the index. The remaining track
USD 100 million. O ver a monthly horizon, the VaR of the index
ing error is due to nonparallel moves in the term structure.
at the 95 percent confidence level is USD 1.99 million. This is about equivalent to the risk of a 4-year note.
Relative to the original index, the tracking error can be m ea sured in term s of variance reduction, similar to an R2 in a regres
N ext, we try to match the index with two bonds. The rightm ost
sion. The variance im provem ent is
columns in the table display the positions of two-bond portfo lios with duration m atched to that of the index. Since no zero-
1 _ (
t
^ )
= 95‘4 P ercent
coupon has a maturity of exactly 4.62 years, the closest portfolio consists of two positions, each in a 4- and a 5-year zero. The
which is in line with the explanatory power of the first factor in
respective w eights for this portfolio are USD 38 million and
the variance decom position.
USD 62 million.
Next, we explore the effect of altering the composition of the
Define the new vector of positions for this portfolio as x and for
tracking portfolio. Portfolio 2 widens the bracket of cash flows
the index as x0. The VaR of the deviation relative to the bench mark is Tracking Error VaR = a V ( x — x 0) '2 ( x — x 0)
(5.8)
A fter perform ing the necessary calculations, we find that the
in years 3 and 7. The TE-VaR is USD 0.29 million, which is an improvement over the previous number. Next, portfolio 3 has positions in years 2 and 9. This comes the closest to approximat ing the cash-flow positions in the index, which has the greatest weight on the 2-year vertex. The TE-VaR is reduced further to
tracking error VaR (TE-VaR) of this duration-hedged portfolio
USD 0.16 million. Portfolio 4 has positions in years 1 and 10. Now
is USD 0.43 million. Thus the maximum deviation between the
the TE-VaR increases to USD 0.20 million. This mistracking is even
index and the portfolio is at most USD 0.43 million under normal
more pronounced for a portfolio consisting of 1 -month bills and
m arket conditions. This potential shortfall is much less than the
30-year zeroes, for which the TE-VaR increases to USD 0.81 million.
Chapter 5 VaR Mapping
■ 65
Am ong the portfolios considered here, the lowest tracking error
consider the fact that investors have two alternatives that are
is obtained with portfolio 3. Note that the absolute risk of these
econom ically equivalent: (1) Buy e-yr units of the asset at the
portfolios is lowest for portfolio 5. As correlations decrease
price S t and hold for one period, or (2) enter a forward contract
for more distant m aturities, we should exp ect that a duration-
to buy one unit of the asset in one period. Under alternative
m atched portfolio should have the lowest absolute risk for the
1 , the investm ent will grow, with reinvestm ent of dividend, to
com bination of most distant m aturities, such as a barbell port
exactly one unit of the asset after one period. Under alterna
folio of cash and a 30-year zero. However, minimizing absolute
tive 2 , the contract costs ft upfront, and we need to set aside
m arket risk is not the sam e as minimizing relative m arket risk. This exam ple dem onstrates that duration hedging only provides a first approxim ation to interest-rate risk m anagem ent. If the goal is to minimize tracking error relative to an index, it is essen tial to use a fine decom position of the index by maturity.
enough cash to pay K in the future, which is Ke~n . A fter 1 year, the two alternatives lead to the same position, one unit of the asset. Therefore, their initial cost must be identical. This leads to the following valuation formula for outstanding forward contracts: ft =
5.3 MAPPING LINEAR DERIVATIVES
(5.9)
Note that we can repeat the preceding reasoning to find the current forward rate F t that would set the value of the contract to zero. Setting K — Ft and ft — 0 in Equation (5.9), we have
Forward Contracts Forward and futures contracts are the sim plest types of deriva tives. Since their value is linear in the underlying spot rates,
F t = (Ste~yr)erT
ft = F te~rr - K e -n = (Ft - K)e~n
Assum e, for instance, that we are dealing with a forward con tract on a foreign currency. The basic valuation formula can be
(5.10)
This allows us to rewrite Equation (5.9) as
their risk can be constructed easily from basic building blocks.
(5.11)
In other words, the current value of the forward contract is the present value of the difference between the current forward rate
derived from an arbitrage argum ent.
and the locked-in delivery rate. If we are long a forward contract
To establish notations, define
with contracted rate K, we can liquidate the contract by entering a
S t — spot price of one unit of the underlying cash asset
new contract to sell at the current rate Ft. This will lock in a profit of (Ft — K ), which we need to discount to the present time to find ft.
K — contracted forward price
Let us exam ine the risk of a 1-year forward contract to purchase
r = dom estic risk-free rate
100 million euros in exchange for USD 130.086 million. Table
y = income flow on the asset
5.6 displays pricing information for the contract (current spot, forw ard, and interest rates), risk, and correlations. The first step
r — time to maturity. W hen the asset is a foreign currency, y represents the foreign risk-free rate r*. We will use these two notations interchange ably. For convenience, we assume that all rates are com
is to find the m arket value of the contract. W e can use Equa tion (5.9), accounting for the fact that the quoted interest rates are discretely com pounded, as ft — USD 1.2877
pounded continuously. We seek to find the current value of a forward contract ft to buy one unit of foreign currency at K after tim e r . To do this, we
Table 5.6
- Ke~n
1 1 - USD 1.3009 (1 + 3.3304/100) (1 + 2.2810/100)
= USD 1.2589 - USD 1.2589 = 0
Risk and Correlations for Forward Contract Risk Factors (Monthly VaR at 95 Percent Level) Correlations
Risk Factor EUR spot
Price or Rate USD 1.2877
VaR (%)
EUR Spot
4.5381
1
0.1289
1
Long EUR bill
2.2810%
0.1396
0.1289
Short USD bill
3.3304%
0 .212 1
0.0400
EUR forward
66
■
EUR 1Y
-0 .0 5 8 3
USD 1.3009
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
USD 1Y 0.0400 -0 .0 5 8 3
1
Thus the initial value of the contract is zero. This value, however, may change, creating m arket risk.
This shows that the forward position can be separated into three cash flows: (1) a long spot position in EUR,
Am ong the three sources of risk, the volatility of the spot con tract is the highest by far, with a 4.54 percent VaR (correspond ing to 1.65 standard deviations over a month for a 95 percent confidence level). This is much greater than the 0.14 percent VaR for the EUR 1-year bill or even the 0.21 percent VaR for the USD bill. Thus most of the risk of the forward contract is driven
worth EUR 100 million = USD 130.09 million in a year, or (Se-r*T) = USD 125.89 million now, (2) a long position in a EUR investm ent, also worth USD 125.89 million now, and (3) a short position in a USD investm ent, worth USD 130.09 million in a year, or (Ke~rT) — USD 125.89 million now. Thus a position in the forward contract has three building blocks: Long forward contract = long foreign currency spot +
by the cash EUR position. But risk is also affected by correlations. The positive correlation of 0.13 betw een the EUR spot and bill positions indicates that
long foreign currency bill + short U .S.dollar bill Considering only the spot position, the VaR is USD 125.89 mil
when the EUR goes up in value against the dollar, the value of a
lion tim es the risk of 4.538 percent, which is USD 5.713 million.
1-year EUR investm ent is likely to appreciate. Therefore, higher
To com pute the diversified VaR, we use the risk m atrix from the
values of the EUR are associated with lower EUR interest rates.
data in Table 5.7 and pre- and postm ultiply by the vector of positions (PV of flows column in the table). The total VaR for the
This positive correlation increases the risk of the combined position. On the other hand, the position is also short a 1-year USD bill, which is correlated with the other two legs of the trans action. The issue is, what will be the net effect on the risk of the
VaR provides an exact answer to this question, which is dis played in Table 5.7. But first we have to com pute the positions x on each of the three building blocks of the contract. By taking the partial derivative of Equation (5.9) with respect to the risk factors, w e have
volatility dom inates the volatility of 1 -year bonds.
term currency sw aps, which are equivalent to portfolios of forward contracts. For instance, a 10-year contract to pay dol lars and receive euros is equivalent to a series of 10 forward contracts to exchange a set amount of dollars into marks. To com pute the VaR, the contract must be broken down into a currency-risk com ponent and a string of USD and EUR fixed-
df
df .
df .
dS
dr*
dr
income com ponents. As before, the total VaR will be driven pri
d f — — d S H----- dr* H---- dr
marily by the currency com ponent.
= e~r*Td S - Se~r\ d r * + K e ^ r d r
(5.12)
Here, the building blocks consist of the spot rate and interest rates. Alternatively, we can replace interest rates by the price of bills. Define these as P — e~n and P* — e-f*T. We then replace dr with dP using d P —
{-T)e~n
dr and dP*
—
(—r)e~r*T dr*. The
+ (Se~r*T)
Commodity Forwards The valuation of forward or futures contracts on com m odities is substantially more com plex than for financial assets such as currencies, bonds, or stock indices. Such financial assets have
risk of the forward contract becom es
Table 5.7
same size as that of the spot contract because exchange-rate
More generally, the same m ethodology can be used for long
forward contract?
.,
forward contract is USD 5.735 million. This number is about the
dP* P*
a well-defined income flow y, which is the foreign interest rate, (5.13)
the coupon paym ent, or the dividend yield, respectively.
Computing VaR for a EUR 100 Million Forward Contract (Monthly VaR at 95 Percent Level)
Position
Present-Value Factor
Cash Flows (CF)
EUR spot
PV of Flows, x
Individual VaR, x V
Component VaR, xAVaR
USD 125.89
USD 5.713
USD 5.704
Long EUR bill
0.977698
EU R100.00
USD 125.89
USD 0.176
USD 0.029
Short USD bill
0.967769
- USD 130.09
- U S D 125.89
USD 0.267
USD 0.002
Undiversified VaR Diversified VaR
USD 6.156 USD 5.735
Chapter 5 VaR Mapping
■ 67
Table 5.8
Risk of Commodity Contracts (Monthly VaR at 95 Percent Level) Energy Products
Maturity
Natural Gas
Heating Oil
Crude Oil-WTI
Unleaded Gasoline
1 month
28.77
22.07
20.17
19.20
3 months
22.79
20.60
18.29
17.46
6 months
16.01
16.67
16.26
15.87
12 months
12.68
14.61
14.05
—
Base Metals Maturity
Aluminum
Copper
Nickel
Zinc
Cash
11.34
13.09
18.97
13.49
3 months
11.0 1
12.34
18.41
13.18
15 months
8.99
10.51
15.44
11.95
27 months
7.27
9.57
11.59
—
Precious Metals Maturity
Silver
Gold
Cash
6.18
Platinum 14.97
7.70
Things are not so simple for com m odities, such as m etals, agri
per barrel. Using a present-value factor of 0.967769, this trans
cultural products, or energy products. Most products do not
lates into a current position of USD 43,743,000.
make m onetary paym ents but instead are consum ed, thus creat ing an implied benefit. This flow of benefit, net of storage cost,
Differentiating Equation (5.11), we have
is loosely called con ven ien ce y ie ld to represent the benefit from
e~n d F =
holding the cash product. This convenience yield, however, is
(5.14)
not tied to another financial variable, such as the foreign inter
The term between parentheses therefore represents the exp o
est rate for currency futures. It is also highly variable, creating its
sure. The contract VaR is
own source of risk. As a result, the risk m easurem ent of com m odity futures uses Equation (5.11) directly, where the main driver of the value of the contract is the current forward price for this com m odity.
VaR = USD 43,743,000 X 14.05/100 = USD 6,146,000 In general, the contract cash flows will fall between the maturities of the risk factors, and present values must be apportioned accordingly.
Table 5.8 illustrates the term structure of volatilities for selected energy products and base metals. First, we note that monthly VaR m easures are very high, reaching 29 percent for near con
Forward Rate Agreements
tracts. In contrast, currency and equity m arket VaRs are typically
Forward rate agreem ents (FRAs) are forward contracts that allow
around 6 percent. Thus com m odities are much more volatile
users to lock in an interest rate at some future date. The buyer
than typical financial assets.
of an FRA locks in a borrowing rate; the seller locks in a lending
Second, we observe that volatilities decrease with maturity. The effect is strongest for less storable products such as energy products
rate. In other w ords, the "long" receives a paym ent if the spot rate is above the forward rate.
and less so for base metals. It is actually imperceptible for precious
Define the timing of the short leg as t -i and of the long leg as t 2,
metals, which have low storage costs and no convenience yield. For
both expressed in years. Assum e linear com pounding for sim
financial assets, volatilities are driven primarily by spot prices, which
plicity. The forward rate can be defined as the implied rate that
implies basically constant volatilities across contract maturities.
equalizes the return on a r2-period investm ent with a Tr period
Let us now say that we wish to com pute the VaR for a 12-month forward position on 1 million barrels of oil priced at USD 45.2
68
■
investm ent rolled over, that is, (1 + R t 2) = (1 + f t o X I + F 1; 2(t 2 - n )] 2
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
(5.15)
Table 5.9
Computing the VaR of a USD 100 Million FRA (Monthly VaR at 95 Percent Level)
Position
PV of Flows, x
Risk (% ), V
180 days
- U S D 97.264
0.1629
1
360 days
USD 97.264
0.4696
0.8738
Correlation Matrix, R
Individual VaR, x V
Component VaR, xAVaR
0.8738
USD 0.158
- U S D 0.116
1
USD 0.457
USD 0.444
Undiversified VaR
USD 0.615
Diversified VaR
USD 0.327
For instance, suppose that you sold a 6 X 12 FRA on USD 100
To illustrate, let us com pute the VaR of a USD 100 million 5-year
million. This is equivalent to borrowing USD 100 million for 6
interest-rate swap. We enter a dollar swap that pays 6.195 per
months and investing the proceeds for 12 months. When the FRA
cent annually for 5 years in exchange for floating-rate paym ents
expires in 6 months, assume that the prevailing 6-month spot rate
indexed to London Interbank O ffer Rate (LIBO R). Initially, we
is higher than the locked-in forward rate. The seller then pays
consider a situation where the floating-rate note is about to be
the buyer the difference between the spot and forward rates
reset. Ju st before the reset period, we know that the coupon
applied to the principal. In effect, this payment offsets the higher
will be set at the prevailing m arket rate. Therefore, the note car
return that the investor otherwise would receive, thus guarantee
ries no m arket risk, and its value can be m apped on cash only.
ing a return equal to the forward rate. Therefore, an FRA can be
Right after the reset, however, the note becom es similar to a bill
decom posed into two zero-coupon building blocks.
with maturity equal to the next reset period.
Long 6 X 1 2 FRA = long 6 -month bill + short 1 2 -month bill Table 5.9 provides a worked-out exam ple. If the 360-day spot rate is 5.8125 percent and the 180-day rate is 5.6250 percent, the forward rate must be such that _ (1 + 5.8125/100)
1,2
(1 + 5.6250/200)
or F = 5.836 percent. The present value of the notional USD 100 million in 6 months is x = USD 100/(1 + 5.625/200) = USD 97.264 million. This amount is invested for 12 months. In the m eantim e, what is the risk of this FRA?
Interest-rate swaps can be viewed in two different ways: as (1) a com bined position in a fixed-rate bond and in a floating-rate bond or (2) a portfolio of forward contracts. W e first value the swap as a position in two bonds using risk data from Table 5.4. The analysis is detailed in Table 5.10. The second and third columns lay out the paym ents on both legs. Assum ing that this is an at-the-m arket sw ap, that is, that its coupon is equal to prevailing swap rates, the short position in the fixed-rate bond is worth USD 100 million. Ju st before reset, the long position in the FRN is also worth USD 100 million, so the m arket value of the swap is zero. To clarify the allocation of current values, the FRN is allocated to cash, with a zero maturity.
Table 5.9 displays the com putation of VaR for the FRA. The
This has no risk.
VaRsof 6 - and 12-month zeroes are 0.1629 and 0.4696, respec
The next column lists the zero-coupon swap rates for maturities
tively, with a correlation of 0.8738. Applied to the principal of USD 97.26 million, the individual VaRs are USD 0.158 million and USD 0.457 million, which gives an undiversified VaR of USD 0.615 million. Fortunately, the correlation substantially low ers the FRA risk. The largest amount the position can lose over a month at the 95 percent level is USD 0.327 million.
Interest-Rate Swaps Interest-rate swaps are the most actively used derivatives. They create exchanges of interest-rate flows from fixed to floating or vice versa. Swaps can be decom posed into two legs, a fixed leg and a floating leg. The fixed leg can be priced as a coupon
going from 1 to 5 years. The fifth column reports the present value of the net cash flow s, fixed minus floating. The last column presents the com ponent VaR, which adds up to a total diversi fied VaR of USD 2.152 million. The undiversified VaR is obtained from summing all individual VaRs. As usual, the USD 2.160 mil lion value som ew hat overestim ates risk. This swap can be view ed as the sum of five forward contracts, as shown in Table 5.11. The 1-year contract prom ises paym ent of USD 100 million plus the coupon of 6.195 percent; discounted at the spot rate of 5.813 percent, this yields a present value of —USD 100.36 million. This is in exchange for USD 100 million now, which has no risk.
paying bond; the floating leg is equivalent to a floating-rate
The next contract is a 1 X 2 forward contract that prom
note (FRN).
ises to pay the principal plus the fixed coupon in 2 years, or
Chapter 5 VaR Mapping
■ 69
Table 5.10
Computing the VaR of a USD 100 Million Interest-Rate Swap (Monthly VaR at 95 Percent Level) Cash Flows
Term (Year)
Fixed USD 0
0
Float
Spot Rate
PV of Net Cash Flows
+ USD 100
Individual VaR
+ USD 100.000
Component VaR
USD 0
USD 0
1
- U S D 6.195
USD 0
5.813%
- U S D 5.855
USD 0.027
USD 0.024
2
- U S D 6.195
USD 0
5.929%
- U S D 5.521
USD 0.054
USD 0.053
3
- U S D 6.195
USD 0
6.034%
- U S D 5.196
USD 0.077
USD 0.075
4
- U S D 6.195
USD 0
6.130%
- U S D 4.883
USD 0.096
USD 0.096
5
- U S D 106.195
USD 0
6.217%
- U S D 78.546
USD 1.905
USD 1.905
Total
USD 0.000
Undiversified VaR
USD 2.160 USD 2.152
Diversified VaR
Table 5.11
An Interest-Rate Swap Viewed as Forward Contracts (Monthly VaR at 95 Percent Level) PV of Flows: Contract
Term (Year)
1x 2
1
1
- U S D 100.36
3
VaR
USD 89.11 - U S D 89.08
4
USD 83.88 - U S D 83.70
5 VaR
4X5
USD 94.50 - U S D 94.64
2
3X4
2X3
USD 78.82 - U S D 78.55
USD 0.471
USD 0.571
USD 0.488
USD 0.446
USD 0.425
Undiversified VaR
USD 2.401
Diversified VaR
USD 2.152
—USD 106.195 million; discounted at the 2-year spot rate, this
from USD 2.152 million to USD 1.763 million. More generally,
yields —USD 94.64 million. This is in exchange for USD 100 mil
the swap's VaR will converge to zero as the swap m atures, dip
lion in 1 year, which is also USD 94.50 million when discounted
ping each tim e a coupon is set.
at the 1-year spot rate. And so on until the fifth contract, a 4 X 5 forward contract.
5.4 MAPPING OPTIONS
Table 5.11 shows the VaR of each contract. The undiversified VaR of USD 2.401 million is the result of a sim ple summation
We now consider the mapping process for nonlinear derivatives,
of the five VaRs. The fully diversified VaR is USD 2.152 million,
or options. O bviously, this nonlinearity may create problem s for
exactly the same as in the preceding table. This dem onstrates
risk m easurem ent system s based on the delta-normal approach,
the equivalence of the two approaches.
which is fundam entally linear.
Finally, we exam ine the change in risk after the first paym ent has
To sim plify, consider the Black-Scholes (BS) model for
just been set on the floating-rate leg. The FRN then becom es a
European o p tio n s .2 The model assum es, in addition to
1 -year bond initially valued at par but subject to fluctuations in rates. The only change in the pattern of cash flows in Table 5.10 is to add USD 100 million to the position on year 1 (from —USD 5.855 to USD 94.145). The resulting VaR then decreases
70
■
2 For a systematic approach to pricing derivatives, see the excellent book by Hull (2005).
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
Table 5.12
Derivatives for a European Call
Parameters: S — USD 100, a — 20%, r — 5%, r* — 3%, r = 3 months Exercise Price Variable c
K = 90
Unit Dollars
K = 100
K = 110
11.01
4.20
1.04
Change p e r
A
Spot price
Dollar
0.869
0.536
0.195
r
Spot price
Dollar
0.020
0.039
0.028
A
Volatility
(% pa)
0.102
0.197
0.138
P
Interest rate
(% pa)
0.190
0.123
0.046
P*
A sset yield
(% pa)
-0.217
-0.133
-0.049
0
Tim e
Day
-0.014
-0.024
-0.016
p erfect capital m arkets, that the underlying spot price follow s
Delta increases with the underlying spot price. The relationship
a continuous g e o m e tric brow nian m otion with constant vo latil
becom es more nonlinear for short-term options, for exam ple,
ity cr(dS/S). Based on these assum ptions, the Black-Scholes
with an option maturity of 10 days. Linear m ethods approxim ate
(1973) m odel, as expanded by M erton (1973), gives the value
delta by a constant value over the risk horizon. The quality of
of a European call as
this approxim ation depends on param eter values.
c = c(S,/C,T,r,r*,cr) = S e ^ TN(c/-,) - Ke~rTN{d2)
(5.16)
where N[d) is the cum ulative normal distribution function with argum ents
For instance, if the risk horizon is 1 day, the worst down move in the spot price is - a S a V f = - 1 .6 4 5 X USD 100 X 0.20 V l/ 2 5 2 = - U S D 2.08, leading to a worst price of U SD 97.92. W ith a 90-day option, delta changes from 0.536 to 0.452 only. With such a
ln(Se~r*7/ < - Ax) 11 = 1
where x t is the return of asset X at tim e t and n is the number of observed points in tim e. The volatility of asset Y is derived accordingly. Equation 7.2 can be com puted with = stdev in Excel and std in M A TLA B . From our exam ple in Table 7.1, we find that crx — 44.51% and cry — 47.58% .
Table 7.1
Performance of a Portfolio with Two Assets Return of Asset X
Return of Asset Y
Year
Asset X
2013
100
200
W hy study financial correlations? That's an easy one. Financial
2014
120
230
20.00%
15.00%
correlations appear in many areas in finance. W e will briefly dis
2015
108
460
- 1 0 .0 0 %
100.00%
2016
190
410
75.93%
- 1 0 .8 7 %
2017
160
480
- 1 5 .7 9 %
17.07%
2018
280
380
75.00%
- 2 0 .8 3 %
A verage
29.03%
20.07%
cuss five areas: (1) investm ents, (2) trading, (3) risk m anagem ent,
2 To hedge means to protect More precisely, hedging means to enter into a second trade to protect against the risk of an orginal trade.
108
■
Asset Y
Financial Risk Manager Exam Part II: Market Risk Measurement and Management
W ith equal w eights, ie, wx — w Y — 0.5, the exam ple in Table 7.1 results in op = 16.66% . Importantly, the standard deviation (or its square, the variance) is interpreted in finance as risk. The higher the standard deviation, the higher the risk of an asset or a portfolio. Is standard deviation a good measure of risk? The answer is: it's not great, but it's one of the best there are. A high standard deviation may mean high upside potential of the asset in question! So it penalises possible profits! But high standard deviation naturally also means high downside risk. In particular, risk-averse investors will not like a high standard deviation, ie, high fluctuation of their returns. An inform ative perform ance measure of an asset or a portfolio is the risk-adjusted return, ie, the return/risk ratio. For a portfolio it is /xp/ap, which we derived in
The negative relationship of the portfolio return/ portfolio risk ratio fxP/ j. Hence w e have
option, see w w w .dersoft.com /quanto.xls. The correlation betw een assets can also be traded directly with
Prealised =
a correlation swap. In a correlation sw ap, a fixed (ie, known) cor
3 2 _ 3 ^ ,5 + ° ‘3 +
= 0 -3
relation is exchanged with the correlation that will actually occur,
Following equation (7.7), the payoff for the correlation fixed-
called realised or stochastic (ie, unknown) correlation, as seen in
rate payer at swap maturity is USD 1,000,000 X (0.3 — 0.1) =
Figure 7.5.
USD 200,000.
Paying a fixed rate in a correlation swap is also called "buying cor relation". This is because the present value of the correlation swap will increase for the correlation buyer if the realised correlation increases. Naturally the fixed-rate receiver is "selling correlation".
Correlation swaps can indirectly protect against decreasing stock prices. As we will see in this chapter in "H ow does cor relation risk fit into the broader picture of risks in finance?", Figure 7.8, as well as in C hapter 8, when stock prices decrease,
The realised correlation p in Figure 7.5 is the correlation
typically the correlation betw een the stocks increases. Hence a
betw een the assets that actually occur during the tim e of the
fixed correlation payer protects them selves indirectly against a
swap. It is calculated as:
stock m arket decline.
P r e a l i s e d
?
2
5>y
n —n»
i
where p(J is the Pearson correlation betw een asset / and j, and n is the number of assets in the portfolio. The payoff of a correla tion swap for the correlation fixed rate payer at maturity is: W
p r e a l i s e d
~
P f i x e d )
A t the tim e of writing there is no industry-standard valuation model for correlation swaps. Traders often use historical data to anticipate /^realised- To apply swap valuation techniques, we require a term structure of correlation in tim e. However, no correlation term structure currently exists. We can also apply stochastic correlation m odels to value a correlation swap.
(?•?)
where N is the notional amount. Let's look at an exam ple of a correlation swap.
Stochastic correlation m odels are currently em erging. A nother way of buying correlation (ie, benefiting from an increase in correlation) is to buy put options on an index such as the Dow Jo n e s Industrial A verage (Dow) and sell put options on
Example 7.1
individual stock of the Dow. As we will see in C hapter 8, there is
W hat is the payoff of a correlation swap with three assets, a
a positive relationship between correlation and volatility.
fixed rate of 10%, a notional amount of USD 1,000,000 and a 1-year maturity?
Pairwise Pearson Correlation Coefficient at Swap Maturity Table 7.2
First, the daily log-returns ln(St/ S t_ 1) of the three assets are calculated for one year.3*Let's assume the realised pairwise
3 Log-returns ln(S1/S 0) are an approximation of percentage returns (S-1 — S0)/So- We typically use log-returns in finance since they are additive in time, whereas percentage returns are not. For details see Appendix A2.
Sj=2
SJ = 1
Sj=3
S;=1
1
0.5
0.1
Sj=2
0.5
1
0.3
$i=3
0.1
0.3
1
Chapter 7 Correlation Basics: Definitions, Applications, and Terminology
■ 111
Therefore, if the correlation between the stocks of the Dow
chapter on m arket risk. M arket risk consists of four types of risk:
increases, for exam ple in a m arket downturn, so will the implied
(1) equity risk, (2) interest-rate risk, (3) currency risk and (4) com
volatility4 of the put on the Dow. This increase is expected to
modity risk.
outperform the potential loss from the increase in the short put positions on the individual stocks. Creating exposure on an index and hedging with exposure on
There are several concepts to measure the m arket risk of a port folio such as VaR, expected shortfall (ES), enterprise risk m anagem ent (ERM ) and more. VaR is currently (year 2018) the
individual com ponents is exactly what the "London w h ale", JP
most w idely applied risk-m anagem ent m easure. Let's show the
Morgan's London trader Bruno Iksil, did in 2012. Iksil was called
im pact of asset correlation on V aR.6
the London whale because of his enorm ous positions in C D S s.5 He had sold C D Ss on an index of bonds, the C D X .N A .IG .9 , and "h ed g ed " it with buying C D Ss on individual bonds. In a recover ing econom y this is a promising trade: volatility and correlation
First, what is value-at-risk (VaR)? VaR m easures the maximum loss of a portfolio with respect to m arket risk for a certain proba bility level and for a certain tim e fram e. The equation for VaR is: VaRp — (Tpa\/x
typically decrease in a recovering econom y. Therefore, the sold C D Ss on the index should outperform (decrease more than) the losses on the C D Ss of the individual bonds.
(7.8)
where VaRP is the value-at-risk for portfolio P, and a: A bscise value of a standard normal distribution,
But what can be a good trade in the medium and long term s
corresponding to a certain confidence level. It can be derived
can be disastrous in the short term . The positions of the London
as = norm sinv(confidence level) in Excel or norm inv(confidence
whale w ere so large, that hedge funds "short squeezed" Iksil:
level) in M A TLA B ; a takes the values
they started to aggressively buy the CD S index C D X .N A .IG .9 . This increased the C D S values in the index and created a huge
— oo
< a