2022 FRM Exam Part 2 - Market Risk Measurement and Management [1] 0137686617, 9780137686612


137 37 33MB

English Pages [227] Year 2022

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Preface
Chapter 1 Estimating Market Risk Measures
Chapter 2 Non-Parametric Approaches
Chapter 3 Parametric Approaches (II): Extreme Value
Chapter 4 Backtesting VaR
Chapter 5 VaR Mapping
Chapter 6 Messages from the Academic Literature on Risk Management for the Trading Book
Chapter 7 Correlation Basics: Definitions, Applications, and Terminology
Chapter 8 Empirical Properties of Correlation: How Do Correlations Behave in the Real World?
Chapter 9 Financial Correlation Modeling—Bottom-Up Approaches
Chapter 10 Empirical Approaches to Risk Metrics and Hedging
Chapter 11 The Science of Term Structure Models
Chapter 12 The Evolution of Short Rates and the Shape of the Term Structure
Chapter 13 The Art of Term Structure Models: Drift
Chapter 14 The Art of Term Structure Models: Volatility and Distribution
Chapter 15 Volatility Smiles
Chapter 16 Fundamental Review of the Trading Book
Index
Recommend Papers

2022 FRM Exam Part 2 - Market Risk Measurement and Management [1]
 0137686617, 9780137686612

  • Author / Uploaded
  • GARP
  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

GARP FRM Financial Risk Manager

Market Risk Measurement and Management

Pearson

Excerpts taken from: O ption s, Futures, and O th er D erivatives, Tenth Edition by John C. Hull Copyright © 2017, 2015, 2012, 2009, 2006, 2003, 2000 by Pearson Education, Inc. New York, New York 10013 Copyright © 2022, 2021 Global Association of Risk Professionals All rights reserved. This copyright covers material written expressly for this volum e by the editor/s as well as the com pilation itself. It does not cover the individual selections herein that first appeared elsew here. Permission to reprint these has been obtained by Pearson Learning Solutions for this edition only. Further reproduction by any means, electronic or m echanical, including photocopying and recording, or by any information storage or retrieval system , must be arranged with the individual copyright holders noted.

Grateful acknowledgment is made to the following sources for permission to reprint material copyrighted or controlled by them: "Estim ating M arket Risk M easures," "N on-Param etric A p p ro aches," and "Param etric Approaches (III): Extrem e Value" by Kevin Dowd, reprinted from M easuring M arket Risk, Second Edition (2005), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. "B ack Testing VaR" and "VaR M apping," by Philippe Jo rio n, reprinted from Value at Risk: The N ew Benchm ark for M anaging Financial Risk, Third Edition (2007), by permission of The M cG raw Hill Com panies. "M essages from the A cadem ic Literature on Risk M easurem ent for the Trading Book," W orking Paper No. 19 January 2011, reprinted by permission of the Basel Com m ittee on Banking Supervision. "Som e Correlation Basics: Definitions, A pplications, and Term inology," "Em pirical Properties of Correlation: How Do Correlations Behave in the Real W orld?," and "Financial Correlation M odeling— Bottom Up A p p ro aches," by G unter Meissner, reprinted from Correlation Risk M o d elin g and M anagem ent, Second Edition (2019), by permission of Risk Books/InfoPro Digital Services, Ltd. "Em pirical Approaches to Risk M etrics and H ed g es," "Th e Science of Term Structure M odels," "The Evolution of Short Rates and the Shape of the Term Structure," "The A rt of Term Structure M odels: D rift," and "The A rt of Term Structure M odels: Volatility and D istribution," by Bruce Tuckman and Angel Serrat, reprinted from F ixe d Incom e Secu rities: Tools fo r Today's M arkets, Third Edition (2012), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. "Fundam ental Review of the Trading Book," by John C . Hull, reprinted from Risk M anagem ent and Financial Institutions, Fifth Edition (2018), by permission of John W iley & Sons, Inc. All rights reserved. Used under license from John W iley & Sons, Inc. Learning O bjectives provided by the Global Association of Risk Professionals. All tradem arks, service marks, registered tradem arks, and registered service marks are the property of their respective owners and are used herein for identification purposes only. Pearson Education, Inc., 330 Hudson Street, New York, New York 10013 A Pearson Education Com pany w w w .pearsoned.com Printed in the United States of Am erica

ScoutAutomatedPrintCode 00011693-00000005 / A 1 03000278803 EEB/M B

Pearson

ISBN 10: 0137686617 ISBN 13: 9780137686612

Chapter 1

Estimating Market Risk Measures

1.1 Data

1.6 The Core Issues:An Overview

1 2

Profit/Loss Data

2

Loss/Profit Data

2

A rithm etic Return Data

2

G eom etric Return Data

2

1.7 Appendix

13 13

Prelim inary Data Analysis

13

Plotting the Data and Evaluating Sum m ary Statistics

14

Q Q Plots

14

Chapter 2

Non-Parametric Approaches

17

1.2 Estimating Historical Simulation VaR

3

1.3 Estimating Parametric VaR

4

Estim ating VaR with Norm ally Distributed Profits/Losses

2.1 Compiling Historical Simulation Data

18

4

Estim ating VaR with Norm ally Distributed A rithm etic Returns

2.2 Estimation of Historical Simulation VaR and ES

19

5

Basic Historical Simulation

19

Estim ating Lognormal VaR

6

Bootstrapped Historical Simulation

19

1.4 Estimating Coherent Risk Measures 7

Historical Simulation Using Non-param etric Density Estim ation

19

Estim ating Curves and Surfaces for VaR and ES

21

Estim ating Expected Shortfall

7

Estim ating Coherent Risk M easures

8

1.5 Estimating the Standard Errors of Risk Measure Estimators

10

Standard Errors of Q uantile Estim ators

10

Standard Errors in Estim ators of Coherent Risk M easures

12

2.3 Estimating Confidence Intervals for Historical Simulation VaR and ES 21 An O rder Statistics Approach to the Estim ation of Confidence Intervals for HS VaR and ES

22

in

A Bootstrap Approach to the Estimation of Confidence Intervals for HS VaR and ES

2.4 Weighted Historical Simulation

23

Age-w eighted Historical Simulation

24

Volatility-weighted Historical Simulation

25

Correlation-w eighted Historical Simulation

26

Filtered Historical Simulation

26

2.5 Advantages and Disadvantages of Non-Parametric Methods

28

A dvantages

28

D isadvantages

28

Conclusions

29

Appendix 1

29

Estim ating Risk M easures with O rder Statistics

29

Using O rder Statistics to Estim ate Confidence Intervals for VaR

29

Conclusions

30

Appendix 2

31

The Bootstrap

31

Lim itations of Convention Sampling Approaches

31

The Bootstrap and Its Im plem entation

31

Standard Errors of Bootstrap Estim ators

33

Tim e Dependency and the Bootstrap

34

Chapter 3

Parametric Approaches (II): Extreme Value

3.1 Generalised Extreme-Value Theory Theory A Short-Cut

EV Method

Estim ation of EV Param eters

iv

22

■ Contents

35

3.2 The Peaks-Over-Threshold Approach: The Generalised Pareto Distribution

43

Theory

43

Estim ation

45

G E V vs PO T

45

3.3 Refinements to EV Approaches

46

Conditional EV

46

Dealing with Dependent (or Non-iid) Data

46

M ultivariate EV T

47

3.4 Conclusions

47

Chapter 4

Backtesting VaR

4.1 Setup for Backtesting

49 50

An Exam ple

50

Which Return?

50

4.2 Model Backtesting with Exceptions

51

Model Verification Based on Failure Rates

51

The Basel Rules

54

Conditional C overage M odels

55

Extensions

56

4.3 Applications

56

4.4 Conclusions

57

Chapter 5

VaR Mapping

5.1 Mapping for Risk Measurement

59 60

W hy M apping?

60

36

Mapping as a Solution to Data Problems

60

39

The Mapping Process

61

39

General and Specific Risk

62

36

5.2 Mapping Fixed-Income Portfolios

63

6.4 Risk Measures

82

O verview

82

M apping Approaches

63

VaR

82

Stress Test

64

Expected Shortfall

84

Benchm arking

64

Spectral Risk M easures

85

O ther Risk M easures

86

Conclusions

86

5.3 Mapping Linear Derivatives

66

Forward Contracts

66

Com m odity Forw ards

67

Forward Rate Agreem ents

68

O verview

87

Interest-Rate Swaps

69

Incorporating Stress Testing into M arket-Risk Modelling

87

Stressed VaR

88

Conclusions

89

5.4 Mapping Options

70

5.5 Conclusions

72

Chapter 6

Messages from the Academic Literature on Risk Management for the Trading 73 Book

6.1 Introduction 6.2 Selected Lessons on VaR Implementation

74 74

O verview

74

Tim e Horizon for Regulatory VaR

74

Time-Varying Volatility in VaR

76

Backtesting VaR Models

77

Conclusions

78

6.3 Incorporating Liquidity 1 1 tJ

6.5 Stress Testing Practices for Market Risk

87

6.6 Unified Versus Compartmentalised 89 Risk Measurement O verview

89

Aggregation of Risk: Diversification versus Com pounding Effects

90

Papers Using the "Bottom -U p" Approach

91

Papers Using the "Top-Down" Approach

94

Conclusions

95

6.7 Risk Management and Value-at-Risk in a Systemic Context

95

O verview

95

Interm ediation, Leverage and Value-at-Risk: Em pirical Evidence

96

W hat Has All This to Do with VaR-Based Regulation?

97

Conclusions

98

Annex

103

78

O verview

78

Exogenous Liquidity

79

Endogenous Liquidity: Motivation

79

Endogenous Liquidity and M arket Risk for Trading Portfolios

80

Adjusting the VaR Tim e Horizon to Account for Liquidity Risk

81

Conclusions

81

Chapter 7

Correlation Basics: Definitions. Applications, and Terminology 105

7.1 A Short History of Correlation

Contents

106

■ v

7.2 What Are Financial Correlations? 106 7.3 What Is Financial Correlation Risk? 7.4 Motivation: Correlations and Correlation Risk Are Everywhere in Finance

106

108

Investm ents and Correlation

108

7.5 Trading and Correlation

109

8.2 Do Equity Correlations Exhibit Mean Reversion?

128

How Can We Q uantify Mean Reversion?

128

8.3 Do Equity Correlations Exhibit Autocorrelation?

129 130

Risk M anagem ent and Correlation

112

The Global Financial C rises 2007 to 2009 and Correlation

8.4 How Are Equity Correlations Distributed?

113

Regulation and Correlation

116

8.5 Is Equity Correlation Volatility an Indicator for Future Recessions? 130

7.6 How Does Correlation Risk Fit into the Broader Picture of Risks in Finance? 116 Correlation Risk and M arket Risk

117

Correlation Risk and C redit Risk

117

7.7 Correlation Risk and Systemic Risk

119

7.8 Correlation Risk and Concentration Risk

119

7.9 A Word on Terminology

121

Summary

121

Appendix A1

122

8.6 Properties of Bond Correlations and Default Probability Correlations 131 Summary

131

Questions

132

Chapter 9

Financial Correlation Modeling—Bottom-Up Approaches 133

9.1 Copula Correlations

134

D ependence and Correlation

122

Exam ple A 1: Statistical Independence

122

The Gaussian Copula

Correlation

122

Independence and Uncorrelatedness

122

Simulating the Correlated Default Tim e for M ultiple A ssets 137

Appendix A2 Questions

Chapter 8

■ Contents

134

123

On Percentage and Logarithm ic Changes 123

vi

8.1 How Do Equity Correlations Behave in a Recession, Normal Economic Period or Strong Expansion? 126

124

Empirical Properties of Correlation: How Do Correlations Behave in the Real World? 125

Chapter 10

Empirical Approaches to Risk Metrics and Hedging 139

10.1 Single-Variable Regression-Based Hedging 140 Least-Squares Regression Analysis

141

The Regression Hedge

142

11.6 Option-Adjusted Spread

162

The Stability of Regression Coefficients over Tim e

143

11.7 Profit and Loss Attribution with an OAS

162

144

11.8 Reducing the Time Step

163

10.3 Level Versus Change Regressions

146

11.9 Fixed Income Versus Equity Derivatives

164

10.4 Principal Components Analysis

146

10.2 Two-Variable Regression-Based Hedging

O verview

146

PCA s for USD Swap Rates

147

Hedging with P C A and an Application to Butterfly W eights

149

Principal Com ponent Analysis of EUR, GBP, and JP Y Swap Rates

150

The Shape of PCs over Tim e

150

Appendix A

151

The Least-Squares Hedge Minimizes the Variance of the P&L of the Hedged Position

Appendix B

152

Constructing Principal Com ponents from Three Rates

Chapter 11

151

The Science of Term Structure Models

152

Chapter 12

167

12.1 Introduction

168

12.2 Expectations

168

12.3 Volatility and Convexity

169

12.4 Risk Premium

171

Chapter 13 155

The Evolution of Short Rates and the Shape of the Term Structure

The Art of Term Structure Models: Drift 175

11.1 Rate and Price Trees

156

13.1 Model 1: Normally Distributed Rates and No Drift 176

11.2 Arbitrage Pricing of Derivatives

157

13.2 Model 2: Drift and Risk Premium

178

11.3 Risk-Neutral Pricing

158

11.4 Arbitrage Pricing in a Multi-Period Setting

13.3 The Ho-Lee Model: Time-Dependent Drift

179

159

13.4 Desirability of Fitting to the Term Structure

180

13.5 The Vasicek Model: Mean Reversion

181

11.5 Example: Pricing a Constant-Maturity Treasury Swap

161

Contents

■ vii

Chapter 14

The Art of Term Structure Models: Volatility and Distribution 187

14.1 Time-Dependent Volatility: Model 3 14.2 The Cox-Ingersoll-Ross and Lognormal Models: Volatility as a Function of the Short Rate

188

189

14.3 Tree for the Original Salomon Brothers Model 190 14.4 The Black-Karasinski Model: A Lognormal Model with Mean Reversion 191 14.5 Appendix Closed-Form Solutions for Spot Rates

Chapter 15

191 191

Volatility Smiles 193

15.1 Why the Volatility Smile Is the Same for Calls and Puts 194 15.2 Foreign Currency Options Em pirical Results

195

Reasons for the Smile in Foreign Currency O ptions

196

15.3 Equity Options The Reason for the Smile in Equity O ptions

15.4 Alternative Ways of Characterizing the Volatility Smile

viii

195

■ Contents

196 197

198

15.5 The Volatility Term Structure and Volatility Surfaces

198

15.6 Minimum Variance Delta

199

15.7 The Role of the Model

199

15.8 When a Single Large Jump Is Anticipated

199

Summary

200

Appendix

201

Determ ined Implied Risk-Neutral Distributions from Volatility Sm iles

Chapter 16

Fundamental Review of the Trading Book

201

203

16.1 Background

204

16.2 Standardized Approach

205

Term Structures

206

Curvature Risk Charge

206

Default Risk Charge

207

Residual Risk Add-On

207

A Sim plified Approach

207

16.3 Internal Models Approach

207

Back-Testing

208

Profit and Loss A ttribution

209

Credit Risk

209

Securitizations

209

16.4 Trading Book vs. Banking Book

209

Summary

210

Index

211

On behalf of our Board of Trustees, G A RP's staff, and particu­

The FRM program addresses the financial risks faced by both

larly its certification and educational program s team s, I would

non-financial firm s and those in the highly interconnected and

like to thank you for your interest in and support of our Financial

sophisticated financial services industry, because its coverage is

Risk M anager (FRM®) program .

not static, but vibrant and forward looking.

The past couple of years have been difficult due to C O V ID -19.

The FRM curriculum is regularly reviewed by an oversight com ­

And in that regard, our sincere sym pathies go out to anyone

m ittee of highly qualified and experienced risk-m anagem ent

who was ill or suffered a loss due to the pandem ic.

professionals from around the globe. These professionals con­

The FRM program also experienced many virus-related chal­

sist of senior bank and consulting practitioners, governm ent

lenges. Because we always place candidate safety first, we cancelled the May 2020 FRM exam offering and deferred all can­ didates to October, while reserving an optional date in January 2021 for candidates not able to sit for the examination in October. A change like this has never happened before. Ultimately, we were able to offer the FRM exam to all 2020 registered candidates who wanted to sit for it during the year and were not constrained by COVID-related restrictions, which was most of our registrants. Since its inception in 1997, the FRM program has been the

regulators, asset m anagers, insurance risk professionals, and academ ics. Their mission is to ensure the FRM program remains current and its content addresses not only standard credit and m arket risk issues, but also em erging issues and trends, ensur­ ing FRM candidates are aware of what is or is expected to be im portant in the near future. W e're com m itted to offering a pro­ gram that reflects the dynam ic and sophisticated nature of the risk-m anagem ent profession and those who are making it a career.

global industry benchm ark for risk-m anagem ent professionals

We wish you the very best as you study for the FRM exam s, and

wanting to dem onstrate objectively their knowledge of financial

in your career as a risk-m anagem ent professional.

risk-m anagem ent concepts and approaches. Having FRM hold­

Yours truly,

ers on staff also tells com panies' that their risk-m anagem ent professionals have achieved a dem onstrated and globally adopted level of expertise. In a world where risks are becom ing more com plex daily due to any number of technological, capital, governm ental, geopoliti­ cal, or other factors, com panies know that FRM holders possess the skills to understand and adapt to a dynam ic and rapidly

Richard A postolik

changing financial environm ent.

President & C E O

Preface

■ ix

Chairperson Michelle McCarthy Beck Form er G A R P Board M em ber

Members Richard Apostolik

Dr. Attilio Meucci, CFA

President and C E O , Global Association of Risk Professionals

Founder, ARPM

Richard Brandt

Dr. Victor Ng

MD O perational Risk M anagem ent, Citigroup

MD Head of Risk Architecture, Goldm an Sachs

Julian Chen, FRM, SVP

Dr. Matthew Pritsker

FRM Program Manager, Global Association of Risk Professionals

Senior Financial Econom ist and Policy Advisor / Supervision,

Dr. Christopher Donohue, MD

Regulation, and Credit, Federal Reserve Bank of Boston

G A R P Benchm arking Initiative, Global Association of Risk

Dr. Samantha Roberts, FRM

Professionals

SVP Balance Sheet Analytics & M odeling, PN C Bank

Donald Edgar, FRM

Dr. Til Schuermann

MD Risk & Q uantitative Analysis, BlackRock

Partner, O liver W yman

Herve Geny

Nick Strange

Form er Group Head of Internal A udit, London Stock Exchange

Senior Technical Advisor, O perational Risk & Resilience,

Group and form er C R O , ICAP

Supervisory Risk Specialists, Prudential Regulation Authority,

Keith Isaac, FRM VP Capital M arkets Risk M anagem ent, TD Bank Group

William May, SVP Global Head of Certifications and Educational Program s, Global Association of Risk Professionals

x



FRM® Committee

Bank of England

Dr. Sverrir Porvaldsson, FRM Senior Q uant, SEB

Estimating Market Risk Measures

An Introduction and Overview Learning Objectives A fter com pleting this reading you should be able to: Estim ate VaR using a historical simulation approach.

Estim ate risk measures by estim ating quantiles.

Estim ate VaR using a param etric approach for both normal

Evaluate estim ators of risk measures by estim ating their

and lognormal return distributions.

standard errors.

Estim ate the expected shortfall given profit and loss (P/L)

Interpret quantile-quantile (Q Q ) plots to identify the

or return data.

characteristics of a distribution.

D escribe coherent risk m easures.

E x c e rp t is C h a p ter 3 o f Measuring M arket Risk, S e co n d Edition, by Kevin D ow d.

1

Loss/Profit Data

This chapter provides a brief introduction and overview of the main issues in m arket risk m easurem ent. O ur main

W hen estim ating VaR and ES, it is som etim es more convenient

concerns are: •

Prelim inary data issues: How to deal with data in profit/loss form , rate-of-return form , and so on.



to deal with data in loss/profit (L/P) form . L/P data are a simple transform ation of P/L data: L /P t — ~ P / L t

Basic m eth ods o f VaR estim ation: How to estim ate simple

(1.4)

VaRs, and how VaR estim ation depends on assum ptions

L/P observations assign a positive value to losses and a negative

about data distributions.

value to profits, and we will call these L/P data 'losses' for short.



How to estim ate coherent risk m easures.

Dealing with losses is som etim es a little more convenient for risk



How to gauge the precision of our risk measure estim ators by estim ating their standard errors.



O verview : An overview of the different approaches to market risk m easurem ent, and of how they fit together.

m easurem ent purposes because the risk measures are them ­ selves denom inated in loss term s.

Arithmetic Return Data Data can also come in the form of arithm etic (or simple) returns.

We begin with the data issues.

The arithm etic return rt is defined as:

Pt + Dt

1.1 DATA

Pt-

Profit/Loss Data

-

1

(1.5)

which is the same as the P/L over period t divided by the value of the asset at the end of t — 1.

O ur data can com e in various form s. Perhaps the sim p lest is in term s of profit/loss (or P/L). The P/L g enerated by an

assumption will seldom be appropriate over long periods because

as the value of the asset (or portfolio) at the end of t

interim income is usually reinvested. Hence, arithmetic returns

plus any interim paym ents D t m inus the asset value at

should not be used when we are concerned with long horizons.

the end of t — 1: P/ Lt — Pt + D t - Pt_!

In using arithmetic returns, we implicitly assume that the interim payment Dt does not earn any return of its own. However, this

asset (or portfolio) over the period t, P /L t, can be defined

(1.1)

If data are in P/L form , positive values indicate profits and nega­

Geometric Return Data Returns can also be expressed in geom etric (or compound)

tive values indicate losses. If we wish to be strictly correct, we should evaluate all paym ents from the same point of tim e (i.e., we should take account of the tim e value of money). We can do so in one of two ways. The first way is to take the present value of P /L t evaluated at the end of the previous period, t — 1: P resen t Value (P /L t) =

P t + Dt Pt- 1

Pt- 1 1

form . The geom etric return Rt is Pt +

p

A

p t- 1 )

( 1.6 )

The geom etric return implicitly assumes that interim payments are continuously reinvested. The geom etric return is often more

(Pt + D t)

- Pt_-|

(1.2)

ensures that the asset price (or portfolio value) can never become negative regardless of how negative the returns might be. With

where d is the discount rate and w e assum e for convenience that Dt is paid at the end of t. The alternative is to take the for­ ward value of P /L t evaluated at the end of period t: Forw ard Value (P / L t) = Pt + Dt - {1 + d)Pt_-,

economically meaningful than the arithmetic return, because it

arithmetic returns, on the other hand, a very low realized return— or a high loss— implies that the asset value Pt can become nega­ tive, and a negative asset price seldom makes economic sense.1

(1.3)

The geom etric return is also more convenient. For exam ple,

which involves com pounding Pt_-| by d. Th e d ifferen ces

if we are dealing with foreign currency positions, geom etric

betw een th ese values depend on the discount rate d, and

returns will give us results that are independent of the reference

will be sm all if the p erio d s th em selves are short. W e will ignore th ese d ifferen ces to sim plify the d iscussio n, but they can m ake a d ifferen ce in p ractice w hen dealing with longer p erio d s.

2



1 This is mainly a point of principle rather than practice. In practice, any distribution we fit to returns is only likely to be an approximation, and many distributions are ill-suited to extreme returns anyway.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

returns are 1 + rt = e xp (Rt) => rt — e xp (Rt) — 1 = exp(0.05) — 1 = 0.0513. In both cases the arithmetic return is close to, but a little higher than, the geometric return— and this makes intuitive sense when one considers that the geometric return compounds at a faster rate.

1.2 ESTIMATING HISTORICAL SIMULATION V a R The sim plest way to estim ate VaR is by means of historical simulation (HS). The HS approach estim ates VaR by means of ordered loss observations. Suppose we have 1000 loss observations and are interested in the VaR at the 95% confidence level. Since the confidence level

Arithmetic Return

Figure 1.1

im plies a 5% tail, we know that there are 50 observations in the

Geometric and arithmetic returns.

tail, and we can take the VaR to be the 51st highest loss observation.2

currency. Sim ilarly, if we are dealing with multiple periods, the

W e can estim ate the VaR on a sp read sh eet by ordering

geom etric return over those periods is the sum of the one-

our data and reading off the 51st largest observation from

period geom etric returns. Arithm etic returns have neither of

the sp read sh eet. W e can also estim ate it more directly

these convenient properties.

by using the 'Larg e' com m and in E xce l, which gives us

The relationship of the two types of return can be seen by

the kth largest value in an array. Thus, if our data are an

rewriting Equation (1.6) (using a Taylor's series expansion for the natural log) as:

array called 'L o ss_d a ta ', then our VaR is given by the Excel com m and 'La rg e (L o ss_d a ta ,5 1 )'. If w e are using MATLA B , w e first order the loss/profit data using the 'Sort()'

n(1 + rt) = rt

(1.7)

from which we can see that Rt ~ rt provided that returns are 'sm all'. This conclusion is illustrated by Figure 1.1, which plots the geom etric return Rt against its arithm etic counterpart rt. The difference between the two returns is negligible when both returns are sm all, but the difference grows as the returns get bigger— which is to be expected, as the geom etric return is a log function of the arithm etic return. Since we would expect returns to be low over short periods and higher over longer periods, the difference betw een the two types of return is

com m and (i.e ., by typing 'Loss_data = Sort(Loss_data)'); and then derive the VaR by typing in 'Lo ss_d a ta (5 1 )' at the com m and line. More generally, if we have n observations, and our confi­ dence level is a, we would want the (1 — a), n + 1th highest observation, and we would use the com m ands 'Large(Loss_data,(1 — a/pha)*n + 1)' using Excel, or 'Loss_data((1 — alpha)*n + 1)' using M A TLA B , provided in the latter case that our 'Loss_data' array is already sorted into ordered observations.3

negligible over short periods but potentially substantial over longer ones. And since the geom etric return takes account of earnings on interim incom e, and the arithm etic return does not, we should always use the geom etric return if we are dealing with returns over longer periods.

Example 1.1

Arithmetic and Geom etric Returns

If arithmetic returns rt over some period are 0.05, Equation (1.7) tells us that the corresponding geometric returns are Rt — ln(1 + rt) — ln(1.05) = 0.0488. Similarly, if geometric returns Rt are 0.05, Equation (1.7) implies that arithmetic

2 In theory, the VaR is the quantile that demarcates the tail region from the non-tail region, where the size of the tail is determined by the confidence level, but with finite samples there is a certain level of arbitrariness in how the ordered observations relate to the VaR itself— that is, do we take the VaR to be the 50th observation, the 51st obser­ vation, or some combination of them? However, this is just an issue of approximation, and taking the VaR to be the 51st highest observation is not unreasonable. 3 We can also estimate HS VaR using percentile functions such as the 'Percentile' function in Excel or the 'pretile' function in MATLAB. However, such functions are less transparent (i.e., it is not obvious to the reader how the percentiles are calculated), and the Excel percentile function can be unreliable.

Chapter 1 Estimating Market Risk Measures

■ 3

An exam p le of an HS VaR is given in Figure 1.2. This figure

In p ractice, it is often helpful to obtain HS VaR estim ates from

shows the histogram of 1000 hypothetical loss o b serva­

a cum ulative histogram , or em pirical cum ulative frequency

tions and the 95% VaR. Th e figure is g enerated using the

function. This is a plot of the ordered loss observations against

'hsvarfig ure' com m and in the MMR Toolbox. The VaR is

their em pirical cum ulative freq uency (e .g ., so if there are n

1.704 and sep arates the top 5% from the bottom 95% of

observations in to tal, the em pirical cum ulative freq uency of

loss o bservations.

the rth such ordered observation is i/n). The em pirical cum u­ lative freq uency function of our earlier data set is shown in Figure 1.3. The em pirical freq uency function m akes it very easy to obtain the VaR: we sim ply move up the cum ulative freq uency axis to w here the cum ulative freq uency equals our confidence level, draw a horizontal line along to the curve, and then draw a vertical line down to the x-axis, which gives us our VaR.

1.3 ESTIMATING PARAMETRIC V a R W e can also estim ate VaR using p aram etric ap p roaches, the distinguishing feature of which is that they require us to exp licitly sp ecify the statistical distribution from which our data observations are draw n. W e can also think of p aram etric app roaches as fitting curves through the data Loss (+)/profit (-)

Figure 1.2

Historical simulation VaR.

Note: Based on 1000 random numbers drawn from a standard normal L/P distribution, and estimated with 'hsvarfigure' function.

and then reading off the VaR from the fitted curve. In making use of a param etric approach, we therefore need to take account of both the statistical distribution and the type of data to which it applies.

Estimating VaR with Normally Distributed Profits/Losses Suppose that we wish to estim ate VaR under the assumption that P/L is normally distributed. In this case our VaR at the confi­ dence level a is: aVaR — —/JLp/i + crp/Lza

(1-8)

w here za is the standard normal variate corresponding to a, and /xP/L and a P/L are the mean and standard deviation of P/L. Thus, za is the value of the standard normal variate such that a of the probability density mass lies to its left, and 1 — a of the probability density mass lies to its right. For exam p le, if our confidence level is 95% , za — z095 will be 1.645. Loss (+)/profit (-)

Figure 1.3 Historical simulation via an empirical cumulative frequency function. Note: Based on the same data as Figure 1.2.

4



In practice, /xP/L and a P/L would be unknown, and we would have to estim ate VaR based on estim ates of these param eters. Our VaR estim ate, aVaRe, would then be: aVaRe - —mP/L + s P/Lza

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

(1.9)

where mP/L and sP/L are estim ates of the mean and standard deviation of P/L. Figure 1.4 shows the 95% VaR for a norm ally distributed P/L with mean 0 and standard deviation 1. Since the data are in P/L form , the VaR is indicated by the negative of the cut off point betw een the low er 5% and the upper 95% of P/L observations. The actual VaR is the negative of —1.645, and is therefore 1.645. If we are working with normally distributed U P data, then i i L/P — —/Jip/L and a/_/p = a P//_, and it im m ediately follows that: aVaR — /jl i_/p + crL/pZa

(1.10a)

aVaRe — nnL/P + s L/Pza

(1.10b)

Figure 1.5 illustrates the corresponding VaR. This figure gives the sam e inform ation as Figure 1.4, but is a little more straightforw ard to in terp ret because the VaR is defined in units of losses (or 'lost m oney') rather than P/L. In this case,

Profit (+)/loss (-)

Fiqure 1.4 VaR with standard normally distributed profit/loss data. Note: Obtained from Equation (1.9) with /jl P/L = 0 and crP/L = 1. Esti­ mated with the 'normalvarfigure' function.

the VaR is given by the point on the x-axis that cuts off the top 5% of the pdf m ass from the bottom 95% of pdf m ass. If we prefer to w ork with the cum ulative density function, the VaR is the x-value that co rresp onds to a cd f value of 95% . Eith er w ay, the VaR is again 1.645, as w e w ould (hopefully) e xp e ct.

Example 1.2

VaR with Normal P/L

If P/L over some period is normally distributed with mean 10 and standard deviation 20, then (by Equation (1.8)) the 95% VaR is - 1 0 + 20z0.95 = - 1 0 + 20 X 1.645 = 22.9. The corresponding 99% VaR is —10 + 20 z q 99 — —10 + 20 X 2.326 = 36.52.

Estimating VaR with Normally Distributed Arithmetic Returns

) ______ 1 -5 -4

--- - 1______ I______ I______ I __I--- --- - ----- I______ -3 -2 -1 0 1 2 3 4 5 Loss (+)/profit (-)

We can also estim ate VaR making assum ptions about

Fiaure 1.5 data.

VaR with normally distributed loss/profit

returns rather than P/L. Suppose then that w e assum e that

Note: Obtained from Equation (1,10a) with /jl L/P = 0 and aL/P = 1.

arithm etic returns are normally distributed with mean /ir and standard deviation ay. To derive the VaR, we begin by obtaining the critical value of rt, r*, such that the probabil­

Since the actual return rt is the loss/profit divided by the earlier

ity that rt exceeds r* is equal to our confidence level a. r* is

asset value, Pt_ 1f it follows that:

therefore: r* = Hr ~ o-rza

( 1. 11)

( 1. 12)

Chapter 1 Estimating Market Risk Measures

■ 5

Substituting r* for rt then gives us the relationship betw een r* and the VaR: (1.13)

BOX 1.1 THE LOGNORMAL DISTRIBUTION

Equation (1.14) will give us equivalent answers to our earlier VaR

A random variate X is said to be lognorm ally d istrib ­ uted if the natural log of X is norm ally d istrib u ted . The lognorm al distribution can be sp ecified in term s of the mean and standard deviation of In X. Call these p aram eters fi and cr. The lognorm al is often also rep re­ sented in term s of m and cr, w here m is the m edian of x, and m — exp(ju).

equations. For exam ple, if we set a — 0.95, f ir — 0, a r — 1 and

The pdf of X can be w ritten:

Substituting Equation (1.11) into Equation (1.13) and rearrang­ ing then gives us the VaR itself: aVaR - - { f i r - crrz JP t_-|

(1.14)

Pt_i = 1, which correspond to our earlier illustrative P/L and U P

1

param eter assum ptions, aVaR is 1.645: the three approaches give the same results, because all three sets of underlying assum ptions are equivalent.

Example 1.3

VaR with Normally Distributed Arithmetic

Returns Suppose arithm etic returns rt over some period are distributed as normal with mean 0.1 and standard deviation 0.25, and we have a portfolio currently worth 1. Then (by Equation (1.14)) the 95% VaR is -0 .1 + 0.25 X 1.645 = 0.331, and the 99% VaR is

\

1 /log(x) -

^X = ~U\/2tt 6XP{ ~2 \

J

J

for x > 0. Thus, the lognorm al pdf is only defined for positive values of x and is skew ed to the right as in Figure 1.6. Let co — exp(cr2) for convenience. The mean and variance of the lognormal can be written as: mean — mexp( In P* = R* + In Pt_-|

non-negative, and its distribution is skew ed with a long right-hand tail.

=> P* = Pt- 1 exp[P*] = Pt_'i exp [jl l r - a Rza]

Since the VaR is a loss, and since the loss is the difference

=> aVaR - Pt_-| - P* = Pt—1(1 - exp[/uLR - crRz J )

between Pt (which is random) and Pt_-| (which we can take here as given), then the VaR itself has the same distribution as Pt. Normally distributed geom etric returns imply that the VaR is

If we proceed as we did earlier with the arithm etic return, probability that R > R* is a:



normally distributed geom etric returns.

the standardised (but typically unrealistic) assum ptions that

we begin by deriving the critical value of R, R*, such that the

6

This gives us the lognormal VaR, which is consistent with

The lognormal VaR is illustrated in Figure 1.7, based on

lognormally distributed.

R* = HR - ( T Rza

(1.16)

juR — 0,

ctr

— 1, and Pt_[p) is specified by the user. The ES gives all tail-loss quantiles an equal w eight, and other quantiles a w eight of 0. Thus the ES is a sp e­ cial case of M^ obtained by setting a

which is 2.0250.

Table 1.1 Tail VaRs

(1.17)

0

high value of n and carry out the calculations on a sp read sh eet or using ap p rop riate so ftw are. How ever, to show the p ro ce­

I (p)qpd p

True value

2.0630

Note: VaRs estimated assuming the mean and standard deviation of losses are 0 and 1.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

The more general coherent risk m easure,

involves a poten­

to give accurate results. Table 1.4 reports som e alternative

tially more sophisticated weighting function (p). W e can there­

estim ates obtained using this procedure with increasing val­

fore estim ate any of these m easures by replacing the equal

ues of n. These results show that the estim ated risk m easure

w eights in the 'average VaR' algorithm with the 0(p) weights

rises with n, and gradually converges to a value in the region

appropriate to the risk measure being estim ated.

of about 1.854. The estim ates in this table indicate that we

To show how this might be done, suppose we have the exp o ­ nential weighting function:

may need a considerably larger value of n than we did earlier to get results of the sam e level of accuracy. Even so, a good com puter should still be able to produce accurate estim ates of

p—(1—p)/y

spectral risk m easures fairly quickly.

=

( 1 1 9 )

W hen estim ating ES or more general coherent risk m easures in

and we believe that w e can represent the degree of our

practice, it also helps to have som e guidance on how to choose

risk-aversion by setting y = 0.05. To illustrate the procedure

the value of n. G ranted that the estim ate does eventually con­

manually, we continue to assume that losses are standard

verge to the true value as n gets large, one useful approach is to

normally distributed and w e set n = 10 (i.e., we divide the

start with som e small value of n, and then double n repeatedly

com plete losses density function into 10 equal-probability

until we feel the estim ates have settled down sufficiently. Each

slices). W ith n = 10, we have n — 1 = 9 (i.e., n — 1) loss quan­

tim e we do so, we halve the width of the discrete slices, and

tiles or VaRs spanning confidence levels from 0.1 to 0.90. These

we can monitor how this 'halving' process affects our estim ates.

VaRs are shown in the second column of Table 1.3, and vary

This suggests that we look at the 'halving error' s n given by:

from —1.2816 (for the 10% VaR) to 1.2816 (for the 90% VaR). The third column shows the (p) w eights corresponding to each confidence level, and the fourth column shows the products of each VaR and corresponding w eight. O ur estim ated exponential spectral risk measure is the 0(p)-w eighted average of the VaRs,

s n = M (n) - M (n/2)

(1.20)

w here M(n) is our estim ated risk m easure based on n slices. W e stop doubling n when s n falls below som e to lerance level that indicates an accep tab le level of accuracy. The process is

and is therefore equal to 0.4228. A s when estim ating the ES earlier, when using this type of routine in practice we would w ant a value of n large enough

Estimating Exponential Spectral Risk Table 1.3 Measure as a Weighted Average of VaRs Confidence Level (or)

aVaR

Weight (f>{a)

Table 1.4 Estimates of Exponential Spectral Coherent Risk Measure as a Function of the Number of Tail Slices Number of Tail Slices

Estimate of Exponential Spectral Risk Measure

10

0.4227

{a) are given by the exponential function (Equation (1.19)) with y = 0.05.

Chapter 1 Estimating Market Risk Measures

■ 9

Table 1.5

Estimated Risk Measures and Halving Errors

Number of Tail Slices 100

Estimated Spectral Risk Measure 1.5853

Thus, the key to estim ating any co herent risk m easure is to be able to estim ate quantiles or VaRs: the co h eren t risk m easures can then be obtained as ap p ro p riate ly w eig hted

Halving Error

averag es of q u an tiles. From a p ractical point of view , this is e xtre m e ly helpful as all the building blocks that go into

0.2114

quantile or VaR estim ation— d atab ases, calculation rou­ tin es, e tc .— are exactly w hat w e need for the estim ation of

200

1.7074

0.1221

400

1.7751

0.0678

800

1.8120

0.0368

1600

1.8317

0.0197

3200

1.8422

0.0105

6400

1.8477

0.0055

12 800

1.8506

0.0029

25 600

1.8521

0.0015

51 200

1.8529

0.0008

Note: VaRs estimated assuming the mean and standard deviation of losses are 0 and 1, using the 'normalvar' function in the MMR Toolbox. The weights 4>[a) are given by the exponential function (Equation (1.19)) with y — 0.05.

co h eren t risk m easures as w ell. If an institution already has a VaR eng ine, then that engine needs only sm all ad ju st­ m ents to produce estim ates of co h eren t risk m easures: in d eed , in many cases, all th at needs changing is the last few lines of code in a long data processing system . Th e costs of sw itching from VaR to more so p h isticated risk m easures are th erefo re very low.

1.5 ESTIMATING THE STANDARD ERRORS O F RISK M EASURE ESTIMATORS We should always bear in mind that any risk measure estim ates that we produce are just that— estim ates. We never know the true value of any risk m easure, and an estim ate is only as good

shown in Table 1.5. Starting from an arb itrary value of 100, we

as its precision: if a risk measure is very im precisely estim ated,

rep eated ly double n (so it becom es 200, 400, 800, e tc.). A s we

then the estim ator is virtually w orthless, because its imprecision

do so, the estim ated risk m easure gradually co nverg es, and

tells us that true value could be almost anything; on the other

the halving error gradually falls. So, for exam p le, for n = 6400,

hand, if we know that an estim ator is fairly precise, we can be

the estim ated risk m easure is 1.8477, and the halving error is

confident that the true value is fairly close to the estim ate, and

0 .0 0 5 5 . If w e double n to 12,800, the estim ated risk m easure

the estim ator has some value. Hence, we should always seek to

becom es 1.8506, and the halving error falls to 0 .0 0 2 9 ,

supplem ent any risk estim ates we produce with som e indicator

and so on.

of their precision. This is a fundam ental principle of good risk

However, this 'weighted average quantile' procedure is rather

m easurem ent practice.

crude, and (bearing in mind that the risk measure (Equation (1.17))

We can evaluate the precision of estim ators of risk m easures by

involves an integral) we can in principle expect to get substantial

means of their standard errors, or (generally better) by produc­

improvements in accuracy if we resorted to more 'respectable'

ing confidence intervals for them . In this chapter we focus on

numerical integration or quadrature methods. This said, the crude

the more basic indicator, the standard error of a risk measure

'weighted average quantile' method actually seems to perform

estimator.

well for spectral exponential risk measures when compared against some of these alternatives, so one is not necessarily better off with the more sophisticated m ethods.5

Standard Errors of Quantile Estimators W e first co nsid er the standard errors of quantile (or VaR) estim ato rs. Follow ing Kendall and S tu a rt,6 sup po se w e have a distribution (or cum ulative density) function F(x), which

5 There is an interesting reason for this: the spectral weights give the highest loss the highest weight, whereas the quadrature methods such as the trapezoidal and Simpson's rules involve algorithms in which the two most extreme quantiles have their weights specifically cut, and this undermines the accuracy of the algorithm relative to the crude approach. However, there are ways round these sorts of problems, and in principle versions of the sophisticated approaches should give better results.

10



m ight be a param etric distribution function or an em pirical

6 Kendall and Stuart (1972), pp. 251-252.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

distribution function (i.e ., a cum ulative histogram ) estim ated from real data. Its co rresponding density or relative-frequency function is f(x). Sup p o se also that w e have a sam ple of size n, and w e se le ct a bin w idth h. Let d F be the p ro b ab ility that (k — 1) o b servatio n s fall below som e value q — h/2, th at one observation falls in the range q ± h/2, and that (n — k) o b ser­ vations are g reater than q + h/2. d F is proportional to {F(q )}k- l f(q)dq{1 - H q )}n~k

(1.21)

This gives us the frequency function for the quantile q not exceeded by a proportion k/n of our sam ple, i.e ., the 100(k/n)th percentile. If this proportion is p, Kendall and Stuart show that Equation (1.21) is approxim ately equal to p np(1 — p)n(1~p) for large values of n. If e is a very small increm ent to p, then p np( 1 - p)n(1" p) « (p + £)np(1 - p - £)n(1“ p)

(1.22)

Taking logs and expanding, Equation (1.22) is itself

Note: Based on random samples of size n drawn from a standard normal distribution. The bin width h is set to 0.1.

approxim ately

(p + £)np(1 - p - £)n(1" p) «

(1.23)

2p(1 - p)

which im plies that the distribution function d F is approxim ately proportional to

make a difference to our estim ates— and also on the bin width h, which is essentially arbitrary.

—ns2

exp

\

(1.24)

2p(1 - P ) J

The standard error can be used to construct confidence inter­ vals around our quantile estim ates in the usual textb o o k w ay.

Integrating this out, dF —

In addition, the quantile standard error depends on the prob­ ability density function f(.)— so the choice of density function can

For exam p le, a 90% confidence interval for a quantile q is

1 V 2 7 rV p (1 - p)/n

exp

—ns‘ 2p(1 - p)

ds

(1.25)

which tells us that £ is normally distributed in the limit with vari­ ance p(1 — p )/n. However, we know that de — d F(q ) = f(q)dq,

given by [q — 1.645 se(q), q + 1.645 se(q)] 1.645

Vp(1 - p)/n . . Kq)

Vp(1 - p)/n

. q + 1.645 —

(1.27)

Kq)

so the variance of q is variq)

P( 1 ~ P) n[f(q)]2

(1.26)

This gives us an approxim ate expression for the variance, and hence its square root, the standard error, of a quantile estim ator q. This expression shows that the quantile standard error depends on p, the sam ple size n and the pdf value f(q). The way in which the (normal) quantile standard errors depend on these param ­ eters is apparent from Figure 1.9. This shows that:

Example 1.6

Obtaining VaR Confidence Intervals Using

Quantile Standard Errors Suppose we wish to estim ate the 90% confidence interval for a 95% VaR estim ated on a sam ple of size of n = 1000 to be drawn from a standard normal distribution, based on an assum ed bin width h — 0.1. W e know that the 95% VaR of a standard normal is 1.645. W e take this to be q in Equation (1.27), and w e know that q falls in the bin spanning 1.645 ± 0.1/2 = [1.595, 1.695].



The standard error falls as the sam ple size n rises.



The standard error rises as the probabilities becom e more

is also equal to p, and the probability of profit or a loss less

extrem e and we move further into the tail— hence, the more

than 1.595 is 0 .9 4 4 6 . Hence f(q), the probability mass in the

extrem e the quantile, the less precise its estimator.

q range, is 1 — 0.0450 — 0.9446 = 0.0104. W e now plug the

The probability of a loss exceed ing 1.695 is 0.0 4 5 , and this

Chapter 1 Estimating Market Risk Measures



11

relevant values into Equation (1.27) to obtain the 90% confi­

of VaR and ES estim ators for Levy distributions with varying a

dence interval for the VaR:

stability param eters. Their results suggested that VaR and ES

1.645 - 1.645 1.645 + 1.645

V 0. 045(1

-

0 . 045)/1000

' 0.0104 V 0 . 045(1 - 0 . 045)/1000

0.0104 = [0.6081,2.6819]

This is a wide confidence interval, especially when com pared to the O S and bootstrap confidence intervals. The confidence interval narrows if we take a w ider bin width, so suppose that we now repeat the exercise using a bin width h — 0.2, which is probably as wide as we can reasonably go with these data, q now falls into the range 1.645 ± 0.2/2 = [1.545, 1.745]. p, the prob­ ability of a loss exceeding 1.745, is 0.0405, and the prob­ ability of profit or a loss less than 1.545 is 0.9388. Hence f[q) — 1 — 0.0405 — 0.9388 = 0.0207. Plugging these values into Equation (1.27) now gives us a new estim ate of the 90% confi­ dence interval: 1.645 - 1.645 1.645 + 1.645

Vo.0405(1

- 0.0405 )/1000

0.0207

Vo.0405(1

- 0.0405)/1000

0.0207 = [1.1496, 2.1404]

This is still a rather wide confidence interval. This exam ple illustrates that although we can use quantile stan­ dard errors to estim ate VaR confidence intervals, the intervals can be wide and also sensitive to the arbitrary choice of bin width. The quantile-standard-error approach is easy to implement and has some plausibility with large sample sizes. However, it also has weaknesses relative to other methods of assessing the precision of quantile (or VaR) estimators— it relies on asymptotic theory and requires large sample sizes; it can produce imprecise estimators,

estim ators had com parable standard errors for near-normal Levy distributions, but the ES estim ators had much bigger standard errors for particularly heavy-tailed distributions. They explained this finding by saying that as tails becam e heavier, ES estim a­ tors becam e more prone to the effects of large but infrequent losses. This finding suggests the depressing conclusion that the presence of heavy tails might make ES estim ators in general less accurate than VaR estim ators. Fortunately, there are grounds to think that such a conclusion might be overly pessim istic. For exam ple, Inui and Kijima (2003) present alternative results showing that the application of a Richardson extrapolation method can produce ES estim ators that are both unbiased and have com parable standard errors to VaR estim ators.7 A cerbi (2004) also looked at this issue and, although he confirm ed that tail heaviness did increase the stan­ dard errors of ES estim ators relative to VaR estim ators, he con­ cluded that the relative accuracies of VaR and ES estim ators were roughly com parable in em pirically realistic ranges. However, the standard error of any estim ator of a coherent risk measure will vary from one situation to another, and the best practical advice is to get into the habit of always estim ating the standard error w henever one estim ates the risk measure itself. Estim ating the standard error of an estim ator of a coherent risk measure is also relatively straightforw ard. O ne way to do so starts from recognition that a coherent risk measure is an L-estim ator (i.e., a w eighted average of order statistics), and L-estim ators are asym ptotically normal. If we take N discrete points in the density function, then as N gets large the variance of the estim ator of the coherent risk measure (Equation (1.17)) is approxim ately

cr(M ^) —> / 4>{p)4>{q)---------------------- d p d q * N J ^ WV (F -1(p))f(F-1(q)) P W pMF(x)W>(F(y))F(x)(1 - F(y))dxdy

(1.28)

x< y

or wide confidence intervals; it depends on the arbitrary choice of bin width; and the symmetric confidence intervals it produces are

and this can be com puted numerically using a suitable num eri­

misleading for extrem e quantiles whose 'true' confidence inter­

cal integration procedure. W here the risk measure is the ES, the

vals are asymmetric reflecting the increasing sparsity of extrem e

standard error becom es

observations as we move further out into the tail.

F-\a)F~\a) .( E S - ) - ^

Standard Errors in Estimators of Coherent Risk Measures

/

j[m in(F(x), R y)) - F(x)F(y)]dxdy (1.29) 0

0

and used in conjunction with a suitable numerical integration m ethod, this gives good estim ates even for relatively low values

We now consider standard errors in estim ators of coherent risk m easures. O ne of the first studies to exam ine this issue (Yamai and Yoshiba (2001b) did so by investigating the relative accuracy

12



7 See Inui and Kijima (2003).

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

of N .8 If we wish to obtain confidence intervals for our risk m ea­

portfolio level is more limiting, but easier, while working at the

sure estim ators, we can make use of the asym ptotic normality of

position level gives us much more flexibility, but can involve

these estim ators to apply textb o o k form ulas (e.g ., such as

much more work.

Equation (1.27)) based on the estim ated standard errors and centred around a 'good' best estim ate of the risk m easure.

Which m eth od ? Having chosen our risk measure and level of analysis, we then choose a suitable estim ation m ethod.

An alternative approach to the estim ation of standard errors for

To decide on this, we would usually think in term s of the classic

estim ators of coherent risk measures is to apply a bootstrap: we

'VaR trinity':

bootstrap a large num ber of estim ators from the given distribu­



Non-param etric m ethods



Param etric methods

ple of bootstrapped estim ators. Even better, we can also use



Monte Carlo simulation methods

a bootstrapped sam ple of estim ators to estim ate a confidence

Each of these involves som e com plex issues.

tion function (which might be param etric or non-param etric, e .g ., historical); and we estim ate the standard error of the sam ­

interval for our risk measure.

1.6 THE CO R E ISSUES: AN O VERVIEW Before proceeding to more detailed issues, it might be help­ ful to pause for a moment to take an overview of the structure,

1.7 APPEN D IX Preliminary Data Analysis W hen confronted with a new data set, we should n ever proceed

as it w ere, of the subject m atter itself. This is very useful, as it

straight to estim ation without som e prelim inary analysis to get

gives the reader a mental fram e of reference within which the

to know our data. Prelim inary data analysis is useful because it

'detailed' material that follows can be placed. Essentially, there

gives us a feel for our data, and because it can highlight prob­

are three core issues, and all the material that follows can be

lems with our data set. Rem em ber that we never really know

related to these. They also have a natural sequence, so w e can

where our data come from , so we should always be a little wary

think of them as providing a roadmap that leads us to where we

of any new data set, regardless of how reputable the source

want to be. Which risk m easure? The first and most im portant is to choose the type of risk m easure: do we want to estim ate VaR, ES, etc.? This is logically the first issue, because w e need to know what we are trying to estim ate before we start thinking about how we are going to estim ate it.

might appear to be. For exam ple, how do you know that a clerk hasn't made a m istake som ewhere along the line in copying the data and, say, put a decim al point in the wrong place? The answer is that you don't, and never can. Even the most repu­ table data providers provide data with errors in them , however careful they are. Everyone who has ever done any em pirical work will have encountered such problem s at some tim e or

Which level? The second issue is the level of analysis. Do we

other: the bottom line is that real data must always be viewed

wish to estim ate our risk measure at the level of the portfo­

with a certain amount of suspicion.

lio as a whole or at the level of the individual positions in it? The form er would involve us taking the portfolio as our basic unit of analysis (i.e., we take the portfolio to have a specified com position, which is taken as given for the purposes of our

Such prelim inary analysis should consist of at least the first two and preferably all three of the following steps: •

The first and by far the most im portant step is to eyeball the

analysis), and this will lead to a univariate stochastic analysis.

data to see if they 'look right'— or, more to the point, we

The alternative is to work from the position level, and this has

should eyeball the data to see if anything looks w rong. Does

the advantage of allowing us to accom m odate changes in the

the pattern of observations look right? Do any observations

portfolio com position within the analysis itself. The disadvan­

stand out as questionable? And so on. The interocular trauma

tage is that we then need a m ultivariate stochastic fram ew ork,

test is the most im portant test ever invented and also the

and this is considerably more difficult to handle: we have to

easiest to carry out, and we should always perform it on any

get to grips with the problem s posed by variance-covariance

new data set.

m atrices, copulas, and so on, all of which are avoided if we work at the portfolio level. There is thus a trade-off: working at the



W e should plot our data on a histogram and estim ate the relevant summary statistics (i.e., mean, standard deviation, skew ness, kurtosis, etc.). In risk m easurem ent, we are par­ ticularly interested in any non-normal features of our data:

8 See Acerbi (2004, pp. 200-201).

skew ness, excess kurtosis, out-liers in our data, and the like.

Chapter 1 Estimating Market Risk Measures



13

We should therefore be on the lookout for any evidence of

of the distribution, which might indicate fat tails on at least

non-normality, and w e should take any such evidence into

the left-hand side. In fact, these particular observations are

account when considering w hether to fit any param etric dis­

drawn from a Student-t distribution with 5 degrees of freedom ,

tribution to the data.

so in this case we know that the underlying true distribution

Having done this initial analysis, we should consider what

is unim odal, sym m etric and heavy tailed. However, we would



kind of distribution might fit our data, and there are a num­ ber of useful diagnostic tools available for this purpose, the most popular of which are Q Q plots— plots of em pirical quantiles against their theoretical equivalents.

not know this in a situation with 'real' data, and it is precisely because we do not know the distributions of real-world data sets that prelim inary analysis is so im portant. Som e sum m ary statistics for this data set are shown in Table 1.6. The sam ple mean (—0.099) and the sam ple m ode dif­

Plotting the Data and Evaluating Summary Statistics

fer som ew hat (-0 .0 3 0 ), but this difference is small relative to

To get to know our data, we should first obtain their histogram

(3.985) is a little bigger than norm al. The sam ple minimum

and see what might stand out. Do the data look normal, or

(—4.660) and the sam ple maximum (3.010) are also not sym m et­

the sam ple standard deviation (1.363). However, the sam ple skew (—0.503) is som ew hat negative and the sam ple kurtosis

non-normal? Do they show one pronounced peak, or more than

ric about the sam ple mean or m ode, which is further evidence

one? Do they seem to be skew ed? Do they have fat tails or thin

of asym m etry. If we encountered these results with 'real' data,

tails? Are there outliers? And so on.

we would be concerned about possible skew ness and kurtosis.

As an exam ple, Figure 1.10 shows a histogram of 100 random observations. In practice, we would usually wish to w ork with considerably longer data sets, but a data set this small helps to highlight the uncertainties one often encounters in practice.

However, in this hypothetical case we know that the sam ple skew ness is m erely a product of sam ple variation, because we happen to know that the data are drawn from a sym m etric distribution.

These observations show a dom inant peak in the centre, which

Depending on the context, we might also seriously consider

suggests that they are probably drawn from a unimodal distri­

carrying out some formal tests. For exam ple, we might test

bution. On the other hand, there may be a negative skew, and

w hether the sam ple param eters (mean, standard deviation, etc.)

there are som e large outlying observations on the extrem e left

are consistent with what we might exp ect under a null hypoth­ esis (e.g ., such as normality). The underlying principle is very sim ple: since we n ever know the true distribution in practice, all we ever have to w ork with are estim ates based on the sam ple at hand; it therefore behoves us to make the best use of the data we have, and to extract as much information as possible from them .

QQ Plots Having done our initial analysis, it is often good practice to ask what distribution might fit our data, and a very useful device for identifying the distribution of our data is a quantile-quantile or Q Q plot— a plot of the quantiles of the em pirical distribu­ tion against those of some specified distribution. The shape of the Q Q plot tells us a lot about how the em pirical distribution com pares to the specified one. In particular, if the Q Q plot is linear, then the specified distribution fits the data, and we have identified the distribution to which our data belong. In addi­

Fiaure 1.10

A histogram.

Note: Data are 100 observations randomly drawn from a Student-t with 5 degrees of freedom.

14



tion, departures of the Q Q from linearity in the tails can tell us w hether the tails of our em pirical distribution are fatter, or thinner, than the tails of the reference distribution to which it is being com pared.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

Table 1.6

6

Summary Statistics

Parameter

Value

4

Mean

-0 .0 9 9

Mode

-0 .0 3 0 •

Standard deviation Skewness Kurtosis

1.363 -0 .5 0 3 3.985

Minimum

-4 .6 6 0

Maximum

3.010

Num ber of observations

100

Note: Data are the same observations shown in Figure 1.10.

To illustrate, Figure 1.11 shows a Q Q plot for a data sam ple drawn from a normal distribution, com pared to a reference distribution that is also norm al. The Q Q plot is obviously close to linear: the central mass observations fit a linear Q Q plot very closely, and the extrem e tail observations som ew hat

u. The usefulness of the MEF stems from the fact that each distribution has its own distinc­ tive MEF. A comparison of the empirical MEF with the theoretical MEF associated with some specified distribution function therefore gives us an indication of whether the chosen distribution fits the tails of our empirical distribution. However, the results of MEF plots need to be interpreted with some care, because data observations become more scarce as X gets larger. For more on these and how they can be used, see Embrechts et. al. (1997, Chapters 3.4 and 6.2).

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

Non-Parametric Approaches Learning Objectives A fter com pleting this reading you should be able to: A p p ly the bootstrap historical simulation approach to

Com pare and contrast the age-w eighted, the volatility-

estim ate coherent risk m easures.

w eighted, the correlation-w eighted, and the filtered

D escribe historical simulation using non-param etric density estim ation.

historical simulation approaches. Identify advantages and disadvantages of non-param etric estim ation m ethods.

E x c e rp t is C h apter 4 o f M easuring M a rket Risk, S e co n d Edition, by Kevin D ow d.

17

This chapter looks at som e of the most popular approaches

for HS VaR and ES. Then we will address w eighted HS— how

to the estim ation of risk m easures— the non-param etric

we might w eight our data to capture the effects of observation

approaches, which seek to estim ate risk m easures without

age and changing m arket conditions. These m ethods introduce

making strong assum ptions about the relevant (e.g ., P/L)

param etric form ulas (such as G A R C H volatility forecasting

distribution. The essence of these approaches is that we try to

equations) into the picture, and in so doing convert hitherto

let the P/L data speak for them selves as much as possible, and

non-param etric m ethods into what are best described as semi-

use the recent em pirical (or in som e cases simulated) distribu­

param etric m ethods. Such m ethods are very useful because

tion of P/L— not som e assumed theoretical distribution— to

they allow us to retain the broad HS fram ew ork while also

estim ate our risk m easures. All non-param etric approaches are

taking account of ways in which we think that the risks we face

based on the underlying assumption that the near future will

over the foreseeable horizon period might differ from those

be sufficiently like the recent past that we can use the data

in our sam ple period. Finally we review the main advantages

from the recent past to forecast risks over the near future— and

and disadvantages of non-param etric and sem i-param etric

this assumption may or may not be valid in any given context.

approaches, and offer some conclusions.

In deciding w hether to use any non-param etric approach, we must make a judgm ent about the extent to which data from the recent past are likely to give us a good guide about the risks we face over the horizon period we are concerned with. To keep the discussion as clear as possible, we will focus on the estim ation of non-param etric VaR and ES. However, the m ethods discussed here extend very naturally to the estimation of coherent and other risk measures as well. These can be estim ated using an 'average quantile' approach along the lines discussed in C hapter 1: w e would select our weighting function 4>{p), decide on the num ber of probability 'slices' n to take, estim ate the associated quantiles, and take the w eighted aver­ age using an appropriate numerical algorithm (see Chapter 1).1 We can then obtain standard errors or confidence intervals for our risk measures using suitably modified form s. In this chapter we begin by discussing how to assem ble the P/L data to be used for estim ating risk m easures. We then discuss the most popular non-param etric approach-historical simulation (HS). Loosely speaking, HS is a histogram -based approach: it is

2.1 COM PILING HISTORICAL SIMULATION DATA The first task is to assem ble a suitable P/L series for our portfo­ lio, and this requires a set of historical P/L or return observations on the positions in our current portfolio. These P/Ls or returns will be measured over a particular frequency (e.g ., a day), and we want a reasonably large set of historical P/L or return obser­ vations over the recent past. Suppose we have a portfolio of n assets, and for each asset / we have the observed return for each of T subperiods (e.g ., daily subperiods) in our historical sam ple period. If Ri t is the (possibly m apped) return on asset / in subperiod t, and if w( is the amount currently invested in asset /, then the historically sim ulated portfolio P/L over the subperiod t is: P /L, =

(=1

(2-1)

conceptually sim ple, easy to im plem ent, very w idely used, and

Equation (2.1) gives us a historically sim ulated P/L series for our

has a fairly good historical record. We focus on the estim ation of

current portfolio, and is the basis of HS VaR and ES. This series

VaR and ES, but as explained in the previous chapter, more gen­

will not generally be the same as the P/L actually earned on our

eral coherent risk measures can be estim ated using appropri­

portfolio— because our portfolio may have changed in com posi­

ately w eighted averages of any non-param etric VaR estim ates.

tion over tim e or be subject to mapping approxim ations, and so

We then discuss refinem ents to basic HS using bootstrap and

on. Instead, the historical simulation P/L is the P/L we w ould

kernel m ethods, and the estim ation of VaR or ES curves and sur­

have earned on our current portfolio had w e held it throughout

faces. W e will discuss how we can estim ate confidence intervals

the historical sam ple period.2 As an aside, the fact that multiple positions collapse into

A

Nonetheless, there is an important caveat. This method was explained in Chapter 1 in an implicit context where the risk measurer could choose n, and this is sometimes not possible in a non-parametric context. For example, a risk measurer might be working with an n deter­ mined by the HS data set, and even where he/she has some freedom to select n, their range of choice might be limited by the data avail­ able. Such constraints can limit the degree of accuracy of any resulting estimated risk measures. However, a good solution to such problems is to increase the sample size by bootstrapping from the sample data. (The bootstrap is discussed further in Appendix 2 to this chapter).

18



one single HS P/L as given by Equation (2.1) im plies that it is

2 To be more precise, the historical simulation P/L is the P/L we would have earned over the sample period had we rearranged the portfolio at the end of each trading day to ensure that the amount left invested in each asset was the same as at the end of the previous trading day: we take out our profits, or make up for our losses, to keep the w,- constant from one end-of-day to the next.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

70

very easy for non-param etric m ethods to accom m odate high dim ensions— unlike the case for som e param etric m ethods. W ith non-param etric m ethods, there are no

60 -

problem s dealing with variance-covariance m atrices, curses of dim ensionality, and the like. This means that

50 -

non-param etric m ethods will often be the most natural choice for high-dimension problem s. C

D D"

95% VaR = 1.704

40 -

CD

2.2 ESTIMATION O F HISTORICAL SIMULATION V a R AND ES

CD

20

Basic Historical Simulation

10

Having obtained our historical simulation P/L data, we

-

-

can estim ate VaR by plotting the P/L (or L/P) on a simple 0 -4

histogram and then reading off the VaR from the histo­ gram. To illustrate, suppose we have 1000 observations in

Figure 2.1

Figure 2.1. If these were daily data, this sam ple size would to a year. If we take our confidence level to be 95% , our

rh .m - 3

-

2

-

1

0

1

2

rTTTn

Loss (+)/Profit (-)

our HS P/L series and we plot the L/P histogram shown in be equivalent to four years' daily data at 250 trading days

95% ES = 2.196

30 -

Basic historical simulation VaR and ES.

Note: This figure and associated VaR and ES estimates are obtained using the 'hsesfigure' function.

VaR is given by the x-value that cuts off the upper 5% of very high losses from the rest of the distribution. Given

bootstrap is very intuitive and easy to apply. A bootstrapped

1000 observations, we can take this value (i.e., our VaR) to be

estim ate will often be more accurate than a 'raw' sam ple esti­

the 51 st highest loss value, or 1.704.3 The ES is then the aver­

m ate, and bootstraps are also useful for gauging the precision

age of the 50 highest losses, or 2.196.

of our estim ates. To apply the bootstrap, we create a large num­

The imprecision of these estim ates should be obvious when we consider that the sam ple data set was drawn from a standard normal distribution. In this case the 'true' underlying VaR and ES are 1.645 and 2.063, and Figure 2.1 should (ideally) be normal. O f course, this imprecision underlines the need to work with arge sam ple sizes where practically feasible.

Bootstrapped Historical Simulation O ne simple but powerful im provem ent over basic HS is to estim ate VaR and ES from bootstrapped data. As explained in A p p end ix 2 to this chapter, a bootstrap procedure involves resampling from our existing data set with replacem ent. The

3 We can also estimate the HS VaR more directly (i.e., without bothering with the histogram) by using a spreadsheet function that gives us the 51st highest loss value (e.g., the 'Large' command in Excel), or we can sort our losses data with highest losses ranked first, and then obtain the VaR as the 51st observation in our sorted loss data. We could also take VaR to be any point between the 50th and 51st largest losses (e.g., such as their mid-point), but with a reasonable sample size (as here) there will seldom be much difference between these losses anyway. For conve­ nience, we will adhere throughout to this convention of taking the VaR to be the highest loss observation outside the tail.

ber of new sam ples, each observation of which is obtained by drawing at random from our original sam ple and replacing the observation after it has been drawn. Each new 'resam pled' sam ­ ple gives us a new VaR estim ate, and we can take our 'best' esti­ mate to be the mean of these resam ple-based estim ates. The same approach can also be used to produce resam ple-based ES estim ates— each one of which would be the average of the losses in each resam ple exceeding the resam ple VaR— and our 'best' ES estim ate would be the mean of these estim ates. In our particular case, if we take 1000 resam ples, then our best VaR and ES estim ates are (because of bootstrap sampling variation) about 1.669 and 2.114— and the fact that these are much closer to the known true values than our earlier basic HS estim ates suggests that bootstraps estim ates might be more accurate.

Historical Simulation Using Non-parametric Density Estimation A nother potential im provem ent over basic HS som etim es suggested is to make use of non param etric density estim a­ tion. To appreciate what this involves, w e must recognise that basic HS does not make the best use of the information we have. It also has the practical draw back that it only allows us to

Chapter 2 Non-Parametric Approaches



19

estim ate VaRs at discrete confidence intervals determ ined by

(a) Original histogram

(b) Surrogate density function

the size of our data set. For exam ple, if we have 100 HS P/L observations, basic HS allows us to estim ate VaR at the 95% confidence level, but not the VaR at the 95.1 % confidence level. The VaR at the 95% confidence level is given by the sixth largest loss, but the VaR at the 95.1% confidence level is a problem because there is no corresponding loss observation to go with it. W e know that it should be greater than the sixth largest loss (or the 95% VaR), and sm aller than the fifth largest loss (or the 96% VaR), but with only 100 observations there is no observation that corresponds to any VaR whose confi­ dence level involves a fraction of 1%. With n observations, basic HS only allows us to estim ate the VaRs associated with, at best, n different confidence levels.

Fiaure 2.2

Histograms and surrogate density functions.

Non-param etric density estim ation offers a potential solution to both these problem s. The idea is to treat our data as if they were drawings from some unspecified or unknown em pirical distribution function. This approach also encourages us to con­ front potentially im portant decisions about the width of bins and where bins should be centred, and these decisions can som e­ tim es make a difference to our results. Besides using a histo­ gram, w e can also represent our data using naive estim ators or, more generally, kernels, and the literature tells us that kernels are (or ought to be) superior. So, having assem bled our 'raw' HS data, we need to make decisions on the widths of bins and where they should be centred, and w hether to use a histogram , a naive estimator, or som e form of kernel. If we make good deci­ sions on these issues, we can hope to get better estim ates of VaR and ES (and more general coherent measures). Non-parametric density estimation also allows us to estimate VaRs and ESs for any confidence levels we like and so avoid con­ straints imposed by the size of our data set. In effect, it enables us to draw lines through points on or near the edges of the 'bars' of a histogram. We can then treat the areas under these lines as a surrogate pdf, and so proceed to estim ate VaRs for arbitrary con­ fidence levels. The idea is illustrated in Figure 2.2. The left-hand side of this figure shows three bars from a histogram (or naive estimator) close up. Assuming that the height of the histogram (or naive estimator) measures relative frequency, then one option is to treat the histogram itself as a pdf. Unfortunately, the result­ ing pdf would be a strange one—just look at the corners of each bar— and it makes more sense to approxim ate the pdf by drawing

our data set. Each possible confidence level would correspond to its own tail sim ilar to the shaded area shown in Figure 2.2(b), and we can then use a suitable calculation method to estim ate the VaR (e.g ., w e can carry out the calculations on a spreadsheet or, more easily, by using a purpose-built function such as the 'hsvar' function in the MMR Toolbox).4 O f course, drawing straight lines through the mid-points of the tops of histogram bars is not the best we can do: we could draw smooth curves that m eet up nicely, and so on. This is exactly the point of nonparam etric density estim ation, the purpose of which is to give us some guidance on how 'best' to draw lines through the data points we have. Such m ethods are also straightforward to apply if we have suitable software. Some empirical evidence by Butler and Schachter (1998) using real trading portfolios suggests that kernel-type methods produce VaR estimates that are a little different to those we would obtain under basic HS. However, their work also suggests that the differ­ ent types of kernel methods produce quite similar VaR estimates, although to the extent that there are differences among them, they also found that the 'best' kernels were the adaptive Epanechinikov and adaptive Gaussian ones. To investigate these issues myself, I applied four standard kernel estimators— based on normal, box, triangular and Epanechinikov kernels— to the test data used in earlier examples, and found that each of these gave the same VaR estimate of 1.735. In this case, these different kernels produced the same VaR estimate, which is a little higher (and, curiously,

lines through the upper parts of the histogram. A sim ple way to do this is to draw in straight lines connecting the mid-points at the top of each histogram bar, as illustrated in the figure. O nce we draw these lines, we can forget about the histogram bars and treat the area under the lines as if it were a pdf. Treating the area under the lines as a pdf then enables us to estim ate VaRs at any confidence level, regardless of the size of

20



4 The actual programming is a little tedious, but the gist of it is that if the confidence level is such that the VaR falls between two loss obser­ vations, then we take the VaR to be a weighted average of these two observations. The weights are chosen so that a vertical line drawn through the VaR demarcates the area under the 'curve' in the correct proportions, with a to one side and 1 — a to the other. The details can be seen in the coding for the 'hsvar' and related functions.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

a little less accurate) than the basic HS VaR estimate of 1.704 obtained earlier. Other results not reported here suggest that the different kernels can give somewhat different estimates with smaller samples, but again suggest that the exact kernel specification does not make a great deal of difference. So although kernel m ethods are better in theory, they do not necessarily pro­ duce much better estim ates in practice. There are also practical reasons why we might prefer sim pler non-param etric density estim ation m ethods over kernel ones. Although the kernel m ethods are theoretically better, crude m ethods like drawing straight-line 'curves' through the tops of histogram s are more transparent and easier to check. W e should also not forget that our results are subject to a number of sources of error (e.g ., due to errors in P/L data, mapping approxim a­ tions, and so on), so there is a natural limit to how much real fineness we can actually achieve.

Confidence level

Fiaure 2.3

Plots of HS VaR and ES against confidence level.

Note: Obtained using the 'hsvaresplot2D_cl' function and the same hypothetical P/L data used in Figure 2.1.

Estimating Curves and Surfaces for VaR and ES It is straightforward to produce plots of VaR or ES against the con­ fidence level. For exam ple, our earlier hypothetical P/L data yields the curves of VaR and ES against the confidence level shown in Figure 2.3. Note that the VaR curve is fairly unsteady, as it directly reflects the randomness of individual loss observations, but the ES curve is smoother, because each ES is an average of tail losses.

soon find that we don't have enough data. To illustrate, if we have 1000 observations of daily P/L, corresponding to four years' worth of data at 250 trading days a year, then w e have 1000 P/L observations if we use a daily holding period. If we have a w eekly holding period, with five days to a w eek, each w eekly P/L will be the sum of five daily P/Ls, and we end up with only 200 observations of w eekly P/L; if we have a monthly hold­ ing period, we have only 50 observations of monthly P/L; and so on. Given our initial data, the num ber of effective observations

It is more difficult constructing curves that show how non-para­

rapidly falls as the holding period rises, and the size of the data

metric VaR or ES changes with the holding period. The methods

set im poses a major constraint on how large the holding period

discussed so far enable us to estimate the VaR or ES at a single

can practically be. In any case, even if we had a very long run of

holding period equal to the frequency period over which our

data, the older observations might have very little relevance for

data are observed (e.g., they give us VaR or ES for a daily holding

current m arket conditions.

period if P/L is measured daily). In theory, we can then estimate VaRs or ESs for any other holding periods we wish by construct­ ing a HS P/L series whose frequency matches our desired hold­ ing period: if we wanted to estimate VaR over a weekly holding period, say, we could construct a weekly P/L series and estimate the VaR from that. There is, in short, no theoretical problem as such with estimating HS VaR or ES over any holding period we like.

2.3 ESTIMATING C O N FID EN C E INTERVALS FOR HISTORICAL SIMULATION V a R AND ES The m ethods considered so far are good for giving point esti­

However, there is a major practical problem : as the holding

mates of VaR or ES, but they don't give us any indication of the

period rises, the number of observations rapidly falls, and we

precision of these estim ates or any indication of VaR or ES

Chapter 2 Non-Parametric Approaches

■ 21

confidence intervals. However, there are

200

m ethods to get around this limitation and produce confidence intervals for our

180 -

risk estim ates.5 160 -

An Order Statistics Approach to the Estimation of Confidence Intervals for HS VaR and ES One of the most promising methods is

140 120

Upper bound of 90% confidence interval = 1.797

Lower bound of 90% confidence interval = 1.552

-

u c

CD

3 100 O" Cl)

-

80 -

to apply the theory of order statistics, explained in Appendix 1 to this chapter.

60 -

This approach gives us, not just a VaR (or ES) estim ate, but a complete VaR

40 -

(or ES) distribution function from which we can read off the VaR (or ES) confidence

20

-

interval. (The central tendency param­ eters (mean, mode, median) also give us

0

alternative point estimates of our VaR or ES, if we want them.) This approach is (relatively) easy to programme and very general in its application. Applied to our earlier P/L data, the OS

1.5

1.55

1.6

1.65

1.7

1.75

1.8

1.85

1.9

VaR

Fiqure 2.4

Bootstrapped VaR.

Note: Results obtained using the 'bootstrapvarfigure' function with 1000 resamples, and the same hypothetical data as in earlier figures.

approach gives us estim ates (obtained using the 'hsvarpdfperc' function) of the 5% and 95% points of the 95% VaR distribution function— that is, the bounds of the 90% confidence interval for our VaR— of 1.552 and 1.797. This tells us we can be 90% confident that the 'true' VaR lies in the range [1.552, 1.797].

A Bootstrap Approach to the Estimation of Confidence Intervals for HS VaR and ES W e can also estim ate confidence intervals using a b o o t­ strap app ro ach: we produce a b o o tstrap p ed histogram of resam ple-based VaR (or ES) estim ates, and then read the

The corresponding points of the ES distribution function can be

co nfidence interval from the quantiles of this histogram .

obtained (using the 'hsesdfperc' function) by mapping from the

For exam p le, if w e take 1000 b o o tstrap p ed sam ples from our

VaR to the ES: we take a point on the VaR distribution function,

P/L data set, estim ate the 95% VaR of each, and then plot

and estim ate the corresponding percentile point on the ES

them , we g et the histogram shown in Figure 2 .4 . Using the

distribution function. Doing this gives us an estim ated 90%

basic p ercentile interval approach outlined in A p p e n d ix 2

confidence interval of [2.021, 2.224].6

to this chapter, the 90% confidence interval fo r our VaR is [1 .5 5 4 , 1.797]. The sim ulated histogram is surprisingly disjointed , although the bootstrap seem s to give a relatively

5 In addition to the methods considered in this section, we can also estimate confidence intervals for VaR using estimates of the quantile standard errors. However, as made clear there, such confidence intervals are subject to a number of problems, and the methods suggested here are usually preferable. 6 Naturally, the order statistics approach can be combined with more sophisticated non-parametric density estimation approaches. Instead of applying the OS theory to the histogram or naive estimator, we could apply it to a more sophisticated kernel estimator, and thereby extract more information from our data. This approach has some merit and is developed in detail by Butler and Schachter (1998).

22



robust estim ate of the co nfidence interval if w e keep repeating the e xercise. W e can also use the b o o tstrap to estim ate ESs in much the sam e w ay: for each new resam pled data set, w e estim ate the VaR, and then estim ate the ES as the averag e of losses in e xce ss of VaR. Doing this a large num ber of tim es gives us a large num ber of ES estim ates, and w e can plot them in the sam e w ay as the VaR estim ates. Th e histogram of b o o tstrap p ed ES values is shown in Figure 2 .5 , and is b etter

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

2.4 W EIGHTED HISTORICAL SIMULATION

100

90 80 -

Upper bound of 90% confidence interval = 2.271

Lower bound of 90% confidence interval = 1.986

70 -

O ne of the most im portant features of traditional HS is the way it w eights past observations. Recall that Rl t is the return on asset / in period t, and we are im ple­

60 > u* c o 3 50 CT 0)

menting HS using the past n observa­ tions. An observation Rj t_j will therefore belong to our data set if j takes any of the values 1 , . . . , t — n, where j is the

40 -

age of the observation (e.g ., so j — 1 indicates that the observation is 1 day

30 -

old, and so on). If we construct a new HS 20

P/L series, P /L t, each day, our observa­

-

tion Ri t_j will first affect P / L t, then affect 10

P /L t+ , and so on, and finally affect 7

-

P / L i+n: our return observation will affect rz 1.9

0

1.8

2.1

2.2

2.3

2.4

ETL

Figure 2.5

2.5

each of the next n observations in our P/L series. A lso, other things (e.g ., posi­ tion weights) being equal, Rjt-j will affect

Bootstrapped ES.

each P/L in exactly the sam e way. But

Note: Results obtained using the 'bootstrapesfigure' function with 1000 resamples, and the same hypothetical data as in earlier figures.

after n periods have passed, Ri t-j will fall out of the data set used to calculate the current HS P/L series, and will thereafter

Table 2.1 90% Confidence Intervals for Non-parametric VaR and ES Approach

Lower bound

have no effect on P/L. In short, our HS P/L series is constructed in a way that gives any observation the

Upper bound 95% VaR

O rder statistics Bootstrap

1.552 1.554

no w eight (i.e., a zero weight) if it is older than that. This weighting structure has a number of problem s. One

1.797

problem is that it is hard to justify giving each observation in our

1.797

sam ple period the sam e w eight, regardless of age, market

95% ES O rder statistics

2.021

2.224

Bootstrap

1.986

2.271

Note; Bootstrap estimates based on 1000 resamples.

behaved than the VaR histogram in the last figure because the ES is an averag e of tail VaRs. Th e 90% co nfid ence interval for o u r E S is [1 .9 8 6 , 2.271]. It is also interesting to com pare the VaR and ES confidence intervals obtained by the tw o m ethods. Th ese are sum m arised in Table 2.1, and we can see that the O S and bootstrap approaches give very sim ilar results. This suggests that either approach is likely to be a reasonable one to use in p ractice.

sam e w eight on P/L provided it is less than n periods old, and

volatility, or anything else. A good exam ple of the difficulties this can create is given by Shimko et. al. (1998). It is well known that natural gas prices are usually more volatile in the winter than in the summer, so a raw HS approach that incorporates both sum m er and w inter observations will tend to average the sum m er and w inter observations together. As a result, treating all observations as having equal w eight will tend to underesti­ mate true risks in the winter, and overestim ate them in the sum m er .7 The equal-weight approach can also make risk estim ates unresponsive to major events. For instance, a stock m arket crash

7 If we have data that show seasonal volatility changes, a solution— suggested by Shimko et. al. (1998)— is to weight the data to reflect seasonal volatility (e.g., so winter observations get more weight, if we are estimating a VaR in winter).

Chapter 2 Non-Parametric Approaches

■ 23

might have no effect on VaRs excep t at a very high confidence

historical sim ulation' and can be regarded as sem i-param etric

level, so we could have a situation where everyone might agree

m ethods because they com bine features of both param etric and

that risk had suddenly increased, and yet that increase in risk

non-param etric m ethods.

would be missed by most HS VaR estim ates. The increase in risk would only show up later in VaR estim ates if the stock market continued to fall in subsequent days— a case of the stable door closing only well after the horse had long since bolted. That

Age-weighted Historical Simulation O ne such approach is to w eight the relative im portance, of our

said, the increase in risk w ould show up in ES estim ates just

observations by their age, as suggested by Boudoukh, Richard­

after the first shock occurred— which is, incidentally, a good

son and W hitelaw (BRW : 1998). Instead of treating each obser­

exam ple of how ES can be a more inform ative risk measure

vation for asset / as having the same implied probability as any

than the VaR .8* The equal-weight structure also presum es that each observa­ tion in the sam ple period is equally likely and independent of the others over tim e. However, this 'iid' assumption is unrealistic because it is well known that volatilities vary over tim e, and that periods of high and low volatility tend to be clustered together. The natural gas exam ple just considered is a good case in point. It is also hard to justify why an observation should have a w eight that suddenly goes to zero when it reaches age n. W hy is it that an observation of age n — 1 is regarded as having a lot of value (and, indeed, the same value as any more recent observation),

other (i.e., 1 /n), we could w eight their probabilities to discount the older observations in favour of newer ones. Thus, if w(1) is the probability w eight given to an observation 1 day old, then w(2 ), the probability given to an observation 2 days old, could be Aw(1); w(3) could be A2w(1); and so on. The A term is between

0 and 1 , and reflects the exponential rate of decay in the w eight or value given to an observation as it ages: a A close to 1 indi­ cates a slow rate of decay, and a A far away from 1 indicates a high rate of decay. w ( 1 ) is set so that the sum of the weights is 1, and this is achieved if we set w(1) = (1 — A)/(1 — An). The w eight given to an observation / days old is therefore: AM (1 - A) r ­

but an observation of age n is regarded as having no value at

-( 0 = - r ^

all? Even old observations usually have som e information con­

(

2 . 2)

tent, and giving them zero value tends to violate the old statisti­

and this corresponds to the w eight of 1 /n given to any in-sample

cal adage that we should never throw information away.

observation under basic HS.

This weighting structure also creates the potential for ghost

O ur core inform ation— the inform ation inputted to the HS

effects— we can have a VaR that is unduly high (or low) because

estim ation p ro cess— is the paired set of P/L values and asso ­

of a small cluster of high loss observations, or even just a single

ciated p ro b ab ility w eig h ts. To im plem ent ag e-w eig hting ,

high loss, and the measured VaR will continue to be high (or low)

w e m erely replace the old equal w eig h ts 1 /n with the age-

until n days or so have passed and the observation has fallen out of the sam ple period. A t that point, the VaR will fall again, but

d ep en d en t w eig h ts w(i) given by (2.4). For exam p le , if we are using a sp re a d sh e e t, w e can ord er our P/L o b servatio n s

the fall in VaR is only a ghost effect created by the weighting

in one colum n, put th e ir w eig h ts w(i) in the next colum n, and

structure and the length of sam ple period used.

go down that colum n until w e reach our desired p ercen tile.

We now address various ways in which we might 'adjust' our data to overcom e som e of these problem s and take account of ways in which current m arket conditions might differ from those in our sam ple. These fall under the broad heading of 'w eighted

O ur VaR is then the negative of the co rresponding value in the first colum n. And if our d esired p ercen tile falls betw een tw o p e rce n tile s, w e can take our VaR to be the (negative of the) in terp o lated value of the co rresp ond ing first-colum n o b servatio n s. This age-weighted approach has four major attractions. First,

8 However, both VaR and ES suffer from a related problem. As Pritsker (2001, p. 5) points out, HS fails to take account of useful information from the upper tail of the P/L distribution. If the stock experiences a series of large falls, then a position that was long the market would experience large losses that should show up, albeit later, in HS risk estimates. However, a position that was short the market would experi­ ence a series of large profits, and risk estimates at the usual confidence levels would be completely unresponsive. Once again, we could have a situation where risk had clearly increased—because the fall in the market signifies increased volatility, and therefore a significant chance of losses due to large rises in the stock market—and yet our risk estimates had failed to pick up this increase in risk.

24



it provides a nice generalisation of traditional HS, because we can regard traditional HS as a special case with zero decay, or A —> 1. If HS is like driving along a road looking only at the rear­ view mirror, then traditional equal-weighted HS is only safe if the road is straight, and the age-weighted approach is safe if the road bends gently. Second, a suitable choice of A can make the VaR (or ES) esti­ mates more responsive to large loss observations: a large loss event will receive a higher w eight than under traditional HS, and

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

the resulting next-day VaR would be higher than it would oth­

exam ple, if the current volatility in a m arket is 1.5% a day, and it

erwise have been. This not only means that age-weighted VaR

was only 1 % a day a month ago, then data a month old under­

estim ates are more responsive to large losses, but also makes

state the changes we can exp ect to see tom orrow, and this

them better at handling clusters of large losses.

suggests that historical returns would underestim ate tom orrow's

Third, age-weighting helps to reduce distortions caused by events that are unlikely to recur, and helps to reduce ghost effects. As an observation ages, its probability w eight gradually falls and its influence dim inishes gradually over tim e. Further­ more, when it finally falls out of the sam ple period, its w eight will fall from Anw(1) to zero, instead of from 1 /n to zero. Since

risks; on the other hand, if last month's volatility was 2 % a day, month-old data will overstate the changes we can exp ect tom or­ row, and historical returns would overestim ate tom orrow's risks. We therefore adjust the historical returns to reflect how volatility tom orrow is believed to have changed from its past values. Suppose we are interested in forecasting VaR for day T. Let

Anw(1) is less than 1/n for any reasonable values of A and n, then

rti be the historical return in asset / on day t in our historical

the shock— the ghost effect— will be less than it would be under

sam ple, ort j be the historical G A R C H (or EW M A) forecast of the

equal-weighted HS.

volatility of the return on asset / for day t, made at the end of

Finally, we can also modify age-weighting in a way that makes our risk estim ates more efficient and effectively elim inates any remaining ghost effects. Since age-weighting allows the im pact

day t — 1 , and crTi be our most recent forecast of the volatility of asset /. W e then replace the returns in our data set, rt i, with volatility-adjusted returns, given by:

of past extrem e events to decline as past events recede in tim e, it gives us the option of letting our sam ple size grow over tim e. (Why can't we do this under equal-weighted HS? Because we would be stuck with ancient observations whose information content was assumed never to date.) Age-w eighting allows us to let our sam ple period grow with each new observation, so we never throw potentially valuable information away. This would improve efficiency and elim inate ghost effects, because there would no longer be any 'jum ps' in our sam ple resulting from old observations being thrown away. However, age-weighting also reduces the effective sam ple size, other things being equal, and a sequence of major profits or losses can produce major distortions in its implied risk profile. In

Actual returns in any period ta re therefore increased (or decreased), depending on w hether the current forecast of vola­ tility is greater (or less than) the estim ated volatility for period t. We now calculate the HS P/L using Equation (2.3) instead of the original data set rt/, and then proceed to estim ate HS VaRs or ESs in the traditional way (i.e., with equal w eights, e tc .).10* The HW approach has a number of advantages relative to the traditional equal-weighted and/or the BRW age-weighted approaches: •

addition, Pritsker shows that even with age-weighting, VaR

and the age-weighted approach treats volatility changes in a

estim ates can still be insufficiently responsive to changes in

rather arbitrary and restrictive way.

underlying risk .9 Furtherm ore, there is the disturbing point that the BRW approach is ad hoc, and that excep t for the special

It takes account of volatility changes in a natural and direct way, w hereas equal-weighted HS ignores volatility changes



case where A = 1 w e cannot point to any asset-return process

It produces risk estim ates that are appropriately sensitive to current volatility estim ates, and so enables us to incorpo­

for which the BRW approach is theoretically correct.

rate information from G A R C H forecasts into HS VaR and ES estim ation.

Volatility-weighted Historical Simulation



It allows us to obtain VaR and ES estim ates that can exceed the maximum loss in our historical data set: in periods of high

We can also w eight our data by volatility. The basic idea— sug­

volatility, historical returns are scaled upwards, and the HS

gested by Hull and W hite (HW; 1998b)— is to update return

P/L series used in the HW procedure will have values that

information to take account of recent changes in volatility. For

exceed actual historical losses. This is a major advantage over traditional HS, which prevents the VaR or ES from being any

9 If VaR is estimated at the confidence level a, the probability of an HS estimate of VaR rising on any given day is equal to the probability of a loss in excess of VaR, which is of course 1 — a. However, if we assume a standard GARCH(1,1) process and volatility is at its long-run mean value, then Pritsker's proposition 2 shows that the probability that HSVaR should increase is about 32% (Pritsker (2001, pp. 7-9)). In other words, most of the time HS VaR estimates should increase (i.e., when risk rises), they fail to.

bigger than the losses in our historical data set. •

Em pirical evidence presented by HW indicates that their approach produces superior VaR estim ates to the BRW one.

10 Naturally, volatility weighting presupposes that one has estimates of the current and past volatilities to work with.

Chapter 2 Non-Parametric Approaches

■ 25

The HW approach is also capable of various extensions. For

major generalisation of the HW approach, because it gives us a

instance, we can com bine it with the age-weighted approach if

weighting system that takes account of correlations as well as

we wished to increase the sensitivity of risk estim ates to large

volatilities.

losses, and to reduce the potential for distortions and ghost effects. W e can also com bine the HW approach with O S or bootstrap m ethods to estim ate confidence intervals for our VaR

Example 2.1

Correlation-weighted HS

or ES— that is, w e would w ork with order statistics or resam ple

Suppose we have only two positions in our portfolio, so m — 2.

with replacem ent from the HW -adjusted P/L, rather than from

The historical correlation between our two positions is 0.3, and

the traditional HS P/L.

we wish to adjust our historical returns R to reflect a current correlation of 0.9.

Correlation-weighted Historical Simulation

If 3jj is the /, jth elem ent of the 2 x 2 m atrix A, then applying the Choleski decom position tells us that

We can also adjust our historical returns to reflect changes

3n = 1 ,

between historical and current correlations. Correlation-w eight­

a -12 — 0 ,

a2i = p,

322 ~ V 1 — p2

ing is a little more involved than volatility-weighting. To see the

where p — 0.3. The m atrix A is sim ilar except for having p — 0.9.

principles involved, suppose for the sake of argum ent that we

Standard matrix theory also tells us that

have already made any volatility-based adjustm ents to our HS

a 22/ — a 12

returns along Hull-White lines, but also wish to adjust those

a 21/a 11 _

returns to reflect changes in correlations .11

Substituting these into Equation (2.5), w e find that

To make the discussion concrete, we have m positions

R = AA 1 R =

and our (perhaps volatility adjusted) 1 X m vector of his­ torical returns R for som e period t reflects an m X m

HR 1 V i - 0.32 V i - 0.32,0 -0.3,1 _o.9, v T - 0.92_ 1 , 0

variance-covariance m atrix X . X in turn can be decom posed into the product a C a T, where a is an m X m diagonal m atrix of volatilities (i.e., so the /th elem ent of a is the ith volatility 07 and the off-diagonal elem ents are zero), crT is its transpose, and C is

1 V i - 0 .3 2

the m X m m atrix of historical correlations. R therefore reflects an historical correlation m atrix C, and we wish to adjust R so

R

Vi

- 0 .3 2

0.9 Vi - 0 .3 2

0.3 Vi - 0.92, V i

1, 0

that they becom e R reflecting a current correlation m atrix C. Now suppose for convenience that both correlation m atrices are

0.7629, 0.4569

positive definite. This means that each correlation m atrix has an m X m 'm atrix square root', A and A respectively, given by a Choleski decom position (which also im plies that they are easy to obtain). W e can now write R and R as m atrix products of the rel­ evant Choleski m atrices and an uncorrelated noise process s:

Filtered Historical Simulation Another promising approach is filtered historical simulation (FH S ).12*This is a form of sem i-param etric bootstrap which aims

R = Ae

(2.4a)

to com bine the benefits of HS with the power and flexibility of

R = Ae

(2.4b)

conditional volatility m odels such as G A R C H . It does so by boot­

We then invert Equation (2.4a) to obtain e = A 1R, and substi­ tute this into (Equation 2.4b) to obtain the correlation-adjusted series R that we are seeking:

R = A A -1R

strapping returns within a conditional volatility (e.g ., G A R C H ) fram ew ork, where the bootstrap preserves the non-param etric nature of HS, and the volatility model gives us a sophisticated treatm ent of volatility.

(2.5)

The returns adjusted in this way will then have the currently

Suppose we wish to use FHS to estim ate the VaR of a single­ asset portfolio over a 1-day holding period. The first step in

prevailing correlation m atrix C and, more generally, the

FHS is to fit, say, a G A R C H model to our portfolio-return data.

currently prevailing covariance m atrix S .T h is approach is a

We want a model that is rich enough to accom m odate the key

11 The correlation adjustment discussed here is based on a suggestion by Duffie and Pan (1997).

12 This approach is suggested in Barone-Adesi et. al. (1998), Barone-Adesi et. al. (1999), Barone-Adesi aud Giannopoulos (2000) and in other papers by some of the same authors.

26



Financial Risk Manager Exam Part II: Market Risk Measurement and Management

features of our data, and Barone-Adesi and colleagues recom ­

keep any correlation structure present in the raw returns. The

mend an asym m etric G A R C H , or A G A R C H , m odel. This not only

bootstrap thus maintains existing correlations, without our hav­

accom m odates conditionally changing volatility, volatility clus­

ing to specify an explicit m ultivariate pdf for asset returns.

tering, and so on, but also allows positive and negative returns to have differential im pacts on volatility, a phenom enon known as the leverage effect. The A G A R C H postulates that portfolio returns obey the following process:

(x)

+ a(et_i + y )2 + /3cr2_i

we have a longer holding period, we would first take a draw ­ ing and use Equation (2.8) to get a return for tom orrow ; we would then use this drawing to update our volatility forecast

rt — /jl + s t of =

The other obvious extension is to a longer holding period. If

(2 . 6 a) (2 . 6 b)

The daily return in Equation (2.6a) is the sum of a mean daily return (which can often be neglected in volatility estim ation) and a random error s t. The volatility in Equation (2.6b) is the sum of a constant and term s reflecting last period's 'surprise' and last period's volatility, plus an additional term y that allows for the surprise to have an asym m etric effect on volatility, depending on w hether the surprise term is positive or negative. The second step is to use the model to forecast volatility for each of the days in a sam ple period. These volatility forecasts are then divided into the realised returns to produce a set of standardised returns. These standardised returns should be independently and identically distributed (iid), and therefore be suitable for HS.

for the day after tom orrow, and take a fresh drawing to deter­ mine the return for that day; and we would carry on in the same manner— taking a drawing, updating our volatility forecasts, taking another drawing for the next period, and so on— until we had reached the end of our holding period. A t that point we would have enough information to produce a single simulated P/L observation; and we would repeat the process as many tim es as we wished in order to produce the histogram of sim u­ lated P/L observations from which we can estim ate our VaR. FHS has a number of attractions: (i) It enables us to com bine the non-param etric attractions of HS with a sophisticated (e.g ., G A R C H ) treatm ent of volatility, and so take account of changing m arket volatility conditions, (ii) It is fast, even for large portfo­ lios. (iii) As with the earlier HW approach, FHS allows us to get VaR and ES estim ates that can exceed the maximum historical loss in our data set. (iv) It maintains the correlation structure in

Assuming a 1-day VaR holding period, the third stage involves

our return data without relying on knowledge of the variance-

bootstrapping from our data set of standardised returns: we take

covariance m atrix or the conditional distribution of asset returns,

a large number of drawings from this data set, which we now

(v) It can be modified to take account of autocorrelation or past

treat as a sample, replacing each one after it has been drawn,

cross-correlations in asset returns, (vi) It can be modified to pro­

and multiply each random drawing by the A G A R C H forecast of

duce estim ates of VaR or ES confidence intervals by combining

tomorrow's volatility. If we take M drawings, we therefore get M

it with an O S or bootstrap approach to confidence interval esti­

simulated returns, each of which reflects current market conditions

m ation .14 (vii) There is evidence that FHS works w e ll .15

because it is scaled by today's forecast of tomorrow's volatility. Finally, each of these sim ulated returns gives us a possible end-of-tomorrow portfolio value, and a corresponding possible loss, and we take the VaR to be the loss corresponding to our chosen confidence level .13 We can easily modify this procedure to encom pass the obvious com plications of a multi asset portfolio or a longer holding period. If we have a multi-asset portfolio, we would fit a m ultivariate G A R C H (or A G A R C H ) to the set or vector of asset returns, and we would standardise this vector of asset returns. The bootstrap would then select, not just a standardised portfolio return for som e chosen past (daily) period, but the standardised vector of asset returns for the chosen past period. This is im portant because it means that our sim ulations would

13 The FHS approach can also be extended easily to allow for the esti­ mation of ES as well as VaR. For more on how this might be done, see Giannopoulos and Tunaru (2004).

14 The OS approach would require a set of paired P/L and associated probability observations, so we could apply this to FHS by using a P/L series that had been through the FHS filter. The bootstrap is even easier, since FHS already makes use of a bootstrap. If we want 8 bootstrapped estimates of VaR, we could produce, say, 100*8 or 1000*8 bootstrapped P/L values; each set of 100 (or 1000) P/L series would give us one HS VaR estimate, and the histogram of M such estimates would enable us to infer the bounds of the VaR confidence interval. 15 Barone-Adesi and Giannopoulos (2000), p. 17. However, FHS does have problems. In his thorough simulation study of FHS, Pritsker (2001, pp. 22-24) comes to the tentative conclusions that FHS VaR might not pay enough attention to extreme observations or time-varying correla­ tions, and Barone-Adesi and Giannopoulos (2000, p. 18) largely accept these points. A partial response to the first point would be to use ES instead of VaR as our preferred risk measure, and the natural response to the second concern is to develop FHS with a more sophisticated past cross-correlation structure. Pritsker (2001, p. 22) also presents simula­ tion results that suggest that FHS-VaR tends to underestimate 'true' VaR over a 10-day holding period by about 10%, but this finding conflicts with results reported by Barone-Adesi et. al. (2000) based on real data. The evidence on FHS is thus mixed.

Chapter 2 Non-Parametric Approaches

■ 27

2.5 ADVANTAGES AND DISADVANTAGES O F NON-PARAMETRIC METHODS

on the historical data s e t .16 Th ere are various other related problem s: •

If our data period was unusually quiet, non-param etric m ethods will often produce VaR or ES estim ates that are

Advantages

too low for the risks we are actually facing; and if our data

In draw ing our discussion to a clo se , it is perhaps a

ES estim ates that are too high.

good idea to sum m arise the main ad van tag es and d isad van ­ tag e s of non-param etric ap p ro ach es. The ad van tag es

period was unusually volatile, they will often produce VaR or •

include: • •

if there is a perm anent change in exchange rate risk, it will

Non-param etric approaches are intuitive and conceptually

usually take tim e for the HS VaR or ES estim ates to reflect

sim ple.

the new exchange rate risk. Sim ilarly, such approaches are som etim es slow to reflect major events, such as the increases

Since they do not depend on param etric assum ptions about

in risk associated with sudden m arket turbulence.

P/L, they can accom m odate fat tails, skew ness, and any other non-normal features that can cause problem s for param etric





estim ates even though we don't expect them to recur.

They can in theory accom m odate any type of position, including derivatives positions.



shadow effects.

that HS works quite well em pirically, although formal em piri­ •



actually occur, in our sam ple period.

spreadsheet. N on-param etric m ethods are free of the operational prob­ lems to which param etric m ethods are subject when applied to high-dim ensional problem s: no need for covariance m atrices, no curses of dim ensionality, etc. •

They use data that are (often) readily available, either from public sources (e.g ., Bloom berg) or from in-house data sets (e.g ., collected as a by-product of marking positions to market).



In general, non-param etric estim ates of VaR or ES make no allowance for plausible events that might occur, but did not

They are (in varying degrees, fairly) easy to im plem ent on a



Most (if not all) non-param etric m ethods are subject (to a greater or lesser extent) to the phenom enon of ghost or

There is a w idespread perception among risk practitioners cal evidence on this issue is inevitably m ixed.

If our data set incorporates extrem e losses that are unlikely to recur, these losses can dom inate non-param etric risk

approaches. •

Non-param etric approaches can have difficulty handling shifts that take place during our sam ple period. For exam ple,



Non-param etric estim ates of VaR and ES are to a greater or lesser extent constrained by the largest loss in our historical data set. In the sim pler versions of HS, we cannot extrapolate from the largest historical loss to anything larger that might conceivably occur in the future. More sophisticated versions of HS can relax this constraint, but even so, the fact remains that non-param etric estim ates of VaR or ES are still con­ strained by the largest loss in a way that param etric estim ates are not. This means that such m ethods are not well suited to

They provide results that are easy to report and com m uni­

handling extrem es, particularly with small- or medium-sized

cate to senior m anagers and interested outsiders (e.g ., bank

sam ples.

supervisors or rating agencies). •

It is easy to produce confidence intervals for non-param etric VaR and ES.



N on-param etric ap p ro ach es are cap ab le of co n sid ­

able refinem ents. For exam ple, we can am eliorate volatility, m arket turbulence, correlation and other problem s by semiparam etric adjustm ents, and we can am eliorate ghost effects

erab le refinem ent and potential im provem ent if we

by age-weighting our data and allowing our sam ple size to rise

com bine them with p aram etric 'add-ons' to m ake

over tim e.

them sem i-p aram etric: such refinem ents include a g e ­ w eig hting (as in BRW ), vo latility-w eig htin g (as in HW and FH S), and co rrelatio n -w eig h tin g .

Disadvantages Perhaps th e ir b ig g est potential w eakn ess is that their results are very (and in m ost cases, co m p letely) d ep en d en t

28

However, we can often am eliorate these problem s by suit­



There can also be problem s associated with the length of the sam ple w indow period. W e need a reasonably long w indow

16 There can also be problems getting the data set. We need time series data on all current positions, and such data are not always available (e.g., if the positions are in emerging markets). We also have to ensure that data are reliable, compatible, and delivered to the risk estimation system on a timely basis.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

to have a sam ple size large enough to get risk estim ates of accep tab le precision, and as a broad rule of thum b, most exp erts believe that w e usually need at least a couple of year's

Using Order Statistics to Estimate Confidence Intervals for VaR

worth of daily observations (i.e ., 500 observations, at 250

If we have a sam ple of n P/L observations, we can regard

trading days to the year), and often m ore. On the other hand,

each observation as giving an estim ate of VaR at an implied

a very long w indow can also create its own problem s. The

confidence level. For exam ple, if n = 1000, we might take the

longer the w indow : •

the greater the problem s with aged data;



the longer the period over which results will be distorted by unlikely-to-recur past events, and the longer we will have to wait for ghost effects to disappear;



the more the news in current m arket observations is likely to be drowned out by older observations— and the less responsive will be our risk estim ates to current m arket conditions; and



the greater the potential for data-collection problem s. This is a particular concern with new or em erging m arket instru­ m ents, where long runs of historical data don't exist and are not necessarily easy to proxy.

95% VaR as the negative of the 51 st sm allest P/L observation, we might take the 99% VaR as the negative of the 11th sm all­ est, and so on. We therefore take the a VaR to be equal to the negative of the rth lowest observation, where r is equal to 100(1 — a) + 1. More generally, with n observations, we take the VaR as equal to the negative of the rth lowest observation, where r — n( 1 — a) + 1 . The rth order statistic is the rth lowest (or, alternatively, highest) in a sam ple of n observations, and the theory of order statis­ tics is well established in the statistical literature. Suppose our observations x 1# x 2, . . . , x n come from som e known distribution (or cum ulative density) function F(x), with rth order statistic x (r). Now suppose that x (1) < x (2) ^ . . .

^ x (n). The probability that

j of our n observations do not exceed a fixed value x must obey the following binomial distribution:

CO N CLUSIO N S

P r { j o b serva tion s < x j —

{ P (x )}7{ 1 — F ( x ) } n_;

(2.7)

Non-param etric m ethods are w idely used and in many respects

It follows that the probability that at least r observations in the

highly attractive approaches to the estim ation of financial risk

sam ple do not exceed x is also a binomial:

m easures. They have a reasonable track record and are often superior to param etric approaches based on sim plistic assum p­ tions such as normality. They are also capable of considerable refinem ent to deal with som e of the w eaknesses of more basic non-param etric approaches. As a general rule, they work fairly well if m arket conditions remain reasonably stable, and are capable of considerable refinem ent. However, they have their limitations and it is often a good idea to supplem ent them with other approaches. W herever possible, we should also com ple­ ment non-param etric m ethods with stress testing to gauge our vulnerability to 'what if' events. We should never rely on nonparam etric m ethods alone.

GrCx) =

- R x)}" '

( 2 . 8)

G r(x) is therefore the distribution function of our order statistic and, hence, of our quantile or VaR .17 This VaR distribution function provides us with estim ates of our VaR and of its associated confidence intervals. The median (i.e., 50 percentile) of the estim ated VaR distribution function gives us a natural 'best' estim ate of our VaR, and estim ates of the lower and upper percentiles of the VaR distribution function give us estim ates of the bounds of our VaR confidence interval. This is useful, because the calculations are accurate and easy to carry out on a spreadsheet. Equation (2.8) is also very general and

A PPEN D IX 1

gives us confidence intervals for any distribution function F(x),

Estimating Risk Measures with Order Statistics

To use this approach, all we need to do is specify F(x) (as nor­

The theory of order statistics is very useful for risk m easure­ m ent because it gives us a practical and accurate m eans of estim ating the distribution function for a risk m easure— and

param etric (normal, t, etc.) or em pirical.

mal, t, etc.), set our param eter values, and use Equation (2.8) to estim ate our VaR distribution function. To illustrate, suppose w e w ant to apply the order-statistics (OS) approach to estim ate the distribution function of a standard

this is useful because it enables us to estim ate confidence intervals for them .

17 See, e.g., Kendall and Stuart (1973), p. 348, or Reiss (1989), p. 20.

Chapter 2 Non-Parametric Approaches

■ 29

normal VaR. W e then assum e that F(x) is standard normal

equivalent— the t-distribution function, the Gum bel distribu­

and use Equation (2.8) to estim ate three key param eters of

tion function, and so on. We can also use the same approach to

the VaR distribution: the median or 50 percentile of the esti­

estim ate the confidence intervals for an em pirical distribution

m ated VaR distribution, which can be interpreted as an O S

function (i.e., for historical simulation VaR), where F(x) is some

estim ate of normal VaR; and the 5 and 95 percentiles of the

em pirical distribution function.

estim ated VaR distribution, which can be interpreted as the O S estim ates of the bounds of the 90% confidence interval for standard normal VaR.

Conclusions

Some illustrative estim ates for the 95% VaR are given in Table 2.2.

The O S approach provides an ideal method for estim ating the

To facilitate comparison, the table also shows the estim ates of

confidence intervals for our VaRs and ESs. In particular, the O S

standard normal VaR based on the conventional normal VaR

approach is:

formula as explained in Chapter 1. The main results are: •



The confidence interval— the gap between the 5 and 95

ric or non-param etric VaR or ES.

percentiles— is quite wide for low values of n, but narrows as n gets larger. • •

Com pletely general, in that it can be applied to any param et­



Reasonable even for relatively small sam ples, because it is not based on asym ptotic theory— although it is also the case

As n rises, the median of the estim ated VaR distribution

that estim ates based on small sam ples will also be less accu­

converges to the conventional estim ate.

rate, precisely because the sam ples are small.

The confidence interval is (in this case, a little) w ider for more extrem e VaR confidence levels than it is for the more central ones.



Easy to im plem ent in practice.

The O S approach is also su p erio r to co nfid ence-in terval estim ation m ethods based on estim ates of quantile sta n ­

The same approach can also be used to estim ate the per­

dard errors (see C h a p te r 1), because it does not rely on

centiles of other VaR distribution functions. If we wish to

asym p to tic theo ry and/or force estim ated co nfid ence

estim ate the percentiles of a non-normal param etric VaR, we

intervals to be sym m etric (which can be a problem for

replace the normal distribution function F(x) by the non-normal

extrem e VaRs and ESs).

Table 2.2

Order Statistics Estimates of Standard Normal 95% VaRs and Associated Confidence Intervals (a) As n varies

No. of observations

100

500

1000

5000

10 000

Lower bound of confidence interval

1.267

1.482

1.531

1.595

1.610

Median of VaR distribution

1.585

1.632

1.639

1.644

1.644

Standard estim ate of VaR

1.645

1.645

1.645

1.645

1.645

Upper bound of confidence interval

1.936

1.791

1.750

1.693

1.679

6 .0%

4.2%

W idth of interval/m edian

42.2%

18.9%

13.4%

(b) As VaR confidence level varies (with n = 500) VaR confidence level

0.90

0.95

0.99

Low er bound of confidence interval

1.151

1.482

2.035

Median of VaR distribution

1.274

1.632

2.279

Standard estim ate of VaR

1.282

1.645

2.326

Upper bound of confidence interval

1.402

1.791

2.560

W idth of interval/m edian of interval

19.7%

18.9%

23.0%

Notes: The confidence interval is specified at a 90% level of confidence, and the lower and upper bounds of the confidence interval are estimated as the 5 and 95 percentiles of the estimated VaR distribution (Equation (2.8)).

30



Financial Risk Manager Exam Part II: Market Risk Measurement and Management

A PPEN D IX 2

distribution are unknown— and, more likely than not, so too is the distribution itself. We are interested in a particular parameter

The Bootstrap

6, where 0 might be a mean, variance (or standard deviation),

The bootstrap is a simple and useful method for assessing

estimate 0 using a suitable sample estimator— so if 0 is the mean,

uncertainty in estim ation procedures. Its distinctive feature

our estimator 0 would be the sample mean, if 0 is the variance, our

quantile, or some other parameter. The obvious approach is to

/V

/V

is that it replaces mathem atical or statistical analysis with

estimator 6 would be based on some sample variance, and so on.

sim ulation-based resam pling from a given data set. It therefore

Obtaining an estimator for 0 is therefore straightforward, but how

provides a means of assessing the accuracy of param eter esti­

do we obtain a confidence interval for it?

mators without having to resort to strong param etric assum p­ tions or closed-form confidence-interval form ulas. The roots of the bootstrap go back a couple of centuries, but the idea only took off in the last three decades after it was developed and popularised by the w ork of Bradley Efron. It was Efron, too, who first gave it its name, which refers to the phrase 'to pull oneself up by one's bootstraps'. The bootstrap is a form of statistical 'trick', and is therefore very aptly named. The main purpose of the bootstrap is to assess the accuracy of

To estim ate confidence intervals for 0 using traditional closedform approaches requires us to resort to statistical theory, and the theory available is of limited use. For exam ple, suppose we wish to obtain a confidence interval for a variance. If we assume that the underlying distribution is normal, then we know that (n — 1 ) z° +

z° + za

, a 2 = [z° +

1 - a(z° + z j

Z° + Z - Q 1

(2 . 12)

1 - a(z° + z ^ J .

If the param eters a and z° are zero, this B C a confidence interval will coincide with the earlier percentile interval. However, in general, they will not be 0, and we can think of the B C a method as correcting the end-points of the confidence interval. The param eter a refers to the rate of change of the standard error of 9 with respect to the true param eter 9, and it can be regarded as a correction for skew ness. This param eter can be estim ated from the follow ing, which would be based on an initial bootstrap or jackknife exercise:

the original sam ple. O nce we have our resam ple, we use it to

(2.13)

estim ate the param eter we are interested in. This gives us a resam ple estim ate of the param eter. We then repeat the 'resam ­ pling' process again and again, and obtain a set of B resam ple

The param eter z° can be estim ated as the standard normal

param eter estim ates. This set of B resam ple estim ates can also

inverse of the proportion of bootstrap replications that is less

be regarded as a bootstrapped sam ple of param eter estim ates.

than the original estim ate 9. The B C a method is therefore

We can then use the bootstrapped sam ple to estim ate a con­ fidence interval for our param eter 9. For exam ple, if each resaA p

mple i gives us a resam ple estim ator 9 (i) we might construct a /v D

sim ulated density function from the distribution of our 9 b[i) val­ ues and infer the confidence intervals from its percentile points. If our confidence interval spans the central 1 — 2a of the prob­ ability mass, then it is given by: C o n fid en ce Interval — [9B,

(2.11)

AQ

where 9% is the a quantile of the distribution of bootstrapped A

q

9 (i) values. This 'percentile interval' approach is very easy to apply and does not rely on any param etric theory, asym ptotic or otherwise.

A

(relatively) straightforward to im plem ent, and it has the theoreti­ cal advantages over the percentile interval approach of being both more accurate and of being transform ation-respecting, the latter property meaning that if we take a transform ation of 9 (e.g ., if 9 is a variance, we might wish to take its square root to obtain the standard deviation), then the B C a method will auto­ matically correct the end-points of the confidence interval of the transform ed param eter .23 We can also use a bootstrapped sam ple of param eter estim ates to provide an alternative point estim ator of a

/V

param eter that is often superior to the raw sam ple estim ator 9. G iven that there are B resam ple estim ators, we can take our A p

bootstrapped point estim ator 9 as the sam ple mean of our

N onetheless, this basic percentile interval approach is limited

B § B (/) valu e s :24

itself, particularly if param eter estim ators are biased. It is there­ fore often better to use more refined percentile approaches, and perhaps the best of these is the bias-corrected and

=

° i=

1

Relatedly, we can also use a bootstrap to estim ate the bias in an estimator. The bias is the difference between the expectation

21 This application of the bootstrap can be described as a non-parametric one because we bootstrap from a given data sample. The bootstrap can also be implemented parametrically, where we bootstrap from the assumed distribution. When used in parametric mode, the bootstrap provides more accurate answers than textbook formulas usually do, and it can provide answers to problems for which no textbook formulas exist. The bootstrap can also be implemented semi-parametrically and a good example of this is the FRS approach. 22 In practice, it might be possible to choose the value of n, but we will assume for the sake of argument that n is given.

32



of an estim ator and the quantity estim ated (i.e., the bias equals

23 For more on BCa and other refinements to the percentile interval approach, see Efron and Tibshirani (1993, Chapters 14 and 22) or Davison and Hinkley (1997, Chapter 5). 24 This basic bootstrap estimation method can also be supplemented by variance-reduction methods (e.g., importance sampling) to improve accuracy at a given computational cost. See Efron and Tibshirani (1993, Chapter 23) or Davison and Hinkley (1997, Chapter 9).

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

E[d ] — d)), and can be estim ated by plugging Equation (2.14)

these results are lim ited, because Equation (2.19) only applies to

and a basic sam ple estim ator 0 into the bias equation:

the mean and Equation (2.20) presupposes normality as well.

/V

/V

Bias — E[0] — 0=> Estim a ted Bias — 0 B — 0

(2.15)

We can use an estim ate of bias for various purposes (e.g ., to correct a biased estimator, to correct prediction errors, etc.). However, the bias can have a (relatively) large standard error. In such cases, correcting for the bias is not always a good idea, because the bias-corrected estim ate can have a larger standard error than the unadjusted, biased, estim ate. The program s to com pute bootstrap statistics are easy to write and the most obvious price of the bootstrap, increased com pu­

We therefore face two related questions: (a) how we can esti­ mate var(sB) in general? and (b) how can we choose B to achieve a given level of accuracy in our estim ate of sB? O ne approach to these problem s is to apply brute force: we can estim ate var(sB) using a jackknife-after-bootstrap (in which we first bootstrap the data and then estim ate var(sB) by jackknifing from the bootstrapped data), or by using a double bootstrap (in which we estim ate a sam ple of bootstrapped sB values and then estim ate their variance). W e can then experim ent with different values of B to determ ine the values of these param eters needed to bring

tation, is no longer a serious problem .25

var(sB ) down to an acceptable level.

Standard Errors of Bootstrap Estimators

to choose B), a more elegant approach is the follow ing, sug­

Naturally, bootstrap estim ates are them selves subject to error.

our 'ideal' the value of sB associated with an infinite number of

If we are more concerned about the second problem (i.e., how

Typically, bootstrap estim ates have little bias, but they often have substantial variance. The latter com es from basic sampling variability (i.e., the fact that we have a sam ple of size n drawn from our population, rather than the population itself) and from resampling variability (i.e., the fact that we take only B bootstrap

gested by Andrew s and Buchinsky (1997). Suppose we take as resam ples, i.e ., s x . Let r be a target probability that is close to

1 , and let b o u n d be a chosen bound on the percentage devia­ tion of scrB from s^. W e want to choose B = B(bound, t ) such that the probability that

resam ples rather than an infinite number of them ). The esti-

Pr 100

/V

mated standard error for 0, sB, can be obtained from:

1 B

A

where 6

Q

P

A

Q

\ 1/2 - « e)2 J

(2.16)

SB — S oo /V

SB

m2f m 4 _

(2.18)

_ 4 B \ rri2

where m,- is the rth moment of the bootstrap distribution of the

m 4/m 2 - rh2 4n‘

+

a2 2nB

+

(T2{m4/m 2 - 3) 4n2B

d

2 n:

1+s

the kurtosis

p

lem, we replace

k

with a consistent estim ator of k , and this leads

method to determ ine B: •

W e initially assum e that

k

— 3, and plug this into

Equation (2.22) to obtain a prelim inary value of B, denoted by B0l where

f 5000^ \

(2.19)

If the distribution is normal, this further reduces to var(sB) -

k,

2 . 22)

of the distribution of 6 , is unknown. To get around this prob­

0B(i). In the case where 0 is the mean, Equation (2.18) reduces to: var(sB) -

(

Andrew s and Buchinsky to suggest the following three-step

/V / /V

var(sB) = var[rr?2 2] + E

:

(2 . 2 1 )

= r

However, this formula is not operational because A

rearranged as:

< bound

2500( k - 1)x? B « ------ ------ bound

However, sB is itself variable, and the variance of sB is:

Following Efron and Tibshirani (1993, Chapter 19), this can be

t

If B is large, then the required number of resam ples is

^

(2.17)

is within the desired bound is /V

approxim ately

— (1 / B ) 2 /=10 (/). sB is of course also easy to estim ate. var(sB) = var[ E(se)] + Etvar(sB)]

s B

A

\ b o u n d 2J

(2.23)

and where int(a) refers to the sm allest integer greater than or ( 2 .20)

equal to a. •

W e sim ulate B 0 resam ples, and estim ate the sam ple kurtosis A p

of the bootstrapped 0 values,

We can then set B to reduce var(sB) to a desired level, and so achieve a target level of accuracy in our estim ate of sB. However,



A

k

.

W e take the desired num ber of bootstrap resam ples as equal to m ax(B0, B i), where

25 An example of the bootstrap approach applied to VaR is given ear­ lier in this chapter discussing the bootstrap point estimator and boot­ strapped confidence intervals for VaR.

2500( k - 1)*? By - ------------ 7 bound2

Chapter 2 Non-Parametric Approaches

(2.24)

■ 33



This method does not directly tell us what the variance of s B

G A R C H procedure). W e can then bootstrap from the residu­

might be, but we already know how to estim ate this in any

als, which should be independent. However, this solution

case. Instead, this method gives us som ething more useful: it

requires us to identify the underlying stochastic model and

tells us how to set B to achieve a target level of precision in

estim ate its param eters, and this expo ses us to model and

our bootstrap estim ators, and (unlike Equations (2.19) and

param eter risk. A Q

(2 . 20 )) it applies for any param eter u and applies however 6 is distributed .26



An alternative is to use a block approach: we divide sam ple data into non-overlapping blocks of equal length, and select a block at random. However, this approach can 'whiten' the data (as the joint observations spanning different blocks are

Time Dependency and the Bootstrap

taken to be independent), which can underm ine our results.

Perhaps the main limitation of the bootstrap is that standard

On the other hand, there are also various m ethods of dealing

bootstrap procedures presuppose that observations are inde­

with this problem (e.g ., making block lengths stochastic, etc.)

pendent over tim e, and they can be unreliable if this assumption

but these refinem ents also make the block approach more

does not hold. Fortunately, there are various ways in which we

difficult to im plem ent.

can modify bootstraps to allow for such dependence: •

If we are prepared to m ake param etric assum ptions, we can model the dep endence param etrically (e .g ., using a



A third solution is to m odify the p ro b ab ilities with which individual o b servatio n s are chosen. Instead of assum ing th at each observation is chosen with the sam e p ro b ab ility, w e can m ake the p ro b ab ilities of selectio n d e p e n d e n t on the tim e indices of recently se lecte d o b servatio n s: so, for

A /

This three-step method can also be improved and extended. For example, it can be improved by correcting for bias in the kurtosis estima­ tor, and a similar (although more involved) three-step method can be used to achieve given levels of accuracy in estimates of confidence intervals as well. For more on these refinements, see Andrews and Buchinsky (1997).

34



exam p le, if the sam ple data are in chronological order and observation / has ju st been chosen, then observation / + 1 is more likely to be chosen next than m ost other o b servatio n s.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

Parametric Approaches (II): Extreme Value Learning Objectives A fter com pleting this reading you should be able to: Explain the im portance and challenges of extrem e values in risk m anagem ent. D escribe extrem e value theory (EVT) and its use in risk m anagem ent.

Evaluate the tradeoffs involved in setting the threshold level when applying the generalized Pareto (GP) distribution. Explain the m ultivariate EVT for risk m anagem ent.

D escribe the peaks-over-threshold (PO T) approach. Com pare and contrast the generalized extrem e value and P O T approaches to estim ating extrem e risks.

E x c e rp t is C h a p ter 7 o f Measuring M arket Risk, S e co n d Edition, by Kevin D ow d.

35

There are many problem s in risk m anagem ent that deal with

to these problem s .1 EVT focuses on the distinctiveness of extrem e

extrem e events— events that are unlikely to occur, but can be

values and makes as much use as possible of what theory has to

very costly when they do. These events are often referred to

offer. Not surprisingly, EVT is quite different from the more fam il­

as low-probability, high-impact events, and they include large

iar 'central tendency' statistics that most of us have grown up

m arket falls, the failures of major institutions, the outbreak of

with. The underlying reason for this is that central tendency statis­

financial crises and natural catastrophes. Given the im portance

tics are governed by central limit theorem s, but central limit theo­

of such events, the estim ation of extrem e risk measures is a key

rems do not apply to extrem es. Instead, extrem es are governed,

concern for risk m anagers.

appropriately enough, by extreme-value theorem s. EVT uses

However, to estim ate such risks we have to confront a d if­ ficult problem : extrem e events are rare by definition, so we have relatively few extrem e observations on which to base our estim ates. Estim ates of extrem e risks must therefore be very uncertain, and this uncertainty is especially pronounced if we are interested in extrem e risks not only within the range

these theorem s to tell us what distributions we should (and should not!) fit to our extrem es data, and also guides us on how we should estimate the param eters involved. These EV distribu­ tions are quite different from the more familiar distributions of central tendency statistics. Their parameters are also different, and the estimation of these parameters is more difficult.

of observed data, but w ell b e y o n d it— as m ight be the case if

This chapter provides an overview of EV theory, and of how it

we w ere interested in the risks associated with events more

can be used to estimate measures of financial risk. We will focus

extrem e than any in our historical data set (e .g ., an unprec­

mainly on the VaR (and to a lesser extent, the ES) to keep the dis­

edented stock m arket fall).

cussion brief, but the approaches considered here extend natu­

Practitioners can only respond by relying on assum ptions to make up for lack of data. Unfortunately, the assum ptions they make are often questionable. Typically, a distribution is selected arbitrarily, and then fitted to the whole data set. However, this means that the fitted distribution will tend to accom m odate the

rally to the estimation of other coherent risk measures as well. The chapter itself is divided into four sections. The first two discuss the two main branches of univariate EV theory, the next discusses some extensions to, including m ultivariate EVT, and the last concludes.

more central observations, because there are so many of them , rather than the extrem e observations, which are much sparser. Hence, this type of approach is often good if w e are interested in the central part of the distribution, but is ill-suited to handling extrem es. W hen dealing with e xtre m e s, w e need an app roach that com es to term s with the basic problem posed by extrem e-

3.1 G EN ERA LISED EXTREM E-VALUE TH EO RY Theory Suppose we have a random loss variable X, and we assume to

value estim atio n: that the estim ation of the risks asso ciated

begin with that X is independent and identically distributed (iid)

with low -frequency events with lim ited data is inevitab ly

from som e unknown distribution F(x) = Prob(X < x). We wish to

p ro b lem atic, and th at th ese d ifficu lties increase as the events

estim ate the extrem e risks (e.g ., extrem e VaR) associated with

co ncerned becom e rarer. Such problem s are not unique to

the distribution of X. Clearly, this poses a problem because we

risk m anagem ent, but also occur in other d iscip lin es as w ell.

don't know what F(x) actually is.

Th e standard exam p le is hydrology, w here en g in eers have long stru g g led with the question of how high d ikes, sea w alls and sim ilar b arriers should be to contain the p ro b ab ilities of flo o d s w ithin reaso n ab le lim its. T h e y have had to do so with even less data than financial risk p ractitio n ers usually

This is where EVT comes to our rescue. Consider a sample of size n drawn from F(x), and let the maximum of this sample be M n.12 If n is large, we can regard M n as an extrem e value. Under rela­ tively general conditions, the celebrated Fisher-Tippett theorem

have, and th eir quantile estim ates— the flood w ate r levels they w ere contending w ith— w ere also typ ically w ell out of the range of th e ir sam ple data. So they have had to grapp le with co m p arab le problem s to those faced by insurers and risk m anagers, but have had to do so with even less data and p o ten tially much more at stake. The result of their efforts is extrem e-value theory (EVT)— a branch of applied statistics that is tailor-made

36



1 The literature on EVT is vast. However, some standard book references on EVT and its finance applications are Embrechts et. al. (1997), Reiss and Thomas (1997) and Beirlant et. al. (2004). There is also a plethora of good articles on the subject, e.g., Bassi et. al. (1998), Longin (1996, 1999), Danielsson and de Vries (1997a,b), McNeil (1998), McNeil and Saladin (1997), Cotter (2001,2004), and many others. 2 The same theory also works for extremes that are the minima rather than the maxima of a (large) sample: to apply the theory to minima extremes, we simply apply the maxima extremes results but multiply our data by - 1 .

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

(1928) then tells us that as n gets large, the distribution of

O bserve that most of the probability mass is located between

extrem es (i.e., M n) converges to the following generalised

x values of —2 and + 6 . More generally, this means most of the

extreme-value (GEV) distribution:

probability mass will lie between x values of /jl — 2 a and \x + 6 a . To obtain the quantiles associated with the G E V distribution, we set the left-hand side of Equation (3.1) to p, take logs of both

£ * 0 (3.1) exp

exp

sides of Equation (3.1) and rearrange to get: /

£ = 0

-1 / f

V

£ # 0 (3.2)

if

where x satisfies the condition 1 + £(x — ju)/a > 0 .3 This distri­ bution has three param eters. The first two are ju, the location

£ = 0

param eter of the limiting distribution, which is a measure of the central tendency of Mn, and cr, the scale param eter of the limit­ ing distribution, which is a measure of the dispersion of Mn. These are related to, but distinct from , the more fam iliar mean and standard deviation, and we will return to these presently. The third param eter, £, the tail index, gives an indication of the shape (or heaviness) of the tail of the limiting distribution.

V We then unravel the x-values to get the quantiles associated with any chosen (cumulative) probability p :5 x = n -

(hrecnet, £ > u

1 - (-ln (p )) H

x — fjL — a ln[ —ln(p)]

(3.3a)

(Gum bel, £ = 0)

(3.3b)

The G E V Equation (3.1) has three special cases: •

If £ > 0, the G E V becom es the Frechet distribution. This case

Example 3.1

Gumbel quantiles

applies where the tail of F(x) obeys a pow er function and is

For the standardised G um bel, the 5% quantile is

therefore heavy (e .g ., as would be the case if F(x) were a Levy

—In[ —ln(0.05)] = —1.0972 and the 95% quantile is

distribution, a t-distribution, a Pareto distribution, etc.). This

—In[ —ln(0.95)] = 2.9702.

case is particularly useful for financial returns because they are typically heavy-tailed, and we often find that estim ates of •



£ for financial return data are positive but less than 0.35.

Example 3.2

Frechet quantiles

If £ = 0, the G E V becom es the Gum bel distribution, corre­

For the standardised Frechet with £ = 0.2, the 5% quantile is

sponding to the case where F(x) has exponential tails. These

—(1/0.2)[1 - (—ln(0.05))-° 2] = -0 .9 8 5 1 and the 95% quantile

are relatively light tails such as those we would get with nor­

is —(1/0.2)[1 - (—ln(0.95))-0-2] = 4.0564. For £ = 0.3, the 5%

mal or lognormal distributions.

quantile is —(1/0.3)[1 — (—ln(0.05))~° 3] = —0.9349 and the

If £ < 0, the G E V becom es the W eibull distribution, corre­

95% quantile is —(1/0.3)[1 - (- ln (0 .9 5 )r°- 3] = 4.7924. Thus,

sponding to the case where F(x) has lighter than normal tails.

Frechet quantiles are sensitive to the value of the tail index £,

However, the W eibull distribution is not particularly useful for

and tend to rise with £. Conversely, as £ —» 0, the Frechet quan­

modelling financial returns, because few em pirical financial

tiles tend to their Gum bel equivalents.

returns series are so light-tailed .4

We need to rem em ber that the probabilities in Equations

The standardised (i.e., /jl — 0, a — 1) Frechet and Gum bel prob­

(3.1)-(3.3) refer to the probabilities associated with the extrem e

ability density functions are illustrated in Figure 3.1. Both are

loss distribution, not to those associated with the distribution

skew ed to the right, but the Frechet is more skew ed than the

of the 'parent' loss distribution from which the extrem e losses

Gum bel and has a noticeably longer right-hand tail. This means

are drawn. For exam ple, a 5th percentile in Equation (3.3) is the

that the Frechet has considerably higher probabilities of produc­

cut-off point between the lowest 5% of extrem e (high) losses

ing very large X-values.

3 See, e.g., Embrechts et. al. (1997), p. 316. 4 We can also explain these three cases in terms of domains of attrac­ tion. Extremes drawn from Levy or t-distributions fall in the domain of attraction of the Frechet distribution, and so obey a Frechet distribution as n gets large; extremes drawn from normal and lognormal distribu­ tions fall in the domain of attraction of the Gumbel, and obey the Gumbel as n gets large, and so on.

5 We can obtain estimates of EV VaR over longer time periods by using appropriately scaled parameters, bearing in mind that the mean scales proportionately with the holding period h, the standard deviation scales with the square root of h, and (subject to certain conditions) the tail index does not scale at all. In general, we find that the VaR scales with a parameter k (i.e., so VaR(h) = VaR{ 1)(h)K, where h is the holding period), and empirical evidence reported by Hauksson et. al. (2001, p. 93) sug­ gests an average value for k of about 0.45. The square-root scaling rule (i.e., k — 0.5) is therefore usually inappropriate for EV distributions.

Chapter 3 Parametric Approaches (II): Extreme Value

■ 37

X 10-3

and the highest 95% of extrem e (high) losses; it is not the 5th

estim ate the relevant VaRs. O f course, the VaR form ulas given by

percentile point of the parent distribution. The 5th percentile of

Equation (3.5) are meant only for extrem ely high confidence lev­

the extrem e loss distribution is therefore on the /eft-hand side of

els, and w e cannot expect them to provide accurate estim ates

the distribution of extrem e losses (because it is a small extrem e

for VaRs at low confidence levels.

loss), but on the right-hand tail of the original loss distribution (because it is an extrem e loss). To see the connection between the probabilities associated with the distribution of M n and those associated with the distribution of X, we now let Mn* be some extrem e threshold value. It then follows that: Pr[Mn < M*] = p = {Pr[X < M*n]}" = [a]"

(3.4)

where a is the VaR confidence level associated with the thresh­

Example 3.3

Gumbel VaR

For the standardised Gum bel and n = 100, the 99.5% VaR is —ln[—100 X ln(0.995)] = 0.6906, and the 99.9% VaR is —Inf—100 X ln(0.999)] = 2.3021.

Example 3.4

Frechet VaR

VaR — /in — —^[1 — (—n In(a))- ^1] (Frechet, £ > 0)

(3.5a)

0.2 and n — 100, the 99.5% (1/0.2)[1 - (-100 X ln(0.995)r02] = 0.7406 and the 99.9%VaR is —(1/0.2)[1 - (-100 X ln(0.999)r02] = 2.9237. For £ = 0.3, the 99.5%VaR is —(1/0.3)[1 - (-100 X ln(0.995)r03]

VaR — ixn — crn Inf—n In (a)] (Gum bel, £ = 0)

(3.5b)

= 3.3165.

old Mn*. To obtain the a VaR, we now use Equation (3.4) to sub­ stitute [a]n for p in Equation (3.3), and this gives us:

(Since n is now explicit, we have also subscripted the param e­

For the standardised Frechet with £ =

VaR is —

= 0.7674 and the 99.9% VaR is —(1/0.3)[1 - (- 1 0 0 X ln (0.99 9)r 03

These results tell us that EV-VaRs (and, by im plication, other

ters with n to make explicit that in practice these would refer to

EV risk measures) are sensitive to the value of the tail index £n,

the param eters associated with maxima drawn from sam ples of

which highlights the im portance of getting a good estim ate of

size n. This helps to avoid errors with the limiting VaRs as n gets

£n when applying EVT. This applies even if we use a G um bel,

large.) Given values for the extrem e-loss distribution param ­

because we should use the Gum bel only if we think £n is insigni­

eters /xn, crn and (where needed) £n, Equation (3.5) allows us to

ficantly different from zero.

38



Financial Risk Manager Exam Part II: Market Risk Measurement and Management

Example 3.5

where k(x) varies slowly with x. For exam ple, if w e assume

R e a lis tic F re c h e t V aR

for convenience that k(x) is approxim ately constant, then

Suppose w e wish to estim ate Frech et VaRs with more real­

Equation (3.6) becom es:

istic p aram eters. For US stock m arkets, som e fairly plau­

F(x) « kx-1/£

sible param eters are fx — 2%, cr — 0.7% and £ = 0.3% . If

(3.7)

Now consider two probabilities, a first, 'in-sam ple' probability

we put these into our Frech et VaR form ula Equation (3.5a) and retain the earlier n value, the estim ated 99.5% VaR

P i n - s a m

p l e

#and a second, sm aller and typically out-of-sample

probability

(in %) is 2 - (0.7/0.3)[1 - (- 1 0 0 X ln (0 .9 9 5 )r03] = 2.537,

p

0 U t. 0 f .s a m p l e -

Equation (3.7) im plies:

and the estim ated 99.9% VaR (in %) is 2 — (0.7/0.3)

_

[1 — (—100 X ln(0.999))_a3] = 4.322. For the next trading

P i n - s a m

day, these estim ates tell us that w e can be 99.5% confident

P

of not m aking a loss in excess of 2.537% of the value of our portfolio, and so on.

o u t - o f - s a m

jL, - 1/£

p l e

p l e



k

J

i n - s a m

x

ana

p l e

o u t - o f - s a m

(3.8)

p l e

which in turn im plies: P i n - s a m

It is also interesting to note that had we assum ed a Gum bel P

(i.e., £ = 0) we would have estim ated these VaRs (again

p l e

o u t - o f - s a m

*

p l e

*

i n - s a m

p l e

o u t - o f - s a m

in %) to be 2 — 0.7 X ln[ —100 X ln(0.995)] = 2.483 and

(

2 —0.7 X ln[—100 X ln(0.999)] = 3.612. These are lower than

\

^

p l e

P i n - s a m

p l e

P o u t - o f - s a m

the Frechet VaRs, which underlines the im portance of getting

y

(3.9)

I

p l e

/

This allows us to estim ate one quantile (denoted here as

the £ right.

X

o u t - o f - s a m

p l e

) based on a known in-sample quantile x /n_samp/e, a (which is known

Flow do w e choose betw een the G um b el and the Frech et?

known out-of-sample probability

Th ere are various w ays w e can d ecid e which EV distribution

because it com es directly from our VaR confidence level), and an

to use:

unknown in-sample probability P\n.samp\e.



If we are confident that w e can identify the parent loss distri­

The latter can easily be proxied by its em pirical counterpart, t/n,

bution, we can choose the EV distribution in whose domain

where n is the sam ple size and t the num ber of observations

of attraction the parent distribution resides. For exam ple, if

higher than x,n.samp/e. Using this proxy then gives us:

p

0 u t - o f - s a m p i e

we are confident that the original distribution is a t, then we would choose the Frechet distribution because the t belongs in the domain of attraction of the Frechet. In plain English, we choose the EV distribution to which the extrem es from

p l e

* i n - s a m

nP

o u t - o f - s a m

p l e

\

^

(3.10)

p l e y

To use this approach, we take an arbitrarily chosen in-sample

W e could te st the significance of the tail index, and we m ight choose the G um bel if the tail index w as insignificant and the Frech et o therw ise. H ow ever, this leaves us open to the danger that w e m ight incorrectly conclude that £ is

0 , and this could lead us to underestim ate our extrem e risk m easures. •

o u t - o f - s a m

which is easy to estim ate using readily available information.

the parent distribution will tend. •

(

X

Given the dangers of model risk and bearing in mind that the

quantile, x /n.samp/e, and determ ine its counterpart em pirical prob­ ability, t/n. We then determ ine our out-of-sample probability from our chosen confidence level, estim ate our tail index using a suitable m ethod, and our out-of-sample quantile estim ator im m ediately follows from Equation (3.10 ).6

estim ated risk measure increases with the tail index, a safer

Estimation of EV Parameters

option is always to choose the Frechet.

To estim ate EV risk m easures, we need to estim ate the relevant EV param eters fx, a and, in the case of the Frechet, the tail index £, so we can insert their values into our quantile form ulas

A Short-Cut EV Method There are also short-cut ways to estim ate VaR (or ES) using EV theory. These are based on the idea that if £ > 0, the tail of an extrem e loss distribution follows a power-law tim es a slowly varying function: F(x) = k(x)x_1/^

(3.6)

6 An alternative short-cut is suggested by Diebold et. al. (2000). They suggest that we take logs of Equation (3.7) and estimate the log-transformed relationship using regression methods. However, their method is still relatively untried, and its reliability is doubtful because there is no easy way to ensure that the regression procedure will produce a 'sensible' estimate of the tail index.

Chapter 3 Parametric Approaches (II): Extreme Value

■ 39

(i.e., Equation (3.5)). We can obtain estim ators using maximum

above.) In the case where £

likelihood (ML) m ethods, regression m ethods, moment-based or

together give

0, Equations (3.1) and (3.13)

sem i-param etric m ethods. T-— “ I + m

ML Estimation Methods

e xp [-(1 + U M „ - /*„)/ 0. The ML approach has som e attractive p ro p erties (e .g ., it is sta tisti­ cally well g ro und ed , p aram eter estim ato rs are co nsistent and asym p to tically norm al if £n > —1 / 2 , w e can easily te st hyp otheses about p aram eters using likelihood ratio sta tis­ tics, e tc .). H ow ever, it also lacks closed-form solutions for the p aram eters, so the M L approach requires the use of an ap p ro p riate num erical solution m ethod. This requires suitable so ftw are, and there is the d ang er th at M L estim ato rs m ight not be robust. In ad d itio n , b ecause the underlying th eo ry is asym p to tic, th ere is also the potential for problem s arising from sm allness of sam p les.

These are typically used to estimate the tail index £, and the most popular of these is the Hill estimator. This estimator is directly applied to the ordered parent loss observations. Denoting these from highest to lowest by X 1; X 2, . . . , X n, the Hill

is:

2 > X i - l n X k +1

k 1=1

(3.17)

where k, the tail threshold used to estim ate the Hill estimator, has to be chosen in an appropriate way. The Hill estim ator is the average of the k most extrem e (i.e., tail) observations, minus the k + 1th observation, or the one next to the tail. The Hill estim a­ tor is known to be consistent and asym ptotically normally dis­ tributed, but its properties in finite sam ples are not well understood, and there are concerns in the literature about its

Regression Methods

small-sample properties and its sensitivity to the choice of

An easier method to apply is a regression method due to

threshold k. However, these (and other) reservations notwith­

Gum bel (1958). 7 To see how the method works, we begin by

standing, many EV T practitioners regard the Hill estim ator as

ordering our sam ple of M'n values from lowest to highest, so

being as good as any other .8

Mn 1

< ...

< Mn - Because these are order statistics, it

follows that, for large n: E [H [M'n)]

1 + m

The main problem in practice is choosing a cut-off value for k. We know that our tail index estim ates can be sensitive to the

H(M')

1 + m

(3.13)

choice of k, but theory gives us little guidance on what the value of k should be. A suggestion often given to this problem is that

where H(M'n) is the cumulative density function of maxima, and we drop all redundant scripts for convenience. (See Equation (3.1)

7 See Gumbel (1958), pp. 226, 260, 296.

40



8 An alternative is the Pickands estimator (see, e.g., Bassi et. al. (1998), p. 125 or Longin (1996), p. 389). This estimator does not require a posi­ tive tail index (unlike the Hill estimator) and is asymptotically normal and weakly consistent under reasonable conditions, but is otherwise less effi­ cient than the Hill estimator.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

BOX 3.1 MOMENT-BASED ESTIMATORS OF EV PARAMETERS An alternative approach is to estim ate EV param eters using em pirical m om ents. Let m,- be the /th em pirical moment of our extrem es data set. Assum ing £ ^ 0, w e can adapt Em brechts et. al. (1997, pp. 293-295) to show that:

£ = m, + r(1)

-5 -

10

-

1997.5 1998 1998.5 1999 1999.5 2000

Fiqure 4.4

Bank VaR and trading profits.

ple failing to acco u n t fo r d iv e rsific a ­ tion e ffe c ts. Yet ano ther exp lan atio n is th at capital req u irem en ts are cu rren tly not b ind ing . The am ount of eco n o m ic cap ital U .S . banks cu rren tly hold is in

p re fe r to rep o rt high VaR num bers to avoid the p o ssib ility of reg u lato ry intrusio n. S till, th ese p ractice s im poverish the inform ational co n ten t of VaR num bers.

e xce ss of th e ir reg u lato ry ca p ita l. A s a result, banks may

4.4 CO N CLU SIO N S 4 Including fees increases the P&L, reducing the number of violations. Using hypothetical income, as currently prescribed in the European Union, could reduce this effect. Jaschke, Stahl, and Stehle (2003) com­ pare the VaRs for 13 German banks and find that VaR measures are, on average, less conservative than for U.S. banks. Even so, VaR forecasts are still too high.

Model verification is an integral com ponent of the risk m anage­ ment process. Backtesting VaR numbers provides valuable fe e d ­ back to users about the accuracy of their m odels. The procedure also can be used to search for possible im provem ents.

Chapter 4 Backtesting VaR

■ 57

BOX 4.2 NO EXCEPTIONS The C E O of a large bank receives a daily report of the bank's VaR and P&L. W henever there is an exception, the C E O calls in the risk officer for an explanation. Initially, the risk officer explained that a 99 percent VaR con­ fidence level implies an average of 2 to 3 exceptions per year. The C E O is never quite satisfied, however. Later, tired of going "upstairs," the risk officer simply increases the confidence level to cut down on the number of exceptions. Annual reports suggest that this is frequently the case. Financial institutions routinely produce plots of P&L that show no violation of their 99 percent confidence VaR over long periods, proclaiming that this supports their risk model.

Verification tests usually are based on "excep tio n " counts, defined as the number of exceedences of the VaR m easure. The goal is to check if this count is in line with the selected VaR confidence level. The method also can be modified to pick up bunching of deviations. Backtesting involves balancing two types of errors: rejecting a correct model versus accepting an incorrect m odel. Ideally, one would want a fram ew ork that has very high power, or high prob ability of rejecting an incorrect m odel. The problem is that the power of exception-based tests is low. The current fram ew ork could be improved by choosing a lower VaR confidence level or by increasing the number of data observations. Adding to these statistical difficulties, w e have to recognize other practical problem s. Trading portfolios do change over the horizon. M odels do evolve o v e rtim e as risk m anagers improve

Due thought should be given to the choice of VaR quantitative param eters for backtesting purposes. First, the horizon should

their risk modeling techniques. All this may cause further struc­ tural instability.

be as short as possible in order to increase the number of obser­

D espite all these issues, backtesting has becom e a central

vations and to m itigate the effect of changes in the portfolio

com ponent of risk m anagem ent system s. The m ethodology

com position. Second, the confidence level should not be too

allow s risk m anagers to im prove their m odels constantly.

high because this decreases the effectiveness, or power, of the

Perhaps most im portant, backtesting should ensure that risk

statistical tests.

m odels do not go astray.

58



Financial Risk Manager Exam Part II: Market Risk Measurement and Management

VaR Mapping Learning Objectives A fter com pleting this reading you should be able to: Explain the principles underlying VaR mapping and

D escribe how mapping of risk factors can support stress

describe the mapping process.

testing.

Explain how the mapping process captures general and

Explain how VaR can be com puted and used relative to a

specific risks.

perform ance benchm ark.

Differentiate among the three m ethods of mapping

D escribe the method of mapping forw ards, forward rate

portfolios of fixed income securities.

agreem ents, interest rate sw aps, and options.

Sum m arize how to map a fixed income portfolio into positions of standard instrum ents.

E x c e rp t is C h a p ter 17 o f Value at Risk: The New Benchm ark for Managing Financial Risk, Third Ed itio n , by Philippe Jo rio n .

59

The second [principle], to divide each of the difficulties under exam ination into as many parts as possible, and as might be necessary for its adequate solution. — Rene D escartes

W hichever value-at-risk (VaR) method is used, the risk m easure­ ment process needs to sim plify the portfolio by m apping the positions on the selected risk factors. M apping is the process by which the current values of the portfolio positions are replaced by exposures on the risk factors. Mapping arises because of the fundam ental nature of VaR, which is portfolio m easurem ent at the highest level. As a result, this is usually a very large-scale aggregation problem . It would be too com plex and tim e-consum ing to model all positions indi­ vidually as risk factors. Furtherm ore, this is unnecessary because many positions are driven by the same set of risk factors and can be aggregated into a small set of exposures without loss of risk inform ation. O nce a portfolio has been m apped on the risk factors, any of the three VaR m ethods can be used to build the distribution of profits and losses. This chapter illustrates the mapping process for major financial instrum ents. First we review the basic principles behind m ap­ ping for VaR. W e then proceed to illustrate cases where instru­ ments are broken down into their constituent com ponents. We will see that the mapping process is instructive because it reveals useful insights into the risk drivers of derivatives. The next sections deal with fixed-incom e securities and linear deriva­ tives. We cover the most im portant instrum ents, forward con­ tracts, forward rate agreem ents, and interest-rate sw aps. Then we describe nonlinear derivatives, or options.

BOX 5.1 WHY MAPPING? " J .R Morgan Chase's VaR calculation is highly granular, com prising more than 2.1 million positions and 240,000 pricing series (e.g ., securities prices, interest rates, foreign exchange rates)." (Annual report, 2004)

forw ard contracts. The positions may differ owing to different m aturities and delivery prices. It is unnecessary, however, to model all these positions individually. Basically, the positions are exposed to a single m ajor risk factor, which is the dollar/ euro spot exchange rate. Thus they could be sum m arized by a single aggregate exposure on this risk factor. Such ag g reg a­ tion, of course, is not appropriate for the pricing of the port­ folio. For risk m easurem ent purposes, however, it is perfectly accep tab le. This is why risk m anagem ent m ethods can differ from pricing m ethods. Mapping is also the only solution when the characteristics of the instrum ent change over tim e. The risk profile of bonds, for instance, changes as they age. O ne cannot use the history of prices on a bond directly. Instead, the bond must be m apped on yields that best represent its current profile. Sim ilarly, the risk profile of options changes very quickly. O ptions must be m apped on their primary risk factors. M apping provides a way to tackle these practical problem s.

Mapping as a Solution to Data Problems Mapping is also required in many common situations. O ften a com plete history of all securities may not exist or may not be relevant. Consider a mutual fund with a strategy of investing in initial p u b lic offerin gs (IPOs) of common stock. By definition, these stocks have no history. They certainly cannot be ignored

5.1 MAPPING FOR RISK M EASUREM EN T

in the risk system , however. The risk m anager would have to replace these positions by exposures on sim ilar risk factors already in the system .

Why Mapping?

Another common problem with global m arkets is the tim e at

The essence of VaR is aggregation at the highest level. This gen­

or mutual funds invested in international stocks. As much as

which prices are recorded. Consider, for instance, a portfolio

erally involves a very large number of positions, including bonds,

15 hours can elapse from the tim e the m arket closes in Tokyo

stocks, currencies, com m odities, and their derivatives. As a result,

at 1:00

it would be impractical to consider each position separately (see

United States at 4:00

Box 5.1). Too many com putations would be required, and the

close ignore intervening information and are said to be stale.

tim e needed to measure risk would slow to a crawl.

This led to the mutual-fund scandal of 2003, which is described

Fortunately, m apping provides a shortcut. Many positions

a

.m .

EST (3:00

P .M . P.M .

in Jap an) to the tim e it closes in the As a result, prices from the Tokyo

in Box 5.2.

can be sim plified to a sm aller num ber of positions on a

For risk m anagers, stale prices cause problem s. Because returns

set of elem entary, or prim itive, risk factors. Consider, for

are not synchronous, daily correlations across m arkets are too

instance, a trader's desk with thousands of open dollar/euro

low, which will affect the m easurem ent of portfolio risk.

60



Financial Risk Manager Exam Part II: Market Risk Measurement and Management

BOX 5.2 MARKET TIMING AND STALE PRICES In Septem ber 2003, New York Attorney G eneral Eliot Spitzer accused a number of investm ent com panies of allowing m arket tim ing into their funds. M arket timing is a short-term trading strategy of buying and selling the same funds. Consider, for exam ple, our portfolio of Jap anese and U.S. stocks, for which prices are set in different time zones. The problem is that U.S. investors can trade up to the close of the U.S. market. M arket tim ers could take advantage of this discrepancy by rapid trading. For instance, if the U.S. mar­ ket moves up following good news, it is likely the Japanese market will move up as well the following day. M arket timers would buy the fund at the stale price and resell it the next day.

O ne possible solution is mapping. For instance,

Such trading, however, creates transactions costs that are borne by the other investors in the fund. A s a result, fund com panies usually state in their prospectus that this practice is not allow ed. In practice, Eliot Spitzer found out that many mutual-fund com panies had encouraged m arket tim ers, which he argued was fraudulent. Eventually, a number of funds settled by paying more than USD 2 billion. This practice can be stopped in a number of ways. Many mutual funds now im pose short-term redem ption fees, which make m arket timing uneconom ical. Alternatively, the cutoff tim e for placing trades can be moved earlier.

Instruments

prices at the close of the U.S. market can be esti­ mated from a regression of Japanese returns on U.S. returns and using the forecast value conditional on the latest U.S. information. Alternatively, correlations can be measured from returns taken over longer time intervals, such as weekly. In practice, the risk manager needs to make sure that the data-collection process will lead to meaningful risk estim ates.

The Mapping Process Figure 5.1 illustrates a sim ple mapping process, where six instrum ents are m apped on three risk factors. The first step in the analysis is marking all positions to m arket in current dollars or w hatever reference currency is used. The m arket value for each instrum ent then is allocated to the three risk factors.

current m arket value is not fully allocated to the risk factors, it

Table 5.1 shows that the first instrum ent has a m arket value of

must mean that the rem ainder is allocated to cash, which is not

Vh which is allocated to three exposures, x 11# x12, and x-i3. If the

a risk factor because it has no risk.

Table 5.1

Mapping Exposures E x p o s u re on R isk F a c to r M a rk e t V alu e

1

2

3

Instrument 1

Vi

*11

x 12

x 13

Instrument 2

v2

X 21

x 22

x 23































Instrument 6

V6

*6 1

x 62

x 63

Total portfolio

V

Y 6 *1 = 2/= iXn

Y 6 X2 = 2/=lXj2

Y 6 x3 = 2 /=i Xi3

Chapter 5 VaR Mapping

■ 61

N ext, the system allocates the position for instrum ent 2 and

These exposures are aggregated across all the stocks in the

so on. A t the end of the process, positions are sum m ed for

portfolio. This gives

each risk factor. For the first risk factor, the dollar exposure is x-i = X /= iX n. This creates a vector x of three exposures that can be fed into the risk m easurem ent system . Mapping can be of two kinds. The first provides an exact alloca­ tion of exposures on the risk factors. This is obtained for deriva­ tives, for instance, when the price is an exact function of the risk factors. As w e shall see in the rest of this chapter, the partial derivatives of the price function generate analytical measures of exposures on the risk factors. A lternatively, exposures may have to be estim ated. This occurs, for instance, when a stock is replaced by a position in the stock index. The exposure then is estim ated by the slope coefficient from a regression of the stock return on the index return.

N

Pp =

2

(=1

W/A

( 5 -3 )

If the portfolio value is W, the mapping on the index is x = W(3p. N ext, we decom pose the variance Rp in Equation (5.2) and find N V(RP) = (f?p)V(Rm) + 2 w ? c r e2i 1=1

(5.4)

The first com ponent is the general m arket risk. The second com ­ ponent is the aggregate of specific risk for the entire portfolio. This decom position shows that with more detail on the prim itive or general-m arket risk factors, there will be less specific risk for a fixed amount of total risk V{Rp). As another exam ple, consider a corporate bond portfolio. Bond positions describe the distribution of money flows over tim e by

General and Specific Risk

their am ount, tim ing, and credit quality of the issuer. This cre­

This brings us to the issue of the choice of the set of prim itive risk factors. This choice should reflect the trade-off betw een

ates a continuum of risk factors, going from overnight to long m aturities for various credit risks.

better quality of the approxim ation and faster p ro cess­

In practice, we have to restrict the number of risk factors to a

ing. More factors lead to tig h ter risk m easurem ent but also

small set. For som e portfolios, one risk factor may be sufficient.

require more tim e devoted to the m odeling process and risk

For others, 15 m aturities may be necessary. For portfolios with

com putation.

options, we need to model m ovem ents not only in yields but also in their implied volatilities.

The choice of prim itive risk factors also influences the size of specific risks. S p e cific risk can be defined as risk that is due to

O ur prim itive risk factors could be m ovem ents in a set of J gov­

issuer-specific price m ovem ents, after accounting for general

ernm ent bond yields Zj and in a set of K credit spreads s^ sorted

m arket factors. Hence the definition of specific risk depends on

by credit rating. W e model the m ovem ent in each corporate

that of general m arket risk. The Basel rules have a separate

bond yield dy( by a m ovem ent in z at the closest maturity and in

charge for specific risk .1

s for the same credit rating. The remaining com ponent is er

To illustrate this decom position, consider a portfolio of N stocks.

The m ovem ent in value W th e n is

We are mapping each stock on a position in the stock market index, which is our prim itive risk factor. The return on a stock Ri is regressed on the return on the stock m arket index Rm, that is, R-, = atj + (3jRm + e(-

(5.1)

which gives the exposure /3,-. In what follows, ignore a, which does not contribute to risk. We assume that the specific risk owing to e is not correlated across stocks or with the m arket. The relative w eight of each stock in the portfolio is given by w(.

Rp =

N

N

2WiRi = i=1 2w&iRm + /=12Wi€i

/=1

K N + 2 DVBP kd sk + 2 )D V B P ide, k= 1 /=1

(5.5)

where DVBP is the total dollar value of a basis point for the asso­ ciated risk factor. The values for DVBPj then represent the sum ­ mation of the DVBP across all individual bonds for each maturity. This leads to a total risk decom position of

Thus the portfolio return is N

N J d W — 2 DVBP id y i= 2 DVBPjdz/ ;=1 j= 1

(5 -2)

N V[dW) — general risk + ^ D V B P fV (d e ()

(5.6)

/=1

A greater number of general risk factors should create less residual risk. Even so, we need to ascertain the size of the second, specific 1 Typically, the charge is 4 percent of the position value for equities and unrated debt, assuming that the banks' models do not incorporate spe­ cific risks.

62



risk term. In practice, there may not be sufficient history to measure the specific risk of individual bonds, which is why it is often assumed that all issuers within the same risk class have the same risk.

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

5.2 MAPPING FIXED-IN COM E PORTFOLIOS

The portfolio has an average m aturity of 3 years and a

Mapping Approaches

zero-coupon rate.

duration of 2.733 years. The table lays out the present value of all portfolio cash flow s discounted at the appropriate

Principal mapping considers the timing of redem ption pay­

O nce the risk factors have been selected , the question is how

ments only. Since the average maturity of this portfolio

to map the portfolio positions into exposures on these risk

is 3 years, the VaR can be found from the risk of a 3-year

factors. W e can distinguish three m apping system s for fixed-

maturity, which is 1.484 percent from Table 5.3. VaR then is

incom e portfolios: principal, duration, and cash flow s. W ith

USD 200 X 1.484/100 = USD 2.97 million. The only positive

prin cipal m apping, one risk facto r is chosen that corresponds to the average portfolio m aturity. W ith duration m apping, one risk facto r is chosen that corresponds to the portfolio dura­

aspect of this method is its sim plicity. This approach overstates the true risk because it ignores intervening coupon paym ents. The next step in precision is duration m apping. W e replace

tion. W ith cash-flow m apping, the portfolio cash flow s are

the portfolio by a zero-coupon bond with maturity equal

grouped into m aturity buckets. M apping should preserve the m arket value of the position. Ideally, it also should preserve its m arket risk.

to the duration of the portfolio, which is 2.733 years. Table 5.3 shows VaRs of 0.987 and 1.484 for these maturi­ ties, respectively. Using a linear interpolation, we find a risk of

A s an exam p le, Table 5.2 d escrib es a two-bond portfolio

0.987 + (1.484 - 0.987) X (2.733 - 2) = 1.351 percent for this

consisting of a USD 100 million 5-year 6 percent issue and

hypothetical zero. With a USD 200 million portfolio, the duration-

a USD 100 million 1-year 4 percent issue. Both issues are

based VaR is USD 200 X 1.351/100 = USD 2.70 million, slightly

selling at par, im plying a m arket value of USD 200 m illion.

less than before.

Table 5.2

Mapping for a Bond Portfolio (USD millions) M ap p in g (PV)

C a sh F lo w s Term (Year)

5-Year

1-Year

S p o t R ate

USD 6

USD 104

4.000%

0.00

0.00

USD 105.77

2

USD 6

0

4.618%

0.00

0.00

USD 5.48







USD 200.00



3

USD 6

0

5.192%

USD 200.00

0.00

USD 5.15

4

USD 6

0

5.716%

0.00

0.00

USD 4.80

5

USD 106

0

6. 1 1 2 %

0.00

0.00

USD 78.79

USD 200.00

USD 200.00

USD 200.00

Total

Table 5.3 Term (Year)

Loss

C a sh F lo w

1

2.733

Total

D u ratio n

P rin cip al

Computing VaR from Change in Prices of Zeroes C a sh F lo w s

O ld Z e ro V alu e

O ld P V o f F lo w s

R isk (%)

N e w Z e ro V alu e

N e w P V o f F lo w s

1

USD 110

0.9615

USD 105.77

0.4696

0.9570

USD 105.27

2

USD 6

0.9136

USD 5.48

0.9868

0.9046

USD 5.43

3

USD 6

0.8591

USD 5.15

1.4841

0.8463

USD 5.08

4

USD 6

0.8006

USD 4.80

1.9714

0.7848

USD 4.71

5

USD 106

0.7433

USD 78.79

2.4261

0.7252

USD 76.88

USD 200.00

USD 197.37 USD 2.63

Chapter 5 VaR Mapping

■ 63

Finally, the cash-flow mapping method consists of grouping

measures, (USD 2.70 — USD 2.57 million), USD 70,000 is due to

all cash flows on term -structure "vertices" that correspond to

differences in yield volatility, and (USD 2.70 — USD 2.63 million),

m aturities for which volatilities are provided. Each cash flow

USD 60,000 is due to imperfect correlations. The last column pres­

is represented by the present value of the cash paym ent, dis­

ents the component VaR using computations as explained earlier.

counted at the appropriate zero-coupon rate.

Stress Test

The diversified VaR is com puted as VaR = a V x 'X x = V (x X V )'R (x X V)

(5.7)

Table 5.3 presents another approach to VaR that is directly derived from m ovem ents in the value of zeroes. This is an exam ­

where V = acr is the vector of VaR for zero-coupon bond

ple of stress testing.

returns, and R is the correlation matrix. Table 5.4 shows how to com pute the portfolio VaR using cash­ flow m apping. The second column reports the cash flows x from Table 5.2. Note that the current value of USD 200 million is fully allocated to the five risk factors. The third column presents the product of these cash flows with the risk of each vertex x X V, which represents the individual VaRs. With perfect correlation across all zeros, the VaR of the portfolio is N Undiversified VaR = ^ |x,-| V, /= 1

Assum e that all zeroes are perfectly correlated. Then we could decrease all zeroes' values by their VaR. For instance, the 1-year zero is worth 0.9615. Given the VaR in Table 5.3 of 0.4696, a 95 percent probability move would be for the zero to fall to 0.9615 X (1 — 0.4696/100) = 0.9570. If all zeroes are perfectly correlated, they should all fall by their respective VaR. This gen­ erates a new distribution of present-value factors that can be used to price the portfolio. Table 5.3 shows that the new value is USD 197.37 million, which is exactly USD 2.63 million below the original value. This num ber is exactly the same as the undiversi­

which is USD 2.63 million. This number is close to the VaR obtained

fied VaR just com puted. The two approaches illustrate the link between computing VaR

from the duration approximation, which was USD 2.70 million.

through matrix multiplication and through movements in underly­

The right side of the table presents the correlation m atrix of zeroes for m aturities ranging from 1 to 5 years. To obtain the portfolio VaR, we prem ultiply and postm ultiply the m atrix by the dollar amounts (xV) at each vertex. Taking the square root, we find a diversified VaR measure of USD 2.57 million.

ing prices. Com puting VaR through matrix multiplication is much more direct, however, and more appropriate because it allows nonperfect correlations across different sectors of the yield curve.

Benchmarking

Note that this is slightly less than the duration VaR of USD 2.70 million. This difference is due to two factors. First, risk measures

N ext, we provide a practical fixed-incom e com pute VaR in

are not perfectly linear with maturity, as we have seen in a previ­

relative term s, that is, relative to a perform ance benchm ark.

ous section. Second, correlations are below unity, which reduces

Table 5.5 presents the cash-flow decom position of the

risk even further. Thus, of the USD 130,000 difference in these

J.P. Morgan U.S. bond index, which has a duration of 4.62 years.

Table 5.4

Computing the VaR of a USD 200 Million Bond Portfolio (Monthly VaR at 95 Percent Level)

Term (Year)

P V C a sh F lo w s

Individual V aR

C o rre la tio n M a trix R

X

x X V

1Y

2Y

C o m p o n e n t V aR 3Y

4Y

1

USD 105.77

0.4966

1

2

USD 5.48

0.0540

0.897

1

3

USD 5.15

0.0765

0.886

0.991

1

4

USD 4.80

0.0947

0.866

0.976

0.994

1

5

USD 78.79

1.9115

0.855

0.966

0.988

0.998

USD 200.00

2.6335

Total Undiversified VaR

5Y

USD 0.45 USD 0.05 USD 0.08



USD 0.09

1

USD 1.90

USD 2.63

Diversified VaR

64

xA VaR

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

USD 2.57

Table 5.5

Benchmarking a USD 100 Million Bond Index (Monthly Tracking Error VaR at 95 Percent Level) Position: Portfolio

Vertex

Risk (%)

Position: Index (USD)

1 (USD)

2 (USD)

3 (USD)

4 (USD)

5 (USD)

< 1m

0.022

1.05

0.0

0.0

0.0

0.0

84.8

3m

0.065

1.35

0.0

0.0

0.0

0.0

0.0

6m

0.163

2.49

0.0

0.0

0.0

0.0

0.0

1Y

0.470

13.96

0.0

0.0

0.0

59.8

0.0

2Y

0.987

24.83

0.0

0.0

62.6

0.0

0.0

3Y

1.484

15.40

0.0

59.5

0.0

0.0

0.0

4Y

1.971

11.57

38.0

0.0

0.0

0.0

0.0

5Y

2.426

7.62

62.0

0.0

0.0

0.0

0.0

7Y

3.192

6.43

0.0

40.5

0.0

0.0

0.0

9Y

3.913

4.51

0.0

0.0

37.4

0.0

0.0

10Y

4.250

3.34

0.0

0.0

0.0

40.2

0.0

15Y

6.234

3.00

0.0

0.0

0.0

0.0

0.0

20Y

8.146

3.15

0.0

0.0

0.0

0.0

0.0

30Y

11.119

1.31

0.0

0.0

0.0

0.0

15.2

100.00

100.0

100.0

100.0

100.0

100.0

Total Duration

4.62

4.62

4.62

4.62

4.62

4.62

Absolute VaR

USD 1.99

USD 2.25

USD 2.16

USD 2.04

USD 1.94

USD 1.71

Tracking error VaR

USD 0.00

USD 0.43

USD 0.29

USD 0.16

USD 0.20

USD 0.81

Assum e that we are trying to benchm ark a portfolio of

USD 1.99 million absolute risk of the index. The remaining track­

USD 100 million. O ver a monthly horizon, the VaR of the index

ing error is due to nonparallel moves in the term structure.

at the 95 percent confidence level is USD 1.99 million. This is about equivalent to the risk of a 4-year note.

Relative to the original index, the tracking error can be m ea­ sured in term s of variance reduction, similar to an R2 in a regres­

N ext, we try to match the index with two bonds. The rightm ost

sion. The variance im provem ent is

columns in the table display the positions of two-bond portfo­ lios with duration m atched to that of the index. Since no zero-

1 _ (

t

^ )

= 95‘4 P ercent

coupon has a maturity of exactly 4.62 years, the closest portfolio consists of two positions, each in a 4- and a 5-year zero. The

which is in line with the explanatory power of the first factor in

respective w eights for this portfolio are USD 38 million and

the variance decom position.

USD 62 million.

Next, we explore the effect of altering the composition of the

Define the new vector of positions for this portfolio as x and for

tracking portfolio. Portfolio 2 widens the bracket of cash flows

the index as x0. The VaR of the deviation relative to the bench­ mark is Tracking Error VaR = a V ( x — x 0) '2 ( x — x 0)

(5.8)

A fter perform ing the necessary calculations, we find that the

in years 3 and 7. The TE-VaR is USD 0.29 million, which is an improvement over the previous number. Next, portfolio 3 has positions in years 2 and 9. This comes the closest to approximat­ ing the cash-flow positions in the index, which has the greatest weight on the 2-year vertex. The TE-VaR is reduced further to

tracking error VaR (TE-VaR) of this duration-hedged portfolio

USD 0.16 million. Portfolio 4 has positions in years 1 and 10. Now

is USD 0.43 million. Thus the maximum deviation between the

the TE-VaR increases to USD 0.20 million. This mistracking is even

index and the portfolio is at most USD 0.43 million under normal

more pronounced for a portfolio consisting of 1 -month bills and

m arket conditions. This potential shortfall is much less than the

30-year zeroes, for which the TE-VaR increases to USD 0.81 million.

Chapter 5 VaR Mapping

■ 65

Am ong the portfolios considered here, the lowest tracking error

consider the fact that investors have two alternatives that are

is obtained with portfolio 3. Note that the absolute risk of these

econom ically equivalent: (1) Buy e-yr units of the asset at the

portfolios is lowest for portfolio 5. As correlations decrease

price S t and hold for one period, or (2) enter a forward contract

for more distant m aturities, we should exp ect that a duration-

to buy one unit of the asset in one period. Under alternative

m atched portfolio should have the lowest absolute risk for the

1 , the investm ent will grow, with reinvestm ent of dividend, to

com bination of most distant m aturities, such as a barbell port­

exactly one unit of the asset after one period. Under alterna­

folio of cash and a 30-year zero. However, minimizing absolute

tive 2 , the contract costs ft upfront, and we need to set aside

m arket risk is not the sam e as minimizing relative m arket risk. This exam ple dem onstrates that duration hedging only provides a first approxim ation to interest-rate risk m anagem ent. If the goal is to minimize tracking error relative to an index, it is essen­ tial to use a fine decom position of the index by maturity.

enough cash to pay K in the future, which is Ke~n . A fter 1 year, the two alternatives lead to the same position, one unit of the asset. Therefore, their initial cost must be identical. This leads to the following valuation formula for outstanding forward contracts: ft =

5.3 MAPPING LINEAR DERIVATIVES

(5.9)

Note that we can repeat the preceding reasoning to find the current forward rate F t that would set the value of the contract to zero. Setting K — Ft and ft — 0 in Equation (5.9), we have

Forward Contracts Forward and futures contracts are the sim plest types of deriva­ tives. Since their value is linear in the underlying spot rates,

F t = (Ste~yr)erT

ft = F te~rr - K e -n = (Ft - K)e~n

Assum e, for instance, that we are dealing with a forward con­ tract on a foreign currency. The basic valuation formula can be

(5.10)

This allows us to rewrite Equation (5.9) as

their risk can be constructed easily from basic building blocks.

(5.11)

In other words, the current value of the forward contract is the present value of the difference between the current forward rate

derived from an arbitrage argum ent.

and the locked-in delivery rate. If we are long a forward contract

To establish notations, define

with contracted rate K, we can liquidate the contract by entering a

S t — spot price of one unit of the underlying cash asset

new contract to sell at the current rate Ft. This will lock in a profit of (Ft — K ), which we need to discount to the present time to find ft.

K — contracted forward price

Let us exam ine the risk of a 1-year forward contract to purchase

r = dom estic risk-free rate

100 million euros in exchange for USD 130.086 million. Table

y = income flow on the asset

5.6 displays pricing information for the contract (current spot, forw ard, and interest rates), risk, and correlations. The first step

r — time to maturity. W hen the asset is a foreign currency, y represents the foreign risk-free rate r*. We will use these two notations interchange­ ably. For convenience, we assume that all rates are com ­

is to find the m arket value of the contract. W e can use Equa­ tion (5.9), accounting for the fact that the quoted interest rates are discretely com pounded, as ft — USD 1.2877

pounded continuously. We seek to find the current value of a forward contract ft to buy one unit of foreign currency at K after tim e r . To do this, we

Table 5.6

- Ke~n

1 1 - USD 1.3009 (1 + 3.3304/100) (1 + 2.2810/100)

= USD 1.2589 - USD 1.2589 = 0

Risk and Correlations for Forward Contract Risk Factors (Monthly VaR at 95 Percent Level) Correlations

Risk Factor EUR spot

Price or Rate USD 1.2877

VaR (%)

EUR Spot

4.5381

1

0.1289

1

Long EUR bill

2.2810%

0.1396

0.1289

Short USD bill

3.3304%

0 .212 1

0.0400

EUR forward

66



EUR 1Y

-0 .0 5 8 3

USD 1.3009

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

USD 1Y 0.0400 -0 .0 5 8 3

1

Thus the initial value of the contract is zero. This value, however, may change, creating m arket risk.

This shows that the forward position can be separated into three cash flows: (1) a long spot position in EUR,

Am ong the three sources of risk, the volatility of the spot con­ tract is the highest by far, with a 4.54 percent VaR (correspond­ ing to 1.65 standard deviations over a month for a 95 percent confidence level). This is much greater than the 0.14 percent VaR for the EUR 1-year bill or even the 0.21 percent VaR for the USD bill. Thus most of the risk of the forward contract is driven

worth EUR 100 million = USD 130.09 million in a year, or (Se-r*T) = USD 125.89 million now, (2) a long position in a EUR investm ent, also worth USD 125.89 million now, and (3) a short position in a USD investm ent, worth USD 130.09 million in a year, or (Ke~rT) — USD 125.89 million now. Thus a position in the forward contract has three building blocks: Long forward contract = long foreign currency spot +

by the cash EUR position. But risk is also affected by correlations. The positive correlation of 0.13 betw een the EUR spot and bill positions indicates that

long foreign currency bill + short U .S.dollar bill Considering only the spot position, the VaR is USD 125.89 mil­

when the EUR goes up in value against the dollar, the value of a

lion tim es the risk of 4.538 percent, which is USD 5.713 million.

1-year EUR investm ent is likely to appreciate. Therefore, higher

To com pute the diversified VaR, we use the risk m atrix from the

values of the EUR are associated with lower EUR interest rates.

data in Table 5.7 and pre- and postm ultiply by the vector of positions (PV of flows column in the table). The total VaR for the

This positive correlation increases the risk of the combined position. On the other hand, the position is also short a 1-year USD bill, which is correlated with the other two legs of the trans­ action. The issue is, what will be the net effect on the risk of the

VaR provides an exact answer to this question, which is dis­ played in Table 5.7. But first we have to com pute the positions x on each of the three building blocks of the contract. By taking the partial derivative of Equation (5.9) with respect to the risk factors, w e have

volatility dom inates the volatility of 1 -year bonds.

term currency sw aps, which are equivalent to portfolios of forward contracts. For instance, a 10-year contract to pay dol­ lars and receive euros is equivalent to a series of 10 forward contracts to exchange a set amount of dollars into marks. To com pute the VaR, the contract must be broken down into a currency-risk com ponent and a string of USD and EUR fixed-

df

df .

df .

dS

dr*

dr

income com ponents. As before, the total VaR will be driven pri­

d f — — d S H----- dr* H---- dr

marily by the currency com ponent.

= e~r*Td S - Se~r\ d r * + K e ^ r d r

(5.12)

Here, the building blocks consist of the spot rate and interest rates. Alternatively, we can replace interest rates by the price of bills. Define these as P — e~n and P* — e-f*T. We then replace dr with dP using d P —

{-T)e~n

dr and dP*



(—r)e~r*T dr*. The

+ (Se~r*T)

Commodity Forwards The valuation of forward or futures contracts on com m odities is substantially more com plex than for financial assets such as currencies, bonds, or stock indices. Such financial assets have

risk of the forward contract becom es

Table 5.7

same size as that of the spot contract because exchange-rate

More generally, the same m ethodology can be used for long­

forward contract?

.,

forward contract is USD 5.735 million. This number is about the

dP* P*

a well-defined income flow y, which is the foreign interest rate, (5.13)

the coupon paym ent, or the dividend yield, respectively.

Computing VaR for a EUR 100 Million Forward Contract (Monthly VaR at 95 Percent Level)

Position

Present-Value Factor

Cash Flows (CF)

EUR spot

PV of Flows, x

Individual VaR, x V

Component VaR, xAVaR

USD 125.89

USD 5.713

USD 5.704

Long EUR bill

0.977698

EU R100.00

USD 125.89

USD 0.176

USD 0.029

Short USD bill

0.967769

- USD 130.09

- U S D 125.89

USD 0.267

USD 0.002

Undiversified VaR Diversified VaR

USD 6.156 USD 5.735

Chapter 5 VaR Mapping

■ 67

Table 5.8

Risk of Commodity Contracts (Monthly VaR at 95 Percent Level) Energy Products

Maturity

Natural Gas

Heating Oil

Crude Oil-WTI

Unleaded Gasoline

1 month

28.77

22.07

20.17

19.20

3 months

22.79

20.60

18.29

17.46

6 months

16.01

16.67

16.26

15.87

12 months

12.68

14.61

14.05



Base Metals Maturity

Aluminum

Copper

Nickel

Zinc

Cash

11.34

13.09

18.97

13.49

3 months

11.0 1

12.34

18.41

13.18

15 months

8.99

10.51

15.44

11.95

27 months

7.27

9.57

11.59



Precious Metals Maturity

Silver

Gold

Cash

6.18

Platinum 14.97

7.70

Things are not so simple for com m odities, such as m etals, agri­

per barrel. Using a present-value factor of 0.967769, this trans­

cultural products, or energy products. Most products do not

lates into a current position of USD 43,743,000.

make m onetary paym ents but instead are consum ed, thus creat­ ing an implied benefit. This flow of benefit, net of storage cost,

Differentiating Equation (5.11), we have

is loosely called con ven ien ce y ie ld to represent the benefit from

e~n d F =

holding the cash product. This convenience yield, however, is

(5.14)

not tied to another financial variable, such as the foreign inter­

The term between parentheses therefore represents the exp o ­

est rate for currency futures. It is also highly variable, creating its

sure. The contract VaR is

own source of risk. As a result, the risk m easurem ent of com m odity futures uses Equation (5.11) directly, where the main driver of the value of the contract is the current forward price for this com m odity.

VaR = USD 43,743,000 X 14.05/100 = USD 6,146,000 In general, the contract cash flows will fall between the maturities of the risk factors, and present values must be apportioned accordingly.

Table 5.8 illustrates the term structure of volatilities for selected energy products and base metals. First, we note that monthly VaR m easures are very high, reaching 29 percent for near con­

Forward Rate Agreements

tracts. In contrast, currency and equity m arket VaRs are typically

Forward rate agreem ents (FRAs) are forward contracts that allow

around 6 percent. Thus com m odities are much more volatile

users to lock in an interest rate at some future date. The buyer

than typical financial assets.

of an FRA locks in a borrowing rate; the seller locks in a lending

Second, we observe that volatilities decrease with maturity. The effect is strongest for less storable products such as energy products

rate. In other w ords, the "long" receives a paym ent if the spot rate is above the forward rate.

and less so for base metals. It is actually imperceptible for precious

Define the timing of the short leg as t -i and of the long leg as t 2,

metals, which have low storage costs and no convenience yield. For

both expressed in years. Assum e linear com pounding for sim ­

financial assets, volatilities are driven primarily by spot prices, which

plicity. The forward rate can be defined as the implied rate that

implies basically constant volatilities across contract maturities.

equalizes the return on a r2-period investm ent with a Tr period

Let us now say that we wish to com pute the VaR for a 12-month forward position on 1 million barrels of oil priced at USD 45.2

68



investm ent rolled over, that is, (1 + R t 2) = (1 + f t o X I + F 1; 2(t 2 - n )] 2

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

(5.15)

Table 5.9

Computing the VaR of a USD 100 Million FRA (Monthly VaR at 95 Percent Level)

Position

PV of Flows, x

Risk (% ), V

180 days

- U S D 97.264

0.1629

1

360 days

USD 97.264

0.4696

0.8738

Correlation Matrix, R

Individual VaR, x V

Component VaR, xAVaR

0.8738

USD 0.158

- U S D 0.116

1

USD 0.457

USD 0.444

Undiversified VaR

USD 0.615

Diversified VaR

USD 0.327

For instance, suppose that you sold a 6 X 12 FRA on USD 100

To illustrate, let us com pute the VaR of a USD 100 million 5-year

million. This is equivalent to borrowing USD 100 million for 6

interest-rate swap. We enter a dollar swap that pays 6.195 per­

months and investing the proceeds for 12 months. When the FRA

cent annually for 5 years in exchange for floating-rate paym ents

expires in 6 months, assume that the prevailing 6-month spot rate

indexed to London Interbank O ffer Rate (LIBO R). Initially, we

is higher than the locked-in forward rate. The seller then pays

consider a situation where the floating-rate note is about to be

the buyer the difference between the spot and forward rates

reset. Ju st before the reset period, we know that the coupon

applied to the principal. In effect, this payment offsets the higher

will be set at the prevailing m arket rate. Therefore, the note car­

return that the investor otherwise would receive, thus guarantee­

ries no m arket risk, and its value can be m apped on cash only.

ing a return equal to the forward rate. Therefore, an FRA can be

Right after the reset, however, the note becom es similar to a bill

decom posed into two zero-coupon building blocks.

with maturity equal to the next reset period.

Long 6 X 1 2 FRA = long 6 -month bill + short 1 2 -month bill Table 5.9 provides a worked-out exam ple. If the 360-day spot rate is 5.8125 percent and the 180-day rate is 5.6250 percent, the forward rate must be such that _ (1 + 5.8125/100)

1,2

(1 + 5.6250/200)

or F = 5.836 percent. The present value of the notional USD 100 million in 6 months is x = USD 100/(1 + 5.625/200) = USD 97.264 million. This amount is invested for 12 months. In the m eantim e, what is the risk of this FRA?

Interest-rate swaps can be viewed in two different ways: as (1) a com bined position in a fixed-rate bond and in a floating-rate bond or (2) a portfolio of forward contracts. W e first value the swap as a position in two bonds using risk data from Table 5.4. The analysis is detailed in Table 5.10. The second and third columns lay out the paym ents on both legs. Assum ing that this is an at-the-m arket sw ap, that is, that its coupon is equal to prevailing swap rates, the short position in the fixed-rate bond is worth USD 100 million. Ju st before reset, the long position in the FRN is also worth USD 100 million, so the m arket value of the swap is zero. To clarify the allocation of current values, the FRN is allocated to cash, with a zero maturity.

Table 5.9 displays the com putation of VaR for the FRA. The

This has no risk.

VaRsof 6 - and 12-month zeroes are 0.1629 and 0.4696, respec­

The next column lists the zero-coupon swap rates for maturities

tively, with a correlation of 0.8738. Applied to the principal of USD 97.26 million, the individual VaRs are USD 0.158 million and USD 0.457 million, which gives an undiversified VaR of USD 0.615 million. Fortunately, the correlation substantially low­ ers the FRA risk. The largest amount the position can lose over a month at the 95 percent level is USD 0.327 million.

Interest-Rate Swaps Interest-rate swaps are the most actively used derivatives. They create exchanges of interest-rate flows from fixed to floating or vice versa. Swaps can be decom posed into two legs, a fixed leg and a floating leg. The fixed leg can be priced as a coupon­

going from 1 to 5 years. The fifth column reports the present value of the net cash flow s, fixed minus floating. The last column presents the com ponent VaR, which adds up to a total diversi­ fied VaR of USD 2.152 million. The undiversified VaR is obtained from summing all individual VaRs. As usual, the USD 2.160 mil­ lion value som ew hat overestim ates risk. This swap can be view ed as the sum of five forward contracts, as shown in Table 5.11. The 1-year contract prom ises paym ent of USD 100 million plus the coupon of 6.195 percent; discounted at the spot rate of 5.813 percent, this yields a present value of —USD 100.36 million. This is in exchange for USD 100 million now, which has no risk.

paying bond; the floating leg is equivalent to a floating-rate

The next contract is a 1 X 2 forward contract that prom ­

note (FRN).

ises to pay the principal plus the fixed coupon in 2 years, or

Chapter 5 VaR Mapping

■ 69

Table 5.10

Computing the VaR of a USD 100 Million Interest-Rate Swap (Monthly VaR at 95 Percent Level) Cash Flows

Term (Year)

Fixed USD 0

0

Float

Spot Rate

PV of Net Cash Flows

+ USD 100

Individual VaR

+ USD 100.000

Component VaR

USD 0

USD 0

1

- U S D 6.195

USD 0

5.813%

- U S D 5.855

USD 0.027

USD 0.024

2

- U S D 6.195

USD 0

5.929%

- U S D 5.521

USD 0.054

USD 0.053

3

- U S D 6.195

USD 0

6.034%

- U S D 5.196

USD 0.077

USD 0.075

4

- U S D 6.195

USD 0

6.130%

- U S D 4.883

USD 0.096

USD 0.096

5

- U S D 106.195

USD 0

6.217%

- U S D 78.546

USD 1.905

USD 1.905

Total

USD 0.000

Undiversified VaR

USD 2.160 USD 2.152

Diversified VaR

Table 5.11

An Interest-Rate Swap Viewed as Forward Contracts (Monthly VaR at 95 Percent Level) PV of Flows: Contract

Term (Year)

1x 2

1

1

- U S D 100.36

3

VaR

USD 89.11 - U S D 89.08

4

USD 83.88 - U S D 83.70

5 VaR

4X5

USD 94.50 - U S D 94.64

2

3X4

2X3

USD 78.82 - U S D 78.55

USD 0.471

USD 0.571

USD 0.488

USD 0.446

USD 0.425

Undiversified VaR

USD 2.401

Diversified VaR

USD 2.152

—USD 106.195 million; discounted at the 2-year spot rate, this

from USD 2.152 million to USD 1.763 million. More generally,

yields —USD 94.64 million. This is in exchange for USD 100 mil­

the swap's VaR will converge to zero as the swap m atures, dip­

lion in 1 year, which is also USD 94.50 million when discounted

ping each tim e a coupon is set.

at the 1-year spot rate. And so on until the fifth contract, a 4 X 5 forward contract.

5.4 MAPPING OPTIONS

Table 5.11 shows the VaR of each contract. The undiversified VaR of USD 2.401 million is the result of a sim ple summation

We now consider the mapping process for nonlinear derivatives,

of the five VaRs. The fully diversified VaR is USD 2.152 million,

or options. O bviously, this nonlinearity may create problem s for

exactly the same as in the preceding table. This dem onstrates

risk m easurem ent system s based on the delta-normal approach,

the equivalence of the two approaches.

which is fundam entally linear.

Finally, we exam ine the change in risk after the first paym ent has

To sim plify, consider the Black-Scholes (BS) model for

just been set on the floating-rate leg. The FRN then becom es a

European o p tio n s .2 The model assum es, in addition to

1 -year bond initially valued at par but subject to fluctuations in rates. The only change in the pattern of cash flows in Table 5.10 is to add USD 100 million to the position on year 1 (from —USD 5.855 to USD 94.145). The resulting VaR then decreases

70



2 For a systematic approach to pricing derivatives, see the excellent book by Hull (2005).

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

Table 5.12

Derivatives for a European Call

Parameters: S — USD 100, a — 20%, r — 5%, r* — 3%, r = 3 months Exercise Price Variable c

K = 90

Unit Dollars

K = 100

K = 110

11.01

4.20

1.04

Change p e r

A

Spot price

Dollar

0.869

0.536

0.195

r

Spot price

Dollar

0.020

0.039

0.028

A

Volatility

(% pa)

0.102

0.197

0.138

P

Interest rate

(% pa)

0.190

0.123

0.046

P*

A sset yield

(% pa)

-0.217

-0.133

-0.049

0

Tim e

Day

-0.014

-0.024

-0.016

p erfect capital m arkets, that the underlying spot price follow s

Delta increases with the underlying spot price. The relationship

a continuous g e o m e tric brow nian m otion with constant vo latil­

becom es more nonlinear for short-term options, for exam ple,

ity cr(dS/S). Based on these assum ptions, the Black-Scholes

with an option maturity of 10 days. Linear m ethods approxim ate

(1973) m odel, as expanded by M erton (1973), gives the value

delta by a constant value over the risk horizon. The quality of

of a European call as

this approxim ation depends on param eter values.

c = c(S,/C,T,r,r*,cr) = S e ^ TN(c/-,) - Ke~rTN{d2)

(5.16)

where N[d) is the cum ulative normal distribution function with argum ents

For instance, if the risk horizon is 1 day, the worst down move in the spot price is - a S a V f = - 1 .6 4 5 X USD 100 X 0.20 V l/ 2 5 2 = - U S D 2.08, leading to a worst price of U SD 97.92. W ith a 90-day option, delta changes from 0.536 to 0.452 only. With such a

ln(Se~r*7/ < - Ax) 11 = 1

where x t is the return of asset X at tim e t and n is the number of observed points in tim e. The volatility of asset Y is derived accordingly. Equation 7.2 can be com puted with = stdev in Excel and std in M A TLA B . From our exam ple in Table 7.1, we find that crx — 44.51% and cry — 47.58% .

Table 7.1

Performance of a Portfolio with Two Assets Return of Asset X

Return of Asset Y

Year

Asset X

2013

100

200

W hy study financial correlations? That's an easy one. Financial

2014

120

230

20.00%

15.00%

correlations appear in many areas in finance. W e will briefly dis­

2015

108

460

- 1 0 .0 0 %

100.00%

2016

190

410

75.93%

- 1 0 .8 7 %

2017

160

480

- 1 5 .7 9 %

17.07%

2018

280

380

75.00%

- 2 0 .8 3 %

A verage

29.03%

20.07%

cuss five areas: (1) investm ents, (2) trading, (3) risk m anagem ent,

2 To hedge means to protect More precisely, hedging means to enter into a second trade to protect against the risk of an orginal trade.

108



Asset Y

Financial Risk Manager Exam Part II: Market Risk Measurement and Management

W ith equal w eights, ie, wx — w Y — 0.5, the exam ple in Table 7.1 results in op = 16.66% . Importantly, the standard deviation (or its square, the variance) is interpreted in finance as risk. The higher the standard deviation, the higher the risk of an asset or a portfolio. Is standard deviation a good measure of risk? The answer is: it's not great, but it's one of the best there are. A high standard deviation may mean high upside potential of the asset in question! So it penalises possible profits! But high standard deviation naturally also means high downside risk. In particular, risk-averse investors will not like a high standard deviation, ie, high fluctuation of their returns. An inform ative perform ance measure of an asset or a portfolio is the risk-adjusted return, ie, the return/risk ratio. For a portfolio it is /xp/ap, which we derived in

The negative relationship of the portfolio return/ portfolio risk ratio fxP/ j. Hence w e have

option, see w w w .dersoft.com /quanto.xls. The correlation betw een assets can also be traded directly with

Prealised =

a correlation swap. In a correlation sw ap, a fixed (ie, known) cor­

3 2 _ 3 ^ ,5 + ° ‘3 +

= 0 -3

relation is exchanged with the correlation that will actually occur,

Following equation (7.7), the payoff for the correlation fixed-

called realised or stochastic (ie, unknown) correlation, as seen in

rate payer at swap maturity is USD 1,000,000 X (0.3 — 0.1) =

Figure 7.5.

USD 200,000.

Paying a fixed rate in a correlation swap is also called "buying cor­ relation". This is because the present value of the correlation swap will increase for the correlation buyer if the realised correlation increases. Naturally the fixed-rate receiver is "selling correlation".

Correlation swaps can indirectly protect against decreasing stock prices. As we will see in this chapter in "H ow does cor­ relation risk fit into the broader picture of risks in finance?", Figure 7.8, as well as in C hapter 8, when stock prices decrease,

The realised correlation p in Figure 7.5 is the correlation

typically the correlation betw een the stocks increases. Hence a

betw een the assets that actually occur during the tim e of the

fixed correlation payer protects them selves indirectly against a

swap. It is calculated as:

stock m arket decline.

P r e a l i s e d

?

2

5>y

n —n»

i

where p(J is the Pearson correlation betw een asset / and j, and n is the number of assets in the portfolio. The payoff of a correla­ tion swap for the correlation fixed rate payer at maturity is: W

p r e a l i s e d

~

P f i x e d )

A t the tim e of writing there is no industry-standard valuation model for correlation swaps. Traders often use historical data to anticipate /^realised- To apply swap valuation techniques, we require a term structure of correlation in tim e. However, no correlation term structure currently exists. We can also apply stochastic correlation m odels to value a correlation swap.

(?•?)

where N is the notional amount. Let's look at an exam ple of a correlation swap.

Stochastic correlation m odels are currently em erging. A nother way of buying correlation (ie, benefiting from an increase in correlation) is to buy put options on an index such as the Dow Jo n e s Industrial A verage (Dow) and sell put options on

Example 7.1

individual stock of the Dow. As we will see in C hapter 8, there is

W hat is the payoff of a correlation swap with three assets, a

a positive relationship between correlation and volatility.

fixed rate of 10%, a notional amount of USD 1,000,000 and a 1-year maturity?

Pairwise Pearson Correlation Coefficient at Swap Maturity Table 7.2

First, the daily log-returns ln(St/ S t_ 1) of the three assets are calculated for one year.3*Let's assume the realised pairwise

3 Log-returns ln(S1/S 0) are an approximation of percentage returns (S-1 — S0)/So- We typically use log-returns in finance since they are additive in time, whereas percentage returns are not. For details see Appendix A2.

Sj=2

SJ = 1

Sj=3

S;=1

1

0.5

0.1

Sj=2

0.5

1

0.3

$i=3

0.1

0.3

1

Chapter 7 Correlation Basics: Definitions, Applications, and Terminology

■ 111

Therefore, if the correlation between the stocks of the Dow

chapter on m arket risk. M arket risk consists of four types of risk:

increases, for exam ple in a m arket downturn, so will the implied

(1) equity risk, (2) interest-rate risk, (3) currency risk and (4) com ­

volatility4 of the put on the Dow. This increase is expected to

modity risk.

outperform the potential loss from the increase in the short put positions on the individual stocks. Creating exposure on an index and hedging with exposure on

There are several concepts to measure the m arket risk of a port­ folio such as VaR, expected shortfall (ES), enterprise risk m anagem ent (ERM ) and more. VaR is currently (year 2018) the

individual com ponents is exactly what the "London w h ale", JP

most w idely applied risk-m anagem ent m easure. Let's show the

Morgan's London trader Bruno Iksil, did in 2012. Iksil was called

im pact of asset correlation on V aR.6

the London whale because of his enorm ous positions in C D S s.5 He had sold C D Ss on an index of bonds, the C D X .N A .IG .9 , and "h ed g ed " it with buying C D Ss on individual bonds. In a recover­ ing econom y this is a promising trade: volatility and correlation

First, what is value-at-risk (VaR)? VaR m easures the maximum loss of a portfolio with respect to m arket risk for a certain proba­ bility level and for a certain tim e fram e. The equation for VaR is: VaRp — (Tpa\/x

typically decrease in a recovering econom y. Therefore, the sold C D Ss on the index should outperform (decrease more than) the losses on the C D Ss of the individual bonds.

(7.8)

where VaRP is the value-at-risk for portfolio P, and a: A bscise value of a standard normal distribution,

But what can be a good trade in the medium and long term s

corresponding to a certain confidence level. It can be derived

can be disastrous in the short term . The positions of the London

as = norm sinv(confidence level) in Excel or norm inv(confidence

whale w ere so large, that hedge funds "short squeezed" Iksil:

level) in M A TLA B ; a takes the values

they started to aggressively buy the CD S index C D X .N A .IG .9 . This increased the C D S values in the index and created a huge

— oo

< a