126 44 792KB
English Pages 84 [82] Year 2022
BestMasters
Amia Santini
Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives
BestMasters
Mit „BestMasters“ zeichnet Springer die besten Masterarbeiten aus, die an renommierten Hochschulen in Deutschland, Österreich und der Schweiz entstanden sind. Die mit Höchstnote ausgezeichneten Arbeiten wurden durch Gutachter zur Veröffentlichung empfohlen und behandeln aktuelle Themen aus unterschiedlichen Fachgebieten der Naturwissenschaften, Psychologie, Technik und Wirtschaftswissenschaften. Die Reihe wendet sich an Praktiker und Wissenschaftler gleichermaßen und soll insbesondere auch Nachwuchswissenschaftlern Orientierung geben. Springer awards “BestMasters” to the best master’s theses which have been completed at renowned Universities in Germany, Austria, and Switzerland. The studies received highest marks and were recommended for publication by supervisors. They address current issues from various fields of research in natural sciences, psychology, technology, and economics. The series addresses practitioners as well as scientists and, in particular, offers guidance for early stage researchers.
More information about this series at https://link.springer.com/bookseries/13198
Amia Santini
Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives
Amia Santini University of Bologna Bologna, Italy
ISSN 2625-3577 ISSN 2625-3615 (electronic) BestMasters ISBN 978-3-658-37449-5 ISBN 978-3-658-37450-1 (eBook) https://doi.org/10.1007/978-3-658-37450-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Responsible Editor: Marija Kojic This Springer Gabler imprint is published by the registered company Springer Fachmedien Wiesbaden GmbH part of Springer Nature. The registered company address is: Abraham-Lincoln-Str. 46, 65189 Wiesbaden, Germany
Contents
1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 5
3 Chapter I: Literature on the Subject of Excess Volatility . . . . . . . . . . . 3.1 Findings of Excess Volatility In Long-term Interest Rates . . . . . . . 3.1.1 The Determinants of Interest Rates: The Loanable Funds Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Rational Expectations Models . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Findings of Excess Volatility in Long-maturity Stock Prices . . . . . 3.2.1 Ex Post Rational Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 The Dividend Series and Variance Bounds . . . . . . . . . . . . . . 3.2.3 A Small Excursus on the Distributions of Prices and Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Other Interpretations of the Efficient Markets Model . . . . . 3.2.5 Empirical Results of Shiller (1981a) . . . . . . . . . . . . . . . . . . . 3.3 Criticisms of Shiller (1981a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Volatility Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 The Variance Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Possible Shortcomings of the Models . . . . . . . . . . . . . . . . . . 3.5 Alternatives to—or Possible Explanations in Line With—The Efficient-markets Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 “Fads” Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Time-varying Real Discount Factors . . . . . . . . . . . . . . . . . . . 3.5.3 Time-varying Term Premia . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 The “Peso” Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5 Some Final Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 7 7 9 11 13 14 15 16 16 17 19 21 23 24 25 26 26 27 28 v
vi
Contents
4 Chapter II: Excess Volatility Beyond Discount Rates . . . . . . . . . . . . . . 4.1 The Risk-neutral Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 A Simplified Model for Measuring Excess Volatility . . . . . 4.2.2 The Affine-Q Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 The Variance Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Variance Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Equity Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Currency Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Government Bond Yields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.5 Inflation Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.6 Commodity Futures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.7 Credit Default Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29 29 32 34 36 37 39 39 41 42 43 43 44 44
5 Chapter III: Evidence of Excess Volatility in the Eurozone Market . . 5.1 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 The EURO STOXX 50 Index . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 The DAX Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 The Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Variance Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Equity Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Currency Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 German Bunds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Potential Explanations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Long-memory Cash Flow Dynamics . . . . . . . . . . . . . . . . . . . 5.4.2 Absence of Relevant Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Non-linearities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.4 Potential Mispricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Natural Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47 47 49 49 50 54 54 55 59 60 62 63 64 64 65 66
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
Bibiliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
List of Tables
Table Table Table Table Table Table
5.1 5.2 5.3 5.4 5.5 5.6
Euro Stoxx 50 variance swap results . . . . . . . . . . . . . . . . . . . . . DAX variance swap results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Euro Stoxx 50 option results . . . . . . . . . . . . . . . . . . . . . . . . . . . . DAX option results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EUR/USD option results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . German Bund results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56 57 58 58 59 61
vii
1
Abstract
The following master thesis aims to document and study the phenomenon of excess volatility in the context of share prices, of the term structure of interest rates and of derivative instruments. The existing literature surrounding the analysis of the first two is already substantial in size and for this reason it is presented with no empirical extension, but simply with an attempt to provide an encompassing view of the results obtained thus far. The limitations of traditional models of rational expectations and of the reliance on the efficient market hypothesis are shown, as the data violates the bounds on volatility that are derived from them. Emphasis is also placed, in turn, on the possible shortcomings of the methodologies used to uncover those inconsistencies, and on potential explanations of the observed phenomenon that can be considered in line with the rational expectation framework. Next, the focus is shifted to a relatively newer field of study: derivative instruments. Previous results of excess volatility, recovered with a worldwide focus, are presented and an empirical analysis is performed to assess whether a similar outcome would be obtained in the Eurozone market. The answer is affirmative. The exploration of financial information that falls underneath the risk-neutral measure, such as derivative prices, reduces the importance of time-varying discount rates as a potential explanation of excess volatility. In fact, the martingale measure already incorporates all potential variation in risk premia, which is the main driver of changes in discount rates. This opens the door to different and innovative prospects, and specific attention is paid to a new model for investor behaviour, that of natural expectations, which is shown to be able to fit the phenomena that are the object of this study. Natural expectations consist in the averaging of rational and intuitive expectations, with a view of capturing the extrapolative bias characterising observed investor behaviour. A suggestion is made to carry on future analysis in the direction of the further development of such framework. © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_1
1
2
Introduction
The topic of this research finds its literary starting point in Shiller (1979). Therein, following changes in short-term interest rates, the volatility in the long end of the term structure is documented to be in excess of the one allowed for by the traditional representation of long-term interest rates. In the economic literature, the analysis is afterwards extended to stock prices—with a focus on how big the changes in expected real dividend should be in order to account for the high volatility in the changes in detrended real share prices. In fact, standard economic models price stocks as the optimal forecast of the ex-post rational price—where the ex-post rational price is formed by the present value of the subsequent detrended real dividends. However, by their nature, ex-post rational prices are moving averages of real ones, and they naturally present a much smoother behaviour than that of quoted prices. Bounds to the volatility of both time series are recovered and possible explanations to their violation by the empirical data are presented. The next object of study is the paper Giglio and Kelly (2017), wherein the authors analyse the behaviour of long-end derivative prices with respect to short-end ones, with a revolutionary approach. The results reject traditional internal-consistency conditions (derived from no-arbitrage relationships) between short-term and long-term prices. However, the use of derivative prices shifts the problem underneath the risk-neutral (Q) measure, which already incorporates the effect of variations in discount rates. This allows the authors, and the reader, to consider the findings of excess volatility as coming from sources other than the discount rate changes that are accounted for under the Q measure. A new possible explanation—a calibrated model of investor natural expectations—is then explored. This opens the door to a new understanding of investor rationality and irrationality, which takes elements and results of behavioural economics and evolves into well-specified economic models. © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_2
3
4
2
Introduction
This master thesis not only focusses on the past findings of excess volatility in the term structures of interest rates, share prices and derivative prices, but also seeks to extend them. In the literature that is explored, the latter results have been documented almost exclusively pertaining to the U.S. market, with a minimal extension to the European one and to a handful of other countries. The empirical portion of this work will therefore seek to understand whether the derivatives market of Eurozone countries also allows for this phenomenon, to which extent, and why. The expectation is for the findings to be comparable to, if not larger than, those observed on the American financial products. The reason for the possibility of such forecasted greater size is the relatively smaller size of European capital markets, and thus the introduction of the effect of lower liquidity in increasing deviations from the models. The metrics used in the explored past research and in this one are also analysed in terms of their limitations and potential. Possible future prospect for research are also illustrated, with an eye to the potential of the natural expectations model and whether it is able to account for the empirically observed excess volatility or not, and in which cases. One final goal of this paper is to provide a comprehensive view and collection of the aforementioned area of academic interest, a task which seems not to have yet been carried on to such extent. Previous efforts that have been found on the subject matter seem to stop at the analysis of interest rates and share prices. Special attention is also devoted to the inclusion of critiques, made by previous authors or by the writer of this thesis, to the articles that are illustrated, with the aim of presenting the topic from a multitude of perspectives and not simply absorbing the literature without a critical mind. Some of the arguments in the articles presented are of econometric nature and revolve around the mathematical limitations of the models used for the study. Others have an intuitive understanding behind them and critique the logic behind some of the essential intuitions that are used to reach the conclusions that are presented. Concerning the methodological approach, the first portion of the thesis focusses on the analysis of past findings of excess volatility in the market. The methods used are qualitative, with attention paid to literature research and to the theoretical understanding of the models used and the assumptions that they rely on. The empirical portion of the master thesis, on the other hand, involves an analysis of the derivative instruments on the Eurozone market, and whether the findings of excess volatility match, exceed or fall short of those evidenced by Giglio and Kelly (2017). It therefore mostly employs quantitative methods, such as linear regressions, variance ratio tests, linear algebra computing, and software programming. The software used is primarily “Python”, with the
2.1 Definitions
5
open-source package extensions “NumPy”, “SciPy”, and “SymPy”. The software “Gretl” is also instrumental in estimating the parameters of linear regressions, in performing “White tests” for heteroscedasticity and in estimating models with heteroscedasticity-robust coefficients. Finally, the “Matlab” numerical computing environment is also used, but more limitedly and as a way to double-check the results, and “Python” scripts of the relevant portions of the analysis are reported. The empirical section also includes some qualitative methods insofar as the results obtained are linked to possible papers and theories that have already been elaborated by academics and practitioners. The relevant literature expands on the foundations laid by Shiller (1979, 1981a, 1981b), with the backbone of the computational analysis being the more recent paper by Giglio and Kelly (2017). As for the structure of the study, this master thesis consists of three chapters. The first one analyses the past findings and academic publications on the subject of excess volatility, from interest rates to stock prices. It includes a section on the volatility measures used, with a careful eye on their possible shortcomings. A first series of possible explanations that were brought forward by the past literature is also explored. The second chapter focusses on the innovative paper by Giglio and Kelly (2017), with emphasis placed on the potential of the riskneutral analysis that they performed. The research methodology is also illustrated, both in its technical nature and in its results. The third and final chapter contains the empirical part of the thesis. It performs the same tests for excess volatility made under Giglio and Kelly, but on Eurozone market data collected up to the first weeks of April 2019. Possible explanations for the observed phenomena are brought forward, and the subject of new behavioural models for investor preferences is explored. Finally, in the last section of the thesis, further questions are raised, and conclusions are drawn.
2.1
Definitions
In order to facilitate the understanding of the contents of this master thesis, the following terms require definition. Efficient-market hypothesis (EMH): the hypothesis of a market in which “prices at every point in time represent best estimates of intrinsic values” 1 and adjust instantaneously to changes in the intrinsic value.
1
Fama (1965) p. 94.
6
2
Introduction
Excess volatility: a level of volatility exceeding that predicted through models relying on the traditional efficient-market hypothesis. Homoscedasticity: the feature of a sequence or a vector of random variables by which all random variables have the same finite variance.2 Kurtosis: in statistics, it is “the degree of peakedness of the graph of a statistical distribution, indicative of the concentration around the mean”3 . It is computed as the ratio between the fourth moment of a distribution and the square of its second moment. The meaning of “high” or “low” kurtosis depends on the value taken by the normal distribution (which is 3) as a comparison term. Thicker (“fatter”) tails are associated with high kurtosis, whereas lighter tails are associated with low kurtosis. Risk-neutral measure (or equivalent martingale measure): the probability measure under which the current price of a stock equals the expectation of the future one discounted at the risk-free rate. In more general terms, under the risk-neutral measure, the unique no-arbitrage price associated to any attainable contingent claim is given by the expectation of its discounted payoff, where the relevant rate is again the risk-free one. Furthermore, “the existence of a unique equivalent martingale measure […] not only makes the markets arbitrage free, but also allows the derivation of a unique price associated with any contingent claim.”4
2
Ramanathan (1995) p. 86. Kurtosis (n. d.): In Collins English Dictionary, retrieved from https://www.collinsdicti onary.com/us/dictionary/english/kurtosis, last update April 15, 2019, accessed April 20, 2019 4 Brigo/Mercurio (2006) p. 26. 3
3
Chapter I: Literature on the Subject of Excess Volatility
3.1
Findings of Excess Volatility In Long-term Interest Rates
3.1.1
The Determinants of Interest Rates: The Loanable Funds Theory
The traditional understanding of the determination of interest rates derives from macroeconomics, more specifically from the loanable funds theory. All sources of credit—from household savings to bank loans—compose demand and supply of loanable funds, which meet at the equilibrium level of interest rates in the economy. In such a setting, the main drivers of the behaviour of interest rates are expected economic growth, unexpected inflation, public deficits and public consumption. This theory, as most theories in macroeconomics, provides a simplified understanding of reality. Naturally, in order to be tractable, models cannot include all real variables, even if it was indeed possible to identify all of them. However, already in the conference The Determination of Long-Term Interest Rates and Exchange Rates and the Role of Expectations held in 1996 by the Bank for International Settlements, speakers M. Dombrecht and R. Wouters highlighted two shortcomings of this theory. The first one involves the role of public deficits, which are expected to increase real rates of interest because they decrease the supply of loanable funds. However, the direction of this effect might be inverted in the case of private agents increasing savings because of the expectation of a future raise in taxation. This highlights the disconnect between the model and microeconomic foundations including the problem of intertemporal allocation. The second shortcoming is the absence of
© The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_3
7
8
3
Chapter I: Literature on the Subject of Excess Volatility
a division between short- and long-term interest rates, in terms of the different effect that their different determinants have on them. Perhaps a model more consistent with the needs of this research is one that focusses around the intertemporal allocation problem of consumption and savings, with the maximisation of the expected value of a discounted (by ρk ) logarithmic utility function (U) dependent on consumption (Ct ) – which is proposed by Dombrecht and Wouters. The consumption is allocated between domestic and foreign bonds, in order to shift the optimization problem to a universe stated in terms of bond holding-period returns. The relevant formula is: Max E t
∞
ρk U (Ct+k ),
k=0
and the budget constraint: ∞ i=k+1 t+k t+k + st+k Ft+k−1 + = Bt+k−1
t+i t+i bt+k Bt+k + ∞
i=k+1
∞ i=k+1
t+1 t+i f t+k st+k Ft+k =
t+i t+i bt+k Bt+k−1 +
∞ i=k+1
t+i t+i f t+k st+k Ft+k−1 + Yt+k − Ct+k .
The factors btt+i and Btt+i represent the time t price of a domestic discount bond which pays one unit of domestic currency at time t+i and the number of such bonds held by the household at time t, respectively. The factors f tt+i and Ftt+i represent the time t price of a foreign discount bond which pays one unit of foreign currency at time t+i and the number of such bonds held by the household at time t, respectively, while s is the exchange rate from foreign to domestic currency and Y is the household income. Following a standard Lagrangian optimization procedure (for more detail, please refer to Dombrecht and Wouters (1996)), the problem is redefined from holding period returns to bond yields. In the paper, the authors are able to identify how the key macroeconomic determinants of equilibrium rates that are mentioned at the beginning of this paragraph—expected economic growth, unexpected inflation, public deficits and public consumption—affect short-term and long-term rates differently. To achieve this, they make use of historical data referring to the German bond market. However, due to the changes that have happened since 1996, including the creation of the single European currency and market, and the multiple financial crises, the empirical results will not be reported. The replication of such an analysis involving current market data could be an interesting further field of research
3.1 Findings of Excess Volatility In Long-term Interest Rates
9
in the areas of macroeconomics and monetary policy; it is, however, beyond the scope of this paper. For the purpose of this study, it suffices noting that a distinction between short-term and long-term rate determinants was first made through the extension of the Loanable Funds Theory, and then proved by the empirical data. This is an important step in the path of defining links among the two.
3.1.2
Rational Expectations Models
It is now time to focus on the theoretical background that acts as a basis for the findings of excess volatility in both share prices and interest rates: the rational expectations model, according to which prices at any point in time reflect all the information that is available to the public in that moment. Such information can be linked back to macroeconomic factors, such as, among others, changes in the monetary base, in fiscal policy, in factor prices and in expectations about inflation. E(Pt ) = E(Pt |It−1 ). In the context of interest rates, the underlying hypothesis postulates that the longterm rate (Rt ) is composed of the market’s expectation of the future one-period short-term rates (rt ) with maturities corresponding to all discrete points in time leading up to t + n. To this, a risk premium—which in standard approaches is taken as a constant—is added, as in Guidolin and Thornton (2008). In Shiller (1979), the starting point of the analysis on excess volatility is the following equation Rtn =
n−1 1−γ k γ E t (rt+k ) + φn , (i) 1 − γn k=0
where γ is a constant between 0 and 1, γ = 1/ 1 + R¯ , φn is the constant liquidity premium for maturity n, and R¯ is the discount rate. In this formula, the author employs truncated-exponential (or “Koyck” distribution) weights scaled in order for them to add up to one. They follow from the formula of convergence of a truncated exponential summation: n−1 k=0
ar k = a
1 − rn . 1−r
10
3
Chapter I: Literature on the Subject of Excess Volatility
In this way, different weights are attributed to the expectations of short-term rates having different maturities. In particular, a lower weight is given to rates having a more distant maturity, and a greater one to those of maturity nearer to t. Additionally, Shiller shows how, thanks to the scaling factor, this formula not only applies to the more standard zero-coupon bonds, but also to coupon-bearing bonds. This explanation, in particular, is beyond the scope of this master thesis. The interested reader can examine Shiller (1979). A further look is then taken at (n) the one-period holding yield of a bond with maturity n during period t (Ht ). In general terms it can be represented as: (n)
Ht
(n−1)
=
Pt+1
(n)
− Pt (n) Pt
+C
, (n)
where C is the coupon payment at the end of the period and Pt is the price at time t of a bond with maturity n. It is worth noting that the same bond at time t + 1 will have a remaining maturity of n−1 periods. The relation between the expected one-period holding yield and the short-term rate is then (n) described as: E t Ht = rt + φ (n) where φ (n) is again the liquidity premium for maturity n. In line with a CAPM-like approach, the liquidity premium is taken as β (n)[E(Rm ) − r ], with Rm being the return on the market portfolio, and (n) β (n) = cov Ht , Rm /var (Rm ). As already assumed in past literature1 , this premium is taken as constant over time intervals of limited maturity—although it is not necessarily so in general. After these definitions, the paper considers the ex-post rational long-term interest rates Rt∗ , which apply to a perpetual bond (n = ∞), which are computed as the value that long-term rates should have taken, under the rational expectations model, given the realized short-term rates. It is essential to compare the smoothness of ex-post rational long-term rates (Rt∗ ) to the highly volatile behaviour of long-term interest rates (Rt ). The paper studies this observable difference in volatility. It is additionally worth noting that all of the data of the study refers exclusively to the U.S. market.2 The short-term rate is the 4–6-month prime commercial paper rate from 1966(I quarter) to 1977(II quarter). The long-term rate (Rt ) is the Federal Reserve recently offered AAA utility bond yield series. Both rates are collected with quarterly frequency and in the first week of the quarter. From equation (i) and from the definition of Rt∗ , it follows that Rt = E t Rt∗ + φ. This in turn means 1 2
Black/Scholes (1972), Friend/Blume (1970). Shiller (1979) p. 1191.
3.2 Findings of Excess Volatility in Long-maturity Stock Prices
11
that the expectation of the forecast error t = Rt* + φ − Rt , conditional on information known at time t, must be zero. Consequently, t is uncorrelated with all information known at time t, including past interest rates. From the definitions of correlation and covariance, it follows that:
E Rt∗ + φ − Rt ∗ Rt−τ = E t Rt−τ = 0, τ ≥ 0,
E Rt∗ + φ − Rt ∗ rt−τ = 0, τ ≥ 0, where the expectation no longer has the subscript (t) as it is unconditional. From econometric arguments (for more detail please refer to Shiller 1979), the author then derives the following bounds for the volatility of interest rates: var (Rt ) ≤ var Rt∗ ≤ var (r ). The restrictions are then analysed empirically in terms of bond returns. The sample volatility is revealed to violate all of the restrictions imposed by the model. Concerning the predictive power of the expectations model, Shiller concludes that the sample regression shows that movements of long-term rates tend to be in the direction opposite to the one predicted by the model. That is the essence of the paper: traditional models based on rational expectations severely underestimate the actual volatility of long-term rates. From here onwards, traditional models will begin to be challenged based on their ability to correctly predict the volatility of the dependent variables. The next notable example that this analysis will consider is Shiller (1981a), which involves stock prices.
3.2
Findings of Excess Volatility in Long-maturity Stock Prices
As mentioned at the end of the previous paragraph, in Shiller (1981a) the author extends his analysis of excess volatility to the universe of stock prices. In the same year, Stephen F. LeRoy and Richard D. Porter publish the paper “The Present-Value Relation: Tests Based on Implied Variance Bounds”, where they independently derive and conduct test of expectations models of stock prices. These tests involve confidence intervals for volatility testing and are widely comparable to those used by Shiller, which are explained hereafter.
12
3
Chapter I: Literature on the Subject of Excess Volatility
The starting point in Shiller 1981a is again the efficient markets model. In this context, the efficiency theory postulates that the real price at the beginning of a period t is given by the summation of the expectation of future dividends (Dt ) multiplied by a constant discount factor γ. Pt =
∞
γk+1 E t Dt+k
0 < γ < 1,
k=0
γ = 1/(1 + r ), with r the discount rate estimated from the historical data as average dividends divided by average price. The fact that the discount factor is assumed to be constant is worth highlighting. In fact, it will be the limitation that Giglio and Kelly (2017) will seek to overcome. The expectation is also conditioned on the information available at time t. Additionally, the range of gamma does not reach or exceed one; more modern studies have, in general, evolved to allow for zero or negative rates. At the time of the analysis, however, this phenomenon had not yet been observed. The one-period return (from t to t + 1) is denoted by Ht ≡ (Pt+1 + Dt )/Pt , wherefrom it can be rapidly illustrated that Et (Ht ) = r. E t (Ht ) = E t [(Pt + Dt )/Pt ] = E t Pt+1 /Pt + Dt /Pt = E t Pt+1 /Pt − 1 + Dt /Pt ,
where: E t (Pt+1 ) = E t
∞
γ
k+1
E t+1 Dt+1+k .
k=0
By the tower property of conditional expectations: E(E(X |M)|F ) = E(X |F )i f F ⊆ M . Since the information set at time t is a subset of the information set at time t + 1, we obtain: Et
∞ k=0
γ
k+1
E t+1 Dt+1+k
=
∞
γk+1 E t Dt+1+k = Pt /γ − Dt .
k=0
By substituting E t (Pt+1 ) into the formula for the period return, we obtain:
3.2 Findings of Excess Volatility in Long-maturity Stock Prices
13
E t (Ht ) = E t (Pt+1 )/Pt − 1 + Dt /Pt = 1/γ − Dt /Pt − 1 + Dt /Pt = = 1 + r − 1 = r. The model is then restated allowing for a growth rate (g) in the real value of the portfolio: we take λt−T = (1 + g)t−T , with T as the base year, pt = Pt /λt−T , dt = Dt /λt−T +1 , and γ¯ = λγ = 1/(1 + r¯ ). The value of r¯ is defined in order for the equality to hold and is positive. The formula now becomes: pt =
∞
γ¯ k+1 Et dt+k . (ii)
k=0
The scaling factor lambda has the effect of eliminating the heteroscedasticity that could originate from the gradually increasing size of the market.
3.2.1
Ex Post Rational Prices
From this point onwards, the procedure mirrors that of Shiller (1979): as previously done for the ex post rational long-term rates, the focus is now shifted to the ex post rational price series pt∗ , which is related to the actual price series pt through the following relationship: pt = E t pt∗ , where: pt∗ =
∞
γ¯ k+1 dt+k .
k=0
We now assume to know the terminal value of the ex post rational price series and set it as p ∗ . In this case, the series can be decomposed into: pt∗ =
∞
γ¯ k+1 dt+k = γ¯ dt + γ¯ 2 dt+1 + γ¯ 3 dt+2 + ... + p ∗ . (iii)
k=0 ∗ into (iii): We substitute the expression of pt+1
14
3
Chapter I: Literature on the Subject of Excess Volatility
∗ pt+1 =
∞
γ¯ k+1 dt+1+k , (iv)
k=0
∗ pt∗ = γ¯ dt + γ¯ pt+1 , ∗ + dt , (v) pt∗ = γ¯ pt+1 and obtain the recursive formula (v) which can be used to evaluate the entire ex post series. Given that γ¯ < 1, the impact on pt∗ of the choice of p ∗ is negligible. The real modified Dow Jones Industrial Average is compared with the ex post rational price. The data is annual from the years 1928 to 1979. Some adjustments were made to ensure that the data referred to the same 20 companies from the beginning of the sample period to the end, although the composition of the DJIA changed overtime.
3.2.2
The Dividend Series and Variance Bounds
One final item to be defined is the so-called innovation operator δt = E t − E t−1 which multiplies variables to represent changes in the conditional expectations of their future values. The model in (ii) is then restated as: δ t pt =
∞
γ¯ k+1 δt dt+k .
k=0
Next, the focus is shifted to the dividends. The premise of the efficient markets model is that sudden changes in stock prices can be attributed to changes in expectations of future dividends. For this reason, the computations that were made up to this point are continued and extended, one final time, to the series of dividends dt . In terms of the innovation operator δt , the dividend series at time t can be expressed as its unconditional expectation plus the sum of its subsequent innovations: dt = E(d) +
∞
δt−k dt .
k=0
After having defined the model through all of the previous steps, the results are finally put together in a procedure of maximum-likelihood optimization that
3.2 Findings of Excess Volatility in Long-maturity Stock Prices
15
parallels the one of subparagraph 3.1.2. Upper bounds are thus identified for the standard deviation of innovation in prices for a given volatility level in dividends. The formulas and intuitions behind those inequalities are explored in paragraph 3.4 of this master thesis, where they are generalised in order to highlight the link between the universe of interest rates (as seen in paragraph 3.1) and that of stock prices, as in the present section.
3.2.3
A Small Excursus on the Distributions of Prices and Information
The normal distribution occurs very commonly in nature, as it serves—for example—as an extraordinarily accurate representation of human metrics, such as height or weight. Under the central limit theorem (CLT from here onwards), the normalised sum of independent identically distributed random variables, with finite variance, converges in distribution to a standard normal, even if the variables themselves are not normally distributed. This allows statisticians to apply tests that are based on the standard normal distribution to frequently occurring phenomena that satisfy the conditions of the CLT, provided that the sample size is sufficiently large (in the literature, traditionally at least 30 observations). Additionally, the normal distribution is entirely described by only two parameters, the mean and the standard deviation, which are independent from one-another. This implies that errors in the estimation of one parameter do not affect the accuracy of the other. Risk can thus be encompassed by one metric, the variability (volatility) of values around the mean that the distribution can take. For all of these reasons, the normal distribution is usually the preferred choice in modelling. However, there exists an extensive literature3 that studies the empirical distribution of returns. The overarching issue concerns the frequency of extreme (“tail”) negative events, which appears to be inconsistent with the level of kurtosis of the gaussian distribution: over the last few decades, the number of financial crises having happened seems to be inconsistent with the infinitesimal degree of probability that the normal model would attribute to them. A simple enumeration suffices in highlighting this: the debt crisis of Latin American countries in the 1980’s (Argentina, Mexico, and Brazil), the “Black Monday” stock market crash of 1987, the savings and loan crisis in the U.S. from 1989 from 1991, the Russian default crisis and the LTCM Hedge Fund crisis of 1998, the “dot-com”
3
Amihud/Mendelson (1987), Oldfield/Rogalski (1980), Ané/Geman (2000).
16
3
Chapter I: Literature on the Subject of Excess Volatility
bubble burst of 2000–2001, the sub-prime mortgage crisis of 2007–2008, and the European sovereign debt crisis of the 2010s. Shiller (1981a) also highlights the non-normality of returns and links it to the flow of information as a way of providing an intuitive explanation of the upper variance bound. If new information comes “in big lumps infrequently”, then changes in the time-series of prices will mostly take small values, with occasionally very large ones. This will, in turn, imply high kurtosis. However, the variance of the process δ p will not be particularly large. The author notes that the long time spans characterised by rather small changes (corresponding to no information reveal) have a stronger effect for decreasing variance than the large changes have for increasing it. On the other hand, through an analysis of the maximisation procedure, the author observes that variance is maximised when information is revealed smoothly.
3.2.4
Other Interpretations of the Efficient Markets Model
Other interpretations of the efficient markets model might see prices as the present value of expected earnings instead of dividends. Earnings are an accounting entry that includes both funds that are retained by the company and dividends which are paid out to shareholders. Retained earnings are reinvested with the goal of generating further earnings. For this reason, by definition, considering the time series of earnings as the determinant of prices would involve some sort of double counting. Modigliani and Miller (1961) elaborated a model for efficient markets that represented prices as the discounted value of future earnings, with changes made to solve the double-counting problem. That formula can be shown to actually equal (ii)4 .
3.2.5
Empirical Results of Shiller (1981a)
Shiller performs the analysis on two datasets, one represented by the real Standard and Poor’s Composite Stock Price Index and the associated dividend series, annual and from 1871 to 1979, the other by the Dow Jones Industrial Average (modified to reflect the performance of a portfolio composed of the same 30 stocks) and the associated dividend series, annual and from 1928 to 1979. In both samples, all restrictions on price volatility—derived from a model that links 4
Shiller (1981a) p. 429.
3.3 Criticisms of Shiller (1981a)
17
changes in price to changes in expected real dividends—are violated by a factor of at least 5, up to 13. This means that the arrival of new information about future real dividends cannot explain the volatility of stock prices. The size of the mismatch is so great that it cannot be attributed to reasons other than the failure of the efficient markets model itself. Issues such as data collection errors or changes in tax laws would not—according to Professor Shiller—be able to lead to such a strong rejection of the model. The author then tries to understand whether the change in the expected real interest rate could reasonably provide an explanation for such high volatility. He computes what value the standard deviation of real discount rates would have to take in order to account for the discrepancy in the results. The conclusions are quite unrealistic, as the obtained levels (4.36% and 7.36% for the two indices respectively) imply ranges (at a 97.5% significance level) from −3.91 to 13.52% and from −8.16 to 17.27% for the two data sets. These ranges differ greatly from the historical variability shown over the century. However, it is worth noting that real interest rates cannot be directly observed, which precludes a direct statistical testing of the relationship between them and price variations. This last point is of particular importance in the light of Giglio and Kelly (2017), a paper which will be analysed in the next chapter of this master thesis. The study of the two authors will manage to work around the impact of changes in discount rates by analysing price variations of derivatives—which occur under the risk-neutral measure.
3.3
Criticisms of Shiller (1981a)
The final argument brought forward by Shiller is of what he describes as an “academic”5 nature. He argues that the efficient markets model could be saved under the assumption that the sample standard deviation of the movements of real dividends around their long-run growth path is not an accurate measure of uncertainty of future dividends. In fact, the market could historically have feared much larger changes than those actually realised in time. This seems to be an unreliable explanation, as even during the Great Depression dividends went substantially below their long-run growth path only for a handful of years. This would in turn mean, in order to account for the excessive variability observed across the overall time horizon of the study, the general market consensus must have been that movements in real dividends would, at any time, be much larger than even those observed in the 1930’s. 5
Shiller (1981a) p. 434.
18
3
Chapter I: Literature on the Subject of Excess Volatility
A comment to this last point was made by Basil L. Copeland, Jr. (Copeland 1983). In his concise argument, he describes how changes in dividends could have a larger effect than anticipated. Investors might see them as fundamental, as opposed to transitory, changes in a company’s retention policy—where changes in funds available for reinvestment would in turn affect the growth potential of the firm and the overall dividend growth rate. In that case, the effect on price volatility would be different, and greater, as Copeland goes on to illustrate, because even a small deviation of the real dividend from the expectation can be very large in terms of effect on prices if interpreted as a fundamental change in the dividend growth rate. High volatility of prices would therefore be consistent with the efficient markets hypothesis in the case in which market expectations of the future dividend growth rate change frequently. Professor Shiller’s reply6 was published on the same issue of The American Economic Review as the above comment. In it, he reconducted Copeland’s argument to frequent observations of the same kind that he had apparently heard multiple times since the publishing of his 1981a paper. This consists of the possibility of the dividend process being non-stationary, meaning that its mean and variance change overtime (in that argument, due to investor’s belief of changes impacting the fundamentals of a company). Shiller recognizes this possibility in reality, and stresses how he does not assert that all the model assumptions literally hold true. The alternative tests for excess volatility would, however, need to consider a nonlinear measure of the uncertainty of dividends, which could complicate the analysis. This last point is expanded with a more careful look at the real dividend series since 1871, which is described as appearing to be dominated by a growth trend and a mean-reversion tendency. Additionally, it is noticeable to the eye that deviations from this trend tend to follow offsetting directions overtime. This leads Shiller to infer more stationarity in the trend than either the “transitory” or the “fundamental” explanations of Copeland would suggest—where the mathematical translation of the “transitory” feature of changes is the assumption of log dividends behaving as a random walk. In a random walk, the path is defined by a series of independent identically distributed random variables (log(dt )), meaning that each movement at time t is a variable that takes a value independently of the preceding ones. The dividend process can be represented in the following way: log(dt ) = log(dt−1 ) + at , 6
Shiller (1983).
3.4 Volatility Measures
19
with d0 ∈ R as the initial value of dividends and where {at } is a white noise series. The usual assumption is for log(dt ) to have normal increments and for (at ) to be centred around zero. The variance of a random walk increases linearly with time, as it involves the addition of the variance of the single random walk components. In fact, if we denote as St the current position of the random walk, var (St ) = var (a1 + a2 + . . . + at ) = var (a1 ) + var (a2 ) + . . . + var (at ) = σ2 + σ2 + . . . + σ2 = tσ2 because the variance of the initial point S0 is zero, and because each increment is independent. The fact that variance changes with time differentiates a random walk from a stationary process, as the latter is characterised by finite mean and variance that do not change overtime. Furthermore, it is possible to consider a random walk as a special AR(1)— autoregressive of order one—process. In that case, the linear regression coefficient associated with log(dt−1 ) is one. This last condition was tested via Monte-Carlo simulation on historical data by David Dickey and reported by Fuller (1976) and rejected at the 5 percent significant level. Dividends thus do not appear to follow a random walk, but simply to fluctuate in a stationary manner around a trend. And this is, for all intents and purposes, the theoretical basis behind Shiller’s (1981) stationarity of dividends assumption. He does not consider a direct test of stationarity feasible, although he does not venture on to specify why, but shows how the forms of non-stationarity suggested by Copeland are directly rejected by the data.
3.4
Volatility Measures
It is now essential to take a closer look at the volatility measures employed in the studies mentioned in the above paragraphs. This is done for two reasons: the first, to understand why they were used—also for the benefit of the analysis of the next chapters—and the second, to highlight possible shortcomings in their construction. In this portion, a handful of other papers—centred around the same topics discussed up to this point—are included. This is done in order to provide as comprehensive an account as possible, and to highlight the commonalities in the literature of the time. The starting point of the literature of paragraphs 3.1 and 3.2 is the same, the already mentioned efficient markets model, which can be stated one final time in terms that can apply to both interest rate modelling and share price modelling:
20
3
Chapter I: Literature on the Subject of Excess Volatility
yt =
n
β j xte ( j). (vi)
j=0
Where {xt } is a scalar time series that is generated jointly with the vector time series {z t } as a “stationary multivariate linear stochastic process”7 . The sequence {z t } includes all variables other than past values of x that are used to predict x t ; {yt } is also a time series of scalars; xte ( j) denotesE xt+ j |It , the expectation conditioned on all currently available information, and It is the information known at time t—which includes the realisations of {xt } and {yt } up to time t included. Additionally, β < 1 takes on the role of the discount factor, where the corresponding rate is denoted by r . In the context of interest rates (Findings of excess volatility in long-term interest rates), the series x te ( j) becomes the series of expected future one-period short-term rates conditional on information known at t, maturing at all points in time leading up to {t + n}. In the context of share prices (Findings of excess volatility in long-maturity stock prices), the summation goes to infinity. In Leroy and Porter (1981), xte ( j) is the conditional expectation of a corporation’s real earnings and, in Shiller (1981), of its real dividends. In all cases, shares or interest rates, β is the discount factor with a constant discount rate. Next, the variables referring to nominal amounts are scaled by a growth factor (previously denoted as λ), which equals one plus the long-term growth rate of the real value of the portfolio. Finally, a “perfect-foresight” or ex-post series of {yt }, yt∗ , is defined, which is computed from the actual realised values of {xt } and {z t }. Both the long-term rate and the share price models have implications concerning the “innovation terms” δt , which represent the change in the expectation of a certain variable from one period (t − 1) to the next (t). These terms are not very forecastable, as their correlation with available information in the initial period is very small and not statistically significant. The authors then attempt to derive formal expressions of variance bounds. In order to derive testable restrictions, the models are re-stated in terms of likelihood functions. The underlying assumption to the models is that {y} and {xt } are jointly covariance stationary, meaning that
the mean of [yt , xt ] and the covariance between [yt , xt ] and yt+k , xt+k are finite and do not depend on t. This assumption excludes an explosive path for yt as t approaches infinity.
7
LeRoy/Porter (1981) p. 555.
3.4 Volatility Measures
3.4.1
21
The Variance Inequalities
The intuition behind the volatility tests is the same, both in theinterest rate and in the price context. In the two cases, the ex-post series of yt∗ is computed as a weighted moving average of {xt }, meaning that more current data is assigned a heavier weight than that in the distant past. Concerning interest rates, in Shiller 1979 expected short-term rates in the forthcoming future are given greater weights than expected short-term rates in the more distant future. Concerning the share price environment, discount factors ensure a differential weighting calibrated to the time dimension corresponding to each dividend. By virtue of its status as a weighted moving average, yt∗ can be expected to present a “smoother”8 behaviour than the {xt } series. Under spectral theory, this intuition can be translated in terms of restrictions to the weighted integrals of the spectra of yt∗ and xt . Alternatively, as is more straightforward a computation due to the nature of the data, it can be expressed as a collection of inequalities stated in terms of standard deviations. The first inequality is: σ (y) ≤ σ y ∗ and is present in Shiller (1979), Leroy and Porter (1981) as well as Singleton (1980). The latter paper involves a test of the rational expectations model of long-term U.S. Treasury bond yields. The intuition behind this inequality is made explicit in Shiller (1981b) in the context of share prices: it is a condition ensuring that pt be an unbiased forecast of pt∗ . If the inequality were inverted, in periods of particularly large (or small) values of pt , pt would systematically over- (under-) perform pt∗ , due to its greater volatility, thus introducing an upward (downward) bias in the estimation. The second inequality binds from above the volatility of the one-period change in y, y, with the volatility of x: √ σ(y) ≤ σ(x)/ 2r and it is analogous to the inequalities present in Shiller (1979), (1981) and Singleton (1980). The intuition behind this inequality is again provided in Shiller (1981b): if σ(y) is large, then either σ(y) is very large or y is highly forecastable. In fact, changes in y that are not highly forecastable would contribute 8
Shiller (1981b) p. 295.
22
3
Chapter I: Literature on the Subject of Excess Volatility
to increase the high σ(y), but in order for y to remain within the range implicitly defined by its volatility, then the subsequent movements in y would need to show strong negative correlation with the preceding ones, and thus be forecastable. If those movements were also unforecastable then the only other way to account for a large σ(y) would be to also have a very large σ(y). If excess returns are derived from equation (v), which can be stated, in the more general terms of this section, as: yt = β(yt+1 + xt ). Then excess returns are then equal to: βyt+1 − yt + βxt = = yt+1 − βyt + xt = = yt+1 − (1 + r )yt + xt = = yt+1 + xt − r yt . From the intuition Shiller argues that, if σ(y) surpass the bound, then excess returns are forecastable, either because yt+1 is highly forecastable or because r yt varies greatly relative to xt and to the predictable component of yt+1 — meaning that σ(y) is very large. The third inequality is the general representation of bounds present in Shiller (1979) and Huang (1981). It binds from above the standard deviation of the one-period change in y, y, with the volatility of the one-period change in the independent variable x, x: σ(y) ≤ σ(x)/ 2r 3 /(1 + 2r ). As highlighted by Shiller (1981), this last expression has the advantage of applying also in the case in which y and x are integrated processes with no existing variance, as are, for example, random walks. This is a consequence of the fact that the inequality holds in terms of the increments y and x, which have a defined variance although the entire process might not. The intuition behind this final inequality is that, if σ(y) is very large with respect to σ(x), then excess returns would be forecastable, either because σ(r yt − xt ) is very large or because yt+1 is highly forecastable—along the same line of reasoning as the second inequality.
3.4 Volatility Measures
3.4.2
23
Possible Shortcomings of the Models
Now that the reasoning behind the choice of the volatility inequalities has been explained—at least on an economic and mathematically intuitive level—it is essential to provide disclosure on some (three) possible shortcomings that have been found in time by other scholars and by the author of this master thesis. The first weakness of the model could derive from the choice of the variables. It is interesting to note, for example, that in Leroy and Porter (1981) the chosen independent variable x is identified as company earnings. This measure includes a possible double-counting issue, since a portion of earnings is retained by the firm, reinvested and contributes to the generation of future earnings. A straightforward solution could be subtracting retained earnings from earnings, which would result in isolating the dividends. In the aforementioned paper, however, the authors chose to divide both the price and the earnings series by the accumulated undepreciated real retained earnings. The authors identify this last amount as a proxy for the capital stock and the resulting series appear stationary over the period that they analyse, so the approximation can be seen as valid insofar as it reached the goal of ensuring that the model assumptions hold. While on the topic of share price predictability and of the choice of x, it is perhaps worth noting that shareholders can be compensated for their investment in a company not only through the receipt of dividends, but also through the appreciation of the company shares and stock buybacks. The latter form of wealth transfer has been employed more and more frequently since the financial crisis, and is well-illustrated by Apple’s capital return programme, which has seen the number of outstanding Apple shares decrease from a peak of 6.631 billion at the end of 2012 to approximately 4.773 billion at the end of 2018. Boudoukh et al. (2007) carried out an analysis to further refine the choice of the independent variable x from the simple series of dividends to the pay-out and the net pay-out, where the former is described as being composed by dividends and repurchases, and the latter by dividends and repurchases reduced by any new issuances of shares. The authors highlight the complexity behind measuring the variables composing these series, as for example identifying the fraction of share repurchases that is meant by the management to act as a substitute for dividends. For this reason, they choose to show the results of the two measures and leave the selection of the most appropriate one to the reader and to further research. Their analysis shows that once repurchases are taken into account, both the pay-out and the net pay-out are statistically significant in the ability to predict expected stock returns, and thus should be preferred to dividends in asset-pricing models. They define the turning-point in the change between the relevance of
24
3
Chapter I: Literature on the Subject of Excess Volatility
dividends versus pay-out measures as the institution of SEC rule 10b-18 in 1982, which provided a safe harbour for firms from price-manipulation charges if they decided to repurchase their shares in accordance with the rule’s provisions. A second point worth mentioning is that the generic formulation of the efficient markets model involves the assumption of discount factors being lower than one. This is an important point of detachment from traditional economic theory: at the time in which the papers had been written, the phenomenon of negative rates had not manifested itself yet. This opens the door for possible discussion on the validity of the inequalities reported in the previous paragraph under a regime of negative short-term rates. In fact, such a novelty would not only impact the element β, but also the dependent variable series {xt } in the context of longterm rate forecasting, as expected future one-period short-term rates could take negative values. This would be an interesting area of further analysis. The third and most analytical critique was made by Flavin (1983) on the topic of long-term interest-rate modelling. The analysis focusses on the small sample properties of the tests based on volatility bounds when the volatility is expressed in deviations from the sample mean rather than the population mean. The conclusions indicate that, if variances are computed as deviations from the sample mean, the variance-bound tests are often biased towards rejection of the hypothesis of efficient markets. If the tests are adjusted for the bias, the evidence of excess volatility is not as drastic as in the literature that preceded Flavin’s publication. Upper bounds occasionally are not violated, or violations are smaller than under the unadjusted models, or even statistically insignificant. If the population variance of short-term rates is sufficiently larger than the sample variance, Flavin concludes that the efficient hypothesis can still hold. This would be the case under at least three scenarios that had been already listed in Shiller (1981b): a short-term rate process that is non-stationary, that is stationary but “inappropriately detrended”, or that suffers from the “peso problem” (which will be analysed in subparagraph 3.5.4).
3.5
Alternatives to—or Possible Explanations in Line With—The Efficient-markets Hypothesis
In parallel with the development of the body of literature focussed on the phenomenon of excess volatility, a set of possible explanations has been explored, concerning either possible alternative models or elaborations that would allow the academia to still consider the efficient-markets hypothesis as valid. These do not specifically focus on the possible shortcomings of the volatility-analysis models
3.5 Alternatives to—or Possible Explanations …
25
employed in the studies that have been relayed in the thesis until this point, as these shortcomings have already been explored in subparagraph 3.4.2. On the contrary, they involve potential market phenomena that offer a substantially new perspective on investor behaviour and may or may not have concrete econometric translations in terms of market efficiency models.
3.5.1 “Fads” Models By paraphrasing the Cambridge Dictionary, a “fad” can be described as a trend— be it an activity or, in general, an interest—characterised by high levels of widespread popularity and enthusiasm over a short period of time. Allowing for fads as investment motivators drifts away from traditional pricing models and introduces an element of irrationality on the market. It involves the occurrence of waves of optimism or pessimism in the “market psychology”9 , which in turn results in a low correlation between available information and innovation in price. The main rationale behind this behaviour and its relation to excess volatility is that “naïve”10 investors, vulnerable to fads, would introduce noise in transactions and thus wide variations in expected returns. A general idea for its transposal is provided in West (1988): in it, the author suggests the addition of a noise term—generating slowly decreasing deviations—to the share price obtained with the traditional efficient-markets model. The relevant equations would therefore be (vi) combined with the following dynamics for dividends that follow a logarithmic random walk: log(Dt ) = μ + log(Dt−1 )+ t , log(Pt ) = τ + log(Dt ) + at , at = φat−1 + vt , where |φ| < 1, t ∼ N 0, σ2 ,
νt ∼ N 0, σν2 ,
E t νs = 0
f or all t, s.
The random variable at is stationary and AR(1), and it is the “fad” that perturbs the relationship between log-price and log-dividend. Without it, the model would be exclusively determined by (vi). 9
Shiller (1981b) p. 294. West (1988) p. 639.
10
26
3
Chapter I: Literature on the Subject of Excess Volatility
Another possibility is introduced by Campbell and Kyle (1986), who theorise that trading can be divided into two portions: one carried on by the “naïve” traders mentioned above, those sensitive to fads, and one by experienced investors, who ensure that expected returns reach an equilibrium level. This equilibrium takes into account the risk generated by the movements of naïve investors. The authors then go on to conduct an empirical study that isolates a noise process in the movements of the S&P500 index between the years of 1971 and 1984, and the process is determined to account for over 25% of the overall volatility. It is worth noting, however, that no explicit answer is obtained on whether such noise is primarily a result of market actions by naïve investors.
3.5.2
Time-varying Real Discount Factors
The efficient markets model could allow for the empirically observed excess volatility under the hypothesis that discount factors vary with time. This possibility is explored in subparagraph Empirical results of Shiller (1981a), where the discount rate is treated as unknown and is analysed to determine how big the changes in it would need to be, in order to account for the observed excessive volatility. The conclusion is that the standard deviation required to account for the variance is substantially in excess of that of nominal short-term interest rates over the sample period, and it is therefore unclear whether this explanation could be seen as sufficient by itself—or merely as one of the elements painting a larger picture.
3.5.3
Time-varying Term Premia
A term premium is the difference between the yield earned by holding a bond held up to a long maturity and that obtained from the theoretically equivalent strategy of holding a sequence of shorter-maturity bonds, one at a time, until the same final date. This can be seen as the extra return demanded by investors to compensate for the additional risk linked to longer maturities. This phenomenon of violation of market efficiency conditions could impact both the discount rates entering the β term in the EMH models and the liquidity premium φ for maturity n in expression (i). This liquidity premium has been assumed to be constant in the models, however in reality it could vary through time by effect of a change in the term premium required by the market. As for the consistency of this possibility with the efficient-market models, existing studies on bond returns suggest that
3.5 Alternatives to—or Possible Explanations …
27
EMH models could imply time-varying term premia “if the time frame for which the Expectations Hypothesis holds differs from the return measurement period”11 .
3.5.4
The “Peso” Problem
The term “Peso problem” was first employed in writing in the 1977 unpublished dissertation “Rational expectations in the foreign exchange market revisited” by Kenneth Rogoff, however, as noted in Lewis (1991), it had already been used by Milton Friedman in the 1970s. In both cases, it was used referring to the infringement of no-arbitrage relationships between the returns on deposits in Mexican Pesos and in U.S. Dollars in a time when the exchange rate between the two currencies had been pegged. Once analysed statistically, it was concluded that the difference between the model-implied results and the empirically derived ones could be explained by a widespread market belief that the Mexican Peso was overvalued with respect to its fundamentals. Consistently, the Peso was devalued in August 1976. The consequences of decades12 of what can in hindsight be defined as inaccurately pegged rates and of U.S. dollar-denominated loans would drive the country into the dramatic debt crisis of 1982. Today the term has come to refer to the market expectation, embedded in asset prices, of a future discrete change in asset fundamentals, which makes it so that rational forecast errors are correlated with current information and have non-zero mean in finite samples—in opposition to standard assumptions. Such a change is of the infrequent and often unprecedented kind and accounts for violations of both the rational expectation assumptions and the hypothesis of market efficiency. In the context of the analysis of this master thesis, the “Peso problem” could explain at least some of the observed excess market volatility: the market may have been concerned with potential unprecedented events that would dramatically affect share value or future interest rates, which in reality either did not happen or did not affect the growth path of dividends. However, this does not mean that, at the time, accounting for the potential adversity was unreasonable. The “Peso problem” does not necessarily imply that the efficient-market hypothesis does not hold. Shiller (1981b) proposes an extension of the model of efficient markets (vi) accounting for a possible disastrous event. The probability of this event is considered to follow a stochastic process with values 0 ≤ πt ≤ 1 in each discrete period t. If the disaster happens, the share is assumed not to 11 12
Longstaff (1990) p. 1307. Banco De Mexico (2009) p. 2.
28
3
Chapter I: Literature on the Subject of Excess Volatility
pay out any dividend, whereas if it does not, the expected dividend remains the same as in the unrestricted model (vi). The expectation taken at t of the dividend to be paid at time t + j will therefore depend on the probability of no disaster occurring from time t to each subperiod t + k until time t + j. ⎛ ⎞ j ∞ ⎝ (β(1 − πt+k ))⎠xt ( j). yt = Et j=0
3.5.5
k=0
Some Final Observations
The above enumeration of possible explanations and model extensions is not intended to be exhaustive, as there is no academic consensus yet on whether a solution has been found. Even if it had been declared “found”, the history of revolutionary developments in economic research and the unpredictability of market events teaches not to be too crystallised on existing models. Academic conclusions should, in the opinion of the author of this thesis, be seen as yet another starting point more than as a final goal. In this specific context, some possible future developments could revolve around the elaboration of a model comprehensive of multiple, if not all, of the aforementioned explanations. Whether this model could be considered, at the end of the process, still of the EMH nature, it is not possible to say yet. However, it seems to be a route worthy of further exploration.
4
Chapter II: Excess Volatility Beyond Discount Rates
Among the alternatives to—or possible explanations in line with—the efficientmarkets hypothesis mentioned above, possibly the most widely studied was the case of time-varying discount rates. The next academic paper to be the subject of this analysis is Giglio and Kelly (2017). The paper aims to study the violation of internal consistency conditions implied by standard models for the prices of financial assets of different maturities. The authors define as “standard” any model in which, under the risk-neutral pricing measure, the cash flows are determined by a vector auto-regression. In the following paragraphs this concept will be further elaborated and illustrated in econometric terms. The framework rests on no-arbitrage relations between claims, which in this setting imply specific restrictions on the level of covariation between prices with different maturities. This kind of setting differs from the EMH models used so far, mainly in the fact that the problem is shifted under the risk-neutral pricing measure. The innovation of this approach involves precise benefits which will also be mentioned in the following pages.
4.1
The Risk-neutral Focus
The risk-neutral measure is mentioned at the beginning of this thesis, with its description being provided in section 1.5 Definitions. For the sake of completeness, and most of all to provide a ground to the theoretical claims made to describe the model, some additional information will be given. The risk-neutral measure (Q) can be recovered indirectly or directly. In the former case, it can be looked at as the probability ensuring that the price of a claim on a future stream of cash flows equals the expectation of those cash © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_4
29
30
4
Chapter II: Excess Volatility Beyond Discount Rates
flows discounted at the risk-free rate. In this case, it is the unknown in an equation taking the market price as a given input. In the latter case, the direct one, the definition of numeraire becomes useful. A numeraire is “any positive nondividend-paying asset”1 , which acts as a reference to normalise the market prices of other assets, which are then expressed relative to its value. In the case of the risk-neutral probability measure, the chosen numeraire is the “bank account” which equals2 :
t
B(t) = ex p
rs ds ,
0
which is related to the stochastic discount factor D(t, T ) through the following relationship: T B(t) rs ds . D(t, T ) = = ex p − B(T ) t The equivalent martingale measure (or risk-neutral measure) can then be seen as the probability measure associated to the bank account numeraire, under which the price V (t) at time t, of a claim with payoff V (T ) and maturity T , becomes: V (t) = E t [D(t, T )V (T )|Ft ]; then, at t = 0: V (0) = E Q ex p −
T
rs ds V (T ) .
0
The standard assumption in pricing is for rt to be a deterministic function of time, which implies that the bank account and the stochastic discount factor are both known at time 0. The reason for the viability of this assumption lies in the fact that changes in interest rates have an impact on derivative prices of a smaller order of magnitude than changes in the underlying3 . This assumption cannot be maintained in the case of interest-rate derivatives, for which the underlying is the rate itself. Since the research focusses on the prices of non-interest-rate derivatives, however, the assumption does not need to be relaxed. 1
Brigo/Mercurio (2006) p. 27. Brigo/Mercurio (2006) p. 2. 3 Brigo/Mercurio (2006) p. 3. 2
4.1 The Risk-neutral Focus
31
Additionally, the equivalent martingale measure also incorporates movements in the risk premia that would otherwise affect the discount rate of each product. Risk premia account for the uncertain nature of the future stream of cash flows and depend on the nature of the underlying asset; they are also the main drivers of changes in discount rates.4 This ensures that findings of excess volatility in the behaviour of derivative prices cannot be accounted for by non-constant discount rates, as had been considered possible in the previous research that focussed on stock prices and on the term structures of interest rates. Firstly, it is essential to define the physical, real-world probability P as the measure that accounts for a world in which investors are not risk-neutral but require an additional payoff for holding risky assets. In terms of dynamics, it would mean modelling assets as: d St = μSt dt + σSt dWt , with the drift μ equal to the risk-free rate plus the risk premium, with σ (diffusion) representing the volatility of the asset price and Wt being a Brownian motion. If a unique martingale measure exists, then the market is complete, arbitragefree, and frictionless. In that case, it will be possible to simulate the dynamics of the underlying price as: d St = rSt dt + σSt dWt , with r as the risk-free rate.5 The following econometric argument from Giglio and Kelly (2017) illustrates why the Q measure integrates possible drivers of risk-premium variation. Let us consider an economy that is driven by two factors represented by the vector Ht , whose P-dynamics are: Ht+1 = c + ρHt + u t+1 , where is the vector of diffusions and u t+1 is a vector of independent standard normal factor shocks6 with dimension two. The real-world probability P induces a density function φ that is multivariate normal with mean μt = EPt Ht+1 and 4
Giglio/Kelly (2017) p. 3. Giordano/Siciliano (2013) p. 7. 6 Hamilton/Wu (2012) p. 320. 5
32
4
Chapter II: Excess Volatility Beyond Discount Rates
covariance matrix ; it also has a stochastic discount factor (SDF) Mt+1 that represents preferences and that depends on a vector λt giving a price to risk. The expectation of Ht+1 is linear: μt = E t Ht+1 = c + ρHt . The corresponding pricing recursion is: Pt (Ht ) = E t Pt+1 (Ht+1 )Mt+1 . The risk-neutral probability ensures that the above equation can be rewritten as: Q
Pt (Ht ) = E t
Pt+1 (Ht+1 ) . Q
The corresponding probability density function is Mt+1 φ and has mean μt = μt − λt and covariance matrix . The stochastic discount factor, which includes the risk premium, does not explicitly enter the above pricing equation, but it acts on the density function through the linear shift of −λt . The factor dynamics are linear also under Q.
4.2
The Methodology
The null model for tests of excess volatility is thus chosen to be affine under the Q-measure. This kind of model has been the subject of analysis in past literature, particularly in Duffie et al. (2000), where it is illustrated for the pricing of financial assets. The widely employed interest rate models elaborated by Vasicek and Cox-Ingersoll-Ross are also affine. The underlying assumption is that the state vector X follows an affine jump-diffusion (AJD). In order to explain this last statement, it is necessary to borrow some terminology from physics and engineering. Firstly, a state space representation is a mathematical model of a physical system as a set of input, output and state variables related by first-order differential equations.7 The state of a dynamical system is a minimal set of n state variables x1 (t), , x2 (t), . . . , xn (t) taking values xi (t0 ) at time t0 . The state variables are such that the knowledge of their initial values {xi (t0 )}n , together with the knowledge of the deterministic set of inputs u 1 (t), u 2 (t), . . . , u m (t), is sufficient to determine the behaviour of a system for all of the periods after t0 .8 Finally, the corresponding state vector takes as elements the n state variables {xi (t)}n . 7 8
Pang et al. (2011) p. 15. Houpis et al. (2003) p. 38.
4.2 The Methodology
33
On the other hand, a jump diffusion is a stochastic process including features of a diffusion process and discrete movements with random arrival times (jumps). In order to define a diffusion process, it is necessary to describe the Markov process. A Markov process is a random process (a family of random variables) which is characterised by a lack of memory, meaning that that the probability of each event only depends on the state of the immediately preceding event, and not on the entirety or even on a portion of the past path leading up to the present. The knowledge of its history is no more useful to make future predictions than the knowledge of the current state, meaning that future and past states are independent from each other if conditioned on the knowledge of the present state. Markov processes exist in discrete and in continuous time, examples of the latter case being Brownian motions and Poisson processes.9 A diffusion process is also a continuous-time Markov process which is the solution to a stochastic differential equation. A stochastic differential equation differs from a regular differential equation for the fact that one of its terms is a stochastic process, such as a white noise variable or a jump process. A jump process, on the other hand, is a stochastic process characterised by discrete movements happening at random times. While traditional financial stochastic models, such as Black-Scholes, represent price movements as diffusion processes, allowing for elements of a jump process generates the hybrid model of a jump diffusion, wherein prices evolve with small continuous movements and random large jumps. Within the discipline of financial mathematics, this mixture process has been used in its more specific form of the Affine Jump Diffusion (AJD). Its usefulness comes from the fact that its moment generating function and characteristic function are known in closed form. In the valuation of securities, Duffie et al. (2000) consider an affine jumpdiffusion process, meaning that its drift (μ), the Covance matrix and the jump intensities are affine functions of the state vector.10 When such affine relationship holds under the risk-neutral measure Q, then the model at hand is affine-Q, as in Giglio and Kelly (2017).
9
Hou et al. (2013) p. 75. Duffie et al. (2000) p. 1343.
10
34
4.2.1
4
Chapter II: Excess Volatility Beyond Discount Rates
A Simplified Model for Measuring Excess Volatility
The first equation to be illustrated describes a cash flow process x t , whose realisations determine the payoff of a derivative contract with price f t,n . For simplicity, this process is made dependent on only one factor and is, for example, the observed variance of the aggregate market return in a given period t. Under the risk-neutral measure, the variance process xt is taken as autoregressive of order one—which is a frequent form of modelling in the literature of derivative pricing (as in Andersen et al. 2003). Q
xt = ρ Q xt−1 + t , Q
t is a shock having mean zero and no correlation with the lagged value of x, xt−1 . The price of the derivative f t,n , valued at time t and with maturity n, is: Q f t,n = E t xt+n . The term structure of the derivatives is then represented by the sequence of prices at time t of all forward claims with different maturities n = 1, 2, . . . , N , which can be recovered through simple substitutions. 2 N f t,1 = ρ Q xt , f t,2 = ρ Q xt , . . . , f t,N = ρ Q xt . In order to build a test for excess volatility, it is then necessary to specify a restricted and an unrestricted model. The key criterion for the latter is that the price coefficients of the elements of the term structure follow a geometric progression in ρ Q , which can be immediately seen in the above equations. In fact, the cumulative claim to the set of all forward derivatives in the term structure can be written as: 2 N Q pt,N = E t xt+1 + xt+2 + ... + xt+N = ρ Q + ρ Q + . . . + ρ Q xt . The next logical step is attempting to estimate ρ Q by exploiting its relationship with derivative prices. If f t,1 is taken from the market as a substitute for ρ Q xt , then ρ Q , can be obtained as the coefficient b2 in the regression of f t,2 onto f t,1 :
4.2 The Methodology
35
f t,2 = b2 f t,1 . It is now time to distinguish between the restricted and the unrestricted model. The sequence of coefficients recovered directly by regressing the claim price for each maturity onto f t,1 takes values b2 , b3 , . . . , b N and is unrestricted, because it is not built conditional to the relationship of each claim to ρ Q . On the other hand, the restricted versions of the same coefficients can be represented as functions of the ρ Q estimate b2 . They are recovered through this simple model, with no need to perform any additional regression beyond f t,2 onto f t,1 : b N = (b2 ) N −1 . The Variance Ratio (VR) statistic is then constructed for each maturity, by placing at the numerator the explained variance of the unrestricted model, and at the denominator the explained variance of the restricted model. The null hypothesis is for the ratio to equal one, meaning that the restricted model well captures the behaviour of the term structure. If, however, the ratio is significantly greater than one, the implication is that prices for each maturity vary more than what can be explained by the model. In this case, the interpretation of the finding is essential: since both components of the variance ratio refer to explained variances, they refer to the portion of fluctuations in the price of long-maturity claims that can be explained by the behaviour of the short-end prices. This means that wide differences between the two variances imply wide differences in the way in which long-end prices react to short-end prices in the data as opposed to under the (restricted) affine model. It will be illustrated in further pages that the variance ratio exceeds one in all asset classes, meaning that in the empirical data the prices of claims with longer maturities overreact to changes in the prices of claims with a short maturity with respect to affine dynamics. This methodology thus excludes the impact on the excess volatility of other variables, such as discount rates, as their effect in included in the affine structure. It thus becomes immaterial for the calculations to identify which factors are driving the unexplained portion of the movement of the prices. Additionally, the R 2 s of the resulting unrestricted model reach almost 100% for all term structures, meaning that the data is extremely well described by such setting, but the excessive variance ratio implies that the actual coefficients differ from those recovered from the restricted affine model.
36
4.2.2
4
Chapter II: Excess Volatility Beyond Discount Rates
The Affine-Q Model
After the previous analysis of a simplified version of the model, under which the meaning of the tests to be performed is more intuitively understood and explained, it is now essential to define the model that is at the basis of Giglio and Kelly (2017) and of the empirical study of this master thesis. The following equations are indeed taken directly from Giglio and Kelly (2017). There are three underlying assumptions upon which the model specification rests: The first is that the cash flow process, previously defined as x t , is driven by K unobservable factors Ht according to the following linear relationship:
xt = δ0 + δ1 Ht , where δ0 ∈ R and the vector δ1 has dimension K × 1. The second assumption is that, under the risk-neutral probability measure, Ht behaves as: Q
Ht = c Q + ρ Q Ht−1 +t , Q
where the vector of uncorrelated shocks t has dimension K × 1 and expected Q value equal to zero. Each t is also uncorrelated with the previous values of Ht . The covariance matrix is positive definite and is therefore invertible. As a result of this assumption, prices of cash flow claims are linear functions of the factors in the vector Ht , and Ht , in turn, also has linear dynamics under the martingale measure Q. The final assumption is that the risk factors are uncorrelated among each other, because the matrix ρ Q is set to be diagonal, that δ1 is a vector of ones, and that the constant c Q has value zero. This assumption normalises the model and reduces its redundancy, so that the econometric parameter identification problem is avoided. It is worth noting that this model only imposes linear Q -dynamics, whereas non-linear P-dynamics in principle are allowed for. From the above equations we obtain that the price of a linear cumulative claim pt,n for the maturity n has the following formula: Q
pt,n = E t
2 n xt+1 + xt+2 + ... + xt+n = nδ0 + 1 ρ Q + ρ Q + . . . + ρ Q Ht + νt,n .
The last term of the equation accounts for the noise possibly introduced by errors in price measurement under the real-world probability, whereas the geometric
4.2 The Methodology
37
progression of the factor loadings in ρ Q implies restrictions on the movement of prices of different maturities.
4.2.3
The Variance Ratio Test
The tests for excess volatility in Giglio and Kelly (2017) are performed on the parameters estimated via OLS regression with a different number of regressors depending on the nature of the instrument. Ordinary least squares (OLS) regressions have the advantage of simplicity, requiring no distributional assumptions about the dynamics of the physical probQ ability measure or about the distribution of the shocks t . The only case-specific additional assumptions concern regularity conditions of the term structure of prices under the real-world probability (which are taken as the same as in Hayashi (2000), proposition 2.3). These include the stationarity of the term structure and the satisfaction by residuals of the central limit theorem, which generally represent the conditions necessary for consistency under the Wald test. In fact, the variance ratio test is shown to behave asymptotically as the Wald test.11 Additionally, it is required that the prices of the instruments with the short-end maturities that are used as the K regressors in the model of long-term maturities (the number K will change together with the nature of the derivative instrument) have no measurement error under the physical probability measure P, meaning that νt,n = 0 for n = 1, . . . , K . The model then continues with the definition of the factor loading of with maturity j on the vector of latent factors: b j = the price 2 n
ρQ + ρQ + . . . + ρQ 1 and with the definition of the OLS regression model in which observed short-maturity prices Pt,1:K are used as proxies for the K latent factors Ht :
pt,K +1 = α K +1 + β K +1 Pt,1:K + νt,K +1 . The relationship between the latent factors and prices is the following:
−1 Pt,1:K − δ0 [1, . . . , K ] , Ht = B1:K
11
Giglio/Kelly (2017) p. 12.
38
4
Chapter II: Excess Volatility Beyond Discount Rates
where B1:K = [b1 , . . . , bk ] is a K × K matrix of the stacked factor loadings b j of the short maturities. Under the risk-neutral measure, the coefficients of the short-maturity prices follow the relationship below: −1 −1 K +1 K β K +1 = b K +1 B1:K = 1 ρ Q + . . . + ρ Q 1 . (vii) ρ Q 1, . . . , ρ Q + . . . + ρ Q
Next, it is time to derive the restricted and the unrestricted models. The first required input is the diagonal matrix ρˆ Q , which is estimated by inserting the OLS coefficient βˆK +1 inside (vii) and solving for ρ Q . Once this value is obtained, the long-end prices are stated as functions of prices with maturity j > K + 1, where K is the same value that was used to recover ρˆ Q . K does not change, but the regression coefficient βˆ j for each j does.
pt, j = α j + β j Pt,1:K + νt, j .(viii) The restricted estimate of βj is denoted as βˆ jR . It is recovered from the model, by substituting the estimate ρˆ Q into (vii), after having extended the equation to the required maturity j. In fact, it represents the relationship implied in the term structure under the affine-Q specification and it ensures consistency of each price with the previous ones. The unrestricted estimate of βj is denoted as βˆ Uj and is not obtained through the model. It is the OLS slope coefficient recovered by computing directly the regression in (viii). The numerator of the variance ratio, which is the price variation explained by the restricted model, is then recovered as: V jR = βˆ jR Vˆ Pt,1:K βˆ jR , with Vˆ Pt,1:K being the estimate of the variance-covariance matrix of short-end prices under the physical probability measure. The denominator of the variance ratio, on the other hand, is recovered as: V jU = βˆ Uj Vˆ Pt,1:K βˆ Uj , with Vˆ Pt,1:K being the same estimate of the variance-covariance matrix as in V jR . Finally, the corresponding variance ratio statistic is:
4.3 The Findings
39
V Rj =
V jU V jR
.
The interpretation of this variance ratio is relatively straightforward: it represents the fraction of the unconditional covariation between long and short-maturity prices that is consistent with the restricted model. The ideal value is one, which is also the null hypothesis. In this case, the model is perfectly able to capture the data. Variance ratios exceeding one indicate an overreaction with respect to the affine model. An important feature of the test is that it compares factor loadings scaled by the covariance terms of the corresponding maturity, meaning that the coefficients of prices that covary more strongly are given larger weights. In fact, the explained price variation under the restricted and the unrestricted model can be rewritten as: K K
b˜ j,k b˜ j,l σˆ k,l and
k=1 l=1
4.3
The Findings
4.3.1
Variance Swaps
K K
bˆ j,k bˆ j,l σˆ k,l .
k=1 l=1
The first derivative instruments to be analysed in Giglio and Kelly (2017) are the variance swaps on the S&P 500 index. A variance swap is “a financial instrument that allows investors to trade future realized variance against current implied volatility (the “strike”)”12 The payoff of the instrument at maturity T is given by the difference between the realised variance in the underlying—in this case the S&P 500 index—over the life of the swap, and the strike. By denoting as RVt the sum of squared daily log index returns during each month t, the swap price can be rewritten as the expectation under the risk-neutral measure of the future stream of returns. ⎡ ⎤ n
Q RVt+ j ⎦. pt,n = E t ⎣ j=1
12
Hilpisch (2016) p. 235.
40
4
Chapter II: Excess Volatility Beyond Discount Rates
The relationship between this model and the affine framework of paragraph 4.2 is straightforward whenever xt = RVt is taken. The number of latent factors is chosen to be two, as standard in the literature considering variance swap price movements.13 Concerning the data for the analysis, Giglio and Kelly (2017) use daily prices for cumulative claims with maturities of one, two, three, six, twelve, and twentyfour months for the period 1996 to 2013. The variance ratio of the 24-month maturity is shown to be more than twice the amount allowed for under the affine model, with a value of 2.15 under the OLS estimation of the parameters. The unrestricted R 2 of the model is also reported, and it is 99.7% with only two regressors. This is worth noting, because a potential vulnerability of this measure of goodness-of-fit is that it always increases together with the number of factors, even if the co-movements happened by chance. In fact, it will never decrease if a new regressor is added, meaning that a model involving more terms could be mistaken for a better-fitting one purely on grounds of its size. The underlying phenomenon is called “model overfitting”, which happens when an excessive number of terms pushes a model to ascribe the random noise in the data to its elements. A potential alternative has canonically been taken to be the adjusted R-squared, there the bias towards larger models is corrected. It increases only if the new term has a greater explanatory power than expected by chance. Giglio and Kelly (2017) do not include those figures, however the analysis performed in the next chapter will. In the case of inflation swaps, however, the R 2 figure can be taken at face value because of the reduced number of factors (K = 2). The meaning behind a high R 2 (explanatory power of the model) and such a large variance ratio in the long end of the curve is that the regressor coefficients must be widely different from those implied by the affine specification. Loadings obtained from the unrestricted regression are shown to be, for both factors, much larger than implied by the null (affine-Q) model. It is additionally worth remembering that the excessive volatility cannot be ascribed, as done in the previous literature (Shiller 1979,1981a, 1981b), to changes in discount rates. A final analysis conducted in the study is the behaviour of the latent factor loadings inside ρˆ Q , which are taken as a proxy of factor persistence along the term structure. Under the affine model, they should not change along the term structure, because of internal consistency. However, they are shown to be more persistent in higher maturities. A behavioural explanation of this phenomenon is suggested by the authors14 : investors might consider the factors to 13 14
Giglio/Kelly (2017) p. 17. Giglio/Kelly (2017) p. 21.
4.3 The Findings
41
be more persistent in the valuation of longer-maturity claims. This leads onto the topic of natural expectations, which is discussed in the next chapter.
4.3.2
Equity Options
Implied Black and Scholes volatility is one of the essential descriptors of an option price, as the instrument can in fact be used as a mean to trade on the term structure of equity volatilities. One of the ways in which this can be done is by “riding the volatility surface”15 , which is a trading strategy made possible by the limitations of the Black and Scholes model in the face of reality. The standard Black Scholes formula relies on the assumption of constant volatility across strikes, which is however regularly violated by market data. In fact, in reality, the so-called “smile effect” is observed. It is a term that describes the phenomenon according to which in-the-money and out-of-the-money options have higher implied volatilities than at-the-money ones. Therefore, in a graph representing implied option volatilities as a function of strikes, the curve happens to take the shape of a smile. Traders can then exploit changes in the skewness and in the kurtosis of the volatility surface, which is a plot that represents implied volatility as a function of different strikes and of different exercise dates. The intuition behind the effect on prices of changes in volatility is that a higher standard deviation has the effect of raising prices for both call and put options, whereas a decrease in standard deviation does the opposite. This creates profit and loss potential whenever betting on the future behaviour of this essential pricing variable. A point worth noting while continuing the analysis is that options do not directly fall into the affine-Q framework as neatly as variance swaps do. Giglio and Kelly (2017), however, relate a series of studies16 which show that variance and volatility swaps can be approximated rather accurately by options, with Black and Scholes at-the-money implied volatility replicating the price of a volatility swap. In particular, a more intuitive link between the two instruments can be seen in the fact that the VIX index is the weighted average of the option-implied volatilities of the components of the S&P 500 index, and its squared value actually corresponds to the price of a variance swap on that same index. For this 15
Cherubini/Della Lunga (2007) p. 7. Britten-Jones/Neuberger (2000), Jiang/Tian (2005), Carr/Lee (2009).
16
42
4
Chapter II: Excess Volatility Beyond Discount Rates
reason, the analysis on the excess volatility in the term structure of options is not conducted on the option prices, but on their implied volatilities. They have the benefits of a direct relationship to the option price, but also of falling within the affine framework. Giglio and Kelly consider the two most liquid equity names and stock indices on the option market, in order to maximise liquidity and improve the quality of the observed implied volatilities. These are Apple (AAPL), Citigroup (Citigroup Inc), STOXX 50 and DAX. The data is taken from IVY DB US and IVY DB Europe, from 1996 to 2014 and from 2002 to 20013 respectively. The highest maturity that is considered is 24 months, for which option trading is still liquid. Due to the comparability with variance swaps, the selected number of factors is also two. The most interesting results of the variance ratio calculations happen again in the longer-maturity instruments. Apple implied volatility variance ratio exceeds two, with a value of 2.01. The ratios of Citigroup take large values for the 18 m and 24 m maturities with OLS estimation (3.17 and 4.68 respectively), and concerning the two indices, STOXX 50 and DAX, the variance ratios always surpass one in the longer maturities. The largest ratios are obtained again for the eighteen and twenty-four-month maturities (1.68 and 2.27 for STOXX 50, and 1.68 and 2.31 for DAX).
4.3.3
Currency Options
The next objects of study of Giglio and Kelly (2017) are currency options. These options allow the holder to purchase from the issuer at the strike exchange rate a specified amount of call currency, or to sell to the issuer a specified amount of put currency, again at the strike exchange rate.17 They can thus be seen as instruments that allow to trade exchange-rate volatility, just as equity options allow to trade implied volatility. The two selected types of options are chosen on grounds of their high liquidity: they are written on EUR/USD and JPY/USD. These options are traded on the Chicago Mercantile Exchange (CME), but the data is collected from “OptionMetrics”. The last liquid maturity that is considered is the one of twenty-four months. The resulting variance ratios exceed one for both options under the estimation model for the maturities of twelve and eighteen months and of two years. The highest values are again recorded at the end of the term structure, suggesting that the phenomenon of excess volatility grows together with time to expiry. The YEN/USD ratios take value 1.67, 2.85, and 4.57 for each 17
Wystup (2017) p. 4.
4.3 The Findings
43
of the maturities mentioned above whereas the EUR/USD ratios are 1.22, 1.65, and 2.14. Finally, it is worth noting that the R 2 value of the YEN/USD model is the lowest among all instruments: 98.5%, while that of the EUR/USD options is 99.8%, which is among the highest observed.
4.3.4
Government Bond Yields
U.S. Treasury bills are among the most liquid assets on the market. They are used very frequently as collateral in repurchase agreements (“repos”), traditionally with a very low haircut (a median of 2% according to Morgan Stanley (2018)). For this reason, they are one of, if not the, closest financial instruments to the asset representing pure liquidity: cash. By altering the overnight rate, the Federal Reserve is able to impact the short end of the Treasury yield curve. The remaining part is a result of the equilibrium of market forces: demand and supply. The securities do not follow the affine model that has been used until now, but they are shown18 to be exponential-affine. In terms of model restrictions, it means that log-prices are linear functions of the underlying cash flows. As a result, the same estimation and testing methodology that was used up to this point can be applied to government bonds. Giglio and Kelly (2017) retrieve nominal daily zero-coupon bond yield data from 1985 to 2014, with maturities up to thirty years, and estimate a standard homoscedastic exponential-affine model for yields with K = 3. The variance ratios are reported for the maturities of fifteen, twenty, twentyfive, and thirty years. They are shown to exceed one in all cases, but are relatively small: 1.07, 1.20, 1.39, and 1.64. The R 2 value is the highest among all instruments: 99.9%, meaning that not only the regression is able to accurately fit the data, but also that the price coefficients have relatively similar values to the ones estimated from the affine model, except for the very end of the term structure.
4.3.5
Inflation Swaps
An inflation swap is an agreement between two parties “to exchange known fixed interest rate payments for unknown payments linked to inflation […] over an agreed period”19 . Inflation swaps are analysed for both the European and the U.S. market, where both of their payoffs are linked to the realised Consumer Price 18 19
Giglio/Kelly (2017) p. 54. Marcaillou (2016) p. 276.
44
4
Chapter II: Excess Volatility Beyond Discount Rates
Index (CPI) inflation. The data has daily frequency, is retrieved from Bloomberg in the period from 2004 to 2014 and refers to the term structure, with maturity up to 30 years. The number of factors is set to four, chosen in order for the R 2 to reach at least 99%. This is the largest number of factors in the OLS models explored in the paper across derivative classes, and the lack of knowledge about the value of the adjusted-R 2 should lead the reader to interpret the results with a critical mind. The ratios could therefore have a more limited meaning than equal values obtained in a model with an otherwise lower, sometimes halved, number of regressors. In particular, however, the variance ratios for both US and EU inflation swaps are of great absolute size: 1.84, 3.37, 5.54, and 7.47 for US inflation swaps with maturities of fifteen, twenty, twenty-five, and thirty years respectively; 1.22, 1.74, 2.45, and 2.89 for the euro-linked instruments.
4.3.6
Commodity Futures
A futures contract is a legal agreement for the future sale or purchase of a certain asset at a pre-specified price and at a pre-set date (delivery). A commodity future has, as can be inferred from the name, a commodity as an underlying, where a commodity is a hard asset that is recognised as such under the 1936 Commodity Exchange Act in the United States. The price of a futures contract is determined as the expectation under the risk-neutral measure of the future price of the underlying at the delivery date. This makes the modelling of the asset price fall neatly within the affine-Q framework. The parameters necessary to reach a minimum R 2 of 99% are K = 2. The selected commodities are the most liquidly traded: crude oil and gold. The estimated variance ratios reach the highest values at the end of the term structure: at six months, they amount to only 1.01 for crude oil, and 1.04 for gold futures; whereas they increase to 1.19 for both crude oil and gold at the twelve-month maturity, and to 1.63 and 1.53 respectively at the twenty-fourmonth maturity. This result is indicative of the comparable size of the restricted and unrestricted coefficients of both futures prices up until the very end of the term structure.
4.3.7
Credit Default Swaps
A Credit Default Swap (CDS) is a contract in which a protection buyer “pays a fee, the swap premium, to the protection seller in return for the right to receive
4.3 The Findings
45
payment”20 in the case of a credit event of a reference name (entity) with respect to a specific obligation. CDS are used to hedge default risk of corporations and countries, with multiple instances of speculative behaviour having manifested during the European sovereign crisis. This phenomenon had a regulatory response in November 2012, with the European Regulation n. 236/2012, which banned the ownership of naked sovereign CDS. Additionally, naked short selling of sovereign debt was also banned, and a requirement was introduced for the disclosure of significant net short position in sovereign debt.21 From the “MarkIt” platform, Giglio and Kelly (2017) recover daily CDS prices, which—as U.S. Treasury bond yields—are exponentially affine under the risk-neutral measure. In fact, the authors emphasise that CDS spreads can be approximately expressed as a linear function of bond yields. The choice of the names is made on the basis of liquidity criteria, with the two most liquid corporate names being Bank of America (BAC) and General Electric (GE). The data spans over the years 2007 to 2014, and refers to maturities of one, two, three, five, seven, and ten years. The number of factors is K = 2, which is based on existing literature modelling the term structure of credit spreads and is sufficient to reach an R 2 of at least 99% for all names. The variance ratios obtained are most relevant for the 10y maturity for both names: 1.45 for GE and 1.38 for BAC. Two sovereign names are considered, again selected on the basis of liquidity: Russia and Brazil. The highest variance ratios are obtained for the ten-year maturity, 2.18 and 3.08 for the two countries respectively. The relatively long maturities of five years, on the other hand, only present ratios of 1.14 and 1.19 respectively, indicating that the short-end prices used in the regression probably are refer to maturities of multiple years. The overall results of the analysis provide strong evidence for excess volatility in long-maturity claims in the OLS framework for the testing of the affine model.
20
Focardi/Fabozzi (2004) p. 680. Retrieved from https://www.esma.europa.eu/regulation/trading/short-selling, last update 2019–05-02, accessed May 2nd 2019.
21
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
5.1
The Data
The current chapter involves an attempt at performing the analysis of Giglio and Kelly (2017) for instruments linked to the European market and, more specifically, to the Eurozone. The first step of the analysis will revolve around variance swaps on two indices: Euro Stoxx 50 and DAX. The price data was collected from the Bloomberg terminal with the Bloomberg Data History Excel function. The security name inputs are “SX5E Index” for the Euro Stoxx 50, and “DAX Index” for the DAX, whereas the fields are: VAR_SWAP_1M_LV, VAR_SWAP_2M_LV, VAR_SWAP_3M_LV VAR_SWAP_5M_LV, VAR_SWAP_6M_LV, VAR_SWAP_9M_LV, VAR_SWAP_12M_LV, VAR_SWAP_18M_LV, and VAR_SWAP_24M_LV. Daily closing prices are thus recovered from the furthest available dates, which happen to be 4th November 2008 for both underlying indices. The last collected set of prices is for the date of 15th April 2019. The second set of derivative instruments is equity options. Again, the underlying indices with reference to the Eurozone market are the Euro Stoxx 50 and the DAX. The data collected involves daily at-the-money Black-Scholes implied volatilities, from the earliest available date—2nd January 2006—to 15th April 2019. The corresponding function fields are:
Supplementary Information The online version contains supplementary material available at (https://doi.org/10.1007/978-3-658-37450-1_5). © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_5
47
48
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
30DAY_IMPVOL_100.0%MNY_DF, 60DAY_IMPVOL_100.0%MNY_DF, 3MTH_IMPVOL_100.0%MNY_DF, 6MTH_IMPVOL_100.0%MNY_DF, 12MTH_IMPVOL_100.0%MNY_DF, and 18MTH_IMPVOL_100.0%MNY_DF. Data for longer maturities is not available, presumably because of a lack of liquidity. The name inputs are the same as for variance swaps. The third derivative instrument are currency options, and the most liquid ones with respect to the Euro are EUR/USD options. The name input for the BDH formula is “EURUSD Curncy” and the corresponding data availability starts from different dates according to the maturity. For example, data for most of the shortest maturities (one day, three months, one week, two weeks, two months, six months, nine months, and one year) starts from January 1999, whereas the data for the four-month and eighteen-month maturities starts from January 2006. A set of common dates was retrieved for the widest possible set of maturities, starting from 4th January 2016 and ending 15th April 2019. The one-year maturity had to be excluded from the sample because of the unavailability of a large set of dates at the end of the sample period. The final set of considered maturities is one, two, three, four, nine, eighteen, and twenty-four months. Giglio and Kelly (2017) also analyse currency options, but their dataset spans from 2007 to 2014. The set of considered maturities in the paper is also different: six, twelve, eighteen, and twenty-four months. For this reason, the analysis is not redundant. The final instrument considered is the only non-derivative: Treasury bonds. Instead of considering the U.S. government bond, this chapter focusses on the nearest European equivalent: the German Bund. Bund yields are also collected from Bloomberg for the entirety of the yield curve. The corresponding “name” entries for the BDH function change with each maturity and are: GTDEM1Y govt, GTDEM2Y govt, GTDEM3Y govt, GTDEM4Y govt, GTDEM5Y govt, GTDEM6Y govt, GTDEM7Y govt, GTDEM8Y govt, GTDEM9Y govt, and GTDEM10Y govt. The field is “YLD_ANNUAL_MID” for all maturities, which corresponds to the daily, nominal mid-quote of the annualised yield on zero-coupon bonds. As for the currency options, different maturities have different available data time spans. The oldest recovered yields start from August 1990, however, for the sake of completeness in the analysis, the dataset goes from 2nd July 1997 to 15th April 2019 and encompasses the largest possible number of maturities. It is clear that the amount of available data is more limited than that for U.S. Treasury bonds because the current Federal Republic of Germany did not exist until the autumn of 1990. However, the U.S.
5.1 The Data
49
Treasury bond yields in Giglio and Kelly only go from 2002 to 2014, so the selected sample size for Bunds still seems more than sufficiently large.
5.1.1
The EURO STOXX 50 Index
The Euro Stoxx 50 index enters the following analysis twice: as the underlying of both variance swaps and equity options. It was chosen by reason of its relevance in the Eurozone, as it includes fifty blue-chip stocks from eleven countries belonging to it: Austria, Belgium, Finland, France, Germany, Ireland, Italy, Luxembourg, the Netherlands, Portugal, and Spain. The index designer, STOXX, is a provider owned by Deutsche Börse Group and describes on its website the included companies as leaders of the 19 economic “Supersectors” it selected. It was first quoted in 1991 with a base value of 1000 points and is computed and updated every 15 seconds for the variants in EUR and in USD. It is also quoted in CAD, GBP and JPY, but these versions are only calculated at the end of each trading day. The composition is updated in September of every year, to ensure that it accurately represents the most liquid shares and those with the largest market capitalisation. In fact, futures and options written on the Euro Stoxx 50 are among the most liquid not only in Europe, but also in the world. Additionally, the fifty companies alone capture about 60% of the public-float market capitalisation of the Euro Stoxx Total Market Index, which is also trademarked by STOXX, and which accounts for approximately 95% of the free-float market capitalisation of the Eurozone. Within the index, the weights are attributed mainly on the basis of the market capitalisation of each company, but they also account for the sector and the country of incorporation, in order to provide as accurate a snapshot as possible of the economic structure of the Eurozone. Currently, nineteen companies are French, fourteen German, six Spanish, five Italian, four Dutch, one Belgian, and one Finnish.
5.1.2
The DAX Index
The DAX index is also used twice in the following analysis: as the underlying of variance swaps and of equity options. It was chosen by reason of its relevance in the European economy. If there is a need for data with the highest possible level of liquidity, then derivatives written on the DAX can be considered as some of those best satisfying such requirement in western Europe. Germany is currently the largest and most powerful economy of the Eurozone, and this index
50
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
is composed of the thirty largest blue-chip German companies that are traded on the Frankfurt Stock Exchange. The companies are selected on the basis of order book volume and market capitalisation and currently include BMW, Deutsche Bank, Allianz, and Lufthansa. Furthermore, they make up 75% of the overall turnover in the equities on the Frankfurt Stock Exchange1 and 80% of the market capitalisation listed in Germany2 . The DAX was created in 1988 with a base value of 1000 points and it is computed and updated every second since 2006 by the electronic trading venue Xetra, which is operated by Deutsche Börse. It is most comparable to the American Dow Jones Industrial Average due to the limited size of the selection it encompasses. Many of the companies included in the DAX are multinational in scope, and their behaviour can also impact global trends. Some of them also trade on North American stock exchanges, such as Allianz and BASF on the NASDAQ stock market, and Deutsche Bank on the New York Stock Exchange, and since the different time zones between Germany and the U.S. mean that the trading sessions on one exchange finish before the other, the index is used by some to predict the direction that the U.S. market will take after the European ones have closed.
5.2
The Computations
The first step of the analysis is the estimation of ordinary least squares linear regressions for the prices, implied volatilities or yields (depending on the instrument) of the different claim maturities. For this goal, the open-source software “Gretl” is used. For two-factor models, the first three maturities of the term structure are considered: the longest one is regressed on the shortest two and the model is tested for heteroscedasticity with the White test. Heteroscedasticity does not cause bias or inconsistency in the OLS estimators of the coefficients. It also does not affect the goodness-of-fit measures of R 2 and adjusted-R 2 , as they depend on unconditional variances and not on conditional ones. On the other hand, t statistics, F statistics, confidence intervals, and the Lagrange Multiplier statistic are affected by heteroscedastic—having non-constant conditional variance—errors.3 The White test is based on the following equation, illustrated for the case of two regressors:4 1
Eatwell et al. (1992) p. 585. Deutsche Börse AG (2018) p. 6. 3 Wooldridge (2012) p. 268. 4 Wooldridge (2012) p. 279. 2
5.2 The Computations
51
uˆ 2 = δ0 + δ1 x1 + δ2 x2 + δ3 x12 + δ4 x22 + δ5 x1 x2 + err or , in which the null hypothesis is for all of the δi to be zero, except for δ0 . Low p-values reject the null hypothesis of homoscedasticity. The data rejects homoscedasticity for all the maturities of all aforementioned instruments, with significance levels consistently lower than 5%. For this reason, the models are estimated with heteroscedasticity-robust standard errors, in order to recover a more reliable estimate of the p-value of each coefficient. The following steps of the process are described for variance swaps, with a number of factors K = 2. It serves as a good illustration also for other instruments, where the regression may change as it is made on the different, specific descriptors—implied volatilities or yields—mentioned in paragraph 5.1. The prices of the three-month maturity swaps are regressed on variance swap prices of maturities one and two months. The coefficients thus obtained enter the vector β3 = β3,1 , β3,2 , which is then plugged into the following equation: −1 −1 2 3 2 β3 = b3 B1:2 = 1 ρ Q + ρ Q + ρ Q ρ Q 1, ρ Q + ρ Q 1 .
The only known parameter of the equation is β3 ; instead, ρ Q is the diagonal matrix of unobservable factor loadings. The matrix has dimension K × K = 2 ×2 Q Q and the entries of the diagonal can be denoted as ρ1 and ρ2 , or sometimes as x1 and x2. The equation is then reverse engineered to recover these two values. The computation was made in “Python” 2.7.15, with the indispensable use of the “NumPy”, “SymPy”, and “SciPy” packages. A few elements of the script are reported below. from scipy.optimize import fsolve import math Next, the entries of the beta vector are defined: b1 = −0.360531 #case-specific values b2 = 1.35397 ig = b1–1 #the initial guess for the value of the first rho A function is then defined, with “fe” and “se” representing the explicit equations that the factor loadings (here x1 and x2) solve: def f(variables): (x1, x2) = variables fe = (((1 +x2)*(x1+x1**2 +x1**3)−(x2+x2**2+x2**3))/(x1*(x2−x1)))−b1
52
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
se = ((x2+x2**2+x2**3−(1+x1)*(x1+x1**2+x1**3))/(x2*(x2−x1)))−b2 return (fe, se) The function is then called with two initial guesses for the values, in order to aid the software in finding the correct solution. x, y = fsolve(f, (ig, −0.05)) The same procedure for K = 3 is reported in the next subsection. It is largely more complex, as the inversion of the three-dimensional matrix −1 2 2 3 ρ Q 1, ρ Q + ρ Q 1, ρ Q + ρ Q + ρ Q 1 requires the use of symbols within the programme syntax. The sub-matrix ρ Q also has dimension three. A manual computation of the explicit equations, and then their simplification, is ripe for mistakes. It is therefore made with the use of the software. To continue the procedure for the two-factor setting, let us now consider equation (viii):
pt, j = α j + β j Pt,1:K + νt, j . The dimension of Pt,1:K remains the same for all subsequent maturities, as it corresponds to the K-dimensional vector of prices chosen as regressors (K = 2). The vector β j , on the other hand, changes with each maturity. In the unrestricted model, it is recovered directly through an OLS regression of pt, j on Pt,1:K . In the restricted model, it is in turn obtained by substituting the entries of ρ Q , recovered in the previous computations, into equation (vii). The equation is extended to maturities j above K + 1 = 3. For j = 3, the restricted and the unrestricted coefficients coincide, as we are simply reversing the equation used to recover the diagonal elements of ρ Q . For the following maturities, the size of the vector b j −1 increases with j, while the matrix B1:K maintains its dimension. A particular detail adopted, not explicitly specified in the original paper but inferable, is to consider the maturities as consecutive, meaning that if the initial regressors are chosen as months one and two, all following j s are specified as months, regardless of the existence of the corresponding time series. On the other hand, if they are chosen as weeks, the following maturities need to be stated in terms of weeks. For this reason, the vector for the inference on a maturity of one year corresponds to j = 12 months, j = 1 year, or j = 52 weeks depending on the selected short-end independent variables. Fortunately, the lack of data for
5.2 The Computations
53
one or more maturities is not a problem for the analysis, since the extrapolating algorithm is not recursive. A function is defined in “Python” to recover the vector b j for different maturities. It takes as input x1 and x2, the entries of the diagonal of ρ Q recovered in the previous step from the short-end maturities, and j, the desired maturity for the regression. It returns a vector of dimension 1 × 2 in the form j j [x1 + x12 + x13 + . . . + x1 , x2 + x22 + x23 + . . . + x2 ]. import numpy as np def b_j(x1, x2, j): x1 = float(x1) x2 = float(x2) i=1 b1 = 0 b2 = 0 while i < = j: b1 = b1 + x1**i b2 = b2 + x2**i i=i+1 return np.matrix([b1, b2]) The function is then called for the case-specific values of x1, x2, and j. −1 . It does Another function is defined in order to recover the matrix B1:K not change with j and is then called for the specific values of x1 and x2. def matr(x1, x2): x1 = float(x1) x2 = float(x2) a = (1 + x2)/(x1*(x2−x1)) b = (−1−x1)/(x2*(x2−x1)) c = −1/(x1*(x2−x1)) d = 1/(x2*(×2−×1)) return np.matrix([[a,b],[c,d]]) The last step necessary in order to recover the vector of restricted coefficients is −1 the multiplication of b j and B1:K :
54
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
vector = b_j(x1,x2,j) matrix = matr(x1,x2) coefficient = vector*matrix The final step of the analysis is the computation of the variance ratio. In order to find it, two components are necessary: the explained price variation under the restricted model and that under the unrestricted model. In both cases, the required formula is: K K
b˜ j,k b˜ j,l σˆ k,l ,
k=1 l=1
where the σˆ k,l entries are the elements of the sample variance-covariance matrix of the short-end maturity prices that represent model regressors. They are taken under the physical probability measure P, so they are computed on Excel with the formulas = var .s and = covariance.s, which correspond to the unbiased sample estimators. The unrestricted explained price variation is recovered as: VR_U = beta_abs*VarCov*beta_abs.transpose() Where “beta_abs” is the horizontal vector of the absolute value of the j-maturity coefficients obtained directly through linear regression. The restricted explained price variation, on the other hand, is obtained through the same formula but by taking as input the absolute value of the vector of “bootstrapped” restricted coefficients estimated from ρ Q through the affine model. The final variance ratio statistic is simply the unrestricted model explained price variation divided by the restricted model one. The script for the procedure with K = 3 is reported in Appendices A and B.
5.3
The Results
5.3.1
Variance Swaps
The heteroscedasticity-robust regressions, for both Euro Stoxx 50 and DAX variance swaps, yield statistically significant results at the 1% level for the coefficients in front of one-month and two-month prices. This result holds across the entire term structure, and it indicates that the null hypothesis of those coefficients taking value zero can be rejected. Naturally this has a positive effect on the
5.3 The Results
55
reliability of the overall analysis, which is additionally strengthened by adjustedR 2 values of 0.995053 and of 0.993906 for the two indices respectively. The analysed maturities are three, five, six, nine, twelve, eighteen, and twenty-four months. Along the term structure, the two coefficients maintain opposite signs: the first one negative and the second one positive. The same relationship holds for the restricted coefficients. The underlying matrix entries x1 and x2, on the other hand, are both negative. The results are reported below. The variance ratios are larger for the Euro Stoxx 50 index, and overall slightly above those obtained in Giglio and Kelly (2017) with reference to the S&P 500 index (Table 5.1) (Table 5.2).
5.3.2
Equity Options
The regressions in the case of equity options are made on the term structure of Black-Scholes implied volatilities with a number of factors K = 2. The underlying names in the equity case are the two most liquid Eurozone indices: Euro Stoxx 50 and DAX. The heteroscedasticity-robust regressions for both options yield statistically significant results at the 1% level for the coefficients of onemonth and two-month option implied volatilities. This result holds across the entire term structure, and it indicates that the null hypothesis of those coefficients taking value zero can be rejected. The adjusted-R 2 values are 0.994309 and 0.996936 for the two names. The analysed maturities are three, six, twelve, and eighteen months. As for the variance swaps, along the term structure, the two coefficients maintain opposite signs: the first one negative and the second one positive. The same relationship holds for the restricted coefficients. The underlying matrix entries x1 and x2, on the other hand, are both negative. The variance ratios are larger for the DAX index implied volatilities. The overall results are reported in the tables below (Table 5.3) (Table 5.4).
1
207.985019
VR ratio
207.985019
VR_R 1.84926615
214.415377
396.5110988
1.38586784
1.35397
VR_U
−0.35556151
−0.0695764
−0.360531
0.974484
Restricted coefficients
0.995053
Adjusted Rˆ2
0.974504
x2
0.995057
Rˆ2
1.66121
−0.694166
−0.0749909
1.35397
5M
x1
−0.360531
VAR_SWAP_2M_LV
3M
VAR_SWAP_1M_LV
Unrestricted coefficients
2.22228
214.481
476.638
1.386
−0.35569
0.97154
0.97156
1.76238
−0.81661
6M
2.626712
214.4762
563.3673
1.385993
−0.35568
0.935573
0.935621
1.85047
−0.949515
9M
2.86526
214.476
614.529
1.38599
−0.35568
0.92633
0.92639
1.88295
−1.03848
12M
3.298054
214.4762
707.3542
1.385993
−0.35568
0.901885
0.90196
1.95483
−1.17561
18M
3.47277
214.476
744.826
1.38599
−0.35568
0.86425
0.86435
1.97277
−1.23754
24M
5
EURO STOXX 50
Table 5.1 Euro Stoxx 50 variance swap results
56 Chapter III: Evidence of Excess Volatility in the Eurozone Market
VR ratio
1
165.5441824
VR_R
1.36700372
1.282494335
180.5743641
231.5855991
1.29816
165.5441824
−0.308081
VR_U
−0.31360953
−0.120242985
0.977364
Restricted coefficients
0.993906
Adjusted Rˆ2
0.977381
x2
0.99391
Rˆ2
1.41793
−0.474648
−0.137837082
1.29816
5M
x1
−0.308081
VAR_SWAP_2M_LV
3M
VAR_SWAP_1M_LV
Unrestricted coefficients
DAX
Table 5.2 DAX variance swap results
1.515196
181.143
275.2829
1.36837
−0.31485
0.969755
0.969778
1.49202
−0.56804
6M
1.949378
181.0658
354.0139
1.368184
−0.31468
0.947247
0.947287
1.6026
−0.72793
9M
2.206948
181.066
400.7902
1.368185
−0.31468
0.922692
0.922751
1.6537
−0.82275
12M
2.282288
181.066
414.4722
1.368185
−0.31468
0.884848
0.884935
1.63737
−0.87817
18M
2.386693
181.066
433.4325
1.368185
−0.31468
0.83831
0.838433
1.64502
−0.92553
24M
5.3 The Results 57
58
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
Table 5.3 Euro Stoxx 50 option results Euro Stoxx 50 implied vol
3M
6M
12M
18M
30DAY_IMPVOL_100
−0.30501
−0.67438
−0.82159
−0.87959
60DAY_IMPVOL_100
1.2706
1.55064
1.57511
1.57269
Rˆ2
0.994313
0.975769
0.935464
0.89948
0.994309
0.975755
0.935426
0.899421
Unrestricted coefficients
Adjusted Rˆ2 x1
−0.08018
x2
−0.07397
Restricted coefficients
−0.30501 1.2706
−0.28897 1.316602
−0.28895 1.316587
−0.28895 1.316587
VR_U
150.6702
308.1288
360.8219
379.15
VR_R
150.6702
155.9694
155.9636
155.9636
VR ratio
1
1.975572
2.313501
2.431016
Table 5.4 DAX option results DAX implied vol
3M
6M
12M
18M
30DAY_IMPVOL_100
−0.35784
−0.77516
−1.01801
−1.0947
60DAY_IMPVOL_100
1.32896
1.66286
1.79765
1.81767
Rˆ2
0.996938
0.982989
0.951588
0.921489
0.996936
0.982979
0.951559
0.921442
Unrestricted coefficients
Adjusted Rˆ2 x1
−0.08561
x2
−0.07859
Restricted coefficients
−0.35784 1.32896
-0.34283 1.376805
−0.34281 1.376784
−0.34281 1.376784
VR_U
159.7468
342.3293
461.4237
495.3758
VR_R
159.7468
165.5165
165.5086
165.5086
VR ratio
1
2.068249
2.787913
2.993051
5.3 The Results
5.3.3
59
Currency Options
In the case of currency options, the regressions are made on the term structure of Black-Scholes implied volatilities with a number of factors K = 2, as in the case of equity options. The analysed options are those written on the most liquid currency exchange concerning the European market: EUR/USD. The model is tested for heteroscedasticity (White test) and homoscedasticity is rejected. The coefficients are highly statistically significant for all maturities, with p-values computed on heteroscedasticity-robust standard errors and all below 1%. The adjusted-R 2 value is 0.992044, and it remains above 0.9 for all maturities. The relevant points of the term structure are three, four, nine, eighteen, and twentyfour months. The absence of some of the usual considered maturities is explained in paragraph “The Data”, and is due to data collection issues. Once again, the coefficients of the one-month and two-month maturities are consistently negative and positive respectively, while the x1 and x2 values are negative. The results are extremely comparable in size to those obtained by Giglio and Kelly with the data referring to the period from 2007 to 2014. With respect to their study, two additional points of the term structure—four and nine months—are considered. The overall results are reported in the table below (Table 5.5). Table 5.5 EUR/USD option results EUR/USD
3M
4M
9M
18M
24M
m1_IMPVOL_100
−0.30969
−0.49936
−0.94318
−0.98921
−0.97346
m2_IMPVOL_100
1.28772
1.45566
1.81582
1.78643
1.73143
Rˆ2
0.992049
0.98571
0.948387
0.91937
0.906494
Adjusted Rˆ2
0.992044
0.985702
0.948357
0.919323
0.90644
Unrestricted coefficients
x1
−0.14707
x2
−0.12718
Restricted coefficients
−0.30969 1.28772
−0.32136 1.379343
−0.31356 1.370785
−0.31356 1.370786
−0.31356 1.370786
VR_U
28.3395
42.63505
85.45872
86.58828
82.25203
VR_R
28.3395
32.11212
31.49089
31.49096
31.49096
VR ratio
1
1.327694
2.71376
2.749624
2.611925
60
5.3.4
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
German Bunds
Giglio and Kelly perform the analysis on U.S. Treasury yields with a three-factor regression. The following regression, on the other hand, only uses two factors. The reasons for such choice are two: the first is linked to the high explanatory power that the two-factor model already possesses. Its adjusted-R 2 is 0.997856, whereas in the three-factor model it would increase to 0.998806. The difference is extremely small, while introducing the additional parameter reduces the number of available model restrictions. For this reason, as will be further illustrated in subparagraph 3.4.2, the two-factor model is preferred. The second motivation is purely numerical. Neither “Python” nor “Matlab” are able to precisely retrieve the estimates of the elements x1, x2, and x3. The error message is “RuntimeWarning: The iteration is not making good progress, as measured by the improvement from the last five Jacobian evaluations”. They stop after a given number of iterations, and the resulting estimates change widely for each alteration of initial guesses, however minimal. The resulting coefficients thus swing from extremely large to extremely small values compared to the unrestricted counterparts, yielding unreliable results in the variance ratios. The two-factor model is thus preferred. It is then tested for heteroscedasticity (White test), which rejects homoscedasticity for all maturities. The coefficients are highly statistically significant for all maturities, with p-values computed on heteroscedasticity-robust standard errors and all below 1%. The adjusted-Rˆ2 value remains above 0.95 for all maturities. The relevant points of the term structure are three, four, five, six, seven, eight, nine, and ten years. The variance ratios reach extremely high values. It is important to interpret them in the light of the much longer maturity with respect to the other instruments analysed this far. The values are still less extreme than some of the results that have been previously obtained (although in different instruments, as they did not analyse Bunds) by Giglio and Kelly (2017). It is also worth noting, however, that they exceed the variance ratios of U.S. Treasuries. This is an important result in terms of the extent of excess volatility in the yields of what are considered the safest instruments on the European market. Further research on this point would surely be interesting. The overall results are reported in the following table (Table 5.6).
−0.53154
14.48155
14.48155
1
VR_U
VR_R
VR ratio
1.54
−0.05844
Restricted coefficients
0.997856
Adjusted Rˆ2
x2
0.997857
Rˆ2
−0.06215
1.54
x1
−0.53154
2Y
3Y
1Y
Unrestricted coefficients
Bund
1.947257
14.77474
28.77022
1.563539
−0.52879
0.993859
0.993861
1.96115
−0.96055
4Y
Table 5.6 German Bund results
2.828997
14.75083
41.73007
1.562672
−0.52796
0.989161
0.989164
2.24578
−1.27395
5Y
3.657678
14.75264
53.9604
1.562738
−0.52802
0.982647
0.982653
2.48446
−1.51852
6Y
4.199061
14.75251
61.94668
1.562733
−0.52802
0.974819
0.974828
2.62033
−1.66899
7Y
4.572977
14.75252
67.46291
1.562733
−0.52802
0.966933
0.966944
2.70351
−1.77296
8Y
4.683605
14.75251
69.09495
1.562733
−0.52802
0.959682
0.959697
2.71597
−1.81448
9Y
4.68064
14.75251
69.05121
1.562733
−0.52802
0.955189
0.955205
2.70042
−1.82871
10Y
5.3 The Results 61
62
5.4
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
Potential Explanations
The results of the analysis of the Eurozone market confirm the findings of excess volatility in the literature of previous years, especially in Giglio and Kelly (2017). Additionally, they show, in a number of instruments that were highlighted in the previous paragraph, an absolute size of variance ratios that is larger than in the closest American equivalents. However, the most widely diffused explanation, before the writing of the latter paper, had been to attribute the excessive variance to an element that did not enter traditional models for share prices or for interest rates based on the efficient markets hypothesis: time-varying discount rates. This is rational, as it was at the time the greatest vulnerability in those studies, since discount rates are not deterministic in real life. However, as mentioned in subparagraph Empirical results of Shiller (1981a), the behaviour of short-term interest rates would not have been able to explain the entity of such wide variations. In fact, the ranges recovered by his attempt to compute how great the variations would have had to be in order to account for the observed volatility (at a 97.5% significance level, from −3.91 to 13.52% and from −8.16 to 17.27% for the two data sets) differ greatly from the historical variability shown over the century. This was however an interesting starting point, from which a comparison with Giglio and Kelly (2017) is made necessary. The sheer size of the discount rate movements mentioned above is unrealistic, but it should not lead a reader to discard the explanation altogether. More likely, it is a suggestion of a larger picture, one in which excessive volatility and time-varying discount rates coexist. Once the model is changed, and the object of analysis with it, discount rates can be filtered out of the analysis through the careful selection of derivatives whose prices can be reconducted to an affine-Q specification. Under the risk-neutral measure, the main driver of changes in the discount rate—risk premia—are implicitly accounted for, meaning that their fluctuations do not impact the term structure of prices. For this reason, the interpretation of the results can be much more reliably ascribed to different motivations behind excess volatility. The following potential explanations are taken from, and then expanded upon, Giglio and Kelly (2017), who explored potential sources of violation of the affineQ model. These sources range from potential misspecifications to new models of investor behaviour altogether. They are not intended to be exhaustive, as the literature on this subject can be reasonably expected to expand in the next years. They are, nonetheless, interesting starting points for future research.
5.4 Potential Explanations
5.4.1
63
Long-memory Cash Flow Dynamics
The outcome of the tests for excessive volatility in long-term derivative prices suggests that the affine-Q specification systematically underestimates real factor loadings derived from data regressions, meaning that the impact of changes in short-end prices is much greater than expected. A natural question that can arise is whether this impact is not only greater in size than implied by the model, but also has a longer-lasting effect on prices than otherwise considered. Long memory processes are also referred to as exhibiting long-range dependence. The intuition behind their formal definition is that a process characterised by long memory exhibits a high level of dependence between points in a time series that have a long distance between them. The comparison term to define whether the level is or is not high is the exponential decay. If the statistical dependence declines at a slower pace than an exponential decay, then the dynamics are said to have long-range dependence, or long memory. The data appears to be stationary under the risk-neutral measure, meaning that its mean and variance do not change overtime. This can, at first glance, seem incompatible with the possibility of long-memory cash flow dynamics, and thus lead to an exclusion of this hypothesis. However, it is not necessarily true. The cash flows could be stationary under the martingale measure but have a slower mean-reversion trend than a traditional autoregressive process would allow. In that case, it would be necessary to test whether the estimated affine-Q model provides a sufficiently close approximation of the real cash flow behaviour. Giglio and Kelly (2017) test this possibility by stating an autoregressive, fractionally integrated, moving average model (ARFIMA), that would be able to capture the slow mean-reversion trend of a cash flow process with long memory. They refer to Granger and Joyeux (1980), in which the ARFIMA class was first proposed. The models allow for the choice of a parameter quantifying long-range dependence; if its value falls in the range (0, 0.5), then the process is both fractionally integrated and stationary. The ARFIMA setting does not allow for a tractable expression of derivative prices, so Giglio and Kelly simulate their term structure for different values of the dependence parameter and the autoregression coefficient. A variance ratio test is then constructed to analyse the behaviour under the affine-Q framework of the simulated prices exhibiting the long-memory feature. The authors conclude that the affine model with K = 2 is an “accurate enough approximation of the ARFIMA process that the misspecification can go undetected”5 5
Giglio/Kelly (2017) p. 28.
64
5.4.2
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
Absence of Relevant Factors
A critical point of the analysis of this and the previous chapter is the choice of the number of factors to include in the regression. Wherever feasible, that choice was made based on pre-existing literature. When no pertinent precedent was found, the quantity was selected in order to reach the target model explanatory power (R 2 of at least 99% set in Giglio and Kelly (2017)) with the minimum number of factors possible. By construction of the unrestricted model, the inclusion of all term structure prices would inevitably bring all variance ratios to equal one, but that would render the entire analysis meaningless. Naturally the inclusion of additional parameters to a model always raises its explanatory power—at times even just by chance—but in the case of this analysis, the increase in factors provides a minimal accuracy benefit in percentage terms. For example, the R 2 of the variance swap price regression would raise from 99.6% with K = 2 to 99.9% with K = 3. The size and significance of the resulting variance ratios would essentially remain the same, corroborating the authors’ belief that the benefit is minimal. As an increase in regressors would reduce the number of valuable cross-equation restrictions, the trade-off is not deemed worthy.
5.4.3
Non-linearities
By its very definition, the affine framework requires the assumption of cash flow linearity under the risk-neutral measure. This is, however, not necessarily true in reality. Reality rarely perfectly adheres to models, but academy continually needs to seek a balance between the accuracy of a model and its tractability. The affine-Q specification allows the development of explicit—although not straightforward—formulas for the derivation of variance ratios. Models for non-linear cash flow dynamics do not. Anyhow, it is worthwhile to study whether, if the affine model were indeed mis-specified, the difference between real and modelimplied variance ratios would be substantial. In order to represent this scenario, Giglio and Kelly (2017) make use of smooth transition autoregressive (STAR) models, in particular in the so-called “logistic” specification. Therein, cash flows follow the non-linear process: xt = ρxt−1
−1
−1 −γ(xt−1 −c) + t , 1− 1+e + (1 − ρ)xt−1 1 + e−γ(xt−1 −c)
5.4 Potential Explanations
65
which is determined by the parameters ρ and γ, that govern non-linearity. Prices of claims on STAR model-driven cash flows are not tractable6 . They are therefore computed via simulation by adhering to the no-arbitrage conditions. Variance ratios are then constructed using the affine restricted and unrestricted process on the simulated prices, in a procedure that parallels that of subparagraph 5.4.1. The results show that variance ratios are almost identical to one in all cases, which indicates that the test built on affine restrictions is able to represent correctly also the prices of claims on non-linear cash flows. In fact, since the prices are obtained by imposing no-arbitrage conditions, the variance ratio, computed on restrictions obtained directly from a non-linear model instead of the affine model, would be one. The same analysis is finally performed on even more complex STAR models, but the accuracy of affine variance ratios is still high. For this reason, even if the cash flow dynamics of the market data used in the study were to be proven drastically non-linear in the forms accounted for by STAR models, the inference based on the affine variance ratios remains valid.
5.4.4
Potential Mispricing
If the affine model is assumed to be correct, then the existence of such high variance ratios implies that the market temporarily misprices instruments, as substantial increases in short maturity prices lead to higher prices in long-maturity claims than predicted by the models. On the other hand, downward swings in the short end of the term structure bring prices in the long end down to lower values than the affine model would forecast. The link between mispricing and excess volatility can be confirmed if it is possible to show that a trading strategy which exploits deviations from the affine specification results in large trading profits relative to the risk that is taken on. A possible strategy is based on the following affine model representation7 :
pt+n,N = α N + β N Pt+n,1:K . The left-hand side of the equation can be recovered by purchasing a claim with maturity N + n, which at time t + n has value pt+n,N and has produced cash flows from xt+1 , . . . , xt+n . The right-hand side of the equation can be replicated 6 7
Giglio/Kelly (2017) p. 31. Giglio/Kelly (2017) p. 32.
66
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
by investing the present value of α N at time t at the risk-free rate, so that at time t + n it has value α N ; additionally, the claims corresponding to the vector Pt,n+1:n+K should be bought according to the proportions in β N , so that their value at time t + n is β NPt+n,1:K . Finally, the cash flows xt+1 , . . . , xt+n
can be ensured by purchasing 1 − β N 1 shares of the cumulative claim with maturity n, which has price pt,n and zero residual value at time t + n. Model violations would lead to different values between the two sides already at time t, which could be exploited by taking a long position in the undervalued side and a short position in the overvalued one. This strategy allows investors to recognise temporary deviations from the model, but there is an important element to uncertainty linked to whether the deviation will revert to the model-implied value, and whether it will temporarily increase even further before doing so. This limitation is highlighted by the authors, who also underline the role played by transaction fees and margin requirements. By accounting for the necessary collateral amount, bounds are provided to the size of the arbitrage position in terms of traded units. The authors proceed to perform the analysis on historical data and manage to obtain Sharpe ratios consistently higher than one. However, they conclude that the persistence of high variance ratios may be explained by limitations to arbitrage represented by transaction fees, infrequent profit opportunities, and long holding periods required to cover the cost of trading. Finally, the opportunity cost to traders would be substantial, as capital would have to be kept uninvested while awaiting sufficiently profitable arbitrage opportunities.
5.5
Natural Expectations
Traditional models of investor behaviour have moved away from psychological drivers such as the Keynesian “animal spirits” (Keynes, 1936) and towards a more neatly and elegantly describable assumption of rational actors who behave exclusively on the basis of available information. This is justified by a preference for frameworks that can be represented more formally, as opposed to difficult models of irrational behaviour that are entirely open to potential critique. However, the necessary trade-off is the inability of rational models to explain empirically observable phenomena such as “bubbles”, the cyclical nature of credit and investment availability, and the excess volatility that was documented in the paragraphs above. An interesting evolution, on the other hand, consists of quasi-rational models, which allow for a form of “extrapolative bias”. According to it, actors are
5.5 Natural Expectations
67
excessively reliant on the most recent performance of the market whenever forecasting the future, instead of making use of all information available to them at a given point in time. One such example is the class of “natural expectations models”, which have been the subject of only a limited number of studies., and which represent investor pricing models as some sort of average between a rational and an intuitive component. This twofold intuition is documented in multiple economic studies across the recent literature, showing that, for example, stock market performance drives the equity allocation of an investor with weights that decline linearly from the present to their year of birth—as documented in Malmendier and Nagel (2011). Additionally, not only small investors are affected by the extrapolative bias, but also wealthy and experienced individuals appear to base their forecasts on their past record of performance far more than rationally justifiable, as proven in Vissing-Jørgensen (2003). Furthermore, during “bubbles” many market players exhibit trend-chasing behaviours. This was documented both during the “dot com” bubble of the early 2000s and the U.S. housing bubble leading up to the 2007 crash. Excessive extrapolation does not only concern asset prices, but also expectations about inflation and unemployment rates, which appear too optimistic at the onset of recession and excessively pessimistic at the end, as shown in Tortorice (2012). The dynamics of many macroeconomic time series, however, tend to be hump-shaped on long time horizons8 , meaning that they tend to mean-revert in the long run, while exhibiting high reactivity to shocks in the short run. In this case, natural expectations will lead agents to underestimate the mean-reversion trend in the series, as estimates based on the short end will fail to identify it. On the other hand, they will rely excessively on an optimistic or pessimistic outlook based on whether recent news have been positive or negative. The main idea is thus that investors form their expectations—and therefore price assets—partly inconsistently with the true data generating process. If the cash flow dynamics are assumed to be autoregressive of order two, they can be represented as: xt+1 = αxt + βxt−1 + νt+1 . An investor with rational expectations will adhere to this model. Differently, an investor with completely intuitive expectations could represent those same dynamics in terms of growth9 xt+1 = xt+1 − xt :
8 9
Fuster et al. (2010) p. 72. Fuster et al. (2010) p. 71.
68
5
Chapter III: Evidence of Excess Volatility in the Eurozone Market
xt+1 = φxt + t+1 . The natural expectations model can then be computed as an average—weighted by a certain parameter λ to be calibrated on the data—of the rational and the intuitive expectations models. Whereas in the affine framework the formulas employed to recover expectations do not change across the term structure, by virtue of the averaging process, they do in the natural setting. They are the result of the balancing of two models which involve contradictory estimates of persistence and which—taken individually—both enter the affine framework. The inconsistency arises from the fact that forecasts are averaged after having formed expectations under the two models with different degrees of persistence. If the parameters α, β, φ, λ are calibrated on the data by minimising the distance between the factor loadings estimated from unconstrained regressions and those implied by the natural expectations model, the parameter values obtained in Giglio and Kelly (2017) are similar to those obtained in Fuster et al. (2010). This happens although they are gathered from different datasets: variance swaps in the former case, and macroeconomic variables such as real U.S. GDP and unemployment in the latter. When the model is stated with those parameters, which are calibrated on the data, the variance ratios it generates in a procedure equivalent to that of this and the previous chapter are very near to those observed in the affine setting. In essence, if the true data generating process is assumed to be the natural expectations one instead of the rational, meaning that cash flow dynamics are expected to behave with an excessive degree of persistence assigned to values closer in time to the present, the variance ratios obtained with the same process as in the affine-model are consistent with the data. This framework is able to produce excess volatility consistent with the affine models, although the idea behind this behavioural approach was not conceived, in particular, for its potential application to term structures. Furthermore, if λ = 1 or λ = 0, corresponding to purely rational or purely intuitive expectations, the variance ratios for variance swaps with maturity 24 months take value one, as under the affine setting. These are ground-breaking results in the understanding of how expectations are formed and in whether this class of models could become a credible alternative, or at least addition, to rational expectations in the modern academic literature. One final point worth mentioning is the breadth of the scope of extrapolative expectations, as they can apply to frameworks different from the natural one, and where they are also able to fit the pattern of excess volatility. However, not all of them are able to do so, as three features in particular are essential. First of all,
5.5 Natural Expectations
69
it is necessary for the framework to satisfy a linear factor structure; secondly, it must diverge from the affine representation of factor loadings at different maturities as a geometric progression, and thirdly it must result from the averaging of different models. It is by virtue of this averaging that the model for the calculation of expectations changes along the term structure and generates internal inconsistencies in the data. The shorter maturities are predominantly determined by the true, rational cash flow dynamics, whereas the longer ones are increasingly subject to the influence of the persistent, intuitive dynamics. It is finally worth noting that the spectrum of potential extensions includes a variety of objects of extrapolation: from cash flows, as in the example above, to consumption, productivity, risk, asset prices, and returns. This enables a potential capillary expansion of the application of natural expectations to a multitude of potential fields and phenomena.
6
Conclusions
The rational expectations model is the theoretical background that acts as a basis for the findings of excess volatility of studies revolving around both share prices and interest rates. It describes prices at any point in time as reflecting all the information that is available to the public, and long-term rates as dependent on the series of all future expected one-period short-term rates conditioned on information known in a given moment. Such information includes macroeconomic factors, such as changes in the monetary base, in fiscal policy, in factor prices and in expectations about inflation. Generalisations are also made to identify the commonalities between the formulas used in the universe of interest rates and in that of stock prices, which are many. Then, in the context of rates, different weights are attributed to the expectations of short-term rates having different maturities, with a lower weight being given to those with a more distant maturity, and a greater one to those of a nearer one. When the ex-post rational long-term interest rates are computed, which correspond to the value that long-term rates should have taken, under the rational expectations model, given the realized short-term rates, there is an observable difference in smoothness and thus in volatility. Bounds are then derived linking the standard deviation of real and ex-post rates, and the restrictions are analysed empirically in terms of bond returns. The sample volatility is revealed to violate all of the model-imposed restrictions. Concerning the predictive power of the rational expectations model, Shiller (1979) concludes that such traditional models severely underestimate the actual volatility realised in long-term rates. Once the analysis of excess volatility is extended to the universe of share prices, the starting point is again the efficient markets model. Therein, the real price at the beginning of a period is given by the summation of the expectation of future dividends multiplied by a discount factor with a constant rate. Upper bounds are then identified for the standard deviation of the innovation of price expectations © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1_6
71
72
6
Conclusions
for given levels of volatility in dividends. In the empirical samples, all restrictions on price volatility are violated, meaning that the arrival of new information about future real dividends cannot explain the volatility of stock prices. Furthermore, the size of the mismatch is so great that it cannot be attributed to reasons other than the failure of the efficient markets model itself. Weaknesses of the tests of excess volatility employed are identified, and they include the fundamental role played by the choice of the independent variables within the efficient market models, the impact of negative interest rates, and a potential mathematical bias in the variance bounds. They are all interesting areas of potential further study, but they do not prove able to bring the observed excess volatility to zero. Possible explanations of the phenomenon have been explored, concerning either alternative models or elaborations that would allow the academia to still consider the efficient-market hypothesis as valid. Among them are fads models such as Campbell and Kyle (1986), which isolate a noise process in the movement of the S&P 500 index that accounts for over 25% of the overall volatility. However, no explicit answer is obtained on whether such noise is primarily a result of market fads, and the analysis is not extended to the explicit study of excess volatility. One of the most accredited explanations for the phenomenon is the change in expected real interest rates, which are taken as constant in the models. However, it is shown that the standard deviation of real discount rates would need to take historically very unrealistic values in order to account for the discrepancy in the results. The efficient markets model could remain valid if the observed movements of real dividends around their long-run growth path were severely lower than the much larger changes feared by the market across history. Alternatively, investors might see some of these movements as fundamental, as opposed to transitory, and as the consequence of an alteration is a company’s retention policy. This can be translated, in mathematical terms, into the possibility of the dividend process being non-stationary, with changes in its mean and variance being linked to investor’s belief in the variation of a company’s fundamentals. However, from an empirical analysis, Shiller (1983) infers more stationarity in the trend than either the “transitory” or the “fundamental” hypothesis would suggest. Additionally, it is proposed that efficient market models could also allow for time-varying term premia, which might account for the excess volatility. One final potential explanation is the “Peso” problem, which could be compatible with the efficient-market hypothesis, instead of necessarily implying that it does not hold. Shiller (1981b) proposes an extension of a model of efficient markets accounting for a possible
6
Conclusions
73
disastrous event. Future developments are possible revolving around the elaboration of a framework comprehensive of multiple, if not all, of the aforementioned explanations. Then the violation of internal consistency conditions implied by standard models for the prices of financial assets of different maturities is analysed. Giglio and Kelly (2017) define as “standard” any model in which, under the risk-neutral pricing measure, the cash flows are determined by a vector auto-regression. The framework rests on no-arbitrage relations between claims, which imply restrictions on the covariation between prices of different maturities. This setting differs from the EMH models as the problem is shifted under the risk-neutral pricing measure, which incorporates movements in the main drivers of changes in discount rates: risk premia. This ensures that findings of excess volatility in the behaviour of derivative prices cannot be ascribed by and large to non-constant discount rates, as had previously been done. In order to build a test for excess volatility, a restricted and an unrestricted model are specified, where in the latter price coefficients follow a geometric progression in ρ Q . The sequence of unrestricted coefficients is recovered directly by regressing each claim price onto a fixed number of claims with shorter maturities. A variance ratio statistic is then constructed by dividing the explained variance of the unrestricted model by the explained variance of the restricted model. The null hypothesis is for the ratio to equal one, meaning that the restricted model well captures the behaviour of the term structure. However, the variance ratio exceeds one in all asset classes, meaning that the prices of claims with longer maturities overreact to changes in the prices of claims with a short maturity with respect to affine dynamics. The results of the analysis of the Eurozone market confirm the findings of excess volatility; additionally, they show an absolute size of variance ratios that is larger than that obtained from the most comparable American equivalents. Since the explanatory power of the resulting unrestricted models reaches almost 100% for all term structures, the data is extremely well described in such setting, but the excessive variance ratio implies that the actual coefficients differ from those obtained with the restricted affine model. Finally, potential sources of violation of the affine-Q model are analysed, as was done for the models of the first chapter. These sources range from possible misspecifications to new frameworks of investor behaviour altogether. The first is the presence of long memory in the cash flow process, which could be stationary under the martingale measure but have a slower mean-reversion trend than allowed for in a traditional autoregressive process. Giglio and Kelly (2017) test the hypothesis and conclude that the affine model is a sufficiently accurate approximation of the alternative ARFIMA process, because the misspecification
74
6
Conclusions
goes undetected even in the case of assumed true long-memory dynamics. The second source is the potential insufficiency of factors, but they are chosen in order to satisfy the trade-off between a higher explanatory power of the model and a reduction in the number of valuable cross-equation restrictions for the variance tests. The third is nonlinearity in cash flow dynamics, but affine models are able to represent correctly also the prices of claims on non-linear cash flow. Finally, if the affine model is assumed to be correct, then the existence of high variance ratios would imply that the market temporarily misprices instruments after substantial swings in volatility. The link between mispricing and excess volatility is confirmed by the trading strategy proposed in Giglio and Kelly (2017), which exploits deviations from the affine specification and results in substantial trading profits relative to the risk that is taken on. However, the persistence of high variance ratios may be explained by limitations to arbitrage represented by a number of market frictions. Finally, the natural expectations framework is explored, as a potential solution to the inability of rational models to explain a variety of empirically observable phenomena. Within the setting, actors are excessively reliant on the most recent performance of the market whenever attempting to forecast the future, instead of making use of all information available to them at a given point in time. Investor pricing models then become an average of rational and intuitive components. If the true data generating process is assumed to be of the natural expectation kind, meaning that cash flow dynamics are expected to behave with an excessive degree of persistence assigned to values closer in time to the present, the variance ratios obtained are consistent with the data. These results revolutionise the understanding of how expectations are formed, and this class of models should be further explored as an attractive potential alternative, or at least addition, to the rational expectations framework in the modern financial literature.
Bibiliography
Amihud, Y., and Mendelson, H. (1987): Trading Mechanisms and Stock Returns: An Empirical Investigation, The Journal of Finance, 42(3), 533–553. Andersen, T., Bollerslev, T., Diebold, F., and Labys, P. (2003): Modeling and Forecasting Realized Volatility, Econometrica, 71(2), 579–625 Ané, T., and Geman, H. (2000): Order Flow, Transaction Clock, and Normality of Asset Returns, The Journal of Finance, 55(5), 2259–2284. Banco De Mexico (2009): Exchange Rate Regimes in Mexico since 1954, Mexico City: n.p. Black, F., Jensen, M., and Scholes, M. (1972): The Capital Asset Pricing Model: Some Empirical Tests, Studies in the Theory of Capital Markets, New York: Praeger Boudoukh, J., Michaely, R., Richardson, M., and Roberts, M. (2007): On the Importance of Measuring Payout Yield: Implications for Empirical Asset Pricing, The Journal of Finance, 62(2), 877–915 Brigo, D., and Mercurio, F. (2006): Interest Rate Modes—Theory and Practice, n.p.: Springer Britten-Jones, M., and Neuberger, A. (2000): Option Prices, Implied Price Processes, and Stochastic Volatility, Journal of Finance, 55(2): 839–866 Campbell, J. Y., and Kyle, A. S. (1986): Smart Money, Noise Trading and Stock Price Behavior, Manuscript, n.p.: Princeton University Carr, P., and Lee, R. (2009): Volatility Derivatives, Annual Review of Financial Economics, 1(1): 319–339 Cherubini, U., and Della Lunga, G. (2007): Structured Finance: The Object-Oriented Approach, n.p.: Wiley & Sons Copeland, B. L. J. (1983): Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends? Comment, The American Economic Review, Vol. 73, No. 1, pp. 234–235 Deutsche Börse Group (2018): The DAX Index Universe, Frankfurt, Deutsche Börse AG, retrieved from https://www.dax-indices.com/documents/dax-indices/Documents/DAX% 2030/The%20DAX%20index%20universe.pdf Dombrecht, M., and Wouters, R. (1996): The Determination of Long-Term Interest Rates and Exchange Rates and The Role of Expectations, n.p.: Bank for International Settlements Duffie, D., Pan, J., and Singleton, K. (2000): Transform Analysis and Asset Pricing for Affine Jump-Diffusions, Econometrica, 68(6), 1343–1376 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2022 A. Santini, Excess Volatility in the Term Structure of Interest Rates, in Share Prices and in Eurozone Derivatives, BestMasters, https://doi.org/10.1007/978-3-658-37450-1
75
76
Bibiliography
Eatwell,J., Milgate, M., and Newman, P. (1992): The New Palgrave Dictionary of Money and Finance, n.p.: Springer Fama, E. (1965): The Behavior of Stock-Market Prices, The Journal of Business, 38(1), 34– 105. Flavin, M. (1983): Excess Volatility in the Financial Markets: A Reassessment of the Empirical Evidence, Journal of Political Economy, 91(6), 929–956 Focardi, S. M., and Fabozzi, F. J. (2004): The Mathematics of Financial Modeling and Investment Management, n.p.: Wiley and Sons Friend, I., and Blume, M. E. (1970): Measurement of Portfolio Performance under Uncertainty, A.E.R. 60, no.4: 567–75 Fuller, W. A. (1976): Introduction to Statistical Time Series, New York: John Wiley & Sons Fuster, A., Laibson, D., and Mendel, B. (2010): Natural Expectations and Macroeconomic Fluctuations. The Journal of Economic Perspectives, 24(4), 67–84 Giglio, S., and Kelly, B. (2017): Excess Volatility: Beyond Discount Rates, The Quarterly Journal of Economics, 133(1), 71–127 Giordano, L., and Siciliano, G. (2013): Real-World and Risk-Neutral Probabilities in the Regulation on the Transparency of Structured Products, Roma: Commissione Nazionale per le Società e la Borsa (CONSOB) Granger, C. WJ., and Joyeux, R. (1980): An Introduction to Long-Memory Time Series Models and Fractional Differencing, Journal of Time Series Analysis, 1(1): 15–29 Guidolin, M., and Thornton, D. L. (2008): Predictions of Short-Term Rates and the Expectations Hypothesis of the Term Structure of Interest Rates, ECB Working Paper Series n°977 Hamilton, J. D., and Wu, C. (2012): Identification and Estimation of Gaussian Affine-TermStructure Models. Journal of Econometrics, 168(2): 315–331 Hayashi, F. (2000): Econometrics. 2000, n.p.: Princeton University Press, Section 1: 60–69 Hilpisch, Y. (2016): Listed Volatility and Variance Derivatives, 1st ed., n.p.: Wiley & Sons Hou, Z., Filar, J. A., and Chen, A. (2013): Markov Processes and Controlled Markov Chains, n.p.: Springer Houpis, C. S., Sheldon, S. N., and D’Azzo, J.J. (2003): Linear Control System Analysis and Design, 5th ed., n.p.: CRC Press Huang, R. (1981): The Monetary Approach to Exchange Rate in an Efficient Foreign Exchange Market: Tests Based on Volatility, The Journal of Finance, 36(1), 31–41 Jiang, G. J., and Tian, Y.S. (2005): The Model-Free Implied Volatility and Its Information Content, Review of Financial Studies, 18(4):1305–1342 Keynes, J. M. (1936): The General Theory of Employment, Interest and Money, London: Macmillan LeRoy, S., and Porter, R. (1981): The Present-Value Relation: Tests Based on Implied Variance Bounds, Econometrica, 49(3), 555–574 Lewis, K. (1991): Was there a ‘Peso Problem’ in the U.S. Term Structure of Interest Rates: 1979–1982? International Economic Review, 32(1), 159–173 Longstaff, F. (1990): Time Varying Term Premia and Traditional Hypotheses about the Term Structure, The Journal of Finance, 45(4), 1307–1314 Malmendier, U., and Nagel, S. (2011). Depression Babies: Do Macroeconomic Experiences Affect Risk Taking? The Quarterly Journal of Economics, 126(1), 373–416.
Bibiliography
77
Marcaillou, P. N. (2016): Defined Benefit Pension Schemes in the UK, n. p.: Oxford University Press Miller, M. H., and Modigliani, F. (1961): Dividend Policy, Growth and the Valuation of Shares, J. Bus., Univ. Chicago, 34, 411–33. Morgan Stanley Investment Management (2018): Repurchase Agreements, retrieved from https://www.morganstanley.com/im/publication/insights/education/education_repurchas eagreementprimer_us.pdf Oldfield, G., and Rogalski, R. (1980): A Theory of Common Stock Returns Over Trading and Non-Trading Periods, The Journal of Finance, 35(3), 729–751. Pang, K. C., Lewis, F. L., Lee, T. H., and Dong, Z.Y. (2011): Intelligent Diagnosis and Prognosis of Industrial Networked Systems, CRC Press Ramanathan, R. (1995): Introductory Econometrics with Applications, n.p.: Dryden Press Rogoff, K. (1977): Rational Expectations in the Foreign Exchange Market Revisited, unpublished manuscript, Massachusetts Institute of Technology. Shiller, R. (1979): The Volatility of Long-Term Interest Rates and Expectations Models of the Term Structure, Journal of Political Economy, 87(6), 1190–1219. Shiller, R. J. (1980): Do Stock Prices Move Too Much to Be Justified by Subsequent Changes in Dividends? American Economic Review, Vol 71, No 3 (a) Shiller, R. J. (1981): The Use of Volatility Measures in Assessing Market Efficiency, The Journal of Finance, Vol. 36, No. 2, pp. 291–304 (b) Shiller, R. J. (1983): Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends? Reply, The American Economic Review, Vol. 73, No. 1, pp. 236–237 Singleton, K. (1980): Expectations Models of the Term Structure and Implied Variance Bounds, Journal of Political Economy, 88(6), 1159–1176 Tortorice, D. (2012): “Unemployment Expectations and the Business Cycle”, The B.E. Journal of Macroeconomics, 12(1): 1–47 Vissing-Jørgensen, A. (2003): Perspectives on Behavioral Finance: Does ‘Irrationality’ Disappear with Wealth? Evidence from Expectations and Actions, NBER Macroenomics Annual, 18(1): 139–94 West, K. (1988): Bubbles, Fads and Stock Price Volatility Tests: A Partial Evaluation, The Journal of Finance, 43(3), 639-656. Wooldridge, J.M., (2012): Introductory Econometrics A Modern Approach, 5th ed., n.p.: South-Western CENGAGE Learning Wystup, U. (2017): FX options and structured products, 2nd ed., n.p.: Wiley & Sons