263 2 1MB
English Pages 90 [91] Year 2023
Springer Texts in Business and Economics
Alex Backwell
An Intuitive Introduction to Finance and Derivatives Concepts, Terminology and Models
Springer Texts in Business and Economics
Springer Texts in Business and Economics (STBE) delivers high-quality instructional content for undergraduates and graduates in all areas of Business/Management Science and Economics. The series is comprised of selfcontained books with a broad and comprehensive coverage that are suitable for class as well as for individual self-study. All texts are authored by established experts in their fields and offer a solid methodological background, often accompanied by problems and exercises.
Alex Backwell
An Intuitive Introduction to Finance and Derivatives Concepts, Terminology and Models
Alex Backwell University of Cape Town Cape Town, South Africa
ISSN 2192-4333 ISSN 2192-4341 (electronic) Springer Texts in Business and Economics ISBN 978-3-031-23452-1 ISBN 978-3-031-23453-8 (eBook) https://doi.org/10.1007/978-3-031-23453-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The text that follows was written to serve as the basis for the University of Cape Town course Introduction to Finance and Derivatives (part of the MPhil in Mathematical Finance). The purpose is to give an introduction to the fundamentals of finance that are most relevant to quantitative finance. Many students enter master’s degrees in mathematical finance, financial engineering, etc., or enter financial services, without a background in finance—this text is designed to establish context and knowledge for such a person. Students with a finance background will also benefit from the text, especially through seeing familiar finance concepts in a relevant and unified context. You are advised to pay close attention to the structure of this text (see the table of contents). Chapter 1 gives a broad introduction to finance, particularly academic and theoretical finance, highlighting relevant aspects. Chapters 2, 3, 4 and 5 develop terminology, concepts and theories that are fundamental to finance. Derivatives are then considered in Chaps. 6, 7 and 8. If you are studying derivative pricing, you will likely study derivatives in technical detail elsewhere; these chapters give a basic introduction, some context and background (using the concepts from the earlier chapters), and a brief overview of derivative and arbitrage theory. Cape Town, South Africa
Alex Backwell
v
Contents
1
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 1.1 What Is Finance? .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 1.2 Returns and Interest Rates . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
1 1 5 8
2 Risk and Expected Utility .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 9 2.1 Risk Measures .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 11 2.2 Utility and the Expected Utility Hypothesis . . . . . . .. . . . . . . . . . . . . . . . . . . . . 13 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 18 3 Market Pricing and Market Efficiency . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 3.1 The Efficient-Market Hypothesis.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 3.2 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
21 22 24 28
4 Modern Portfolio Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 29 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 36 5 Asset Pricing .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 5.1 The CAPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 5.2 Factor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
37 38 43 49
6 Introduction to Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 6.1 Forward Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 6.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
51 52 56 61
7 Arbitrage- and Model-Free Pricing Methods . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 7.1 Arbitrage.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 7.2 Pricing and Hedging Forwards . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 7.3 Model-Free Option Analysis . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
63 63 65 69 72
vii
viii
Contents
8 Modelling, Pricing, and Hedging .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 8.1 The One-Period Binomial Model . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 8.2 The Black–Scholes–Merton Model . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 8.3 Beyond Black–Scholes–Merton .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .
73 73 77 83 86
1
Preliminaries
Abstract
A working understanding of the first principles of finance is developed. This includes ideas such as investing, assets and liabilities, financial systems and institutions, various financial (sub-)markets, corporate finance versus investment finance, and others. The calculation of returns is addressed in detail. Keywords
Investments · Financial markets · Financial system · Investment returns
1.1
What Is Finance?
Finance is about money. The word ‘finance’ shares roots with words such as ‘finish’ and ‘finalise’, because, in order to raise money, one usually has to agree to return it somehow, finishing or finalising the arrangement. Finance is about such arrangements. If you have money, you can spend it; this has to do with money, of course, but is not really financial. You could invest your money, meaning that you forgo spending it, and rather do something with it that is likely to give you more money in the future. This is definitely a financial action. Your investment can be described as an asset, something from which future benefits are expected, whereas the party that has taken your money now has a liability, an obligation to pass future benefits to you. Finance is about money, but specifically about assets and liabilities—investing money in assets, raising money by creating liabilities—and managing them over time. A business might make money from selling a certain product and probably needs to spend money to make this happen, but this is known as the business’s
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_1
1
2
1 Preliminaries
operations, not its finances. The finances are about how it gets the money to operate in the first place and how it rewards whoever puts up that money (or capital). Finance, especially in this text, can refer to the study of these activities (raising and investing money), rather than the activities themselves. Finance, in this academic sense, seeks to understand these activities and the overall system in which they occur, often with some degree of abstraction. The specific ways in which people and organisations raise and invest money may or may not be interesting or important to know, but the principles or concepts behind these specifics are certainly of interest to academic finance. Finance is built upon economics, a much older academic field. Financial economics refers, simply, to the application of economic ideas and methods of analysis (supply and demand, incentives and rational decision making, resource allocation, market equilibrium, etc.) to financial activity. To a degree, finance has separated from its parent, and finance is often studied in a department of its own. There is no fundamental distinction between financial economics and (academic) finance, although the former has the connotation of being more rooted in economic theory, and the latter of being more applied and practical. The content of the earlier chapters of this text was developed by economists, is financial economic in nature, and is taught out of both economic and finance departments. The study of derivatives, covered in the later chapters, burgeoned in the wake of the famous work of Black, Scholes and Merton in the early 1970s. This has led to a rich theory and (sub)field in its own right, usually taught out of finance or mathematics departments. Names applied to this field include mathematical finance, quantitative finance, stochastic finance, financial engineering, arbitrage theory, and others (each with their own connotations; e.g., financial engineering tends to emphasise application more than mathematical finance, which has a more theoretical implication). Finance can roughly be divided into two broad areas. The first is to consider the perspective of capital providers: the people and organisations with money to invest. Second is the perspective of the users of capital; individuals, households, businesses and governments often require financing. At a very high level, finance is about connecting the supply from the first group to the demand of the second. The financial system (more on this below) actually does the connecting; academic finance tries to understand how it works (why is it done this way? Could it work better a different way? What would cause the system to break down?). We will be much more concerned with the first perspective (which can be called investment management, including topics such as asset pricing, asset allocation, and risk) than the second, which largely falls under the banner of corporate finance (personal finance or public finance would emphasise that the demander of money is, respectively, an individual or governmental body rather than a business). The major topics of corporate finance are nevertheless worth briefly noting: • Capital structure, which refers to how a business is financed. Business are, in short, financed with a combination of equity (its owners provide the financing) and debt (it finds lenders to provide financing), and there is a large field of study
1.1 What Is Finance?
3
about what sorts of combinations are advantageous in differing circumstances and about how to implement and maintain equity and debt. • Dividend (or payout) policy, referring to what businesses do with profits from their operations. Essentially, they must decide between reinvesting profits (with the aim of giving their owners a larger, more valuable investment) or returning profits to their owners. Investors may prefer certain dividend decisions and policies and thus be more willing to invest in certain businesses. • Valuation (or appraisal), which informs how a business conducts its operations in the future. A firm may have the money to open a new store or expand an existing project and would need to choose if this is worthwhile (if not, they should consider paying a dividend instead, so this is related to the dividend policy). Sometimes there are multiple possibilities to compare—opening a store in location A or location B; opening a new store or upgrading equipment—and businesses need to appraise these options and opt for the one that delivers the most value. The term financial system refers to the ways in which potential investors are connected to those who wish to accept their money. Such a system can vary greatly in how formal, regulated, centralised, transparent, efficient, etc., it is. The term financial markets is an abstraction referring to where these connections (usually called trades or exchanges, with one party receiving money and the other receiving some commitment to receive future money) occur. It need not be a single, physical place; a whole financial system could consist of individuals lending and borrowing money informally to each other (then the financial market is simply the network of potential borrowers and lenders at any given time, and how they relate to one another, make their offers, etc.). In a modern economy, financial institutions are an essential part of the financial system; these are businesses that facilitate and intermediate trades in the financial markets. Important examples are banks (who connect lenders to borrowers, and who generally facilitate payments for trades), investment banks (who provide assistance with complex financial trades, such as issuing new equity or debt, or merging two businesses), brokers, dealers and market makers (who are in the business of facilitating financial trades; in practice, investment banks often offer these services), pension funds, insurers, and investment funds of many kinds (who have money given to them upfront, which they invest in order to fulfil future obligations).1 So a financial system, or even the whole global financial system, is comprised of financial markets, financial institutions, and the services they provide to facilitate trades in the markets, and perhaps also the norms,
1
This latter group—pension funds, insurers, investment funds; the institutions in control of investment capital—is known as the ‘buy-side’ of the financial system; they have the money and are looking to invest it. The ‘sell-side’ refers to the institutions and activities that cater to the buy-side: investment banks (who sell and promote investments), the brokers, dealers and market makers (who the buy-side approaches to make investments), and also investment advisory and research services sold (or given as a perk) to the buy-side.
4
1 Preliminaries
types of trades, infrastructure, etc., that allow or assist in the continuation of trading and the resultant assets and liabilities. It is useful to distinguish different types of financial markets, which result from the different ways that money can be invested. Following the above discussion of capital structure, the two main ways to invest money are to: (i) buy a stake in a business (in other words, to purchase its equity), which is done in the equity market (or stock market); or (ii) to lend to a business, which is done in the bond market. Equity and debt are often issued in the form of a security, meaning that it is suitable for the original purchaser to resell it (securities must therefore conform to some standards, rather than being completely ad hoc arrangements). Related to this, there is an important difference between primary and secondary trades and markets. Primary here refers to when securities or other investments are originally issued; secondary to subsequent trading that does not involve, or even concern, the issuer. The equity market and bond market are together called the capital market, as they are main means for businesses to raise the capital they need to finance their operations. Recall the somewhat abstract idea of a market, and do not confuse this with an exchange, which is a specific, organised (sub-)market, managed by a particular institution. The Johannesburg Stock Exchange is a major part of the South African equity market, but it is not the whole thing. The money market refers to the market for short-term debt/loans. A common heuristic is to consider loans with terms less than one year to belong to the money market (longer term loans, if in the form of a security, trade in the bond market; if not, in the less formal long-term debt market between borrowers and banks). If you deposit money in a bank, you are participating in the money market: a deposit is essentially a loan you are extending to your bank, with a term of one day, with the understanding that the loan is renewed each day it is not withdrawn. You and everyone else looking to deposit money are on one side of this market; on the other are the various banks trying to attract deposits with good interest rates, service, low fees, etc. There are other types of money market transactions, such as term deposits (where the deposit cannot be withdrawn for a certain period) and much more standardised money market securities, such as certificates of deposit and Treasury bills. Other financial markets include currency markets and derivative markets. Derivatives are contracts that derive their value from the value of an underlying asset and are thus separate, and separately traded, from the underlying assets themselves. Note also that money can be invested in the commodities markets (where precious metals, oil and gas, unprocessed agricultural products, and other goods with a high degree of fungibility are traded), and although they are not strictly financial markets in our sense, they are often closely related.
1.2 Returns and Interest Rates
1.2
5
Returns and Interest Rates
Before proceeding, we should establish an understanding of the notion of return, which refers to the profit made on an investment. The goal of investing is to make a lot of profit, or return, and so it is essential to be able to describe the return of an investment clearly and consistently. Without a suitable way to describe investment profits, all further methods of understanding investments will be compromised. To begin, let us suppose that you invest x dollars and receive y back. Perhaps you lend your money to a friend and they pay you back, or perhaps you buy and then sell a share (there need not be an actual cash flow; you could keep the share, but know that you could sell it for y). There are several ways (often called conventions) to describe the profit/loss made (note that this is a preliminary list that we will improve on shortly): • y −x is called the return, or the raw return; this is simply the profit/loss incurred. • y−x x can be called the rate of return or the return on investment, because it expresses the profit/loss relative to the initial investment. It is extremely useful to express the return this way because it describes the return in a scale-free way; i.e., in a way that does not depend on the size of the investment. A raw return of $10 may be a fantastic return (if, e.g., the initial investment was $2) or a poor return (if, e.g., $10,000 were invested) and is thus not a very informative number on its own. Removing the scale of the investment in this way is so useful that rate of return is very often shortened to return. • log(y/x) is called the log return—this is a slightly different way to remove the scale of the investment, the logic of which will become clear below. Note that log returns are similar in magnitude to rates of return, and in fact the two converge as the raw return becomes small. Note also that this is a natural logarithm, by far the dominant logarithm type used in finance. As noted above, it is difficult to assess a certain raw return, because the assessment would depend on the size of the investment. Importantly, that it would also depend on the length of the investment period. For instance, a (rate of) return of 30%, achieved over a year, would usually be considered excellent. But if $100 were invested for retirement, for say 35 years, and only returned $130, this would be considered very poor. We need to improve our terminology accordingly. Supposing that the investment of x dollars turns into y dollars in T years: •
y−x Tx
is called the simple rate of return, or simple return. Now we are expressing the profit per unit of original investment and per unit of time. This is what the word rate is all about. But again, adjusting the profit for the investment size and period is so convenient and common that the ‘rate of’ is often suppressed.
6
1 Preliminaries
This can be confusing, but understanding the logic of all of these definitions will provide the intuition to resolve most ambiguities: ◦ One can of course rearrange rs = y−x T x to y = x(1 + rs T ) or x = (1 + rs T )−1 y. We have established a relationship between three quantities (initial investment, end investment, and return). Any two can imply the other. Instead of calculating the return (which is useful for measuring or comparing investment performance), we can calculate the initial value for a given return (to, e.g., determine the price of an investment one would be willing to pay, if one requires a certain return), or we can forecast the future value of an asset, given its current value and a rate of return it might exhibit. • ( yx )1/T − 1 is called the compound (rate of) return, which is a different way of accounting for the investment length. The logic becomes clear when we rearrange as y = x(1 +rc )T . If one dollar is invested for a 10% simple return for two years, you get $1.2 back. If the return is 10% compounded, one gets $(1.1)2 = $1.21 back. In both cases, the investment is growing at 10% in some sense, but in the simple case, the growth only applies to the original amount, not the growth from earlier years; in other words, the growth/return is not compounded. • log(y/x)/T is called the log return or the continuous return (or continuously compounded return). Rearranging, we get y = xercont T . The following (nontrivial) mathematical fact should explain the logic of this type of return: lim (1 + r/n)T n = erT .
n→∞
In the compounded case, the return was applied (or compounded) each year (the example above started at 1, became 1.1 and then (1.1)2 ), instead of just once as in the simple case. The above limit shows you what happens when the compounding frequency (denoted above by n) increases; at the limit, when compounding continuously, the natural logarithm and e appear. We will shortly discuss compounding frequency further. Note the use of years as the unit of time. In principle, time units can be defined in any way; T = 1 could refer to a one-month or a one-day or a one-century investment period. It is standard to work in terms of years, but note the intuition of working in another unit: if, for instance, you achieve a compound return of 15% per month, this is a staggering increase in value, multiplying money by a factor of more than 150 in three years. Years is the default, though, so if you hear of a ‘return of X%’, you can safely assume this is annual in nature. If the type of return is not specified, you should assume that the return is compounded (in academia, continuous/log returns
1.2 Returns and Interest Rates
7
can be the default, but academics tend to specify if the context does not make it clear).2 Another issue is that one can allow the compounding frequency to differ from the time unit. Above, the compounding occurs annually, or continuously. One can, however, compound more or less often than annually. A return of 12% per annum, compounded monthly implies, using the above notation, that 0.12 12T y =x 1+ = x(1.01)12T . 12 This would be equivalent to an annual compound return greater than 12% (why? Because, like the move from simple to compound what annual compound return?). Compounding on a daily basis gives a good approximation to continuous compounding. Yet another issue is the idea of discount rates. Instead of having y = x(1 + rT ), a different convention uses x = y(1 − dT ) (for compounded instead of simple rates, y = x(1 + r)T instead of x = y(1 − d)T ), with the resultant return/interest rate known as a discount rate. What can be confusing is that people still describe x = y(1 + r)−T as the discounted version of the future cash flow y (taking place in T years). It is common to say, for instance, ‘discount that cash flow at a rate of 5% continuous’, with no reference to these discount rates we have just introduced. So unless you have specific reason to believe you are dealing with a discount rate, stick the previous conventions—discount rates are quite unusual and are not necessarily being invoked when there is general talk of discounting.3 An interest rate is simply the promised return to the lender/investor in a loan (loans and bonds are unlike equity investments, in that the future cash flows are specified; note also that the borrower may default on the loan, and the lender not get the return that was agreed), so all of Chap. 1.2 can be thought of as a description of interest-rate conventions. Bonds have fixed future payments. For a given bond, the market then haggles over the current price (or, equivalently, the corresponding interest rate) they are willing to bear to receive the future payment; in the above notation, for a given y, the market will determine an x (which implies an interest rate/return, given some convention). Interest rates (the returns promised on an underlying loan) tend to vary with the term of the loan; in other words, interest rates have a term structure. The term structure of interest rates is often plotted visually, usually giving rise to a regular, smooth curve known as the yield curve.
2
In the absence of information, specified or contextual, compounded returns and interest rates are a good default assumption, but note how the ambiguity can be abused. Someone marketing their investment performance over the last ten years may be tempted to use simple returns. Someone offering a loan may not mention that the interest is compounded continuously (although there is often consumer protection regulation preventing this). 3 The only reason we include the specific notion of discount rates is that Treasury bills (a crucial instrument in short-term debt markets) tend to be quoted in terms of (simple) discount rates. This can be very convenient, as y(1 − dT ) is easier to calculate than, say, y(1 + r)−T .
8
1 Preliminaries
Finally, note that the above only does not treat investments and loans with multiple cash flows—we used y to denote the end value of the investment, perhaps the value of a share or the amount repaid by a borrower. Many investments deliver value in a way that is distributed over time; you may receive dividends before you sell a share, or a loan may involve multiple repayments. In these cases, the investment’s return must describe how the investment value changes between cash flow dates. Suppose you pay x now for an investment that provides y1 after a year and y2 after two years; the compound return, for instance, is then the number r that satisfies x = y1 (1 + r)−1 + y2 (1 + r)−2 . Here, the initial amount x grows, at a rate of r per year, into two separate cash flows; the first only after one year, with one year of growth reflect above, and the second with two years of growth. This issue of multiple cash flows is important for loan/bond calculations, but we will often consider investments over a one-year time horizon, which will make it relatively easy to incorporate any income into the final value amount and simplify the return calculation.
Literature Notes While the material in this chapter is quite general, one could refer to Cvitanic and Zapatero (2004, Ch.1 & Ch.2) or Keown et al. (2008, Ch.2) for similar information and further details. For details about corporate finance, see, e.g., Ross et al. (2015), or the later chapters of Keown et al. (2008).
References Cvitanic, J., & Zapatero, F. (2004). Introduction to the economics and mathematics of financial markets. MIT Press. Keown, A., Martin, J., & Petty, J. (2008). Foundations of finance: The logic and practice of financial management. Pearson Prentice Hall. Ross, S., Jaffe, J., & Westerfield, R. (2015). Corporate finance. McGraw-Hill Education.
2
Risk and Expected Utility
Abstract
Having discussed return, we then turn to the fundamental notation of risk. The general idea is introduced, and then various risk measures are described. Utility theory—a particular method of measuring and incorporating investment risk—is considered. Keywords
Investment risk · Risk preferences · Risk aversion · Risk measurement · Value at risk · Utility theory
We have seen in Chap. 1 how we can, and often should, describe the performance of an investment in terms of its return. You can only calculate the return retrospectively, though; when considering an investment, you do not know the return that it will deliver. This, in short, is what financial risk is: the possibility that your investments provide poor, potentially negative, returns. Risk is, in other words, simply the fact that many different things can happen in the future, some worse than others (financial risk emphasises that the uncertainty is about the state of assets and liabilities). Let us first consider a special case. Certain investments offer a fixed, known return. (US) Treasury bills, for instance, are a promise of the US government to pay a certain, pre-specified amount at a set date in the coming year (they are thus money market instruments). The chances of the obligation not being fulfilled are extremely small (why?). The current price of a ‘T-bill’ is known, and their future value is known, and so their return is known in advance. They can therefore be said to be a risk-free investment.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_2
9
10
2 Risk and Expected Utility
There are a few notes to be made here. First, the assertion that T-bills they are free of risk needs some caveats. We discuss below how risk is a slippery concept, and, strictly speaking, we should say that T-bills are risk-free in a certain (and important) sense. There is no real uncertainty about the money that will be received, but more subtle uncertainties remain. The rate of inflation during the life of the T-bill is not known, so the fixed amount of money may be less valuable than originally thought. There is also the uncertainty of how the T-bill will suit your investment needs; there were other investment opportunities, and these may have served you better (or perhaps served your competitors better). Second, note that the return offered by T-bills and other risk-free investments (with the caveats above) are in practice non-zero.1 This gives rise to a concept called the time value of money: money now is better, all else equal, than money later (make sure you can see how this follows from a positive T-bill return!).2 The return offered by risk-free investments is known as the risk-free rate and gives an important benchmark for assessing other risky investments. Finally, consider an instrument similar to a T-bill (a promise to repay an amount in the future, a loan or a bond in other words) but issued by a less stable organisation than a large government. Common sense tells us that there can be a non-negligible probability that the promise is not fulfilled; the issuer may go out of business, may be unable to pay at the promised date, may not be able to pay in full, etc. Such an instrument is clearly not risk-free; it is said to exhibit credit risk (or default risk). All else equal, this is a less desirable investment than a T-bill. In Chap. 3, we will discuss in detail how forces of supply and demand tend to account for investment risks in the prices that prevail on the market; for now, you should be able to see that a risky loan will have a lower initial value than a risk-free one (assuming the promised future payments are equal), and therefore that the risk loan will offer a higher return, if the issuer fulfils their obligation. Outside of the special case of a loan to a low-risk borrower, an investment’s return is not known in advance. The return of a generic investment should therefore be treated as a random variable, a quantity that is not known in advance, but about which we can discuss the various possibilities and their associated probabilities. Once an investment has been held for some period, the return is realised, and we can use such realisations to make inferences about the behaviour/distribution of returns. We shall see that, even though investment returns are random, in the sense of not being predictable in advance, their behaviour is not completely haphazard and does exhibit a number of interesting properties. We first consider simpler questions: for a
1
Inflation is one intuitive reason that lenders may not be willing at lend at zero interest (with the purchasing power of the lent funds declining over time). In practice, however, interest rates (especially short-term ones) are not determined in a fully free-market supply-and-demand manner—monetary authorities tend to control interest rates, by intervening in the lending– borrowing market (often with the goal of controlling inflation). 2 ‘All else equal’, more fully ‘all other things being equal’, is a simple but important thinking tool; make sure you understand it. In it common in Economics literature, often given in the Latin ceteris paribus.
2.1 Risk Measures
11
given random variable that represents a return of an investment, how we can evaluate the investment? How can we quantify its riskiness? How should we compare it to an alternative investment with another return distribution? That is, before trying to infer details about typical return distributions, or addressing the fact that the returns of available investments are endogenously determined by the market, we start by taking returns to be exogenous. In Sect. 2.1, we discuss the properties of a random variable, representing a return, that admits financially relevant interpretation. Then, in Sect. 2.2, we present a specific framework that evaluates prospective investments quantitatively.
2.1
Risk Measures
Let a random variable X represent the future value of a certain investment. We shall describe investments in these terms (rather than with its return) in this section, although we could use the return corresponding to/implied by the future value. What is it about X that should concern us as a potential investor? Consider first one of the most basic properties of a random variable: the mean, or expectation, denoted E[X]. As investors, we would like this to be large (this indicates that X tends to be large, and of course, an investment with a large future value, or a large return, is desirable). This piece of common sense can be formalised with the term non-satiation, the simple principle that investors prefer more to less and are not satiated/satisfied at any particular level of wealth/return. But the expectation is far from the whole story. Let us compare two investments: the first yields a constant amount 50, i.e., it is a risk-free investment; the second yields a random amount X, which has a 50% chance of equalling zero and a 50% chance of equalling 100. Both investments have the same expected value, but clearly they are not the same. One is risky and one is not. How much should we be willing to pay to enter these investments? For the first, the answer should be clear: it is worth 50, modulo any time-value-of-money adjustments (that is, if the 50 is paid in a year from now, that amount should be discounted by a suitable interest rate to account for the delay). The answer for the second investment is less clear. The discounted expected value (which is equal to the first investment) is not a bad starting point—this does capture some important aspects of the investment. But it does not capture the variability, that fact that you do not know what return will be realised; in short, the investment’s risk.3 A fundamental statistical measure of variability is standard deviation (or,
3
The rest of this section, and this text, will discuss the important of risk, but let us briefly reemphasise that a discounted expected value valuation method is a great deal better than nothing and does at the very least give a good first approximation of value. Discounting estimated or expected cash flows is probably the dominant method of valuation in a corporate finance context (recall the summary of corporate finance themes in Chap. 1). The idea is known as DCF valuation (discounted cash flows) or NPV (net present value). Often, an adjustment to the discounting rate is made to account for variability in the cash flows; e.g., instead of discounted a constant 50 at the risk-free
12
2 Risk and Expected Utility
equivalently, variance; make sure you know these definitions). Something noteworthy about standard deviation is that it captures both upside and downside variation. This can be viewed as a disadvantage of using standard deviation as a measure of risk, and we will shortly introduce some alternatives. However, recognising that different investments differ in their variability/risk, and using the standard statistical notion of standard deviation to capture this risk, was an important breakthrough made by Harry Markowitz in the 1950s. In Chap. 4, we will study the theory he developed based on characterising investment risk with standard deviation. So the standard deviation of an investment’s future value (or its return, or the profit it delivers) is a way to measure the risk, but it is an incomplete and imperfect summary of the riskiness in the investment. Why? Because information is lost when a complicated thing like a distribution is compressed into a simple thing like a single number (after all, different distributions can have the same standard deviation). An important general principle in finance is that risk is a very general and contextual concept, very difficult to pin down in a single metric or even a single qualitative definition. The field of risk measurement is about developing measures of risk that are informative, readily interpretable, and appropriate for the circumstances (and, although we do not tackle it here, estimating the distributions of the relevant random variables so that these measures can be well estimated). The field of (financial) risk management is about responding to risk measurements over time in a way that is best suited to achieve the relevant goals. Let us take note of some further ways that the risk in an investment can be summarised quantitatively. To focus specifically on the downside of an investment, one can use consider the semi-variance, which is defined as E[(X − μ)2 | X < μ], where μ = E[X] and E[·|A] denotes expectation conditional on event A occurring. Another measure that focusses on the downside would be the probability of falling short of some level: P[X < x], where the level in question is denoted x. All else equal, a lower shortfall probability is of course better. A slight variation on this idea gives very popular risk measure known as the value at risk, often shorted to VaR, which is simply a certain left quantile of the distribution in question (a quantile is a specific value for a random variable that divides the range of possible values according to the probability
interest rate of 5%, we will discount the cash flow with an expectation of 50 at 7%. This does capture and account for the variability in the position, but the issue of how much to adjust the rate remains open.
2.2 Utility and the Expected Utility Hypothesis
13
corresponding to the sub-ranges). If X has a continuous distribution, then, for a certain (small) probability α, the VaR is the number that satisfies4 α = P[X < VaR]. The probability of the investment under-performing relative to VaR is, by definition, α (which one chooses upfront; it would be common, for instance, to consider both the 1% VaR and the 5% VaR). Finally, the expected shortfall, also known as the conditional value at risk (CVaR), is given as the conditional expectation within the left tail of the distribution. For a continuous distribution, this is E[X | X < VaR], where VaR is calculated for a certain probability. Exercise: Explain why the expected shortfall is less than the VaR (assuming the same probability is used in each). It is not that any particular risk measure is better than any other. It is that they answer slightly different questions about the riskiness of the position. The variance tells you about the average squared deviation from the mean. The VaR answers the question ‘what is the outcome that that we have a (1-α)% chance of exceeding?’ or ‘supposing we are the worst α% of the possible future states, what is the best case scenario?’. Some questions are more relevant to a certain situation, and sometimes many questions require answering for a decision to be made. Note that the risk measures above could be applied readily to returns (e.g., variance or semi-variance of return), although the terms VaR and expected shortfall do tend to imply that a monetary value (e.g., rands or dollars) is under discussion (one could absolutely calculate and use the quantile of a return distribution, but it should not be described as VaR).
2.2
Utility and the Expected Utility Hypothesis
We have discussed how an investment has certain risk characteristics, and different investments will differ in these characteristics, such as the specific risk measures (recall also that a risk-free investment, or a good proxy, is a useful benchmark). We have briefly motivated the idea that an investment’s risk should affect its value; we did this by comparing two investments that behave the same on average but have
In general, the VaR is the smallest number x that satisfies α < P[X ≤ x]. The expected shortfall, introduced next, also has a general definition that is not completely straightforward. It is not essential that you understand these general definitions; it is essential that you understand the idea of a quantile and a conditional expectation.
4
14
2 Risk and Expected Utility
differing degrees of variability. We have not explicitly stated the following: risk, broadly speaking, is viewed by market participants as undesirable, all else equal. A non-trivial problem follows: how should risk be accounted for when assessing a potential investment? Risk should detract from an investment’s value (all else equal!), but by how much? Let us start with a simpler problem: if two investments give futures values of X1 and X2 , with different distributions and therefore levels of risk, which one should we prefer? Sometimes the comparison is easy. If it is true that X1 > X2 , then clearly the first investment is preferable. Note the strength of this condition; it is saying that whatever values taken on by X1 and X2 , the first is always larger. This known as dominance, but note that there are other (weaker) forms of dominance out there.5 We will use this kind of relationship in Chap. 7; it is a fairly simple idea that, if X1 > X2 , then the price of the first investment must be greater, but we will see that an arbitrage can be created if this does not hold. Dominance is an unusually easy case. In order to generalise, let us return to another useful test case: suppose that E[X1 ] = E[X2 ], but that, otherwise, X1 and X2 have differing risk/variability characteristics. Well, we have established enough now to see that, if one is clearly less risky than the other (if one is a constant, say), it would very likely be preferable.6 But what if E[X1 ] > E[X2 ] and X1 is riskier than X2 ? In other words, what if investment 1 is riskier (more variable) than investment 2, but offers better performance on average?7 The question then becomes: is the additional average performance sufficient to outweigh the additional riskiness? This is a deep and difficult question, affecting not just individual investors contemplating potential investments, but also much more general concerns of how the financial markets allocate capital, and to what extent risky uses of capital are allowed or incentivised. Expected utility gives an interesting way of dealing with such questions. It is a venerable idea in economic theory and literature, and although it has faced significant criticism and has proven somewhat limited in its practical usefulness, it is at the very least a good developer of intuition. First, we need the idea of utility. As above, let X denote the future amount returned by an investment. Except for a risk-free investment, X can take on several possible values. Instead of considering these possible outcomes themselves, this approach involves considering the utility associated with them. Utility refers to the subjective value (perhaps the satisfaction or benefit) that an outcome would provide,
5
Ours can be called statewise dominance, meaning that the relationship holds in all states of the future. Another notion is first-order dominance, which requires that one investment/random variable has a larger probability of exceeding x than does the other, for all levels x. This is a weaker condition. 6 Even here, there are potential difficulties; different risk measures may give different results, e.g., one could have a higher (worse) standard deviation, but also a higher (better) VaR. 7 A preview to the next few chapters: this situation is very common. In fact, in order to induce investors to bear risk, risky investments tend to offer a risk premium: additional average performance to compensate for the additional variability in performance.
2.2 Utility and the Expected Utility Hypothesis
15
specified as a number. A higher utility number indicates more value, i.e., a preferable outcome. Let u(·) denote a function that maps each outcome to the utility it provides; this is called a utility function and allows us to consider the random variable u(X), the (often unpredictable) amount of utility the investment will end up providing. Let us reflect on what a utility function should be like. We would expect it to be an increasing function; a higher investment value (more money) should deliver more value to the investor. Above, we called this non-satiation, which, in this context, is synonymous with an increasing utility function. An increasing function has a positive slope. A utility function’s slope has a classical economic interpretation: marginal utility, the additional utility gained from a small (marginal) addition to, in this case, investment outcome (in other contexts, the small addition could be to current consumption, or to the amount of bananas one eats). Another venerable idea of economics is the law of diminishing marginal utility. The improvement in utility from zero wealth to $100 wealth, say, is larger than the improvement from $100 to $200, which is larger than the improvement from $100,000 to $100,100. Alternatively, a banana is more valuable to someone with no bananas than it is to a banana farmer. This captures something important about our interaction with the world, and, for utility functions, it implies that the positive slope (from nonsatiation) should decrease from left to right. That is, utility function should slope up but should also be concave. One can think about utility fairly literally, and be confident people’s preferences can in fact be accurately quantified with a utility function. Or one can think of it more practically, perhaps as a model (an idealised representation) of how people tend to regard different outcomes. On this view, one does not need to believe that utility actually exists; it can just be seen as a way of representing and exploring ideas such as non-satiation and diminishing marginal utility. We can now state the famous expected utility hypothesis: investments (or, more generally, risky prospects with multiple possible outcomes) are valued according to the expectation (i.e., statistical average) of the utility of the possible outcomes. In short, the hypothesis says that decisions are made on the basis of expected utility. Before discussing whether there is any reason to put stock in this, let us discuss its implications. We should begin with the quantity E[u(X)], the expected utility of the investment delivering X. This is the key metric, after all, and is already a hopeful start because we have seen that expected value E[X] is an inadequate metric (remind yourself why). It must be completely clear to you that, in general, E[u(X)] = u(E[X]). In mathematical terms, expectations and functions do not commute. In fact, for concave functions, we have E[u(X)] < u(E[X]).
16
2 Risk and Expected Utility
This is a statement of the famous Jensen’s inequality.8 It is essential that you get an intuitive understanding of this mathematical idea. Exercise: Consider the example above, comparing a constant amount of 50 to a toss-up between 0 and 100, and visualise why the former has a higher expected utility, for any concave utility function. Initially, this exercise is purely mathematical, but once you are satisfied with the intuition behind Jensen’s inequality, you must see the financial implication (given the expected utility hypothesis): the expected utility metric accounts, quantitatively, for risk. The idea of a certainty equivalent can be useful to see this clearly. For an investment delivering X, we can calculate the expected utility E[u(X)]. Then consider the following: what constant amount (or, how large a risk-free investment) would be equivalent to X? Equivalent here means equally attractive, equally valuable, giving the same amount of utility; we (or whoever the utility function pertains to) should be indifferent between X and this constant amount. We call this amount the certainty equivalent. It is the number x0 satisfying E[u(X)] = u(x0 ). Because x0 is a constant, this investment delivers utility of u(x0 ) always. This is therefore its expected utility, which is, by virtue of it being the certainty equivalent, the same amount of expected utility delivered by X. An investment’s certainty equivalent is very revealing. Suppose that the certainty equivalent is far lower than E[X]; on average, the investment performs much better but is still viewed as no more attractive than the small but consistent certainty equivalent. In such a case, there is a large aversion to the risk in X—we would be willing to sacrifice a lot of performance on average to avoid the riskiness. This would happen if the utility function is highly concave (as Jensen’s inequality becomes stronger). On the other hand, the certainty equivalent might be equal or similar to E[X]; then there is no particular aversion to the riskiness, with only the expected performance of the investment, rather than its other characteristics, being relevant. The fact that any investment can be assigned a certainty equivalent illustrates the following crucial idea: a certain utility function characterises certain risk preferences, that is, a certain attitude towards risk. An investor may be highly risk averse; they don’t like risk, and would be willing to accept a relatively low certain amount instead of a risky investment. An investor can be risk neutral, or close to it, and be largely concerned with performance in expectation; such an investor would think ‘this risky investment may perform well, and may perform badly—these roughly cancel out, and I’m willing to consider the average’. Finally, an investor may be risk seeking, meaning that, all else equal, they prefer risk. For them, a certainty equivalent is relatively high—they would need extra compensation
8
Jensen’s inequality is traditionally stated for convex functions, giving the opposite inequality to the above. The logic is identical, however, and one can simply apply a negative to take a convex function to a concave function (or vice versa) and thus translate the result.
2.2 Utility and the Expected Utility Hypothesis
17
to take the constant, no-risk investment over the risky one. If you have ever gambled at a casino, you have exhibited risk-seeking behaviour. Exercise: Suppose you bet R100 on red at a roulette wheel. You have an 18 in 37 chance of winning R200, and a 19 in 37 chance of losing your money. Calculate the certainty equivalent and explain why only a risk seeker would wish to play. In summary, given the expected utility hypothesis, we can calculate certainty equivalents for an investment and thus ascertain how the riskiness of the investment is accounted for. We can also compare investments (by comparing certainty equivalents, or by directly comparing expected utilities) in a way that accounts for this riskiness. More generally, it allows us to represent risk preferences with a utility function, with differing preferences represented by differing functions (with the degree of concavity being linked to the degree of risk aversion). This can be a very useful modelling and theoretical tool. So the expected utility hypothesis has some very interesting and important implications, and it would be very useful if it were true. Is there reason to think that it is? Well, first, there does seem to be some sound basic logic behind it. When assessing an uncertain future, it seems true that we consider the various possibilities and average them in some way, weighing the good ones against the bad ones, and factor in their probabilities, even if not literally averaging them. It seems plausible that we consider the utility, their subjective worth, of the outcomes (after all, the only thing that makes the outcome, money, valuable to me is the fact that I can derive some value from it, by spending it). And it is true that diminishing marginal utility is a robust feature of our preferences. There is a much more rigorous, though controversial, justification. In 1947, John von Neumann and Oskar Morgenstern proved a result known as the expected utility theorem. It says that, if (and only if) someone’s preferences conform to four particular axioms, the expected utility hypothesis applies. That is, there exists a utility function such that their preferences agree with the principle of maximising expected utility. If the axioms hold, von Neumann and Morgenstern showed that expected utility is a fully valid basis for preferences and decision making. The axioms are basic properties of decision making. The first simply says that, when presented with two alternative investments/risky prospects, one must be decisively preferred to the other, or they must be deemed equivalent. The second is the mathematical idea of transitivity: if investment A is preferred to B, and B is preferred to C, A must be preferred above C. The third and the fourth can be formulated in various, equivalent ways, and we will not delve into this issue, except to briefly note what is often called the independence (or substitution) axiom: if I prefer A to B, then I should maintain this preference when a third investment is mixed into A and B, i.e., I still prefer A-combined-with-C to B-combined-with-C. This sounds reasonable enough but, as we now proceed to note, is controversial. It is useful to distinguish between normative and descriptive questions, models, modes of analysis, etc. ‘Normative’ indicates that we are discussing how things should be, according to some standard. ‘Descriptive’ indicates that we are interested in how things in fact are. In the 1970s, Daniel Kahneman and Amos Tversky famously demonstrated a number of shortcomings of expected utility as a descrip-
18
2 Risk and Expected Utility
tive model of risk preferences and decision making. Amongst many other interesting findings, they and others have shown that the independence axiom is often violated in practice. Consequently, people often behave in a way that does not appear to maximise their expected utility. Kahneman and Tversky did not stop there; they developed a descriptive model of risk preferences, known as prospect theory, and more generally theorised about why and when normative principles are violated. Kahneman was awarded the Nobel Prize, and these ideas have been remarkably seminal for the fields of behavioural economics and behavioural finance, which investigate biases, errors, and other departures from normative decision making in the context of economics and finance. Finally, an important note on expected utility and risk. Sections 2.1 and 2.2 consider investments in isolation. Investments can be combined into portfolios, and the behaviour of the individual investments is not sufficient to imply the behaviour of the portfolio—the relationships between the investments become relevant (we study this in detail in Chap. 4). When you enter an investment, you inevitably combine it with your existing investments. Often, depending on the application, utility and risk measurement should account for this, by considering the total wealth that would result from an investment, rather than the investment in isolation. For example, when economic models use expected utility, they tend to define utility as a function of total wealth (or consumption, which is what ultimately provides value/satisfaction). Also, an investment can appear risky on its own but in fact have a behaviour that reduces risk when combined with other investments (can you think of an example?); risk measurement would then be more informative if the total portfolio is considered.
Literature Notes There are many textbooks on risk measures and financial risk management; McNeil et al. (2015) and Coleman (2012) are fine examples. For an introduction to risk preferences and risk aversion, see Cvitanic and Zapatero (2004, Ch.4.1). The chapter mentions the expected utility theorem, which was given in Morgenstern and Von Neumann (1947) (we refer to the second edition of this book, which first contained a formal proof), as well as Kahneman and Tversky’s descriptive account of risk preferences; see Kahneman and Tversky (1979) for the original paper, or Kahneman et al. (1982) for an example of a more general text on descriptive decision making.
References Coleman, T. S. (2012). Quantitative risk management: A practical guide to financial risk. Wiley. Cvitanic, J., & Zapatero, F. (2004). Introduction to the economics and mathematics of financial markets. MIT Press. Kahneman, D., Slovic, S. P., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. Cambridge University Press.
References
19
Kahneman, D., & Tversky, A. (1979). Prospect theory: an analysis of decision under risk. Econometrica, 47(2), 263–292. McNeil, A. J., Frey, R., & Embrechts, P. (2015). Quantitative risk management: Concepts, techniques and tools-revised edition. Princeton University Press. Morgenstern, O., & Von Neumann, J. (1947). Theory of games and economic behavior (2nd ed.). Princeton University Press.
3
Market Pricing and Market Efficiency
Abstract
Competitive market pricing and the concept of market efficiency are addressed in this chapter. Risk and risk aversion need to be considered. The famous efficientmarket hypothesis is discussed in detail. Keywords
Risk aversion · Market efficiency · Efficient-market hypothesis · Market outperformance
We have begun to consider the question of how an investment’s risk should be accounted for. All else equal, a less risky, more consistently performing investment is preferable, and we should therefore be willing to pay more for it than a riskier alternative. This makes good sense, but let us extend this thinking by considering the fact that investments trade in the financial markets, and many potential investors are making calculations of this kind. The prices of investments are determined ‘by the market’. We must understand how this works, not just out of academic interest, but because this is the environment in which one must in fact invest. Some basic economics: in a free market, the price of any good or service will adjust until there is equilibrium between supply and demand. If the price is too low, there will be more buyers than sellers, and only some buyers will be able to find a seller (more accurately, if there are more buyers than sellers, then the price is too low). Similarly, only a few willing sellers will be in luck if the price is too high. The market price is the price that clears the market, i.e., balances supply and demand. The real world is more dynamic than this; in practice, market prices are constantly adjusting to find equilibrium between fluctuating supply and demand (the adjustments are not achieved by magic, or some agreement between participants,
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_3
21
22
3 Market Pricing and Market Efficiency
but by buyers increasing their bids when they want more goods than are available at the current bid, or by sellers decreasing their offered prices when they wish to sell more).1 So when it is said that prices are determined ‘by the market’ or ‘by supply and demand’, a (useful) shorthand is being used to refer to the price that emerges from the complicated, bottom-up actions of self-interested potential buyers and sellers. A quick note on the phrase ‘asset pricing’, which is a large part of what we are doing in this text. Haven’t we just said that prices are determined in the market, making the project of determining prices trivial? Well, sometimes prices are not established on the market, for an about-to-be-issued security for instance. But, more generally, consulting the market for an asset’s price is not satisfactory. We wish to understand the principles involved, and, more practically, find investments that appear underpriced. If we think an investment is worth x, based on our perceptions of the future value and its riskiness, and the market price is lower, it follows that we should enter that investment. Sometimes the word value is used to indicate what an investment is worth, while price is used to refer to what one must pay. Here is a crucial definition: a financial market is said to be efficient if market prices reflect available information. When efficiency is discussed in the context of finance, this technical notion of informational efficiency is usually what is meant, rather than efficiency in other senses (a different sense is introduced in the next chapter, and there are others that can be well worth discussing; e.g., is capital/money well allocated by the market? Do the participants, and society at large, benefit from the trading taking place? Is the market conducive to wasteful or unproductive outcomes?). Market efficiency is a central concept in finance theory. Indeed, one of the bestknown ideas in finance is the so-called efficient-market hypothesis, which states that financial markets are in fact efficient. We begin, in Sect. 3.1, by giving further details on market efficiency and the associated hypothesis. The interest in the hypothesis is mainly due to the implications that would follow, which are discussed in Sect. 3.2.
3.1
The Efficient-Market Hypothesis
Let us first expand on the above definition of efficiency. A simple example is helpful; suppose Company X has shares that are traded on the stock exchange and that a number of trades have just taken place at $100 per share. If the market is efficient, then this $100 price ‘reflects available information’. What does this mean? Well, there are a number of things that would make the share more or less valuable. Share value is ultimately driven by the market’s perception of how profitable the underlying company will be in the future, and how much risk is associated with
1
The real world is also much more complicated than this, and many factors interfere with this simplified picture of market pricing, but supply–demand equilibrium is nevertheless the fundamental mechanism that keeps prices in check in a free market.
3.1 The Efficient-Market Hypothesis
23
these future profits. The shareholders own these profits and will either receive them as dividends, or have them reinvested in the company with the goal of creating more profit later. Speaking of the ‘market’s perception’ involves the same shorthand as above; the perceptions of individual buyers and sellers manifest in the market price, which is then said to reflect the market’s perception. The market bases this perception of whatever information seems relevant. Examples include the performance reflected in the latest financial statements, the likely performance in the future given the economic outlook and the nature of the company, whether the CEO is any good, and so on. This is information relevant to the value of the share (there is of course a great deal of information that is not relevant to the share at all). If the market is efficient, the $100 price accounts for all of the relevant factors. Crucially, an efficient market must have prices that reflect information at all times; it must respond to new information. Suppose information comes to light that indicates that Company X’s shares are less valuable than they previously seemed. Perhaps a large corporate fraud has been detected, or a breakthrough has been made in producing an alternative to the company’s main product, or a new type of tax has been announced that applies to the company’s revenue. These are deliberately vivid examples; new information can be more subtle. Perhaps economic conditions are just a bit less suitable to the company’s profitability than they were yesterday. In an efficient market, prices must immediately account for new information; if they did not, then prices would not be reflective in the way efficiency requires. This seems like an onerous property for a market to exhibit—how could a market possibly be efficient? The answer is competition. In principle, efficiency emerges from a large number of competitive, informed market participants making trades based on new information coming to light. If a large corporate fraud is detected, then market participants should no longer be willing to pay $100 per share (this is a decrease in demand; supply will increase as well—how?). The share was worth $100 given the information at the time, but must now be worth less, and the market will settle on a value that accounts for the new information. The information is then said to be ‘priced in’. Markets are indeed very competitive. Everyone wishes to make money, not least finance professionals, and so investors tend to account for information relevant to their investments and potential investments. Moreover, there are a large number of investors (often very well informed and resourced) actively seeking mispriced investments. If an investment is indeed underpriced—that is, too low to properly reflect the value of the share as it can be ascertained from current information— investors tend to buy it, which forces the price up. In principle, the price should continue to rise until the investment is no longer underpriced. The above is the fundamental logic of market efficiency. It is standard for finance textbooks to differentiate three different specific forms of the efficientmarket hypothesis (EMH): • The weak EMH: prices reflect all market trading information, i.e., historical price levels, trading volumes, and other market-generated information.
24
3 Market Pricing and Market Efficiency
• The semi-strong EMH: prices reflect all publicly available information. This includes market trading information (making this stronger than the weak EMH) and, amongst other things, the examples discussed above (revelation of fraud, economic conditions, technological developments, etc.). • The strong EMH: prices reflect all information. This expands the set of information further. For example, there may be information known to senior management of a company (of a pending merger, say) but not known by the public—this must be reflected in market prices for the strong EMH to hold. So is the EMH true? Above, we outlined the basic theoretical argument for why the EMH is plausible, in some form at least. A vast body of academic literature has studied the EMH. In 1970, Eugene Fama introduced what is now the standard definition of efficiency—prices that are informationally efficient, as above—and articulated the EMH (in the three forms above). This built on and consolidated several other attempts at this issue (for instance, the work of Bachelier around 1900), and much work has followed by attempting to assess the EMH rigorously and empirically. The EMH is too abstract to test directly; one usually needs to consider an implication of it being true and test that. This is the topic of the next section.
3.2
Implications
A highly relevant and interesting implication of the EMH is that, if true, it would be impossible to beat the market. It is essential to understand this. To beat the market, one must achieve higher investment returns than the market average, consistently and on a risk-adjusted basis. Consistency is essential; any schmuck can beat the market once or twice by fluke. So is the risk adjustment; the logic is a bit more involved, but crucial, so we now unpack it. Recall from Chap. 2 that investments differ in how variable their returns are, and how this interacts with other investments; in other words, investments have varying degrees and types of risk (even though these can be difficult to pin down precisely). Recall also that investors in the market have non-trivial risk preferences. The market will be more averse to some investments’ risks than others, and this will be reflected in the prices commanded by the investments. The presence of a risk, all else equal, drives the price of a security down. Note that price and return are inversely related; a lower price implies a higher return, and vice versa. This is a fairly straightforward consequence of the definitions in Sect. 1.2. You invest some money—the price of the investment—and then receive a certain, usually risky, amount later. If the initial amount is lower, all else equal, your money has grown by more to reach the end amount, i.e., the return is higher. Imagine a Company Y, identical to Company X above in all respects. The share price must be an identical $100; the market recognises that a share in company X is an equivalent to a share in Company Y. Then suppose Company Y makes one change to their business, which the market (the potential investors, on average) perceives as making it slightly more risky in some way. The price will drop, to reflect this
3.2 Implications
25
risk. The price can also drop because the Company Y is perceived to have a less profitable outlook, but this is a different matter (here, let us say that the average return of Company Y’s share has not changed, only how variable the return will be, based on the market’s current expectation of the future). Now, with the dropped price, the returns of company Y will be higher. So in a competitive investment market, you can enhance your returns by taking on additional risk. The relationship and tension between risk and return is one of the perennial concepts in finance—everyone wants high returns and low risk. But there is only so much return to go around (in some ways, but certainly not all, the financial markets are a zero-sum game), and financial market risks must be borne by someone. When you take on risk, though, your investments will not always perform well, so sometimes your return will be poor (if there was not this possibility, the investments in questions are not really risky!). So it is not true so say that more risk implies more return. Rather, more risk implies more expected return, that is, return on average. The important concept of a risk premium follows: the amount of additional expected return gained from taking on risk (or some unit of risk). In certain contexts, the term market price of risk is used. Why is the impossibility of beating the market an implication of the EMH? Because whatever the basis for one’s investment choices, this and indeed all relevant information is already accounted for in market prices. The EMH implies that no particular investor can have an informational advantage to exploit. If, on the other hand, the market is inefficient, there exist prices that have not responded to updated information about their future; if one has this information, one could exploit the un-updated price. So, is the EMH true? To begin with, it is certainly true that, generally speaking, there is a good approximation to efficiency in the investment markets; the competitive nature of investors, who direct their money to investments perceived as offering higher and less risky returns, tends to ensure that information is rapidly accounted for in the market prices of investments. Let us take note of the implications relative to the three specific forms of market efficiency. The key implication of the weak EMH is that technical analysis— trading based on price and trading history—should be unable to beat the market, to consistently outperform the market without taking on additional risk. The competitive nature of investors does virtually rule out technical analysis as a profitable strategy. Asset prices tend to be forward looking; they are based on the outlook of the future prospects for the asset in question, regardless and independent of the past behaviour. Prices, therefore, do not mean revert, or have momentum. Note also that even if some kind of technical methodology—trading based on some trend or effect—appears profitable, this performance is unlikely to persist. Other investors tend to learn of any and all ways to make excess profit, after which prices adjust to the point that easy excess profit is not available. The semi-strong EMH is the most contentious, largely because it calls the very idea of fund manager skill into question. Fund managers analyse available investments, seeking good investment opportunities for their clients’ money (they charge a fee and/or a portion of the profits). Fund managers market themselves based
26
3 Market Pricing and Market Efficiency
on their ability to generate above average returns through the research they conduct. In an efficient market though, any information they may discover is immediately priced in by the market, making it impossible to profit from it. Of course, this is a little too strict; ‘immediately’ should be replaced by ‘rapidly’. It is not that profit opportunities can never exist; it is that they tend to be fleeting. According to the EMH, fund managers should be fighting an uphill battle, struggling fiercely against competitors to take advantage of any relatively good investment opportunity. The following joke illustrates these important but nuanced concepts. Two financial economists are walking down the street, and they see a $50 bill on the pavement. The one is about to pick it up, when the other stops him, saying ‘Don’t bother. If it were real, someone would have already picked it up’. It is true that easy money is not readily available—there are too many competitive investors searching for and taking any easy opportunities—but it is also not true that mispricing (i.e., opportunities to beat the market) is utterly non-existent. They just tend to be rare and fleeting—if the $50 bill were left on the pavement, it would not remain for long— and it is possible to be in a position to profit off recognising such an opportunity (usually, it requires significant effort or perhaps expenditure; the easy opportunities are extremely fleeting). After all, the markets are not efficient by magic, but by the actions of individual investors competitively seeking profit. To the extent that the EMH is true, one should be sceptical of active investing and fund management. This refers to an approach to investing where investments are researched, compared, and selected; the alternative is a passive approach, where one selects a broad selection of investments reflective of the market as a whole. This requires little work and, under the EMH, cannot be outperformed any active strategy (in the long term, on a risk-adjusted basis). Active fund managers tend to charge relatively large fees, in exchange for their research and analysis; the EMH calls such fees into question. It is also true that validity and necessity of fund managers do not necessarily depend on the EMH. Fund managers can provide valuable services besides seeking mispriced investments and thus trying to beat the market can, for example, advise on the degree and types of risk suitable to a client’s circumstances, or can optimise a client’s tax obligations. Fund managers can also use the scale of their institution to their advantage; it is probably much easy and cheaper for a large fund manager (with established relationships to brokers and market makers) to trade and remain informed than it is for an individual. Rather than attempting to adjudicate the evidence for and against the (semistrong) EMH, we shall restate the initial position given above: the investment markets are certainly very competitive and thus very efficient, or very near to the strict condition of efficiency. If it is possible to beat the market, it is not easy. Investment strategies that purport to do this tend, when scrutinised, to perform no better than the market average, or indeed a random selection of investments. One key difficulty in making these comparisons, though, is that it is not clear how risk should be adjusted for. It is clear that it must be adjusted for in some way, but recall that risk is not easy to full capture quantitatively. This is known as the joint hypothesis problem; to rigorously test the EMH, it must also hypothesise a way to
3.2 Implications
27
quantify and adjust for risk. Finally, let us note that markets can adapt. If markets are inefficient in some way, market participants can learn from this (by, for example, copying a successful investment strategy that is trading on the inefficiency), can start incorporating the information that was previously being neglected, and can thus remove, or at least reduce, the inefficiency that was present initially. It is not plausible that the strong EMH holds in any strict way. It would contradict the logic of insider trading laws, which exist to the protect the majority of investors from profiting by trades based on information that is not yet public. Insider trading does indeed occur (perhaps often without us knowing!), showing that, at least occasionally, there is non-public information that is not yet accounted for by the market. In this chapter, and throughout this text, we have largely considered the perspective of investors. Let us briefly consider market efficiency from the opposite side; businesses (and indeed all entities, from people to governments) that require financing will have their ability to return money, and the riskiness associated with this ability, examined by the market. If their ability to return money to investors is perceived as strong and low risk, the investment (e.g., a bond or a share) will command a relatively high price on the market and will likely offer a relatively low return for investors. This is a good thing for the issuer. For taking on a given liability, they raise a relatively large amount of money. A low return for investors implies a low cost to raising money. Conversely, if an issuer is perceived as risky, their obligations will be priced relatively low, and higher returns will be possible for investors (not guaranteed, obviously, but the low price, resulting from the aversion to the perceived riskiness, will push expected returns up). One reason that governments are hesitant to default on their debt is that they will thereafter be perceived as unreliable and risky, pushing their bond prices down, and making it more expensive for them to borrow money from the capital markets. Let us conclude the section by noting some links to other chapters. In Chap. 2, we viewed investment returns as random variables. Let us reflect on this further, especially how it incorporates the discussions on investor competitiveness and efficiency. Future returns are unpredictable; the current price is based on current information (given the efficient and competitive nature of the markets), and then new information will arrive, and the price will adjust accordingly (according to how the market, the totality of buyers and sellers, perceives the update). This new information cannot be predicted in advance; if it could, it would be implicit in current information and would already be reflected in current prices. So, returns are unpredictable. Also, returns in different periods are independent because these returns are based on different information updates. Whatever new information arrives during February should not be dictated (should not be predictable) by why what happened in January—otherwise, it is not really new! This is related to an idea known as the random walk hypothesis, which states that investment prices evolve as per a random walk, a series of independent, unpredictable steps (sometimes up, sometimes down). This will be relevant to the financial modelling we consider in later chapters, particularly Chap. 8.
28
3 Market Pricing and Market Efficiency
Literature Notes The chapter mentions the early efforts of Bachelier (1900) (it could have mentioned others such as Mandelbrot, 1963) and also the now-standard characterisation of efficiency due to Fama (1970). A modern, post-crisis take on market efficiency (with further ideas) can be found in Lo (2017).
References Bachelier, L. (1900). Théorie de la spéculation. In Annales scientifiques de l’École normale supérieure (Vol. 17, pp. 21–86). Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25(2), 383–417. Lo, A. W. (2017). Adaptive markets: Financial evolution at the speed of thought. Princeton University Press. Mandelbrot, B. (1963). The variation of certain speculative prices. Journal of Business, 36(4), 394– 419.
4
Modern Portfolio Theory
Abstract
Modern portfolio theory is rigorously introduced. This formalises the benefits of diversification and leads to a quantitative method of optimising the tradeoff between risk and return for a particular investor. Keywords
Modern portfolio theory · Efficient frontier · Investment correlation · Diversification · Risk versus return
Modern portfolio theory (MPT), introduced by Harry Markowitz in 1952, is based on a fairly simple model of the investment market. It supposes that n different investments are available, each offering a return with a known mean and variance over some fixed time period. Furthermore, the covariances between all returns are known. In other words, the investment landscape is given exogenously, with all first- and second-order moments, including cross moments, being known. Let Ri denote the return of the ith investment, and let μi and σi denote the mean and standard deviation of Ri . To denote the covariance between Ri and Rj , we write σij = ρij σi σj , where ρij is the (Pearson) correlation coefficient. It is then assumed that portfolios—sets/combinations of individual investments— can be formed. A portfolio is characterised by a set of weights {wi }, each representing the proportion of the portfolio invested in a particular asset, and together summing to one: ni=1 wi = 1. It is assumed that the market is perfect (or frictionless), meaning that investments can be combined in any quantities, without
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_4
29
30
4 Modern Portfolio Theory
transaction costs, taxes, or other hidden costs or effects. The return of a portfolio is then given by Rp =
n
wi Ri .
i=1
Make sure, perhaps with simple examples, that you agree with the above expression as a portfolio’s return. Note that there are different ways we could set up a model of the investment world; for example, we could consider the amount of money invested rather than the weight/proportion of each investment. Once you have agreed with the above equation, you need to be able to show that Rp has the following mean and variance: μp =
n
wi μi , and
i=1
σp2
=
n n i=1 j =1
wi wj σij =
i
wi2 σi2 +
i
wi wj ρij σi σj .
j =i
One can form any portfolio one desires with the original n investments—this is the investment landscape. Crucially, MPT assumes investment risk to be fully characterised by standard derivation/variance of return. Recall from earlier chapters that, in general, no single measure can fully capture the risk of an investment, so MPT certainly makes a simplification in this regard. Even though the standard deviation does not capture everything about an investment’s risk characteristics, it certainly does give a good indication of the degree of variability in the return (and has the huge advantage the standard deviation of a portfolio is implied, as above, by the standard deviations and covariances of the individual investments). Finally, MPT assumes that high, low-risk returns are preferable. One could describe this as an assumption of non-satiation and risk aversion. The following list consolidates the assumptions made my adopting the MPT view of the investment world: • There is a fixed time horizon over which investments are considered and assessed. • The available investments are exogenously given, including the first- and secondorder moments of their returns. • Portfolios can be formed in a perfect-market environment. • Return mean and variance are the only relevant features of an investment, with larger mean returns, and smaller return variances, being preferable. The available investments are represented by plotting their mean returns (on the vertical axis) against their return standard deviation—this can be called risk-
4 Modern Portfolio Theory Fig. 4.1 Risk-expected return space, or σ -μ space, with an example set of investments
31
μ
σ
expected return space or σ -μ space. Figure 4.1 gives an example. What we have learned in Chap. 3 should give some expectations for how the investments should appear plotted in risk-expected return space. Investors tend to be averse to risk, so high-risk (high-σ ) investments tend to have a lower price and a higher expected return. Conversely, low-risk investments are unlikely to give a high expected return; if one did, investors would be eager to buy it, pushing the price up and the prospective return down. So there should be no or very few investments in the upperleft quadrant of risk-expected return space, although the goal should be to end up with investments that are far up and to the left as possible. The bottom right is to be avoided (but, we shall see in Chap. 5 that the market does not reward all risks with additional expected return, and so the bottom-right quadrant may be possible). Figure 4.1 only shows the initial, given investments of this example. A crucial part of MPT is the ability to form portfolios by combining investments. To develop some intuitions around this, consider initially just two assets. One can construct a portfolio by investing a proportion w of one’s investment capital into the first asset, and 1 − w into the second. When w = 1, you are just holding the first asset, but w can be decreased gradually to zero, including an increasing amount of the second asset while holding less of the first. As w is varied, a curve within σ -μ space is traced. Crucially, this curve depends on the correlation between the two investments. It is essential that you are able to use the above equations to arrive at the situation illustrated in Fig. 4.2. Note that (and ensure you see how) the variance expression reduces to σp2 = w2 σ12 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 . If two investment returns are perfectly correlated, then portfolios combining the two investments trace a straight line between the two investments themselves. But this is an extreme case; although investment returns are often positively correlated,
32 Fig. 4.2 Each line represents the portfolios that can be formed by combining two given investments, for different possible correlations between the investments’ return
4 Modern Portfolio Theory
μ
ρ = −1 ρ=0
ρ=1
σ
they are not perfectly so. Different investments depend on different things and therefore have different, non-perfectly correlated returns. If the correlation is less than perfect, the portfolios, roughly speaking, move to the left (relative to the investments individually, or the portfolios in a perfect-correlation world). Moving leftwards and upwards is desirable; we are seeing the benefits of diversification. This is not an especially abstruse idea—‘don’t put all your eggs in one basket’—but MPT gives a very clear formalisation and quantification of it. MPT suggests that you diversify your investments; because investment returns are imperfectly correlated, diversification improves the available risk–return combinations. Also note the perfect negative correlation scenario (ρ = −1) in Fig. 4.2. This is also an extreme case that is unlikely in practice, but it has the interesting mathematical feature of touching the vertical axis; that is, with perfect negative correlation, one can construct a portfolio with zero variability (one can construct a synthetic risk-free investment). Now let us reconsider the more realistic investment landscape, with several investments rather than just two. If the investments plotted in Fig. 4.1 are available, one can combine them in myriad ways. When combining two securities, there is a single degree of freedom in controlling the combination (w above). When combining three, there are two degrees of freedom, and the possible portfolios cover a region in σ -μ space, not merely a one-dimensional curve. It is important to develop some intuition about the potential portfolios in an example like Fig. 4.1. One can choose any pair of investments and consider the curve of portfolios between the pairs (note that you cannot tell how much diversification benefit would be enjoyed because the correlation is not specified). After considering these portfolios for all pairs, one can consider larger combinations. Assuming the correlations are not all perfect, combining investments tends to result in lower risk and portfolios towards the left of the space. The plotted line in Fig. 4.3 is intended to indicate the boundary as to how far upwards and leftwards one can go; one can invest in any particular
4 Modern Portfolio Theory Fig. 4.3 An example set of investments in σ -μ space, along with a plausible efficient frontier
33
μ
σ
investment itself and can create combinations that give diversification benefits. These benefits can only go so far (unless there is perfect negative correlation), and the line indicates the extent to which one can explore the upper-left part of σ -μ space. The line in Fig. 4.3, constructed as per the above, is known as the efficient frontier. For a given standard deviation, consider the portfolio that maximises expected return; such a portfolio is said to be efficient (efficient in a very different sense than considered in Chap. 3). Considering this for all standard deviations, one gets all of the efficient portfolios, which constitute the efficient frontier. Equivalently, one can specify a desired expected return and look to minimise standard deviation/risk. Attaining efficiency is thus a constrained optimisation problem. There is no reason, within MPT, to invest off the efficient frontier: below/right of the frontier is inefficient, in the sense that one can increase one’s expected return, without taking additional risk (or, similarly, decrease risk without losing expected return), while above/left of the frontier is not possible. In summary, MPT shows the benefits of diversification. It shows that the relationships/correlations between investments are essential—a security’s risk/return cannot be fully assessed by itself; one needs to think about how it contributes to the portfolio. It shows that an investor should optimise their portfolios, to enjoy as much diversification benefit as possible, and to avoid be needlessly inefficient. A simple but important extension we can make is to make sure that we include a risk-free investment. The real-world proxy of such an investment is a T-bill or similar debt instrument, maturing in line with the fixed time horizon of the MPT model. Note that without a risk-free asset, the efficient frontier is concave; it bulges upwards and leftwards. We will not prove this, but this is well motivated by the two-investment example in Fig. 4.2, where you can see the concave (and in fact parabolic) curve that it formed when there is less than perfect correlation. An important feature of MPT calculus is that, when a risk-free investment is
34 Fig. 4.4 The dashed line represents a plausible efficient frontier for an example set of risky assets; the solid represents the (necessarily) improved efficient frontier obtained when a risk-free investment made available
4 Modern Portfolio Theory
μ
σ
included, the efficient frontier improves and, in particular, becomes the ray (line) that emanates from the risk-free investment and is tangent to the efficient frontier formed from the risky investments. An illustration is helpful; see Fig. 4.4. Ensure you follow the logical steps on the way to this feature: the risk-free investment lies on the vertical axis; you can combine the risk-free investment with any other investment or portfolio; this forms a straight line between the risk-free investment and the investment/portfolio you are combining with (this is non-trivial, but the math pans out like the perfect-correlation case of Fig. 4.2); the best such line you can form is the one that is tangent to the pre-existing frontier; this line is always above and to the left of the original efficient frontier; in other words, it is everywhere efficient and so becomes the new efficient frontier. Note the final implication: with a risk-free investment, efficient portfolios— portfolios for which increasing the expected return, or decreasing the return variance, is impossible—are formed by combining the risk-free investment with a particular portfolio of risky assets, the portfolio corresponding the point of tangency between the original and improved efficient frontiers. An important principle follows: to attain efficiency, we must first identify and construct this portfolio at the point of tangency (this mix of risky assets is involved, in some degree, on every point of the efficient frontier); and second, we must decide where along the efficient frontier we wish to be. This fact that efficiency can be broken into a two-step procedure is known as the separation theorem (or the monetary separation theorem or Tobin’s separation theorem). Note that, in Fig. 4.4, the improved efficient frontier extends beyond the point of tangency. Two important financial concepts, leverage and shorting, are at work here. To understand this, imagine combining the risk-free investment and riskytangent portfolio, and gradually adjusting the combination from 100 to 100% in the other. This is like Fig. 4.2, where combinations of two risky investments were considered. At all times, the two weights w and (1 − w) add up to one (by
4 Modern Portfolio Theory
35
construction). Now consider what it would mean to continue increasing one of the weights beyond 100%, and allowing the other weight to become negative to maintain the summation to one. You can have smaller or bigger investments, but what about negative investments? This is known as shorting the investment, the opposite of a standard, long position. The terms long and short are widely used in finance, in a variety of different contexts; as a general guide, ‘short’ refers to circumstances where either you are not laying down the initial cash flow, and/or where you are betting against a quantity going up (both of these are the opposite of a typical, ‘long’ investment where you put money into an asset, and hope that it increases). If you ‘go long’ on a cash/risk-free investment, you are effectively investing money by lending/depositing it; to ‘go short’ is the opposite: you borrow money, which reverses all the cash flows of the investment. Instead of handing your money over at the start (when lending), you receive money upfront (when borrowing), and the later cash flows are similarly opposite. In principle, when you short a risky asset, you receive it current value now (the opposite cash flow to if you were buying it), and you must pay the future value later (you therefore hope its value falls). In practice, it is not necessarily easy or even possible to achieve an investment position like this; we discuss this further in Chap. 7. Suppose you invest your money in a risky asset. Suppose that your friend does exactly the same, with the same amount of money. In addition, let us suppose, your friend borrows some further money, which they also invest in the risky asset. They had the same initial capital as you, but they now face increased risk (because their initial capital and borrowed capital are invested in the risky asset); alternatively put, they face increased exposure to the risk source. They have more at stake, on both the upside and downside. This is known as leverage: multiplying one’s exposure to some risk/investment relative to one’s initial investment amount, usually with the use of debt.1 In Fig. 4.4, the part of the improved efficient frontier that extends beyond the point of tangency involves leverage: a short, negative amount of the risk-free asset is held (in other words, money is borrowed), which allows more than 100% of your investment capital to be invested in the risky portfolio of assets. Recalling the central optimisation problem of MPT—one should always minimise risk at a desired expected return, or maximise expected return for given risk; in other words, one should always ensure efficiency—note that shorting may be involved in the solution. In mathematical terms, the problem is to find weights {wi } that sum to one and give rise to a random variable Rp with mean μp and standard deviation σp that lie on the efficient frontier. Negative weights may be part of the solution. If one wants, short positions can be excluded (by including the constraint that wi > 0); this may worsen the efficient frontier but would make sense if one was unsure of the ability to obtain short positions. With shorting allowed, a closed-form expression for all points on the efficient frontier can be derived with a Lagrange multiplier; without shorting, numerical optimisation is generally needed.
1
This term is based on an analogy to physical leverage, where a lever can multiply a force.
36
4 Modern Portfolio Theory
Literature Notes The chapter is based on the seminal paper of Markowitz (1952). Much literature has gone on to describe and develop these ideas; a prime example is Elton et al. (2009).
References Elton, E. J., Gruber, M. J., Brown, S. J., & Goetzmann, W. N. (2009). Modern portfolio theory and investment analysis. Wiley. Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7(1), 77–91.
5
Asset Pricing
Abstract
Models of the investment market are considered, given that individual investors are competitively seeking optimal risk–return profiles. The famous Capital Asset Pricing Model and Single-Index Model are introduced, as well as extensions. Keywords
CAPM · Market portfolio · Systemic and specific risk · Regression
We have discussed the idea that investors tend to account for risk in the prices they are willing to pay for investments. Market prices are set by investors’ willingness to trade, and thus prices reflect risk (and because investors are competitive, there is a high degree of efficiency in how risk perceptions are reflected by prices). Investors are averse to risk, generally speaking, and so risk forces prices down, and subsequent returns up. This is very sound and fundamental economics applied to finance (i.e., fundamental financial economics): risk is an economic bad (the opposite of a good), and the market will settle on a price to avoid this bad (or, equivalently, a price/premium to bear it). So the investment market accounts for investment risk, but can we say anything more specific about this process? The way it works in practice is that many individual investors make up their minds about which investments, and which risks, they are prepared to accept and at what price, and an aggregate pricing of investment risks emerges. Can we abstract any principles governing this complicated process? We begin the chapter with the CAPM, the capital asset pricing model, which is the canonical financial model of how risk is priced into investments, of how the market sets prices and thus expected returns that are reflective of risk. It is called an asset pricing model not because it will determine a numerical price for a
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_5
37
38
5 Asset Pricing
specific asset, but because it solves the conceptual problem at the heart of risky asset pricing: how large an adjustment should one make to account for a certain risk? Or, put differently, how large a return can be expected (on average), to compensate an investor for bearing a certain degree of riskiness (that will spread outcomes around the average)? We then consider factor models—these involve modelling investment returns in a statistical rather than a fundamental or economic way and are useful alongside and in conjunction with the CAPM.
5.1
The CAPM
The CAPM starts with MPT, as described in Chap. 4, specifically with the inclusion of a risk-free asset. The CAPM then extends MPT by assuming it applies across the whole investment market. MPT models the investment options facing an individual investor. The CAPM models the investment market, supposing that all investors face these options; that is, all investors: • Share a fixed time horizon over which their investments are considered and assessed • Have the same investments available to them and all know the first- and secondorder moments of the available returns • Can form portfolios in a perfect-market environment • Are only concerned with finding larger mean returns and smaller return variances One way you hear this described is to say that, in the CAPM, the investment market consists of homogeneous MPT investors (they are all the same, all have the same information, and all seek to be mean–variance efficient). Recall the notion of the efficient frontier within the MPT and that with a riskfree investment present, it is the straight line joining the risk-free investment and a special risky portfolio (represented by the point of tangency in Fig. 4.4). In the context of the CAPM, the efficient frontier is called the capital market line, and the special portfolio is called the market portfolio. These two names are very important and very related to how MPT is extended in the CAPM. Here is the logic: all investors in the CAPM face the same linear efficient frontier and will select some point along it, implying that all investors hold a combination of the risk-free asset and the special risky portfolio (any other strategy would be inefficient—it would mean that an increase in mean return and/or a decrease in return standard deviation would be possible). The only way that any risky investment is held, according to the CAPM, is through a holding of this special portfolio, which is called the market portfolio because it includes all risky investments in the market. MPT formalises the benefits of diversification; the CAPM goes further by exploring the effect of this principle being applied across the investment market. In the CAPM, all investors diversify their exposures in a satisfyingly simple way: if any risk is to be taken on, it is done via holding, to whatever desired degree, a fully and optimally diversified portfolio. Any other risky investment would be less diverse
5.1 The CAPM
39
(you cannot get more diverse that the portfolio consisting of all risky investments in the market) and would be inefficient. An essentially equivalent term for the ‘investment market’ is the ‘capital market’ (recall that the bond and equity markets, the two ways that money can be invested in businesses, are together called the capital markets) or the ‘capital asset market’ (with bonds or equity being examples of capital assets, from the perspective of the holder), and you should now be able to see how the CAPM is a model of the whole market, while MPT, on its own, is just a tool to model the perspective of an individual investor. Recall also that MPT does not stipulate where on the efficient frontier one should invest; MPT simply concludes that it would be irrational to invest off the frontier but leaves the final optimality decision open. One must choose a specific degree of risk (standard deviation) and can then find the optimal investment (alternatively, one could also choose a desired mean return, and let MPT find the lowest risk investment compatible with that desire). The various efficient portfolios are described by μp = r +
μm − r σp , σm
where r is the risk-free return, and where μp and σp are defined in Chap. 4 as the mean and standard deviation of a portfolio’s return (μm and σm pertain to the market portfolio).1 Consider the following informal reading of the above equation: ‘expected return equals the price of time plus the market price of risk times the risk taken’. Expected return in the CAPM can indeed be broken into two components: a risk-free component that is shared by all investments regardless of their risk (the price of time, of losing the use of your money for some time, but without taking any risk) and a risk premium component, driven by the risks of the investment in question. If σp = 0, no risk is taken, and no premium above the risk-free return can be earned. As σp increases, expected return increases in a linear fashion (ensure that you can see from Fig. 4.4 that the linear relation is as above). The above equation tells us about the efficient portfolios. What about a particular investment? Any individual investment is represented by a point, in σ -μ space, somewhere below the efficient frontier.2 What does the CAPM, and the resultant sharing by all investors of a market portfolio of diversified risky investments, imply about such points? To answer this, we must consider the following portfolio: suppose the market portfolio is held, as well as a holding in some generic investment, where this additional investment is funded by a short holding in the riskfree investment. Letting w denote the weight of the additional investment relative to the market portfolio holding, the portfolio can then be described in terms of its 1
Given the contents of Sect. 1.2, all of these returns should of course be given under a consistent compounding convention, to ensure a fair comparison. 2 It is possible that an individual investment lies on the frontier, but it is unlikely. Why? Unless there are highly atypical parameters, efficiency is achieved with diversification—i.e., the combination of multiple, imperfectly correlated investments—not the holding of individual assets.
40
5 Asset Pricing
weights as follows: 100% in the market portfolio, w in the particular investment, and −w in the risk-free investment. When w = 0, this is simply the market portfolio with no additional holding, giving a mean return μm and return standard deviation σm . When w moves away from zero, mean and standard deviation of return become (make sure you agree!) μp = μm + w(μi − r) and σp =
σm2 + w2 σi2 + 2wσim ,
where μi and σi denote the mean and standard deviation of the investment return in question, and σim its covariance with the return of the market portfolio. As w changes, the above mean and standard deviation change, and the rate of change can be calculated by differentiating μp and σp . One can also calculate the change of expected return relative to standard deviation as w is moved: ∂μp = ∂σp
∂μp ∂w ∂σp ∂w
=
μi − r 1 2 2 (σm
+
w2 σi2
1
+ 2wσim )− 2 (2wσi2 + 2σim )
.
As w changes, a curve in σ -μ space is traced. The above quantity gives the slope of this curve. When w = 0, the slope must equal the slope of the efficient frontier; that is, we must have μm − r μi − r = . σim /σm σm Why? If the slope was greater than that of the efficient frontier, it would imply that one could adjust the market portfolio (by adding a little of the individual investment) in a way that would make an improvement to the efficient frontier (as increasing w from zero would do better, in σ -μ space, than riding up the linear efficient frontier). This is a contradiction; by definition, the efficient frontier cannot be improved. A slope less than the efficient frontier’s is also a contradiction, as one could then short a little of the investment in question (i.e., decrease w from zero) and land up above the efficient frontier. In summary, the curve that is formed in σ -μ space by adjusting w must be tangent to the efficient frontier. The above equation rearranges to μi = r +
σim (μm − r) σm2
= r + βi (μm − r),
5.1 The CAPM
41
where the second line defines βi = σσim2 . This relationship, and its graphical m representation, is known as the security market line; it describes the expected returns of all securities in the market (corresponding to various values of i—note that the above portfolio and tangency argument is applicable to all investments in the market) and is the central result of the CAPM. It specifies how an investment’s risk premium is determined, saying that it is a linear function of the investment’s covariance with the market portfolio. Expected returns are driven by what can be called systemic risk, risk based on the whole market/system, measured by covariance with the market portfolio. The converse is equally important: any variance that does not correlate with the market does not result in an increased mean return. An investment return can have a large standard deviation, but if it is not correlated with the market return, this standard deviation does not factor into the security market line. That is, investments can and do carry risk that is unrewarded; this non-market risk can be called specific or idiosyncratic risk. But it is not rewarded with a risk premium; according to the CAPM, only systemic risk is rewarded. Let us reflect on this result, which is an answer to our original fundamental question of how markets account for investment risk. An individual investment is represented by some point in σ -μ space. Under the CAPM assumptions, all investors will hold some combination of the risk-free investment and the market portfolio, which, being the totality of all risky investments in the market, includes all of the individual investments. It turns out that investors’ desire to be efficient (investors’ desire for large mean returns and small return variances) results in this fully diversified strategy. It also turns out that the resultant pricing of investments leads to expected returns that lie on the security market line: they attract a risk premium only insofar as they covary with the market portfolio, or, more succinctly, as they exhibit market risk. Why is specific risk not rewarded? Because it can be diversified away. With investors’ desire to be efficient, all opportunities to gain good returns are snapped up—the only reliable way to make excess returns is to take on risks the market is truly averse too (so that the price is low and the return relatively high). The market is averse to systemic risk—the size of the market price of risk μm −r σm quantifies how averse—but is not especially averse to specific, diversifiable risks. Investors, in CAPM, are willing to take on specific risks, and they just ensure that it is fully diversified in the complete portfolio of available risky investments. Here is an important check on whether you understand the derivation of the security market line: suppose a new investment is added to the market, and it sits above the line (i.e., has too large a mean return, or too low a price)—then investors should add some of it to their risky portfolios, and they will experience an improvement (the tangency argument shows that no improvement can be possible, once the market has found the true efficient frontier, once the model is ‘in equilibrium’). If it sits below the line, investors should short some of it and will improve their position in σ -μ space. This buying (or shorting) will raise (lower) the price of the investment, causing the benefit offered by it to taper off, until some new equilibrium (new market portfolio, efficient frontier, etc.) is reached.
42
5 Asset Pricing
So the key conceptual insights of the CAPM are that, under the simplified but plausible assumptions of the model, investors diversify their risky investments fully, and the associated market results in only systemic (market, non-diversifiable) risk being rewarded with a risk premium, in a linear fashion. In addition to providing a theory for how risk is accounted for in the investment markets, the CAPM can be used to calculate required returns on prospective investments. That is, the security market line implies an expected return for a given degree of market risk, which can be used as a benchmark for investments that one is assessing. As a simple example, suppose that you can invest in a new mining venture that is expected to produce R1 million in one year. This is only the expectation, however; the amount is risky and will vary with market conditions. The security market line will suggest an expected return for such an investment, depending on its degree of market risk. The R1 million can be discounted at this return, and the resultant price is one that accounts for the riskiness of the position.3 It is crucial you understand the intuition of this answer to the question posed at the beginning of the chapter. If you discount the R1 million at the risk-free rate r, you will likely be overpricing the investment. The CAPM suggests a higher rate (assuming there is positive covariance with the market) and therefore a lower price. It answers the key question of how to calculate fair discounts for bearing investment risk. In terms of practical advice, given the riskiness of investing in the mining venture, you should not pay more than the amount discounted at the required return (the return required to bear the riskiness). Recall also, though, that specific risk is not met with a risk premium (if the mining venture were risky but independent of the market, CAPM suggests the risk-free return as the appropriate mean return). In practice, you must decide if this is an appropriate basis for your required return. You might wish to be compensated for non-market risk. The CAPM says you should diversify this risk thoroughly with other investments, so that its idiosyncratic risk becomes extremely minor relative to the average of all investments, i.e., the market-related part of the portfolio. But you are an investor in the real world, not in the perfect CAPM world, and you are considering a real investment that might pose significant idiosyncratic risks. A shallow understanding of the CAPM says that you should only require the risk-free rate for an investment uncorrelated with the market. A deeper understanding of the CAPM is well aware of why investors only require the risk-free rate in the model, which informs but does not need to dictate how things go in the real world. In this example, the following caveat is often issued with the CAPM, or more specifically with the security market line suggested return: it is applicable to investments being added an already well-diversified portfolio. It is possible, however, that you wish to require a risk premium for all types of risks
3
There is a minor but common approximation here. It is slightly (but only slightly) different to say: (i) that the mean return is μ versus (ii) that the return from the mean future value is μ. In continuously compounded terms, (i) corresponds to μ = E[log(X/X0 )/T ], and (ii) to μ = log(E[X]/X0 )/T , where X is the future, time-T investment value, and X0 is the current price.
5.2 Factor Models
43
μ
μ = μm
μ=r
β=1
β
Fig. 5.1 The security market line, i.e., expected returns as a function of β, according to the CAPM
(perhaps you do not have a well-diversified portfolio, with which you can diversify the prospective investment)—you are free to do this. A consequence of this choice is that you will be prevented from entering some investments (some investments will be acceptable at a lower required return, but not at your additional premium for all risks). In summary, you can use the CAPM as a guide or as a benchmark, rather than a strict criterion. The definition of β above, which standardises the covariance by the market portfolio variance, is useful. The market portfolio becomes a convenient reference point, corresponding to β = 1. The risk-free investment, or any investment uncorrelated with the market, has a β of zero. The security market line is plotted in terms of β—see Fig. 5.1 for a stylised example. Ensure you are comfortable with the following representation of β: βi =
σim ρim σi σm ρim σi = = . σm2 σm2 σm
This allows us to see that are two determinants of an investment’s β (and therefore its expected return–risk premium, according to the CAPM): correlation with the market, ρim , and standard deviation/variance, σi . These are two separate features of an investment return that determine its covariance with the market, its beta, and, with the security market line, its mean return under the CAPM.
5.2
Factor Models
In this section, we cover a very different approach to understanding how the investment market allocates risk premiums. Instead of trying to model the investment market, one can model investment returns directly. Returns vary in an unpredictable way, but perhaps some of that variation can be attributed to variation in some
44
5 Asset Pricing
other quantity/quantities. In other words, relationships can be proposed, and their details estimated. In more empirical terms, one can statistically correlate returns (i.e., realised risk premiums) to any factors that may be driving them. If the variation in a large number of returns can be attributed to a small number of factors, much about return behaviour can be learnt. Note that the correlation between two variables is a summary statistic from the linear regression of one onto the other. If linear regression in one variable can explain a large proportion of the variation of the other, the correlation is high (close to one or negative one). So factor models are essentially linear regression models of returns. Return is the variable of interest—called the dependent variable or response variable in regression terminology—which is to be modelled, linearly, in terms of one or many explanatory or independent variables. The constants governing the modelled relationship are then estimated, based on a sample where the dependent and independent variables are co-observed. In fact, excess returns are used rather than returns. An excess return is the return above the risk-free return; the risk-free return can be earned without taking any risk, so it is the excess over this that we wish to understand. Our first example of such a model is the single-index model (SIM), a famous finance model that regresses excess returns on contemporaneous excess returns of the market portfolio. The regression equation is linear, as follows: Ri − r = αi + βi (Rm − r) + ei , where Ri is the return of the investment in question, r is the risk-free return, αi and βi are the regression (intercept and slope) coefficients, Rm is the return of the market portfolio, and ei is an error or residual term that captures the component of the excess return on the left-hand side that cannot be expressed in terms of the explanatory variables. Values for the coefficients αi and βi are chosen based on a sample of the dependent and independent variables, i.e., of the investment’s historical returns and corresponding market portfolio returns. If the returns are well correlated with the market, βi will be high, reflecting the commonality between Ri − r and Rm − r. For returns uncorrelated with the market, βi will be close to zero, reflecting a weak or non-existent relationship. Whatever relationship emerges, it is known as the security characteristic line, for the security/investment in question. It is the line of best fit through co-observed excess returns of the investment and the market. Figure 5.2 shows two examples, both with an α of zero. The slope of the line of best fit, β, is given by the ratio of the covariance and the variance of the market: σσim2 , the same m
expression as in the CAPM.4 This is no coincidence; in both cases, β is constructed to be the proportion of return that can be linked to market return on average. Indeed,
For the SIM, you will often see hats, e.g., σˆ im , indicating that quantities are estimated from a ˆ Much of the sample and are therefore approximations. The resultant estimate of β is denoted β. theory of estimating regressions and parameters in general deals with the difference between a true
4
5.2 Factor Models
45
Rj − r
Ri − r
Rm − r
Rm − r
Fig. 5.2 Security characteristic lines for two investments, as well as the underlying excess return observations. The plot suggests that βi > βj and that σe2i < σe2j
the regression-based labelling of α and β is why the CAPM β is denoted as it is. There is a strong affinity between CAPM and factor models, and they are often used in conjunction. In fact, the SIM can be seen as a natural extension of the CAPM relationship: the CAPM identifies covariance with the market as key, and the SIM examines this relationship statistically. The SIM is born out of the CAPM but can also be seen as a separate simple regression model in its own right. Something striking about the SIM is that it gets at many of the CAPM concepts, in a very direct way. By using the market portfolio as an explanatory factor, the SIM equation suggests a decomposition of returns into market/systemic and specific/idiosyncratic components. The systemic part is controlled by β, a measure of covariance to the market. The systemic part is uncorrelated to the residual, specific part (if there were any correlation, it would imply that β is incorrectly estimated and should be adjusted to absorb the correlated part of the residuals). Taking the variance of either side of the SIM equation gives V[Ri ] = V[Ri − r] = V[αi + βi (Rm − r) + ei ] = V[βi (Rm − r) + ei ] = βi2 V[(Rm − r)] + V[ei ] = βi2 V[Rm ] + V[ei ] = βi2 σm2 + σe2i . As usual, make sure you agree. In Fig. 5.2, one investment has a relatively high systemic risk (a high βi ) but a low specific risk (a low σe2i ); the other has low systemic risk (a low βj ) but higher specific risk (a high σe2j ). Instead of
parameter, σim say, and an imperfect estimate of it, σˆ im . This is important when studying regression or statistical inference in detail but can be glossed over for our purposes.
46
5 Asset Pricing
taking variances of the SIM equation, consider taking expectations—one finds a generalised form of the security market line: μi = r + αi + βi (μm − r). The CAPM suggests that one will find αi = 0; that is, the CAPM suggests that only exposure to market risk, only covariance with the market portfolio, is rewarded with a risk premium (this rewarding is consistent across models, with β being equal). If αi = 0, the risk–return relationship implied by the CAPM (and visualised by the security market line) is violated. Estimating the SIM based on a sample of historical returns gives one an estimate of the covariance of each investment with the market portfolio (note that some realworld proxy for the market portfolio must be chosen—more on this in a moment), and a decomposition of the historical variation of returns into systemic and specific components (which you can react to, knowing that markets tend to reward systemic risk, but not specific risk, at some prevailing market price of risk that is also estimated with the model). One also obtains estimates of each αi , indicating whether any additional average return was earned during the sample, over, and above the appropriate market risk premium. Whether this apparent over- or under-performance will continue is not clear. Note also that, for certain statistical reasons slightly beyond our scope here, the estimates of αi are very poor, in that they suffer greatly from sampling bias (with long samples being necessary to counter this). A practical issue when implementing the SIM is to decide on a proxy for the market portfolio (which, being the totality of all risky investments, cannot be observed directly). One can construct a combination of risky investments, but usually this is best done by using a broad index. For example, the JSE ALSI (the ‘all-share index’ on the Johannesburg Stock Exchange) is usually used in South African applications of the SIM. An extremely appealing feature of factor models is that they are flexible and easy to adjust and extend. If the CAPM were completely realistic, though, there would be no need to look beyond the SIM—with αi = 0 indicating market equilibrium, the SIM regression equation captures the CAPM pricing results (i.e., market risk is rewarded in a linear way). So, to what extent should we believe in the CAPM as realistic? Even though the diversification effects implied by the abstract assumptions of the model will not be fully reflected in practice, it is very probably true that easily diversifiable risks are not met with significant risk premiums—investors are happy to accept such risks, paying slightly higher prices and accepting slightly lower mean returns, with the knowledge that they can diversify them away.5 But can every risk besides covariance with the market be diversified away? An interesting way in which SIM results can be analysed is to examine the α estimates; even though these
5
There are many other critiques that can be made, such as the fact that investors are heterogeneous in practice (not homogeneous, as assumed), or that risk is not in fact as simple as return standard deviation.
5.2 Factor Models
47
are imprecise, consistently positive values do suggest that a risk premium is being earned beyond the one earned via bearing through market risk. Eugene Fama and Kenneth French built a model based on empirical findings throughout the 1980s that: (i) ‘small-cap stocks’ (stocks with a relatively small market capitalisation) and (ii) ‘value stocks’ (stocks with a relatively high bookto-price ratio, as opposed to ‘growth stocks’, with a low book-to-price ratio) outperformed the market average, after accounting for market risk. They therefore included the additional return of small-cap stocks over large-cap ones (denoted SMB, for small minus big) and the additional return of value stocks over growth stocks (denoted HML, for high minus low) in the SIM regression: Ri − r = αi + βi (Rm − r) + γi SMB + δi HML + ei . Although the above equation is not derived from a set of economic assumptions, it does fare better empirically than the basic SIM one.6 For whatever reason, the so-called size effect and value effect are real and can explain a significant portion of historical returns (with estimates of γi and δi being positive). In addition to being averse to market risk, as the CAPM suggests, presumably investors are also averse to the risk posed by small-cap stocks, all else equal. Even with standard deviation and market covariance constant, investors in practice are not indifferent between investments with differing sizes of market capitalisation (while CAPM investors are indifferent). Market pricing, and subsequent returns, reflects this. Similarly, the market appears to price a value factor into investments. This is a nice point to conclude the pure-finance chapters. To understand investing in the financial markets from a theoretical perspective, one should start with the fundamental notions of risk and return (Chaps. 1 and 2). Then market efficiency (Chap. 3) is an essential idea. Easy, high-return investments with low risk are not easy to come by; investors compete to enter high-return, low-risk investments, and the prices and returns that result account for the risks perceived by the market (in theory at least, and the understanding of the theory’s mechanisms that we covered goes a long way in understanding when the theory can break down). So how should one invest? Modern portfolio theory (Chap. 4) gives the fundamental ideas: the risk–return tradeoff in one’s investment options should be ascertained, as best as possible, and one should make an efficient decision, i.e., one should not accept an unnecessarily small (large) mean return (return standard deviation), given all of the opportunities to diversify. One might also be curious about what the investment options are likely to look like; in other words, about what prices and investment opportunities result from typical, efficient financial markets (Chap. 5). Because all investors see and take the benefits of diversification (by investing some portion of their wealth in the market portfolio), the market as a whole is
6
We will not cover the famous arbitrage-pricing theory, which is an extension of the CAPM, and propose a mechanism by which risk premia for general factors, like the examples above, can be priced into investments.
48
5 Asset Pricing
not averse to diversifiable (specific) risk but can be, and almost always is, averse to non-diversifiable (systemic) risk. This is the basic theory, but the Fama–French three-factor model shows that, while market covariance is a major factor driving risk premia, other effects can prevail. Because the markets are efficient, finding mispriced securities is not easy. Unless there are specific reasons you have privileged access to investment opportunities, it is unlikely that you will consistently be able to find counterparties willing to trade at ill-informed prices. If this sounds too strong, note that, in practice, things like having a good network, or occupying a central and visible place in the sector, can put you in a privileged position, to an extent. If modern investment management is not only about hunting for market inefficiencies, what is it about? It is about understanding, and estimating, what risks are being rewarded by the market, and how much; it is about diversifying away risks that can be diversified, as these are presumably not going to earn a risk premium, and doing this as effectively and cheaply as possible, and about taking on an optimal amount of the various risks, depending on investor preferences, market conditions, etc. It is also about a few things just, but not far, beyond the scope of this text. The standard MPT, CAPM, and SIM models can be extended in many ways. Different factors can be proposed, more complicated regressions can be implemented, different risk measures can be incorporated into the model, etc. These can give rise to smart betas, a generic term for betas that are adjusted in some way, and can inform adjusted and hopefully improved investment strategies. The term alpha is used in finance not to refer specifically to the SIM regression constant, but to refer generally to extraordinary return, expected return not associated with risk. Investment managers police the capital markets for any suspect pricing of risk; their desire for alpha shifts capital around the market in a way that dynamically accounts for the perception of risk.
Literature Notes The CAPM originated with Sharpe (1964), Lintner (1965), Mossin (1966), and unpublished work of Jack Treynor. The single-index model was developed in Sharpe (1963), and the three-factor extension in Fama and French (1992). Elton et al. (2009), for example, treats all of this material in a unified setting. Note that the chapter has glossed over the details of how an equilibrium market portfolio is formed; see, e.g., Cvitanic and Zapatero (2004, Ch.12) for further information.
References
49
References Cvitanic, J., & Zapatero, F. (2004). Introduction to the economics and mathematics of financial markets. MIT Press. Elton, E. J., Gruber, M. J., Brown, S. J., & Goetzmann, W. N. (2009). Modern portfolio theory and investment analysis. Wiley. Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. Journal of Finance, 47(2), 427–465. Lintner, J. (1965). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Review of Economics and Statistics, 47(1), 13–37. Mossin, J. (1966). Equilibrium in a capital asset market. Econometrica, 34(4), 768–783. Sharpe, W. F. (1963). A simplified model for portfolio analysis. Management Science, 9(2), 277– 293. Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance, 19(3), 425–442.
6
Introduction to Derivatives
Abstract
Financial derivatives are introduced, in terms of both broad concepts and specific examples. Option payoff functions are constructed, interpreted, and combined into more complicated strategies. Keywords
Financial derivatives · Forward contracts · Option contracts · Payoff functions
The purpose of this chapter is to introduce financial derivatives, including their practical applications and theoretical interest. A derivative, in the context of finance, is a contractual agreement that depends on another asset or financial variable. The value of a derivative—the value, positive, or negative, of being committed to the specified terms of the agreement—is thus derived from the value of the asset on which the derivative depends. The asset is called the underlying. Some examples of underlying assets are a particular share, share index, or a commodity; a non-asset variable such as an interest rate can also serve as an underlying. The entities that enter a derivative (whether individuals or organisations) to are known as parties to the agreement in question. To enter a derivative, you need a counterparty to sign the agreement. Derivatives are sometimes referred to as claims, with one party being legally entitled to claim, from their counterparty, whatever the contract specifies.1
1
A derivative is only one type of claim, though. A share is also a claim, a legal entitlement to a stake in a company. Another term you may hear is contingent claim, emphasising that derivatives are contingent on (dependent on) their underlying asset.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_6
51
52
6 Introduction to Derivatives
Just as individuals and entities create, negotiate, and enter financial agreements (such as loans or equity investments) and thus invest and raise money, they can create, negotiate, and enter derivatives. In other words, a new type of financial agreement, beyond those discussed in the previous chapters, becomes available as soon as one realises that agreements can be made that refer to (i.e., depend on, are derivative of) other financial agreements and investments. The financial markets realised this a long time ago. Why? What is the motivation for such agreements? At the general level, they offer different ways to invest, with different properties, some of which can be preferable to their standard alternatives. We will explore this in detail and see many examples. If an asset is risky, a derivative depending on it is risky as well, and so derivatives provide exposure to the risk associated with their underlyings—we have seen that risk is a central concept in investing, so it should not be surprising that investors are interested in the various ways of acquiring and manipulating exposure to risk. My clients loved risk. Risk, I had learned, was a commodity in itself. Risk could be canned and sold like tomatoes. Different investors place different prices on risk. If you are able, as it were, to buy risk from one investor cheaply, and sell it to another investor dearly, you can make money without taking any risk yourself. And this is what we did. Michael Lewis, Liar’s Poker
6.1
Forward Contracts
An excellent first example of a derivative is a forward contract, or a forward for short. A forward is an agreement between two parties to exchange an asset at a specified future time, at a specified price. This is pretty much the simplest type of derivative out there (ensure you are happy that it conforms to our definition for derivatives) but will lead, we will see, to a rich set of implications. Some terminology is essential for a theoretical discussion. The party agreeing to buy the asset is called the long party; the party agreeing to sell is called the short party. The pre-agreed time of exchange is called the maturity, and the pre-agreed price is called the delivery price. Putting this together, a forward contract requires the short party to deliver the underlying to the long party at maturity. In practice, however, forwards (and derivatives in general) are not necessarily physically settled; they are often cash settled, meaning that the long party receives the value of the underlying in cash, rather than the underlying itself. This can be important, logistically and legally, but not theoretically or conceptually. To abstract from the settlement details, we consider the forward’s payoff : the value of the position specified by the contract, from the perspective of one of the parties, at maturity. Letting T denote the maturity time, ST the time-T underlying asset price, and K the delivery price, the payoff to the long party is ST − K.
6.1 Forward Contracts
53
K
ST
Short Payoff
Long Payoff
K
K
ST
−K Fig. 6.1 The payoff functions of the two sides of a forward contract
This is because the long party receives the asset, or the value of the asset in cash, but must pay the delivery price. The payoff for the short party is the exact opposite, K − ST , given the short party’s position in the contract. The two payoffs sum to zero, reflecting that the contract and resultant cash flows are self-contained, not involving anyone beyond the two parties. Qualitatively, derivatives derive their value from their underlying. Quantitatively, derivative payoffs are functions of the underlying asset value. Payoff functions for a generic forward are plotted in Fig. 6.1. These are a common and useful way to visualise a derivative’s dependence on its underlying—make sure you can connect the details of the plots to the original definition. Let us also ensure we make the basic economic interpretations of a forwardcontract situation. The long party, having agreed to buy something at a fixed price, would prefer that something to be more rather than less valuable. This would give them more value, at no extra cost. This is reflected by their upward-sloping payoff function and is the reasoning behind the long–short terminology (recall the discussion of short positions in Chap. 4). The long party has a positive exposure to the underlying—they want it to go up in value, just as if they were invested in it directly. The short party wants the opposite of a standard investor; they have a negative exposure and would prefer that the underlying falls in value. This would allow them to sell something cheap at a relatively high price (which was fixed in the contract, before the underlying fell in value); this is a good trade, as the underlying can be purchased from the open market at maturity, cheaply, and delivered to the long party in exchange for a net profit. This story would need to change if the short party holds the underlying alongside the forward (and therefore does not turn to the market to deliver it)—more on this below, but for now we are viewing the forward contract in isolation. So, the underlying’s value is important. So is the delivery price K. Looking at the payoff functions, we can see that: (i) the long party wants the underlying to
54
6 Introduction to Derivatives
rise in value, and the short party the opposite and (ii) the long party would want a low delivery price, and the short party the opposite. The underlying value, which will determine (i), will be revealed at maturity. The delivery price, driving (ii), is negotiated and fixed before maturity, by the two parties themselves. This has a crucial implication: the delivery price is like a market price in that it fluctuates over time in a way that balances the sellers (short parties) and buyers (long parties) in the market. The market-prevailing delivery price is not exactly a price, though. It is free to enter a forward—it is simply an agreement to trade in the future—so the price is zero.2 But it is like a price in that it is the key variable (that is under control at that time) driving value to either party, in a zero-sum fashion, and therefore having to settle at a market equilibrium. By the end of Sect. 7.2, you will have a full understanding of how this works, in an idealised setting at least. In the real markets, large volumes of forwards are traded every business day. In fact, the various types of forwards (including interest-rate forwards, known as FRAs) and their extensions (e.g., swaps, which are essentially several forwards grouped in a single contract, or futures, an interesting exchange-traded, markedto-market variant of forwards) are the most popular derivatives in the world. Why? Why do market participants enter forward contracts? The most general answer is that forwards (and derivatives in general) offer a way to obtain exposure to an asset or investment; more fully, derivative returns are driven by underlying returns, so investing in derivatives gives a way, different to the standard direct method of investing, to get access to the return (and associated risk) of the underlying. Moreover, the exposure offered by derivatives can offer greater customisation, added convenience, efficiency, and/or flexibility in achieving a desired end. Let us unpack this. Entering a long forward is similar to buying the underlying asset, in that a positive exposure to the value/returns of the underlying is obtained in both cases. If you think the price of an asset is going to increase, if you think above average returns are going to be exhibited, you can do either one and will earn a profit if your prediction is correct (with the forward, you have to assume that the price increase was not anticipated in the delivery price—more on this in Chap. 7). This is sometimes called speculation: attempting to profit off views about future, usually short-term, performance of investments. Speculation can also be negative, an expectation of a price drop, in which case a short forward would be a suitable position. Speculative investing may be easier to implement in the derivative markets than in the underlying market. It may be easier to enter and obtain an exposure to an asset by entering a forward (or futures) than by buying into it directly. Especially for investing basing on short-term preferences, forwards and other derivatives can be very convenient. Markets can be very liquid, and forward and derivative positions are nicely customisable (although there is a tradeoff between these two things!).
2
In practice, it may not be perfectly free. Fees would often be incurred, credit risk measures may require an initial deposit to secure the transaction, etc. Although these effects are not covered in our idealised theory, many have been studied in considerable detail.
6.1 Forward Contracts
55
Costs associated with entering the underlying can be avoided, and the desired type of exposure obtained in the derivative market. Long-term investors tend to prefer holding the underlying directly, generally speaking, but they often add derivatives to adjust their exposure over time. Another aspect of the convenience of derivatives in general and forwards in particular: the ability to obtain a short exposure with relative ease. In fact, it is hard to explain a simpler way of attaining a short exposure than entering a short forward. Another feature of forwards, noteworthy for their speculative application, is that they offer leverage: they provide exposure to the underlying without the investor/speculator providing capital. The standard way of obtaining exposure is to enter the underlying directly, which requires money upfront. Forwards are free to enter, in theory at least; this might not hold exactly in practice, but much less capital would be required than purchasing the underlying. This takes us to another great adage in the derivative literature: in addition to speculation, derivatives are useful for hedging. Hedging implies that there is an existing exposure that you are looking to offset with another exposure. That is, you add a negatively correlated investment, known as a hedge, to your existing investments, offsetting their variation to some extent and lowering the overall risk. The extent can vary; hedging can be approximate (e.g., ‘let’s add this stock to our portfolio, it tends to perform well when the rest of portfolio is flat’) or very precise (e.g., a specific derivative is structured and entered to manipulate the risk characteristics of an underlying asset). Perfect hedging—when investment positions offset each other’s variation exactly, removing risk completely—will prove to be an important theoretical tool in Chaps. 7 and 8. So, be aware that the specific meaning of the word hedging can vary with context. Back to forwards specifically. At the abstract level, any investor can sign up to receive either the long or short payoff above, assisting investors with their speculative and hedging needs. Let us consider a concrete example. A farmer will produce 100 tons of maize at the end of the season. The maize price could drop, so he enters a short forward contract to fix the price of the 100-ton crop he will sell.3 To see this reflected in the quantitative terms we are developing, recall that the short forward gives a negative exposure to the maize price, with maturity value K − ST . The farmer has a positive exposure, by virtue of holding the maize crop, with future market value ST in the future. The two positions offset one another, leaving the constant delivery price that was fixed in advance. The farmer has (perfectly) hedged their exposure to the maize price. Just as a short forward offers protection to a seller in a very natural way, a long forward can protect a buyer from a price increase. Suppose a retailer buys maize each season from farmers, to sell in their stores. They can enter a long forward on maize, fixing the future price they will pay. In principle, the retailer could make the forward agreement with the farmer directly. Even though the cash flows of such a
3
Although a cash-settled derivative is often convenient, this is a case where a physical settlement might be the easier option.
56
6 Introduction to Derivatives
forward would sum to zero, and one of the parties will lose money, it is important to see the sense in which both parties can benefit. Both parties had to forgo a potential upside (e.g., the farmer could have taken their chances and hoped for a high maize price), but, in exchange, they have removed the risk on the downside. They can then plan their business affairs more easily, manage their risk of cash flow problems, avoid having to follow and worry about the global maize market, etc. In practice, the farmer and the retailer would likely enter their forwards with an investment bank as the counterparty. Farmers and retailers are not really in the business of drawing up financial contracts, finding and vetting counterparties, etc. Banks will enter forward contracts both long and short, using their central position in the markets, and also their expertise and infrastructure in implementing derivative contracts, to attract counterparties. They will include fees in the contact, but, in exchange for this, they offer the most convenient route for anyone interested in entering a forward.4 Essentially, the financial system will sit as an intermediary between farmers and retailers, between long parties and short parties, facilitating derivative trades like this one.
6.2
Options
Options are a type of derivative, similar to forwards, but with a simple twist that is central to much mathematical finance theory and literature. The twist is this: instead of a firm agreement to trade, as in a forward contract, an option contract specifies an agreement that is optional to one party; that party has the right but not the obligation to trade at the specified price and date. Specifically, a call option is the right to buy an asset at a fixed, pre-specified price and date, and a put option is the right to sell an asset at a specified price and date. Together, call and put options are known as vanilla options, distinguishing them from more complicated variants known as exotic options, which we briefly discuss later. Vanilla option terminology is slightly different to that for forward contracts. First, whichever party has the optionality is called the long party. This party has the right to buy (if a call option) or sell (if a put) the underlying. The short party is contractually obliged to comply with the long party if they decide to exercise their right. Presumably, the long party will only do this if it is beneficial to them. For this reason, an option cannot have a negative value—if there is positive value available, the long party exercises; if not, they do not, and they make no loss. This is an important concept. The short party’s position is therefore a liability, something of negative value, because they will either need to complete the transaction when it is favourable to the long party, which means it is costly for them, or in the best-case scenario, they make no profit or loss, when the long party does not exercise. For this reason, the
4
Often this fee is fully or partially in the form of a spread: a difference between the bank’s long and short offerings, so that the bank positions do not exactly cancel, but leave them with a profit.
6.2 Options
57
long party needs to pay the short party to induce them to enter the option agreement in the first place. An upfront payment is made, known as the option premium (this has the slight connotation of the option being newly issued, with option price, which is also valid, suggesting trading on the secondary market). Because of this payment, in exchange for a future right to trade, the long party is also described as the buyer or the holder of the option, and the short party as the seller. The pre-specified date of the optional trade is called the expiry, and the fixed price the strike price or just the strike (as opposed to maturity and delivery, for forwards, although maturity is sometimes used for options as well). Upon expiry, the long party may exercise their option and make the pre-specified trade with the short party. If the asset price at expiry exceeds the strike price of a call option, the holder should exercise their option and benefit from the difference between the market price of the underlying asset and the strike price, as per the above equation for the long forward. If, however, the asset price does not exceed the strike price, the holder should not exercise, and the option will expire worthless. Although the holder ends up not reaping any benefit from the option, the situation is crucially different from how it would be in a long forward contract, where a loss is made from low underlying prices. The option protects the holder from the downside. However, basic economics tells us that this downside protection cannot be free—a marketcommanded premium will be set from the free trading of any option. Consider the payoff function, the value at expiry, of vanilla options. Assuming that the option is always rationally exercised (a safe assumption in practice), the payoff of the long forward must be adjusted as follows to arrive at a call option: max(0, ST − K) = (ST − K)+ , where similar notation as above applies, with K representing the strike price. The function (·)+ is known as the positive part function. If ST < K, it is not beneficial to exercise the option, and the value is zero. For a put option, the payoff is max(0, K − ST ) = (K − ST )+ . These are plotted in Fig. 6.2. In addition to the payoff function, a profit function is plotted, which accounts for the upfront option premium. If an option expires and exercise is not rational for the holder, it is said to expire out-the-money, and an overall loss is made by the holder. If the option expires in-the-money, the value of exercise may outweigh the initial premium, resulting in an overall profit. The terms in-the-money, at-the-money (meaning underlying value is equal to the strike), and out-the-money are applied before expiry as well. An option can, for example, be atthe-money initially, meaning that the strike equals the underlying price at the time of initiation of the contract, and then may expire in- or out-the-money depending on what happens to the underlying price. Once you are mathematically happy with Fig. 6.2, you should return to the economic interpretation. Call options give a positive exposure to the underlying, reflected by the payoff function being an increasing function of ST . If you have the
K ST
Put Payoff (Profit - - -)
6 Introduction to Derivatives
Call Payoff (Profit - - -)
58
K
K ST
Fig. 6.2 Payoff and profit functions for vanilla option contracts
right to buy something at a fixed price, it would be great for you if that something was valuable. Like a long forward, call options are a vehicle to speculate on a view that an underlying will appreciate in value (they also offer leverage, with the option premium being a relatively low amount of capital to obtain full exposure to the underlying’s upside). Recall also that there is a short side to the call option. For a call option to come into existence, someone must agree to make the payment described by the payoff function. The premium must compensate the short party for assuming this liability; they bear the risk that the underlying price increases, and they face a potentially unlimited liability. For put options, the payoff is bounded (because most, though not all, underlying values are non-negative). With the high values on the left of the plot, it reflects a negative or short exposure to the underlying. Speculation pertaining to price drops can be implemented with put options. An extreme case of such a speculation paying off would be the underlying value dropping to zero, giving the put option holder the whole strike price (from our math, K − ST = K if ST = 0; from first principles, if you can sell something that is actually worthless for a price K, you profit by K). So far, we have considered the options in isolation. Such positions—entering options without entering the underlying—are sometimes described as naked. Often, however, one’s motivation for entering the derivative market is that one already has positions in the underlying that one is looking to adjust or hedge. For example, if you hold a share, you can buy a put option on the share, giving you right to sell it. If the share value turns out to be low, you can exercise your right (at a price that was fixed before the value drop) and thereby put a limit on your loss. This is known as a protective put, and the payoff of this strategy (the put plus the underlying itself, which has a payoff simply of ST ) is plotted in the left panel of Fig. 6.3. Other strategies involve combining options with other options. Combining a long call and a long put, for instance, with equal strike prices and expiry dates, is known as a straddle; the left panel of Fig. 6.4 plots the payoff of this strategy. The payoff is large when there are large movements to the underlying either upwards or downwards (as either the call or put option goes into the money). A straddle can
59
Payoff
Payoff
6.2 Options
K K
K
K
ST
ST
Fig. 6.3 Payoff functions for a protective put (underlying plus long put option) and for a covered call (underlying plus short call option)
Payoff
Payoff
K
K
K
ST
ST −K
Fig. 6.4 Payoff functions for a straddle (long call plus long put) and for the combination of long call and a short put. The call and put must share the same underlying asset, strike price, and expiry time
therefore be said to be a bet on volatility. Next, consider a long call plus a short put. The resulting payoff is plotted in the right panel of Fig. 6.4. This is identical to the payoff of a long forward; this is the fact behind the important put–call parity relation explored in Chap. 7. We have already noted that forward contracts, although simple in nature, allow for several applications. It should be clear that options—also relatively simple, involving just one conceptual leap from forwards, that of optionality—similarly give rise to a rich set of applications and implications. Risk from the underlying can be accessed via options, with the twist of optionality allowing the position to be adjusted and customised. Things can be generalised further. Variations on the standard vanilla option contract can be applied, giving rise to the so-called exotic options. For instance, Asian options are like vanilla options but depend on the average level of the underlying over a period, not only at a particular expiry date. Barrier options are vanilla options with an additional contractual clause: if the underlying crosses some barrier level during the life of the option, the payoff is cancelled (or, for knock-in
60
6 Introduction to Derivatives
barrier options, payoff is set to zero if the barrier level is not crossed). Compound options involve the right to buy or sell another vanilla option. Basket options depend on a combination of underlying assets. Asian and barrier options are excellent examples of path-dependent derivatives. Vanilla options depend on the underlying price ST , but they do not depend on the path that the underlying price takes leading up to expiry time T . Another variation is a binary option; these are like vanilla options, but they pay a fixed amount if the option ends up in-the-money (e.g., a binary call option pays zero if ST ≤ K, but pays some fixed amount if ST > K). The derivatives we have discussed up to now are said to be of European style. Option contracts can contain an additional clause saying that they can be exercised not just at the expiry date, but also at any time before that. Options with this early exercise feature are called American options. For an American call option, for instance, the holder can wait until expiry and then has the right to claim value of ST − K, but they can also choose to claim value St − K at any earlier time t, where St is the earlier underlying value. Recall from Sect. 1.2 that part of our goal is to develop a conceptual understanding of finance (the investing and raising of money, through the creation and maintenance of assets and liabilities). The contents of this chapter are part of this endeavour—derivatives offer a way to invest, a way for assets and liabilities to arise, and we should therefore seek to understand how they work. A major aspect of the financial markets is how investments are priced by market participants. We have made an attempt at understanding this in earlier chapters. We will consider the pricing of derivatives in the following chapters, and this will have a very different feel to the preceding ones. The concept of relative asset pricing is the reason for this. Derivatives, as we know, depend on an underlying, and this opens the possibility for a different approach to pricing: perhaps derivatives can be priced in terms of their underlying price (in other words, can be priced relative to their underlying).5 The underlying price can be observed from the market, and whether or not we understand the principles behind how that price arose, perhaps it can be used to infer a derivative price. One does not necessarily need to understand absolute pricing (of non-derivative assets, where there is no underlying to refer/relate to) to undertake relative pricing. Black, Scholes, and Merton showed that the relative pricing of derivatives is, in an interesting and satisfactory way, indeed possible.6 Their fundamental idea is
5
Note that relative pricing (or relative valuation) can also, but not here, refer to a pricing approach that compares similar assets to each other (rather than comparing/relating a derivative to its underlying asset). For instance, to price the share of a medium-sized mining company, one can look at the price-to-earnings ratios of other medium-sized mining companies trading on the market and use these to estimate a price based on the company’s earnings. 6 The 1973 paper of Black and Scholes gets a tremendous amount of credit, as it should, but the understanding added by Merton (in his paper later that year) was utterly essentially. Although the phrase Black–Scholes model is common (introduced, in fact, by Merton), the Black–Scholes– Merton model is a more appropriate designation.
References
61
as follows. One can (under certain conditions) trade in the underlying asset in a way that perfectly hedges the risk in a derivative. The risk-free portfolio is then easily priced, with market-prevailing risk-free returns/interest rates (recall that the non-trivial aspect of pricing investments is how one incorporates risk). Importantly, one does not need to implement this hedge to make the resultant price valid—it is sufficient to note that you, or any investor, could implement it, if there were any deviation between the market and hedging-based price. Any deviation would create an arbitrage opportunity, an important idea of the next chapter. So, while derivatives are an extended part of the financial economists’ project to understand the financial markets, they are also the basis of a self-contained field regarding their pricing relative to underlying asset prices, seeded by Black, Scholes, and Merton in 1973. This field is extremely rich and interesting, both financially and mathematically, and is extremely useful in practice. It is the main subject matter of master’s degrees in mathematical finance/quantitative finance/financial engineering, and now you have some context from how these degrees would differ from programmes in financial economics, or from programmes in finance that do not emphasise this mathematically complex field of derivative pricing.
Literature Notes The chapter cites the famous book of Lewis (1989). One could refer to Hull (2015, Ch.2 & Ch.10) or Wilmott (2013, Ch.2) for respected textbooks that address forward and option contracts. The chapter ends by mentioning the seminal paper of Black and Scholes (1973) and also recognises the importance of Merton (1973).
References Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654. Hull, J. C. (2015). Options, futures, and other derivatives (9th ed.). Pearson. Lewis, M. (1989). Liar’s Poker. W.W. Norton & Company. Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economics and Management Science, 4, 141–183. Wilmott, P. (2013). Paul Wilmott on quantitative finance. Wiley.
7
Arbitrage- and Model-Free Pricing Methods
Abstract
The essential idea of arbitrage is discussed. Using the principle of the absence of arbitrage, some model-free derivative pricing results are attained. Forward contracts are priced, and bounds on possible option premia are derived. Keywords
Arbitrage · Model-independent finance · Robust finance · Put–call parity · Option strategies
7.1
Arbitrage
In this chapter, we shall give an overview of how the idea of arbitrage can be used to make deductions regarding derivative prices. The approach will be model-free in an important sense to be clarified below. A good working definition of an arbitrage is a costless, riskless trading profit. If you can find a way to make a profit, a true net profit without any hidden costs, and with no chance of loss, i.e., without risk, you have identified an arbitrage opportunity. Depositing money at the bank and earning interest do produce a profit, without any real risk, but this is not costless; you need money to deposit in the first place. An arbitrage takes you from nothing—there must be no cost—to profit. Arbitrage is too good to be true. We know that the investment markets are very competitive, and if there were an easy way to make profits from nothing, traders would flock to this opportunity, causing prices to adjust. An arbitrage opportunity is a stark example of a market inefficiency, in the informational sense of Chap. 3— market prices, by allowing arbitrage, are not properly reflective of the nature of the investments. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_7
63
64
7 Arbitrage- and Model-Free Pricing Methods
So, in practice, arbitrage opportunities should be non-existent, or at least very rare. If they arise at all, standard market forces tend to prevent them from persisting. The absence of arbitrage therefore seems a plausible feature of real financial markets and therefore a plausible assumption to make in models of financial markets. The absence of arbitrage turns out to provide important constraints on derivative prices. The constraints—the results of what can be called arbitragepricing theory—differ in nature depending on the type of derivative in question. The contents of this chapter will show us how. To do some quantitative analysis, we need a slightly more formal and concrete definition of an arbitrage. Below, we shall work in a basic model of a financial market, where an underlying asset can be traded alongside a cash account (in which money can be deposited, or from which money can be borrowed). A portfolio can be built by holding the underlying in some quantity, alongside some cash-account holding. Derivatives can be added; for example, a possible portfolio is the protective put strategy illustrated in Fig. 6.3. More underlying assets can be included in the model, of course, allowing richer portfolios to be formed. The standard way to define arbitrage formally is with reference to portfolios: an arbitrage is any portfolio with an initial value of zero (so that it represents an investment that is costless to enter), zero probability of obtaining a loss in the future (so that it represents a riskfree investment), and a non-zero probability of making a profit.1 We now specify the simple model to be used in Sects. 7.2 and 7.3 below. Let ST denote the price of some underlying asset at time T , which we take to the maturity/expiry time of some derivative under consideration. We assume the underlying can be traded in perfect-market conditions: any amount of the underlying can be purchased, including fractional and negative amounts, without fees, taxes, etc. Any amount can be deposited or borrowed with a cash account, at a continuous rate r. We also assume that the underlying has a non-negative value (and that any strike or delivery prices associated with the underlying are similarly non-negative). Unless otherwise stated, the underlying pays no income (such as dividends) and incurs no costs (such as storage costs). We will consider relaxing this income/cost assumption later, but, as usual, it is helpful to start without these distractions. The absence of arbitrage is often listed as an additional assumption. This is fine, but you can think about it in a slightly more open-ended way: we will explore whether our model can be free of arbitrage, and what conditions, if any, can ensure this. The idea, or assumption, of a lack of arbitrage will be discussed further below. Generally speaking, our assumptions are idealisations of the real-world financial markets, in which there are trading costs, shares can only be traded in discrete blocks, shorting may be expensive or impossible, etc. These real-world effects put
1
A further requirement for portfolio to qualify as an arbitrage—not important in this chapter, but important in general—is for it to be self-financing. The initial value of an arbitrage portfolio must be zero, but it is also crucial that costs do not arise later in the life of the portfolio (i.e., it must finance itself as time goes on). Here is a challenge that you can attempt later: explain why selffinancing is not important in this chapter.
7.2 Pricing and Hedging Forwards
65
a limit on the applicability of the idealised theory (a theoretical arbitrage profit, especially if small, might be drowned by transaction costs or liquidity effects). The idealised theory is, nevertheless, definitely the best place to start; abstracting from the multitude of complicating factors often helps to see the key principles more clearly. A perennial theme of financial modelling, and mathematical modelling in general, is the tradeoff between tractability and complexity (or plausibility because the world tends to be complex). Simple models like the one we have outlined are highly tractable and will provide a good foundation to consider more complexity. Crucially, however, we make no assumption about the random variable ST . Usually, in quantitative finance, a ‘model’ refers to the specification of a stochastic process to model an underlying asset price. The Black–Scholes–Merton model and other examples will be seen in Chap. 8. Any particular model, in this specific sense, is an imperfect representation of how the underlying price actually evolves. If you rely on a particular specification to derive results, those results are described as model dependent. This is an important theme in quantitative finance. Model dependence is an important caveat; it says that someone else could have used a different model to me, and different results would have been obtained. My results lose their claim to be objective, in a sense. In this chapter, however, the results are model-free (or model independent)—they rely on the assumptions above, which do constitute a simple model of a financial market, but these do not include a specification for how the underlying asset price can vary. The following results therefore apply whatever distribution ST follows. The boundary between the results that can be derived with model-free methods and those requiring model dependence is an important aspect of financial modelling.
7.2
Pricing and Hedging Forwards
Recall that a long forward has a payoff, at maturity, of ST − K, where K denotes the delivery price. Consider a portfolio consisting of one unit of the underlying asset, and a short holding in the cash account (i.e., a loan, a liability that grows at some interest rate and must be repaid). Suppose the cash-account balance reaches exactly K at the maturity time T . Then the portfolio has a time-T value of ST − K, identical to the forward. The portfolio is said to replicate the forward. A slightly different terminology says that the forward’s payoff is attained by the portfolio. Replicating derivative payoffs is the basis of relative asset pricing. The logic is simple: if a derivative can be replicated, its price must be given by the price of replicating portfolio. This principle is sometimes known as the law of one price: the derivative and the replicating portfolio behave identically, so they must have identical prices. There can only be one price for one particular investment/portfolio, even if there are multiple ways to construct it. With the assumption of perfect-market trading conditions, the law of one price follows immediately from the absence of arbitrage. Exercise: show how any violation of the law of one price implies an arbitrage opportunity.
66
7 Arbitrage- and Model-Free Pricing Methods
So, the value of the forward is equal to the value of the replicating portfolio. Let us say we are standing at time zero, with maturity T years away (recall that annual time units is the standard, so T = 0.5, for instance, represents six months). Then the value of the replicating portfolio, the cost of investing in the portfolio, is S0 − Ke−rT . Make sure you can see why. The underlying, at time zero, has a certain prevailing price, naturally denoted S0 . The replicating portfolio involves borrowing money, which offsets the cost of the underlying. If an amount Ke−rT is borrowed at time zero, continuous interest until time T is charged, resulting in a final loan balance of K, as desired. By the law of one price, the value of the long forward is therefore S0 −Ke−rT . This method of replication and therefore valuation is known as a cash-and-carry strategy, with you carrying (or holding or entering) the underlying, alongside a cash balance. What is the relevance of this? Recall from Chap. 6 that the delivery price K is a crucial variable of the forward, controlling how the value of the forward is balanced between parties. We have now characterised precisely how. The long side of the forward contract is worth S0 − Ke−rT . The short side is worth the exact negative of this.2 If the forward is to be fair to both parties, in the sense that entering the forward does not confer value to one party (which would come at the cost of the other), it must have a value of zero. If the value were non-zero, it would be positive to one party and therefore unfair in their favour. To ensure a value of zero, the delivery price must be given by K = S0 erT . This value for the delivery price is known as the fair forward price. If the delivery price differs from the fair forward price, the contract should not be attractive to prospective parties on one side of the market. A forward contract requires willing parties on both sides, so the delivery price would adjust towards an equilibrium value that is, on aggregate, equally attractive from long and short perspectives. And, using the idea of arbitrage, we have now determined what that value is. An important concept is that the above analysis applies at the initiation of a forward, when the delivery price is negotiated or set by the market, but not after that. Once the contract is signed, the delivery price is fixed. It is fixed so that the forward has zero value (i.e., is fair) initially, but after that, things tend to change. To see this more clearly, note that the above argument regarding the fair forward price can be generalised to apply to any time t < T , rather than only at time zero. Letting FtT denote the fair forward price at time t, for forwards maturing at time T , one concludes that FtT = St er(T −t ) .
2
This can be seen by going through the short equivalent of the above first principles replication, or by simply noting that the long and short positions must sum to zero, as no other parties contribute to the settlement of the forward.
7.2 Pricing and Hedging Forwards Table 7.1 Summary of the cash-and-carry replication strategy for a long forward contract
67
Long forward (entered at t) Underlying (carry, long) Loan (cash, short)
Time t ?
Time T ST − K
St
ST
−Ke−r(T −t)
−K
Exercise: Adjust the replication argument to begin at time t, and derive the above fair forward price. See Table 7.1 for an illustration of the argument (the law of one price allows you to determine the unknown value of the forward, and you will see that this value is zero if K = FtT , in other words, if the delivery price is fair at time t). Suppose at time zero you think an asset will increase in value. You enter a long forward, available at the initial fair forward price F0T , in an attempt to profit from this view, as discussed in Chap. 6. Initially, the forward has zero value by construction, but suppose that your view is correct: the asset value increases significantly after some time. At that point, forwards are available at the new fair forward price, in accordance with the above formula. A new forward at that delivery price has no value, but you are locked into the old delivery price, which is lower (why?). Your position is better than the zero-value fair forward positions available in the market (again, make sure you can explain why!); in other words, your position is of positive value. What value specifically? We have deduced that the time-t value of a forward maturing at T is St − Ke−r(T −t ) (Table 7.1 summarises the argument). In this case, the delivery price was the fair one at time zero, namely F0T = S0 erT . The time-t value of the forward is therefore St − F0T e−r(T −t ) = St − S0 ert . Let us take stock. By replicating the forward, we have priced it (i.e., we have determined its value before maturity). This allows us to ensure that newly issued forwards are fair (we do this by pricing/valuing the forward and then ensuring this price is zero). It also allows us to value forward positions after issuance. For instance, with the long forward above, profit is made according to how much the underlying asset outperforms the risk-free return (because we deduced a value of St − S0 ert ). The pricing of forwards, and derivatives in general, is useful (to, e.g., banks that issue forwards, or speculators wanting to determine their profits or losses, etc.), but there is more to it than that. First, in addition to determining a specific numerical price, the pricing expression can teach us something about the derivative in question. We know that the value of derivatives such as forwards depends on the underlying asset value, and now we have determined, for forwards, precisely how the dependence works.3 For example, one could differentiate the obtained pricing 3
Note the word ‘before’ in the previous paragraph. The value of a derivative at maturity (i.e., the payoff) is relatively easy to deduce from the terms of the derivative agreement. The pre-maturity value is much less obvious, requiring a replication argument.
68 Table 7.2 Summary of a fully hedged long forward contract. The contract is assumed to be fair, with the delivery price given by FtT ; however, one could replace FtT with a general delivery price K, which would cause the time-t value of the forward to be St − Ke−r(T −t)
7 Arbitrage- and Model-Free Pricing Methods
Long forward, (entered fairly at t) Underlying (short) Deposit (long) Total portfolio
Time t 0
Time T ST − FtT
−St
−ST
FtT e−r(T −t)
FtT
FtT e−r(T −t) − St = 0 0
expression with respect to any variable of interest and learn about how sensitive (and therefore how risky) the derivative is to various factors (this is known as Greeking; more on this in Chap. 8). Second, replication arguments allow you to (perfectly) hedge the derivative in question. Hedging is the exact opposite of replication. If you short the replicating portfolio, you obtain a portfolio that behaves in the exact opposite manner as the derivative, cancelling all the cash flows away to zero. To simplify this, replication can be thought of equating two things: A = B. Hedging can be thought of cancelling two things: A − B = 0. To make sure this is clear, the perfect hedging of a long forward is shown in Table 7.2. The ability to hedge is extremely useful in practice. If a farmer approaches a bank wanting to enter a forward contract, the bank can do business with the farmer and then hedge away the risk of their position (note the bank then removes their potential to make large profits, as the variability of their position has been hedged).4 A speculator may have entered a forward but later change their mind about their view on the underlying asset—at that point, they can hedge their forward position, removing their exposure to the underlying. We have covered the very foundation of the pricing and hedging of forwards. It is not too hard to extend this to accommodate some complexities that arise in the real world. For example, if the underlying asset pays dividends, one can usually adjust the replication strategy to account for this.5 Before moving on, let us also briefly reflect on the open-endedness of the no-arbitrage assumption. Usually, when one makes an assumption, one becomes vulnerable to that assumption not holding in practice, and the results derived from that assumption not necessarily holding. Modern portfolio theory, for example,
4
The way banks make money off such business is essentially to charge fees (often in the form of a spread, as mentioned in Chap. 6). Several factors enable them to get away with this: their central position in the market, their trading infrastructure, their stability and reliability as a counterparty, etc. 5 If dividends are assumed to be proportional to the underlying value, then one should look to adjust the carry side of the replicating portfolio. Usually, one should invest in fewer units of the underlying and then reinvest dividends such that the required number of units is attained by maturity. If dividends are absolute amounts, then one should adjust the cash side of the strategy, to ensure that all cash components (including dividends) result in the constant K in the forward payoff.
7.3 Model-Free Option Analysis
69
assumes that investors only care about return mean and variance—this is not a bad approximation, but cannot be exactly true, and you could identify instances in which the theory breaks down as a result.6 A lack of arbitrage can be different. If this condition does not hold in practice, it is often a good thing. If you find a violation of this assumption, you should stop worrying about financial theory and start implementing the arbitrage opportunity in the largest quantities you can muster. In other words, a violation of the absence of arbitrage is actionable. Because this is also true for the masses of competitive investors out there, arbitrages tend not to occur in the first place, but, either way, we do not need to lose sleep over the market not conforming to our theory in this way. Often, however, an apparent arbitrage might not be a true arbitrage in practice because the other assumptions we have made do not hold. We noted (towards the end of Sect. 7.1) that real-world imperfections put a limit on the applicability of our results. Transaction costs, for instance, will often overwhelm small apparent arbitrages. There are well-developed literatures studying arbitrage and arbitragebased pricing in the presence of transaction costs, constraints on the shorting or divisibility of underlying assets, etc. Again, this does not make our simplified analysis worthless. For one thing, you have not much hope of doing a more involved analysis if you do not understand the simpler one. For another (and we will leave it here), even if you cannot exploit a small deviation from the theoretical fair forward price, say, you would likely not be able to exploit the opposite deviation—there is often a rough symmetry between exploiting an overpricing and an underpricing, and the theoretical price remains a good middle benchmark, even if you cannot enforce it down to the last cent.
7.3
Model-Free Option Analysis
We have seen that, assuming away some market imperfections, forwards can be priced and hedged with model-free, cash-and-carry arguments. What about vanilla options? It turns out that a model-free approach has its limitations, and in particular, it is not possible to specify a replicating portfolio for a vanilla option, that is, a portfolio of the underlying and cash account that results in the same value as the option. This is another important fact in derivative pricing theory (with the boundary between model-free and model-dependent results being interesting in itself). In Chap. 8, we will give an overview of how an option can be replicated under the assumptions of a particular model of how underlying value behaves. All is not lost, however, for model-free analysis of options. First, although options cannot be replicated in a model-free way, they can be super-replicated, or
6
This is not to say that modern portfolio theory is fully undermined. It is not clear how this approximation error will manifest. Sometimes this can be studied formally, sometimes empirically, and sometimes the knowledge and intuition of the modeller must be relied upon to balance model tractability and plausibility.
70
7 Arbitrage- and Model-Free Pricing Methods
sub-replicated. That is, in the case of super-replication, one can construct a portfolio that is guaranteed to be greater than the derivative payoff (this is essentially the dominance idea from Sect. 2.2). This is not as powerful as guaranteed equality (replication), but it is something. If you can super-replicate a derivative, you can conclude that the current value of the super-replicating portfolio must exceed the current value of the derivative (exercise: if this did not hold, describe the easy arbitrage opportunity). Similarly, a sub-replicating portfolio puts a lower bound on the derivative value. So, although a specific price cannot be pinned down in a model-free way, a range (between some lower and upper bound) can be deduced. Mathematicians and modellers make it their business to track how differing assumptions give rise to differing implications; this is a great example. Second, an important, model-free relationship, put–call parity, can be derived. Let us start here. Recall the derivative portfolio illustrated in the right panel of Fig. 6.4; this is given by a long call and a short put and is equal to the payoff of a long forward (where the delivery price is equal to the strike price of the options). If a future payoff is replicated, current values must be equal, using the value of a long forward that we derived in Sect. 7.2, where therefore have ct − pt = St − Ke−r(T −t ) , where ct and pt denote the time-t values/premia of call and put options, respectively, expiring at time T , struck at K, written on underlying with time-t price St . Let us make sure we do not miss the significance of this. We have noted that we cannot pin down an option premium without committing ourselves to an inevitably flawed model. We can, however, pin down the difference between the call and put premia. If you know the call value, you can deduce the put value, and vice versa. If the strike K is such that the right-hand side is zero (i.e., when the strike is equal to the current fair forward price), the call and put must be of equal value (i.e., have a difference of zero). And if this relationship fails to hold, it points to a potential arbitrage profit: if two portfolios behave the same in the future, they should have the same price now, but if they do not, you should buy/long the cheap one, sell/short the expensive one, pocket the difference, and let the identical behaviour cancel away at no risk or cost to you. Now to bounding option prices with super/sub-replication arguments. Starting with calls, although the payoff (ST − K)+ cannot be attained exactly, note that it cannot exceed ST (recall that we have assumed non-negative asset and strike prices). A simple long holding in the underlying asset, which provides time-T value of ST , therefore super-replicates a call option. It follows that call option values must be lower than the value of their underlying assets. Next, consider the cash-and-carry portfolio that would replicate a long forward, giving a time-T value of ST − K. This sub-replicates the call option, because (ST − K)+ ≥ ST − K. The call option price is therefore bounded below by the value of this portfolio, which we have seen is, at time t, given by St − Ke−r(T −t ) . Note, in addition, that the call option price is bounded below by zero. Intuitively, options must have a non-negative value; formally, an empty portfolio sub-replicates
71
Put Premium
Call Premium
7.3 Model-Free Option Analysis
Ke−r(T −t)
Ke−r(T −t)
Ke−r(T −t)
St
Ke−r(T −t)
St
Fig. 7.1 An illustration of the model-free pricing bounds on vanilla option premia
any vanilla option payoff. Exercise: show the arbitrage that can be arranged if an option premium is negative. The two lower bounds can be combined into an overall lower bound (if lower bound A and lower bound B both hold, then the maximum of the two must also be a lower bound), and in the left panel of Fig. 7.1, the final lower and upper bounds are shown in red and blue, respectively. You must be comfortable with all the steps in the derivation. Note that this figure is quite different to those in Chap. 6, where we plotted payoff in terms of ST ; here, we are plotting pre-expiry prices, in terms of pre-expiry underlying values St (see Footnote 3). The right panel of Fig. 7.1 plots the corresponding put option bounds. A put option is super-replicated by a cash holding that is equal to the strike price at expiry. More fully, a time-t deposit of Ke−r(T −t ) grows to K at time T , and, with our assumptions, K > (K − ST )+ . This gives the blue line. A put option is subreplicated by a short forward (with delivery equal to the strike) or, equivalently, the replication portfolio for a short forward. This is because K − ST ≤ (K − ST )+ . The time-t cost of replicating the short forward is Ke−r(T −t ) − St . This can be combined with the zero lower bound, giving the red line. As you would expect, these arguments can be extended in a number of ways. For instance, incomes or costs associated with the underlying can be accommodated. Some of the results can be generalised to pertain to American, rather than standard European, options. It is worth mentioning one very interesting and famous modelfree result about American options: American call options, assuming the underlying does not pay income, are no more valuable than their European counterparts. In other words, there is no early exercise premium for call options (the ability to exercise early, which the American option allows, turns out not have any value—it is not obvious, but it would be irrational to exercise your call option early). American options cannot be worth less than otherwise equivalent European options because they confer all the same benefits and then some, but, in this case, they are not worth any more.
72
7 Arbitrage- and Model-Free Pricing Methods
Literature Notes The model-free arbitrage-pricing results of this chapter are developed in many respected textbooks, such as Hull (2015, Ch.5 & Ch.11), Cvitanic and Zapatero (2004, Ch.6), and Björk (2004, Ch.7 & Ch.9).
References Björk, T. (2004). Arbitrage theory in continuous time. Oxford University Press. Cvitanic, J., & Zapatero, F. (2004). Introduction to the economics and mathematics of financial markets. MIT Press. Hull, J. C. (2015). Options, futures, and other derivatives (9th ed.). Pearson.
8
Modelling, Pricing, and Hedging
Abstract
Derivative pricing and hedging are tackled, in the context of stochastic models. The canonical binomial model and Black–Scholes–Merton model are introduced and discussed. Limitations, extensions, and tradeoffs are explored. Keywords
No-arbitrage modelling · Binomial model · Black–Scholes–Merton model · Fundamental theorems of asset pricing
Although model-free results are important, in their own right and for their place in the general theory, the bread and butter of quantitative finance is the stochastic modelling of asset prices and the resultant arbitrage theory. We attempt an overview of this theory in this chapter. We start with the simplest possible model of asset prices, where the future price can take one of two values. Like with the forward contract, there is great usefulness in examining a simple example. We will see the essential ideas and applications of stochastic financial modelling and be in a good position to enrich the setting to the pre-eminent Black–Scholes–Merton model and beyond.
8.1
The One-Period Binomial Model
Let us model the future price of an asset, ST , with the simplest possible (nontrivial) distribution: let there be two possibilities for the future price, with the larger possibility labelled STu and the smaller STd . These are constants, but the future asset price ST is random, taking on one of these two values. The model can ‘jump up’
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. Backwell, An Intuitive Introduction to Finance and Derivatives, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-23453-8_8
73
74
8 Modelling, Pricing, and Hedging
(if ST = STu , which is typically higher than the initial, time-0 share price, S0 ) or ‘jump down’ (where S0 evolves to the downward possibility STd ). We let p denote the probability of an up jump. It will be useful to define two variables u and d such that S0 u = STu and S0 d = STd . These multiplicative up and down factors are an alternative (in fact, a scale-free) way of describing the distribution of ST (i.e., you can specify STu and STd directly, or you can specify the up–down factors u and d). In addition to this random (i.e., risky) asset, we model a risk-free asset (also known as the cash account). In particular, we set B0 = 1 and BT = erT , with r denoting the continuous risk-free interest rate (i.e., risk-free return). A portfolio can include a holding in any number of units of the risky asset (one unit of which will grow to ST ) and any number of the risk-free asset (each unit growing to BT ). These holdings can be fractional and/or negative (a negative cash-account holding, for instance, represents a borrowing, which would leverage any holding in the risky asset). For the risk-free asset, it is slightly artificial to talk about a number of units because there is no natural unit to a cash deposit in practice. One would say ‘I have x dollars in the cash account’ but not ‘I have x units in the cash-account asset’. But units are helpful for portfolio construction because the risk-free asset can then be defined independently of any particular portfolio, and all portfolios described in terms of units in each asset. Figure 8.1 illustrates the two assets in this model, the one-period binomial model, often shortened to the one-period model. As usual, let us not lose sight of the basic economics. There are two investments available, a little like the efficient portfolios in Chaps. 4 and 5 (the asset, however, is not the market portfolio—it is some particular asset, as derivatives are linked to particular underlying assets or variables). Some risk-free return is obtainable, but risk would need to be taken on to enhance one’s expected return. The risk premium of the asset (the expected return it offers above the risk-free rate) is a function of u, d, p, and r (it would be a good exercise to calculate the risk premium in terms of these parameters). Now that we have an underlying asset, we can characterise derivatives written on it. A derivative depends on its underlying, and we have seen this can be captured mathematically by specifying a suitable payoff function, giving the derivative value at its maturity in terms of the underlying. We let (·) denote a generic payoff function. In the case of a put option, for example, we set (x) = (K − x)+ , for the relevant strike price K. uS0 = STu p
B0 = 1
S0 (1 −
p)
BT = erT
dS0 = STd
Fig. 8.1 An illustration of the one-period binomial model of a risky and risk-free asset
8.1 The One-Period Binomial Model
75
Now consider a portfolio with the following number of units held in the underlying and cash account, respectively: (STu ) − (STd ) S0 (u − d)
and e−rT
u (STd ) − d (STu ) . u−d
The value of this portfolio at time T is given by (STu ) − (STd ) u (STd ) − d (STu ) ST + e−rT BT . S0 (u − d) u−d Exercise: Show that the above quantity equals (STu ) if ST = STu and equals (STd ) if ST = STd . Crucially, this shows that the derivative with payoff function (·) has been replicated—we do not know what value ST will take on, but we do know that the above portfolio will take on a value of (ST ), i.e., will replicate the derivative. Unlike in Chap. 7, however, this replication is in the context of a specific model, where ST behaves in a specific (in this case, binary, or binomial) way. This is noteworthy. As theorists, we are ever mindful of the relationship between the assumptions made and results obtained; adding the assumption of the distribution of the underlying asset has allowed us to pin down prices for vanilla options, say, while this is not possible in a model-free setting. We have seen the usefulness of replicating derivative contracts by trading in more fundamental assets, for the specific case of forwards in Sect. 7.2, and can now extend this to other derivatives. The usefulness is quite profound, starting with the ability to price derivatives with the principle of no arbitrage. Let X0 denote the derivative price, which must, if the model is to be arbitrage-free, be given by the cost of the replicating portfolio in the above exercise. Ensure you are comfortable that the following is the correct implication: X0 =
u (STd ) − d (STu ) (STu ) − (STd ) + e−rT . u−d u−d
Is there anything further to be noted about the above price? More generally, are there broad principles central to replication-based pricing of general derivatives that we can learn? The answer is an emphatic yes. Consider rearranging the generic derivative price above, and representing it as follows: rT − d rT u e d u−e X0 = e + (ST ) (ST ) u−d u−d = e−rT q (STu ) + (1 − q) (STd ) , where −rT
q=
erT − d . u−d
The interest in this representation is that the derivative price appears as if it were a discounted expectation/average of the future derivative value (the two possibilities
76
8 Modelling, Pricing, and Hedging
being (STu ) and (STd )). The expectation, however, is not taken under the proper probabilities; q, as defined above, is used rather than p, the actual probability associated with an up jump in the underlying. The probability p is known as a real-world probability. This name distinguishes it from q, which is not real, in the sense that we introduced it artificially, and is not the probability of any event in the model. The artificial probability q is known as a risk-neutral (or risk-adjusted) probability. Recall that a risk-neutral investor prices an asset as its discounted expected value, being indifferent (neutral) to any variation around the expectation. Risk-neutral probabilities are the probabilities one must use to make the market seem risk neutral. ‘Seem’ is crucial–the market is not risk neutral, typically, and so prices are not given by discounted expectations. The market settles on some adjustment to the discounted expectation, depending on investors’ risk preferences.1 And we have shown that this adjustment can be captured by adjusting the probabilities of the model. In one way, the above ideas are unsurprising: we know that investors tend to have non-neutral risk preferences and that market prices do not tend to be given by discounted expectations of their future value. What is surprising, though, is that the real-world probability p does not appear in the derivative price at all. Whether the underlying has a 99.9% chance of jumping up, or a 99.9% chance of jumping down, or anything else, derivative prices are unaffected. How can this be? Well, we have proven it above. But the intuitive explanation is that the proof is based on the ability to hedge away the risk of the underlying completely—the replicating portfolio behaves like the derivative regardless of whether you move up or down. What is going to happen and how likely the possibilities are just do not matter; what matters is the cost of guaranteeing this indifference. Conceptually, the risk-neutral probabilities account for risk preferences. More practically, they are a computational tool to determine no-arbitrage derivative prices. This tool is very efficient. One does not need to explicitly calculate replicating portfolios to determine a replication-based derivative price–you just need to identify the risk-neutral probabilities of the model once, and you can price any number of derivatives with the above formula. The one-period model is very simple though— can this idea be implemented in more general models? Can the effect of risk preferences always be captured by an adjustment to the probabilities of the model? There is a simple and beautiful answer to this question, which we will state towards the end of the chapter. Just as discussed in Sect. 7.2, calculating derivative prices is just the beginning. For reasons discussed in Chap. 6, there is a demand for derivative positions, not just from investors but from all over the economy. Banks need to hedge derivatives and as well as price them because when they are approached by a party wishing to enter a derivative position, they need to: (i) price the derivative appropriately, at the outset, and (ii) manage the risk in the position, as things evolve. If a fund manager
1
This is not to say that the discounted expectation is observable, or that market participants think in these terms. But this is a conceptually clear way of thinking about a market price.
8.2 The Black–Scholes–Merton Model
77
approaches a bank wishing to buy a protective put, it is possible that the bank is willing to face the risk that the underlying falls in value and the put goes intothe-money, but it is much more likely that the wishes to hedge this risk (or, more realistically, the net risk that remains from the multitude of positions the banks will adopt). Exercise: Suppose a client approaches a bank wanting a long position in a certain derivative with payoff function (·). Show that if the bank buys (STu ) − (STd ) S0 (u − d) units of the underlying, there is no variability in their total position.2 This number— the number of units of underlying needed to offset the risk in the short position, or replicate the risk in the long position—is informative. It is known as the derivative’s delta. If positive, the derivative involves a positive exposure to the underlying; it would be suitable for a speculator predicting that the underlying will appreciate, or for a hedger concerned about an increase in value. Once the bank has added delta units of the underlying asset to their books, their position is said to be delta neutral—they had an exposure to the underlying (measured by the negative of delta, because they are short), but they have offset (or neutralised, or hedged) it by taking a suitable position in the underlying.
8.2
The Black–Scholes–Merton Model
Because of the limited scope of this overview, we largely have to skip over a very interesting topic: the multi-period binomial model. Consider starting with a oneperiod model and then appending two additional one-period model distributions to the end values, or nodes, as shown in Fig. 8.2. This can be extended as far as desired. Instead of one period representing the whole life of the model, it now represents one step in the discretisation of the life of the model. The modeller is free to choose the step size (which we denote t) and the terminal time of the model. Multi-period binomial models, known simply as binomial models, can be seen as an intermediate case between the one-period binomial model and the continuoustime Black–Scholes–Merton (BSM) model. As the step size is decreased, and the number of steps over a fixed total time increases, one approaches a continuous-time model, with continuous distributions for the asset price. In particular, if u and d are constant across steps, the binomial model converges to the BSM model, where the risky asset price follows a process called geometric Brownian motion. In the binomial model, as seen in Fig. 8.2, one obtains a sequence of random variables, S0 , S t , S2 t , . . . , which represents the risky asset over time. As the step time
2
In other words, they bank puts
(STu )− (STd ) (u−d)
rands/dollars into the underlying.
78
8 Modelling, Pricing, and Hedging (u,u)
Fig. 8.2 An illustration of a multi-period binomial model
S2Δt (u)
SΔt
(u,d)
S0
S2Δt (d)
SΔt
(d,d)
S2Δt
decreases, one approaches a continuous-time stochastic process, where there is a random variable St for all times t ≥ 0.3 Geometric Brownian motion (GBM) satisfies the following property: dSt = μSt dt + σ St dWt , where μ and σ are constants, and the stochastic process {Wt }t ≥0 is a (standard) Brownian motion. We are suppressing a great deal of mathematical detail, which you will cover elsewhere. Not only does GBM satisfy the above, it does so uniquely, so the above equation is sufficient to define GBM. Although we are glossing over details, the intuitive meaning of the above equation is not too hard to see. GBM is not a constant, trivial stochastic process— it changes over time. The above equation specifies how the changes behave (for this reason, the equation can be called the dynamics of the process). Infinitesimal changes to the process, changes at the smallest possible time scale, dSt , are given by a deterministic part μSt dt and a random part σ St dWt . The random part has a mean of zero, so the deterministic part determines the mean. It is sometimes called the drift, as it dictates the average level of process as it changes over time—on average, the process tends to drift in this direction. In particular, for GBM, E[ST ] = S0 eμT . A continuously compounded return of μ is thus earned when an asset modelled with the above GBM dynamics takes its average.4
3
A time-indexed set of random variables is called a stochastic process. Each random variable represents a value at a particular point in time. Geometric Brownian motion is an example of a continuous-time stochastic process, where the time-index set is an interval of the real numbers. It is also continuous in the sense that particular realisations are continuous functions of time, i.e., they do not jump. 4 Sometimes μ is referred to as the mean return, but this is not quite correct, or at least a little 2 ambiguous. The mean of the distribution of returns is in fact μ − σ2 . The adjustment can be seen as a result of Jensen’s inequality.
8.2 The Black–Scholes–Merton Model
79
The other key parameter σ is called the volatility, as it controls how variable, or how volatile, the changes (and therefore returns) are. In particular, V[log(ST /S0 )] = σ 2 T . By contrast, the risk-free asset (or cash account) has zero volatility, following dynamics of dBt = rBt dt. It simply drifts up in a deterministic way, according to the risk-free rate r, representing a non-risky cash deposit. In particular, BT = E[BT ] = B0 erT . You will learn to derive the above properties of GBM and show that crosssections of GBM (GBM at specific time points) are log-normally distributed. The key tool is Itô’s Lemma, a generalisation of the chain rule needed to handle the non-differentiable processes based on Brownian motion. Financial modelling in continuous time relies heavily on the work of the Japanese mathematician Kiyosi Itô, who in the 1940s showed that standard integration can be extended to nondifferentiable (more precisely, infinite variation) functions. A truly remarkable piece of history came to light in the year 2000, when the contents of a sealed letter showed that Itô’s key insights were developed independently at about the same time, by the French–German mathematician Wolfgang Doeblin. Whether you view GBM as a limiting case of the binomial model, or as a process born directly out of the stochastic calculus developed by Itô and Doeblin, the BSM model uses this process to represent the evolution of a risky asset price.5 You will also learn to derive the key results of the model from the specification. Here, we will simply state the results, highlighting their importance but suppressing their full justification. With the risky asset established, we can consider derivatives written on it. Let X denote a derivative payoff occurring at time T . Often, we can capture a derivative with a payoff function, i.e., X = (ST ); note, though, that this does not capture path-dependent derivatives (make sure you can explain why!).6 A path-dependent derivative may depend on a finite number of points along the underlying’s path (e.g., X = (ST1 , ..., STn )) but in principle can rely on the whole path, i.e., X = ({St }0≤t ≤T ). You will see, in other courses, how the probability-theory method
5
Binomial models can be seen as models in their own right, or as approximations to continuoustime models. Their discrete nature gives them certain tractability advantages, and there is a large body of literature studying them (and generalisations such as trinomial models, known collectively as tree models or methods). 6 Note also that path-dependent derivatives cannot be handled at all in the one-period model—it is just too crude, not involving any specification for the underlying between the initial and terminal times. The full binomial model can, though, and because of the contents of Footnote 5, it is in fact a popular method of pricing and hedging path-dependent claims.
80
8 Modelling, Pricing, and Hedging
of capturing information (in sigma algebras) is used to define the class of possible derivatives. The most important theoretical result about the BSM model is that it is complete. This means that all derivatives can be replicated in the model, with a portfolio consisting of the underlying asset and cash account. One has to prove this, of course—we are stating a conclusion upfront (in a rigorous stochastic calculus course, the order of the material will be very different). Given that the model is complete, all derivative prices can be given as their replication cost. As in the oneperiod model, the idea of risk-neutral or risk-adjusted probabilities gives a very powerful way of calculating these costs, but the mathematics is more complicated. Roughly speaking, all the probabilities in a model can be collectively called the probability measure. The real-world measure is the probability measure that governs how things actually behave (such as the above price dynamics). The risk-neutral measure, a central part of mathematical finance theory, is a tool to price derivatives. Changing probability measures is an important topic in financial modelling theory (it is much more complicated, in general, than changing a single number, from p to some q). In BSM, and other models based on Brownian motion, the Girsanov theorem facilitates this. Under the risk-neutral measure, the following dynamics describe the underlying asset: dSt = rSt dt + σ St d W˜ t , where {W˜ t } is a Brownian motion under the risk-neutral measure. The intuition for why the risk-neutral dynamics take on this form (where r has replaced μ) is this: after the probabilities have been adjusted to account for risk preferences, the risky asset has no more tendency to drift upwards, to provide returns, than the risk-free asset. Indeed, ˜ T ] = S0 erT , E[S ˜ indicates an expectation based on the risk-neutral probability measure. where E[·] The above result holds much more widely than in BSM (exercise: show that it holds in the one-period model). In the one-period model, you do not need to know p to price derivatives; in BSM, you do not need to know μ; to put it generally, you do not need to know the real-world expectation of the future underlying asset value (or, the closely related expected return).7 For this reason, you will often see the real-world dynamics skipped over, with the above risk-neutral dynamics being introduced immediately (even though the idea and introduction of the risk-neutral measure rests on replication being possible in the context of the real-world-measure behaviour of the model).
7
This is a significant advantage of relative pricing methods, such as BSM, over absolute pricing methods, such as CAPM. For reasons beyond our scope (this was mentioned in Chap. 5 in the context of α), it is difficult to estimate an investment’s expected return with accuracy.
8.2 The Black–Scholes–Merton Model
81
It is common (though not universal) to use Q to denote the risk-neutral measure, and P the real-world measure.8 In other texts, you may see risk-neutral expectations denoted EQ [·]. You may also see the measures labelled differently (e.g., P˜ or P0 rather than Q). Some notational choices are common but are an ultimately arbitrary way of referring to the underlying concepts. The Girsanov theorem allows us to characterise the risk-neutral measure (once) and then price (any and all) derivatives—the following is the famous risk-neutral formula: ˜ X0 = e−rT E[X], where X0 is the time-0 price of the derivative that pays X at time T . The above should make intuitive sense to you, both as a generalisation of the one-period pricing expression and as an expression of risk neutrality (if things behave risk neutrally, as they do under Q, pricing depends only on averages, not variation around the average). If one considers vanilla options (by setting X = (ST − K)+ or X = (K − ST )+ ) and calculates the above expectation, one obtains the famous Black– Scholes formulae (the BSM model can do much more than price calls and puts, but, being the most common derivative that cannot be priced in a model-independent way, this is the most famous application of BSM). If the price of a derivative violates the risk-neutral formula, an easy long–short arbitrage (between the derivative and its replicating portfolio) is possible. It is important to understand, though, that this is a theoretical arbitrage, in the context of the model. Recall from Chap. 7 that a theoretical arbitrage may only be apparent because the assumptions propping it up may be faulty. For the model-dependent derivatives under consideration here, the assumptions are guaranteed to be faulty. There is a lot to be said about this—how about an imperfect model interacts with the world—that we cannot cover here. Let us just note it would be unusual to attempt an arbitrage based on the difference between a market-prevailing and a modelimplied derivative price. Such a difference is likely reflective of the model (whether the dynamics, or parameters, or other assumptions or aspects), not of a grossly inefficient market. Also, it is also worth noting something that makes quantitative finance challenging (and interesting): one needs to develop an understanding of models, such as BSM, in their own right, as well as the financial markets in practice, both of which are hard enough, and only then attempt to understand their interaction, the application of models to the actual markets. Given these limitations, what is the usefulness of the BSM model? First, one often needs to price a derivative, and a model-dependent price is better than no price at all. If a client approaches a bank wanting a derivative position, the model can at the very least provide a benchmark value against which the potential business can
8
Note also that this terminology is not universal. Other names for the real-world measure include the physical, historical, statistical, or objective measure. The risk neutral is also known as the risk adjusted, the martingale or simply the pricing measure.
82
8 Modelling, Pricing, and Hedging
be considered. Second, the model suggests a way that the derivative can be hedged, by the bank in this example. Perhaps, the bank is content to allow their exposure (resulting from entering the derivative agreement with its client) to sit naked on their books, but this is unlikely. Banks do carry risky exposures, but they do not like to do so in a haphazard way—the model-implied hedge gives them the opportunity to hedge the risk, or at least to understand how it could be hedged, allowing them to quantify and this manages the risk. Above, it was asserted that all derivatives can be replicated or hedged in the BSM model (i.e., that the model is complete)—how does one go about replicating or hedging a derivative? The rough answer is with the delta-hedging rule. A derivative’s delta is its sensitivity, its partial derivative, with respect to the underlying asset price. When one computes the risk-neutral formula, one gets an expression in terms of S0 . The current estimate/expectation of future behaviour of the asset depends on the current state of the asset.9 The partial derivative of the expression with respect to S0 , the delta, is the number of units of the underlying one must hold (/short) to replicate (/hedge) the position. This is completely intuitive; if the derivative value is highly sensitive to the current value of the underlying, for instance, then you need a high number of units in the underlying to mimic the derivative value. What is not intuitive though is the specific delta of a specific position, and the BSM model addresses this. Note that this does not just apply at the initial time—to replicate or hedge a position, one must maintain a delta hedge over time, as things evolve and change. The risk-neutral formula above is the special case at time zero of the general formula: ˜ | St ], Xt = e−r(T −t ) E[X where the expectation is conditional: when you are sitting at time t (not necessarily t = 0), and you want to know the derivative price Xt , you know St and can use it (condition on it) when taking the expectation.10 Third, the idea of the delta can be extended, giving the so-called Greeks, which are extremely useful for understanding and managing the risks of a derivative position. The delta is useful for hedging a position, but also for quantifying how dependent the position is on the underlying asset. But derivative values, from the risk-neutral formula, depend on other things as well. One can differentiate with
9
It does not, however, in this and most models, depend on the past. This is known as the Markov property. 10 The delta-hedging rule is only a rough answer to how derivatives are replicated, because things can be a bit more mathematically delicate for path-dependent claims. If X = (ST ), the deltahedging rule is rock solid. If not, the result of the risk-neutral formula is not guaranteed to be differentiable. The martingale representation theorem, a very profound result about Brownian motion, is necessary to prove the completeness of the BSM model. Note that we have also glossed over the other (risk-free) part of the portfolio. The cash-account holding must vary in a way that finances the necessary delta-dictated holding alongside it. Attending to this formally requires some stochastic calculus and is known as ensuring the self-financing condition.
8.3 Beyond Black–Scholes–Merton
83
respect to time t; the result is called the theta, indicating how the derivative value changes over time, all else being equal. A derivative price depends on the volatility parameter σ , as this controls an important aspect of the future behaviour of the underlying, and the sensitivity to it is called the Vega. The delta is the most important, though, with the underlying asset being the primary risk factor, and certain Greeks are used to understand how the delta behaves. The main example is gamma, the second partial derivative of the derivative price with respect to the underlying, in other words, the first derivative of the delta. It would be remiss not to mention an alternative approach in which derivative prices can be computed in the BSM and other similar models. Instead of computing the risk-neutral formula, one can deduce a partial differential equation that the derivative pricing function must satisfy (recall that the result of the risk-neutral formula is a function of the asset price and other inputs). The underlying replication justification is still the same, but the price (the replication cost) can be calculated without using risk-neutral probabilities, or probabilities at all. In fact, this is the original method used by Black, Scholes, and Merton. It is a bit less general than the risk-neutral method, as it cannot (easily) handle path-dependent claims.11 The improved generality, mathematical elegance, and conceptual clarity of the riskneutral method were pioneered by Harrison, Pliska and Kreps in the late 1970s and early 1980s, and then given full mathematical generality and rigour in the 1990s by Delbaen and Schachermayer. These results extend well beyond the BSM model, and we discuss them in the final section, below.
8.3
Beyond Black–Scholes–Merton
Why is the BSM model so well-known and widely used? The primary reason is that strikes an excellent balance between tractability and empirical plausibility. On the former, virtually, all derivatives, including exotic and path-dependent ones, can be priced with BSM in closed form; this is ultimately due to the normal/Gaussian distribution underlying GBM.12 On the latter, while there is much to be said about how BSM fails to capture certain aspects of observed risky-asset behaviour, the normally distributed return implied by GBM is an excellent first approximation, capturing how returns vary over time, usually being moderate, but occasionally differing more significantly from the mean, with larger differences occurring less
11 The
risk-neutral method we have covered is also called the martingale method. We have not described it in terms of martingales, but you will become extremely familiar with this concept in stochastic calculus courses; it is the natural mathematical way of characterising the risk-neutral measure and formula. 12 One can argue that the normal distribution is not itself especially tractable, but that it just happens that a great deal of effort has been devoted to approximating it. Indeed, the normal cumulative distribution is not known in closed form, strictly speaking. However, almost all software packages have very efficient and accurate means of approximating it, and in practice, it is like a closed-form function.
84
8 Modelling, Pricing, and Hedging
often.13 Largely for this reason, this balance of tractability and plausibility, but probably also due to some historical happenstance, BSM has become the central benchmark model of the derivatives markets. An indication of this (and an interesting way to see that the BSM model is flawed, but that it is still very useful) is the idea of implied volatility. We noted above that the BSM pricing implications may not agree with the market, and one cannot simply enjoy arbitrage profits if this occurs. A common practice is to reverse the pricing process as we have described it: instead of calculating a price with a model, you take a price that prevails on the market and ask how you can make your model agree with that price. This is called calibration. To calibrate the BSM model, you manipulate the volatility parameter, σ , to ensure agreement with a market-prevailing vanilla option price.14 This volatility value is thus implied by the market. It turns out when you do this, you get different implied volatility values for different options (if you plot the implied volatility across different strike prices, you get the so-called volatility skew or smile; if you plot over strikes and expiries, you get the implied volatility surface). This contradicts the BSM assumption of a single, fixed volatility parameter. The BSM is so standard that, in practice, vanilla option prices are quoted in terms of their implied volatility (you need to plug the quotes into the Black–Scholes formula to get the price in terms of ordinary currency). Implied volatility is not useful for pricing vanilla options—why not?—but is useful for: (i) hedging them in a way that is consistent with the market (changing the implied volatility changes the delta, and the hedge suggested by the model) and (ii) pricing exotic options in a way that is consistent with prevailing vanilla option prices (you could say that BSM allows you to extrapolate from vanilla to exotic option prices). The BSM model, through implied volatility structures and other means, points the way to more complicated and realistic models. We said above that BSM is the central benchmark model. If you want to understand a more complicated model, and try to gauge whether the complexity is helpful, whether one is prepared to sacrifice tractability, you compare it to BSM. If you want to hedge a complicated derivative position in a way more sophisticated than BSM, you still calculate and give the BSM hedge as a benchmark. Other modellers will expect this comparison. The BSM model is a central reference point, simple enough to be very well understood, but complicated enough to be the basis for comparison, discussion, and extension. We conclude with two extremely important results in mathematical finance that pertain not just to BSM, but to all models, from the simplest discrete-time models to the extensions of BSM alluded to in the above paragraph. These results are so important that they are collectively known as the fundamental theorem(s) of
13 For
example, empirical return distributions tend to feature asymmetry (with a negative skewness in particular) and excess kurtosis (i.e., thicker tails than the normal distribution). Generalisations of BSM are often motivated by these kinds of observations. 14 The Vega in BSM is always positive, ensuring a one-to-one relationship between volatility and implied option price. For this reason, implied volatilities are unique.
8.3 Beyond Black–Scholes–Merton
85
asset pricing (some authors say there are two separate theorems, some that there is one with two parts—it does not really matter). The first says that a financial model (any model, any specification of one or more risky assets as well as a riskfree one) is free of arbitrage if and only if there exists a risk-neutral measure in the model. The second says that a model is complete if and only if the riskneutral measure is unique. If this does not impress you, you probably should not be studying quantitative finance. The fundamental theorems could not really be more simple, more general, or more interesting, linking the concepts of key financial interest (arbitrage, completeness, risk-neutral probabilities) in a natural and familiar mathematical way (in terms of existence and uniqueness). Let us briefly discuss the first. Recall the idea that lack of arbitrage is not quite like the assumption of the ability to short or borrow at a constant rate; it is an openended condition to be investigated. Well, the investigation is concluded. If you want to know if a model admits arbitrage (a seemingly very difficult question—how to get a handle on the vast multitude of possible trading strategies and ensure none involve arbitrage?), you exploit the first (part of the) fundamental theorem: check whether you can construct a risk-neutral measure. Let us consider the one-period model. How could the existence of the risk-neutral measure fail? We simply defined the rT −d risk-neutral probability of an up jump as q = eu−d (after which everything worked out; derivatives could be priced in a way that avoided arbitrage). The only way this could fail is if q was not between zero and one, i.e., was not a valid probability. And then (and only then) would there be arbitrage in the model. Exercise: if q < 0, show that you can borrow from the cash account and invest in the risky asset for a guaranteed, cost-free arbitrage profit (and flip this around if q > 1). For the second theorem, recall that model completeness is an incredibly important property of the BSM model.15 It is useful, suggesting a hedge for any derivative of interest, but it goes further than this. It shows that the risk exhibited by derivatives is really just the risk associated with the underlying and that you may as well, in principle at least, just trade in the underlying directly. In practice, things are not so simple—with trading costs and other frictions, the derivative can be much more convenient and effective, but it shows conceptually how one can understand, access, and manage the common risk. But models are not necessarily complete. Stochastic volatility models involve a volatility that changes randomly, unlike the constant volatility parameter in BSM. This is an additional risk source that cannot be hedged/replicated by trading in the underlying, giving rise to an incomplete model. If you take these models seriously, trying to perfectly hedge a volatility-sensitive derivative is a fool’s errand. What about the fundamental theorem? An incomplete model does not have a unique risk-neutral measure—there are multiple measures and therefore multiple risk-neutral formulae. Thus, derivatives will not necessarily have a unique price, reflecting the fact they cannot be replicated (although some will, e.g., those that can be replicated model independently). The various risk-neutral
15 It
is also important for the one-period model, but in that context it is much more straightforward to prove and therefore less striking a result.
86
8 Modelling, Pricing, and Hedging
measures will imply a range of prices, which are the various prices compatible with the absence of arbitrage (this range, for vanilla options, cannot be wider than the model-independent bounds we derived in Chap. 7—why not?—but, depending on the model, they can be narrower). There is a significant body of literature on how to proceed discriminating between these various prices. There is a very great deal more to financial modelling than the BSM model, but you will struggle to understand much of it without understanding the BSM model first. This overview is itself inadequate, of course, but should give you some context and guidance for grappling with the details in other more technical courses. The fundamental finance of the earlier chapters—a basic picture of the financial system (Chap. 1), the ideas of risk and return (Chaps. 1 and 2), market efficiency (Chap. 3), diversification (Chap. 4), risk premia and systemic risk (Chap. 5)—should assist you in developing and applying financial intuitions that you should consciously bear in mind when dealing with mathematical details.
Literature Notes This chapter is a working summary of modelling and arbitrage theory, which is rigorously developed in Björk (2004), Shreve (2004, 2005), Hull (2015), and Cvitanic and Zapatero (2004), to name a few prime examples. The ideas originate with Rendleman and Bartter (1979) and Cox et al. (1979), for the binomial model, and of course with Black and Scholes (1973) and Merton (1973). Credit for development of the theory, and the fundamental theorems in particular, goes to Harrison and Kreps (1979), Harrison and Pliska (1981), and Delbaen and Schachermayer (1994).
References Björk, T. (2004). Arbitrage theory in continuous time. Oxford University Press. Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654. Cox, J. C., Ross, S. A., & Rubinstein, M. (1979). Option pricing: A simplified approach. Journal of Financial Economics, 7(3), 229–263. Cvitanic, J., & Zapatero, F. (2004). Introduction to the economics and mathematics of financial markets. MIT Press. Delbaen, F., & Schachermayer, W. (1994). A general version of the fundamental theorem of asset pricing. Mathematische Annalen, 300(1), 463–520. Harrison, J. M., & Kreps, D. M. (1979). Martingales and arbitrage in multiperiod securities markets. Journal of Economic Theory, 20(3), 381–408. Harrison, J. M., & Pliska, S. R. (1981). Martingales and stochastic integrals in the theory of continuous trading. Stochastic Processes and Applications, 11(3), 215–260. Hull, J. C. (2015). Options, futures, and other derivatives (9th ed.). Pearson. Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economics and Management Science, 4, 141–183. Rendleman, R. J., & Bartter, B. J. (1979). Two-state option pricing. Journal of Finance, 34(5), 1093–1110. Shreve, S. E. (2004). Stochastic calculus for finance II: Continuous-time models. Springer. Shreve, S. E. (2005). Stochastic calculus for finance I: The binomial asset pricing model. Springer.