351 109 47MB
English Pages [5050] Year 2021
HANDBOOK OF FINANCIAL ECONOMETRICS, MATHEMATICS, STATISTICS, AND MACHINE LEARNING Volume 1
11335_9789811202414_tp.indd 1
30/1/19 8:56 AM
This page intentionally left blank
HANDBOOK OF FINANCIAL ECONOMETRICS, MATHEMATICS, STATISTICS, AND MACHINE LEARNING Volume 1
Editors
Cheng Few Lee Rutgers University, USA
John C. Lee
Center for PBBEF Research, USA
World Scientific NEW JERSEY
•
11335_9789811202414_tp.indd 2
LONDON
•
SINGAPORE
•
BEIJING
•
SHANGHAI
•
HONG KONG
•
TAIPEI
•
CHENNAI
•
TOKYO
30/1/19 8:56 AM
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data Names: Lee, Cheng F., editor. | Lee, John C, editor. Title: Handbook of financial econometrics, mathematics, statistics, and machine learning (in 4 volumes) / edited by Cheng Few Lee (Rutgers University, USA) & John C. Lee (Center for PBBEF Research, USA). Description: New Jersey : World Scientific, 2020. Identifiers: LCCN 2019013810 | ISBN 9789811202384 (set) | ISBN 9789811202414 (Vol. 1) | ISBN 9789811202421 (Vol. 2) | ISBN 9789811202438 (Vol. 3) | ISBN 9789811202445 (Vol. 4) Subjects: LCSH: Econometrics--Handbooks, manuals, etc. | Finance--Statistical methods--Handbooks, manuals, etc. Classification: LCC HB139 .H3634 2020 | DDC 332.01/5195--dc23 LC record available at https://lccn.loc.gov/2019013810
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Copyright © 2021 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
For any available supplementary material, please visit https://www.worldscientific.com/worldscibooks/10.1142/11335#t=suppl Desk Editors: Aanand Jayaraman/Yulin Jiang Typeset by Stallion Press Email: [email protected] Printed in Singapore
Aanand Jayaraman - 11335 - Handbook of Financial Econometrics,.indd 1
1/7/2020 4:54:02 pm
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Preface
Financial Econometrics, Mathematics, Statistics, and Machine Learning have become very important tools for both empirical and theoretical research in both finance and accounting. Econometric methods are important tools for doing asset pricing, corporate finance, options and futures, and conducting financial accounting research. Important econometric methods used in this research include: single equation multiple regression, simultaneous regression, panel data analysis, time series analysis, spectral analysis, non-parametric analysis, semi-parametric analysis, GMM analysis, and other methods. Portfolio theory and management research have used different statistical distributions such as normal distribution, stable distribution, and log normal distribution. Options and futures research have used binomial distribution, log normal distribution, non-central chi square distribution, Poisson distribution, and others. Auditing research has used sampling survey techniques to determine the sampling error and non-sampling error for auditing. Besides financial econometrics, financial mathematics, and financial technology are also important for theoretical research and practical applications for all of the above-mentioned subjects. Based upon our years of experience working in the industry, teaching classes, conducting research, writing textbooks, and editing journals on the subject of financial econometrics, mathematics, statistics, and technology, this handbook will review, discuss, and integrate theoretical, methodological, and practical issues of financial econometrics, mathematics, statistics, and machine learning. There are 131 chapters in this handbook. Chapter 1 presents an introduction of financial econometrics, mathematics, statistics, and machine learning and shows how readers can use this handbook. The following chapters, which have been contributed by accredited authors can be classified by the following 16 topics. (i) Financial Accounting and Auditing (2, 3, 17, 22, 30, 32, 53, 54, 105, 130) v
page v
July 6, 2020
12:1
vi
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Preface
(ii) Mutual Funds and Hedge Funds (12, 87, 91, 108, 125) (iii) Corporate Finance (45, 46, 49, 59, 67, 76, 96, 97) (iv) Asset Pricing (10, 12, 19, 48, 62, 64, 70, 79, 81, 82, 98, 99, 100, 114, 122, 129) (v) Options (7, 15, 24, 27, 33, 34, 40, 42, 50, 51, 83, 84, 85, 86, 89, 102, 103, 106, 109, 110, 128) (vi) Portfolio Analysis (4, 13, 14, 43, 47, 80, 81, 82, 88, 93, 104, 121) (vii) Risk Management (9, 15, 20, 35, 38, 39, 44, 50, 57, 58, 61, 65, 72, 73, 78, 101, 107, 127, 131) (viii) International Finance (13, 52, 56, 60) (ix) Investment and Behavior Finance (16, 18, 30, 31, 36, 42, 68, 69, 74, 87, 90, 94, 95, 103, 118, 119, 121, 123) (x) Banking Management (25, 63, 66, 71, 117) (xi) Event Study (53, 67) (xii) Futures and Index Futures (11, 89, 92) (xiii) Financial Econometrics (2, 3, 4, 5, 6, 10, 11, 22, 23, 25, 26, 28, 29, 32, 37, 48, 55, 56, 60, 62, 63, 70, 75, 79, 90, 92, 94, 95, 98, 99, 100, 107, 108, 111, 112, 113, 116, 123, 124, 125, 129) (xiv) Financial Mathematics (24, 27, 34, 40, 41, 49, 54, 86, 96, 97, 102, 104, 106, 109, 110, 128) (xv) Financial Statistics (7, 8, 16, 19, 33, 35, 36, 38, 39, 52, 57, 58, 59, 61, 64, 65, 68, 69, 72, 73, 74, 77, 78, 85, 93, 113, 114, 115, 118, 120) (xvi) Financial Machine Learning (14, 20, 21, 31, 39, 43, 44, 46, 55, 56, 84, 101, 112, 117, 119, 127) In preparation of this handbook, first, we would like to thank the members of the advisory board and contributors of this handbook. In addition, we would like to make note that we appreciate the extensive help from our editors, Mr. Yulin Jiang, Aanand Jayaraman, our assistants, Natalie Krawczyk and Yuanyuan Xiao, and our secretary, Rita Liu. There are undoubtedly some errors in the finished product, both typographical and conceptual. I would like to invite readers to send suggestions, comments, criticisms, and corrections to the author, Professor Cheng F. Lee at the Department of Finance and Economics, Rutgers University at the email address cfl[email protected]. May 2020
Cheng Few Lee John C. Lee
page vi
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Advisory Board
Ivan Brick Rutgers, The State University of New Jersey, USA Stephen Brown New York University, USA Charles Q. Cao Penn State University, USA Wayne Ferson Boston College, USA Lawrence R. Glosten Columbia University, USA Martin J. Gruber New York University, USA Richard E. Kihlstrom University of Pennsylvania, USA E. H. Kim University of Michigan, USA Robert McDonald Northwestern University, USA Ehud I. Ronn The University of Texas at Austin, USA ChunChi Wu State University of New York at Buffalo, USA
vii
b3568-v1-fm
page vii
This page intentionally left blank
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
About the Editors
Cheng Few Lee is a Distinguished Professor of Finance and Economics at Rutgers University and was the chairperson of the Department of Finance from 1988 to 1995. He has also served on the faculty of the University of Illinois (IBE Professor of Finance) and the University of Georgia. He has maintained academic and consulting ties in Taiwan, Hong Kong, China and the United States for the past four decades. He has been a consultant to many prominent organizations including the American Insurance Group, the World Bank, and the United Nations and renowned groups such as The Marmon Group Inc., Wintek Corporation and Polaris Financial Group. Professor Lee founded the Review of Quantitative Finance and Accounting (RQFA) in 1990 and the Review of Pacific Basin Financial Markets and Policies (RPBFMP) in 1998 and serves as the managing editor for both journals. He was also a co-editor for the Financial Review (1985–1991) and the Quarterly Review of Economics and Business (1987–1989). In the past 47 years, Dr. Lee has written numerous textbooks ranging in subject matters from financial management to corporate finance, security analysis and portfolio management to financial analysis, planning and forecasting, and business statistics. Dr. Lee has also published more than 240 articles in more than 20 different journals on finance, accounting, economics, statistics, and management. Professor Lee has been ranked the most published finance professor worldwide during 1953–2008. He has authored the Handbook of Quantitative Finance and Risk Management with John C. Lee and Alice C. Lee in 2010 and the Handbook of Financial Econometrics and Statistics with John C. Lee in 2015. Both handbooks have been published with Springer. This book titled Handbook of Financial Econometrics, Mathematics, Statistics, and Machine Learning will be published by World Scientific Publishing Co. in 2020. ix
page ix
July 6, 2020
12:1
x
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
About the Editors
Professor Lee earned the Siwei Cheng Award in Quantitative Management at the International Academy of Information Technology and Quantitative Management (IAITQM) in May 2013. Professor Lee has trained more than 100 Ph.D. students in finance and accounting in the past 47 years. Most recently, Professor Lee published his autobiography entitled From East to West — Memoirs of a Finance Professor on Academia, Practice, and Policy (by World Scientific Publishing Co., 2017). John C. Lee is a Microsoft Certified Professional in Microsoft Visual Basic and Microsoft Excel VBA. He has a bachelor’s and master’s degree in accounting from the University of Illinois at Urbana–Champaign. John has worked for over 20 years in both the business and technical fields as an accountant, an auditor, a systems analyst and a business software developer. He is the lead author of Essentials of Excel VBA, SAS, and MINITAB for Statistical and Financial Analysis published in 2017 by Springer. This book is a companion text to Statistics of Business and Financial Economics of which he is one of the co-authors. In addition, he also published Financial Analysis, Planning and Forecasting, Third Edition (with Cheng Few Lee). John has been a Senior Technology Officer at the Chase Manhattan Bank and the Assistant Vice President at Merrill Lynch. Currently, he is the Director of the Center for PBBEF Research.
page x
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 1
Preface
v
Advisory Board
vii
About the Editors
ix
List of Contributors 1.
xxxiii
Introduction to Financial Econometrics, Mathematics, Statistics, and Machine Learning
1
C. F. Lee 2.
Do Managers Use Earnings Forecasts to Fill a Demand They Perceive from Analysts?
101
O. Barron, J. Cao, X. Sheng, M. Thevenot and B. Xin 3.
A Potential Benefit of Increasing Book–Tax Conformity: Evidence from the Reduction in Audit Fees
151
N.-T. Kuo and C. F. Lee 4.
Gold in Portfolio: A Long-Term or Short-Term Diversifier?
199
F.-L. Lin, S.-Y. Yang and Y.-F. Chen 5.
Econometric Approach to Financial Analysis, Planning, and Forecasting C. F. Lee
xi
225
page xi
July 6, 2020
12:1
xii
6.
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 1
Forecast Performance of the Taiwan Weighted Stock Index: Update and Expansion
275
D.-Y. Ji, H.-Y. Chen and C. F. Lee 7.
Parametric, Semi-Parametric, and Non-Parametric Approaches for Option-Bound Determination: Review and Comparison
297
C. F. Lee and P. G. Zhang 8.
Measuring the Collective Correlation of a Large Number of Stocks
335
W.-F. Niu and H. H.-S. Lu 9.
Key Borrowers Detected by the Intensities of Their Interactions
355
F. Aleskerov, I. Andrievskaya, A. Nikitina and S. Shvydun 10. Application of the Multivariate Average F -Test to Examine Relative Performance of Asset Pricing Models with Individual Security Returns
391
S. Rahman and M. J. Schneider 11. Hedge Ratio and Time Series Analysis
431
S.-S. Chen, C. F. Lee and K. Shresth 12. Application of Intertemporal CAPM on International Corporate Finance
485
J.-R. Chang, M.-W. Hung and C. F. Lee 13. What Drives Variation in the International Diversification Benefits? A Cross-Country Analysis W.-J. P. Chiou and K. Pukthuanthong
519
page xii
July 17, 2020
13:29
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 1
14. A Heteroskedastic Black–Litterman Portfolio Optimization Model with Views Derived from a Predictive Regression
b3568-v1-fm
xiii
563
W.-H. Lin, H.-W. Teng and C.-C. Yang 15. Pricing Fair Deposit Insurance: Structural Model Approach
583
T. Tai, C. F. Lee, T.-S. Dai, K. L. Wang and H.-Y. Chen 16. Application of Structural Equation Modeling in Behavioral Finance: A Study on the Disposition Effect
603
H.-H. Chang 17. External Financing Needs and Early Adoption of Accounting Standards: Evidence from the Banking Industry
627
S. I-L. Wang 18. Improving the Stock Market Prediction with Social Media via Broad Learning
677
X. Zhang and P. S. Yu 19. Sourcing Alpha in Global Equity Markets: Market Factor Decomposition and Market Characteristics
737
S. S. Mohanty 20. Support Vector Machines Based Methodology for Credit Risk Analysis
791
J. Li, M. Liu, C. F. Lee and D. Wu 21. Data Mining Applications in Accounting and Finance Context
823
W. Kwak, Y. Shi and C. F. Lee 22. Trade-off Between Reputation Concerns and Economic Dependence for Auditors — Threshold Regression Approach F.-C. Lin, C.-C. Chien, C. F. Lee, H.-C. Lin and Y.-C. Lin
859
page xiii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xiv
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 1
23. ASEAN Economic Community: Analysis Based on Fractional Integration and Cointegration
889
L. A. Gil-Alana and H. Carcel 24. Alternative Methods for Determining Option Bounds: A Review and Comparison
917
C. F. Lee, Z. Zhong, T. Tai and H. Chuang 25. Financial Reforms and the Differential Impact of Foreign Versus Domestic Banking Relationships on Firm Value
947
H.-C. Yu, C. F. Lee and B. J. Sopranzetti 26. Time-Series Analysis: Components, Models, and Forecasting C. F. Lee
979
page xiv
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 2
Preface
v
Advisory Board
vii
About the Editors
ix
List of Contributors
xxxiii
27. Itˆ o’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
1025
G. Chalamandaris and A. G. Malliaris 28. Durbin–Wu–Hausman Specification Tests
1075
R. H. Patrick 29. Jump Spillover and Risk Effects on Excess Returns in the United States During the Great Recession
1109
J. Schlossberg and N. R. Swanson 30. Earnings Forecasts and Revisions, Price Momentum, and Fundamental Data: Further Explorations of Financial Anomalies
1151
J. Guerard and A. Mark 31. Ranking Analysts by Network Structural Hole R.-J. Guo, Y. Lu and L. Xie xv
1211
page xv
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xvi
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 2
32. The Association Between Book-Tax Differences and CEO Compensation
1245
K.-W. Lee and G. H.-H. Yeo 33. Stochastic Volatility Models: Faking a Smile
1271
D. Diavatopoulos and O. Sokolinskiy 34. Entropic Two-Asset Option
1295
T. Sebehela 35. The Joint Determinants of Capital Structure and Stock Rate of Return: A LISREL Model Approach
1345
H.-Y. Chen, C. F. Lee and T. Tai 36. Time-Frequency Wavelet Analysis of Stock-Market Co-Movement Between and Within Geographic Trading Blocs
1399
B. Kaffel and F. Abid 37. Alternative Methods to Deal with Measurement Error
1439
H.-Y. Chen, A. C. Lee and C. F. Lee 38. Simultaneously Capturing Multiple Dependence Features in Bank Risk Integration: A Mixture Copula Framework
1485
X. Zhu, J. Li and D. Wu 39. GPU Acceleration for Computational Finance
1519
C.-H. Han 40. Does VIX Truly Measure Return Volatility?
1533
K. V. Chow, W. Jiang and J. Li 41. An ODE Approach for the Expected Discounted Penalty at Ruin in a Jump-Diffusion Model Y.-T. Chen, C. F. Lee and Y.-C. Sheu
1561
page xvi
July 17, 2020
13:29
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 2
42. How Does Investor Sentiment Affect Implied Risk-Neutral Distributions of Call and Put Options?
b3568-v1-fm
xvii
1599
W.-M. Szu, Y.-C. Wang and W.-R. Yang 43. Intelligent Portfolio Theory and Strength Investing in the Confluence of Business and Market Cycles and Sector and Location Rotations
1637
H. Pan 44. Evolution Strategy-Based Adaptive Lq Penalty Support Vector Machines with Gauss Kernel for Credit Risk Analysis
1675
J. Li, G. Li, D. Sun and C. F. Lee 45. Product Market Competition and CEO Pay Benchmarking
1695
I. E. Brick and D. Palia 46. Equilibrium Rate Analysis of Cash Conversion Systems: The Case of Corporate Subsidiaries
1725
W. Chen, B. Melamed, O. Sokolinskiy and B. S. Sopranzetti 47. Is the Market Portfolio Mean–Variance Efficient?
1763
R. Grauer 48. Consumption-Based Asset Pricing with Prospect Theory and Habit Formation
1789
J.-Y. Wang and M.-W. Hung 49. An Integrated Model for the Cost-Minimizing Funding of Corporate Activities Over Time
1821
M. C. Gupta 50. Empirical Studies of Structural Credit Risk Models and the Application in Default Prediction: Review and New Evidence H.-H. Lee, R.-R. Chen and C. F. Lee
1845
page xvii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xviii
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 2
51. Empirical Performance of the Constant Elasticity Variance Option Pricing Model
1903
R. R. Chen, C. F. Lee and H.-H. Lee 52. The Jump Behavior of a Foreign Exchange Market: Analysis of the Thai Baht
1943
J.-R. Chang, M.-W. Hung, C. F. Lee and H.-M. Lu 53. The Revision of Systematic Risk on Earnings Announcement in the Presence of Conditional Heteroscedasticity
1969
C.-C. Chien, C. F. Lee and S.-C. Chiu 54. Applications of Fuzzy Set to International Transfer Pricing and Other Business Decisions
1991
W. Kwak, Y. Shi, H. Lee and C. F. Lee 55. A Time-Series Bootstrapping Simulation Method to Distinguish Sell-Side Analysts’ Skill from Luck
2011
C. Su and H. Zhang 56. Acceptance of New Technologies by Employees in Financial Industry
2053
V. Belousova, V. Solodkov, N. Chichkanov and E. Nikiforova 57. Alternative Method for Determining Industrial Bond Ratings: Theory and Empirical Evidence
2081
L.-J. Kao and C. F. Lee 58. An Empirical Investigation of the Long Memory Effect on the Relation of Downside Risk and Stock Returns C. Y.-H. Chen and T. C. Chiang
2107
page xviii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 2
59. Analysis of Sequential Conversions of Convertible Bonds: A Recurrent Survival Approach
b3568-v1-fm
xix
2141
L.-J. Kao, L.-S. Chen and C. F. Lee 60. Determinants of Euro-Area Bank CDS Spreads
2161
M.-E. K. Agoraki, D. A. Georgoutsos and G. T. Moratis 61. Dynamic Term Structure Models Using Principal Components Analysis Near the Zero Lower Bound
2199
J. Juneja 62. Effects of Measurement Errors on Systematic Risk and Performance Measure of a Portfolio
2251
C. F. Lee and F. C. Jen 63. Forecasting Net Charge-Off Rates of Banks: A PLS Approach
2265
J. R. Barth, S. Joo, H. Kim, K. B. Lee, S. Maglic and X. Shen 64. Application of Filtering Methods in Asset Pricing
2303
H. Chang and Y. Wu 65. Sampling Distribution of the Relative Risk Aversion Estimator: Theory and Applications M. J. Karson, D. C. Cheng and C. F. Lee
2323
page xix
This page intentionally left blank
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 3
Preface
v
Advisory Board
vii
About the Editors
ix
List of Contributors
xxxiii
66. Social Media, Bank Relationships and Firm Value
2337
C.-H. Chao and H.-C. Yu 67. Splines, Heat, and IPOs: Advances in the Measurement of Aggregate IPO Issuance and Performance
2373
Z. A. Smith, M. A. M. Al Janabi and M. Z. Mumtaz 68. The Effects of the Sample Size, the Investment Horizon and the Market Conditions on the Validity of Composite Performance Measures: A Generalization
2399
S.-N. Chen and C. F. Lee 69. The Sampling Relationship Between Sharpe’s Performance Measure and its Risk Proxy: Sample Size, Investment Horizon and Market Conditions S.-N. Chen and C. F. Lee
xxi
2419
page xxi
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxii
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 3
70. VG NGARCH Versus GARJI Model for Asset Price Dynamics
2437
L.-J. Kao and C. F. Lee 71. Why do Smartphone and Tablet Users Adopt Mobile Banking?
2461
V. Belousova and N. Chichkanov 72. Non-Parametric Inference on Risk Measures for Integrated Returns
2485
H. Tsai, H.-C. Ho and H.-Y. Chen 73. Copulas and Tail Dependence in Finance
2499
W.-C. Lai and K.-L. Goh 74. Some Improved Estimators of Maximum Squared Sharpe Ratio
2525
S. K. Choy and B.-q. Yang 75. Errors-in-Variables and Reverse Regression
2547
S. Rahman and C. F. Lee 76. The Role of Financial Advisors in M&As: Do Domestic and Foreign Advisors Differ?
2565
K.-S. Chuang 77. Discriminant Analysis, Factor Analysis, and Principal Component Analysis: Theory, Method, and Applications
2599
C. F. Lee 78. Credit Analysis, Bond Rating Forecasting, and Default Probability Estimation C. F. Lee
2635
page xxii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 3
79. Market Model, CAPM, and Beta Forecasting
b3568-v1-fm
xxiii
2673
C. F. Lee 80. Utility Theory, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
2713
C. F. Lee 81. Single-Index Model, Multiple-Index Model, and Portfolio Selection
2757
C. F. Lee 82. Sharpe Performance Measure and Treynor Performance Measure Approach to Portfolio Analysis
2801
P. Chiou and C. F. Lee 83. Options and Option Strategies: Theory and Empirical Results
2839
C. F. Lee 84. Decision Tree and Microsoft Excel Approach for Option Pricing Model
2885
J.-R. Chang and J. Lee 85. Statistical Distributions, European Option, American Option, and Option Bounds
2929
C. F. Lee 86. A Comparative Static Analysis Approach to Derive Greek Letters: Theory and Applications
2965
C. F. Lee and Y. Xiao 87. Fundamental Analysis, Technical Analysis, and Mutual Fund Performance C. F. Lee
3001
page xxiii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxiv
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 3
88. Bond Portfolio Management, Swap Strategy, Duration, and Convexity
3059
C. F. Lee 89. Synthetic Options, Portfolio Insurance, and Contingent Immunization
3099
C. F. Lee 90. Alternative Security Valuation Model: Theory and Empirical Results
3143
C. F. Lee 91. Opacity, Stale Pricing, Extreme Bounds Analysis, and Hedge Fund Performance: Making Sense of Reported Hedge Fund Returns
3193
Z. A. Smith, M. A. M. Al Janabi and M. Z. Mumtaz 92. Does Quantile Co-Integration Exist Between Gold Spot and Futures Prices?
3219
H.-C. Yu, C.-J. Lee and D.-T. Hsieh 93. Bayesian Portfolio Mean–Variance Efficiency Test with Sharpe Ratio’s Sampling Error
3241
L.-J. Kao, H. C. Soo and C. F. Lee 94. Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
3263
H.-Y. Chen, S.-S. Chen, C.-W. Hsin and C. F. Lee 95. Technical, Fundamental, and Combined Information for Separating Winners from Losers H.-Y. Chen, C. F. Lee and W.-K. Shih
3319
page xxiv
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 3
96. Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis: Theory and Empirical Evidence
b3568-v1-fm
xxv
3367
C. F. Lee, M. C. Gupta, H.-Y. Chen and A. C. Lee 97. Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio: A Joint Optimization Approach
3413
H.-Y. Chen, M. C. Gupta, A. C. Lee and C. F. Lee 98. Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests of Asset-Pricing Models
3465
T. Gramespacher, A. B¨ anziger and N. Hilber 99. Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence C. F. Lee, C.-M. Tsai and A. C. Lee
3491
page xxv
This page intentionally left blank
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 4
Preface
v
Advisory Board
vii
About the Editors
ix
List of Contributors
xxxiii
100. A Dynamic CAPM with Supply Effect: Theory and Empirical Results
3517
C. F. Lee, C.-M. Tsai and A. C. Lee 101. Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default
3545
H. W. Teng and M. Lee 102. Alternative Methods to Derive Option Pricing Models: Review and Comparison
3573
C. F. Lee, Y. Chen and J. Lee 103. Option Price and Stock Market Momentum in China J. Li, Y. Yao, Y. Chen and C. F. Lee
xxvii
3619
page xxvii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxviii
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 4
104. Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs: Methodology and Effectiveness
3649
W.-J. P. Chiou and J.-R. Yu 105. The Path Leading up to the New IFRS 16 Leasing Standard: How was the Restructuring of Lease Accounting Received by Different Advocacy Groups?
3675
C. Blecher and S. Kruse 106. Implied Variance Estimates for Black–Scholes and CEV OPM: Review and Comparison
3703
C. F. Lee, Y. Chen and J. Lee 107. Crisis Impact on Stock Market Predictability
3737
R. Mohnot 108. How Many Good and Bad Funds are There, Really?
3753
W. Ferson and Y. Chen 109. Constant Elasticity of Variance Option Pricing Model: Integration and Detailed Derivation
3829
Y. L. Hsu, T. L. Lin and C. F. Lee 110. An Integral Equation Approach for Bond Prices with Applications to Credit Spreads
3849
Y.-T. Chen, C. F. Lee and Y.-C. Sheu 111. Sample Selection Issues and Applications
3867
H.-L. Chuang and S.-Y. Chiu 112. Time Series and Neural Network Analysis K. C. Tseng, O. Kwon and L. C. Tjung
3887
page xxviii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 4
113. Covariance Regression Model for Non-Normal Data
b3568-v1-fm
xxix
3933
T. Zou, R. Luo, W. Lan and C.-L. Tsai 114. Impacts of Time Aggregation on Beta Value and R2 Estimations Under Additive and Multiplicative Assumptions: Theoretical Results and Empirical Evidence
3947
Y. Xiao, Y. Tang and C. F. Lee 115. Large-Sample Theory
3985
S. Poshakwale and A. Mandal 116. Impacts of Measurement Errors on Simultaneous Equation Estimation of Dividend and Investment Decisions
4001
C. F. Lee and F.-L. Lin 117. Big Data and Artificial Intelligence in the Banking Industry
4025
T. R. Yu and X. Song 118. A Non-Parametric Examination of Emerging Equity Markets Financial Integration
4043
K. Yang, S. Wahab, B. Kolluri and M. Wahab 119. Algorithmic Analyst (ALAN) — An Application for Artificial Intelligence Content as a Service
4075
T. Hong, D. Lee and W. Wang 120. Survival Analysis: Theory and Application in Finance
4087
F. Gao and X. He 121. Pricing Liquidity in the Stock Market D. Du and O. Hu
4119
page xxix
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxx
9.61in x 6.69in
b3568-v1-fm
Contents of Volume 4
122. The Evolution of Capital Asset Pricing Models: Update and Extension
4149
Y.-C. Shih, S.-S. Chen, C. F. Lee and P.-J. Chen 123. The Multivariate GARCH Model and its Application to East Asian Financial Market Integration
4209
Y. Tsukuda, J. Shimada and T. Miyakoshi 124. Review of Difference-in-Difference Analyses in Social Sciences: Application in Policy Test Research
4255
W. H. Greene and M. Liu 125. Using Smooth Transition Regressions to Model Risk Regimes
4281
L. A. Gallagher, M. C. Hutchinson and J. O’Brien 126. Application of Discriminant Analysis, Factor Analysis, Logistic Regression, and KMV-Merton Model in Credit Risk Analysis
4313
C. F. Lee and H.-C. Yu 127. Predicting Credit Card Delinquencies: An Application of Deep Neural Networks
4349
T. Sun and M. A. Vasarhalyi 128. Estimating the Tax-Timing Option Value of Corporate Bonds
4383
P. H. Chen, S. Liu and C. Wu 129. DCC-GARCH Model for Market and Firm-Level Dynamic Correlation in S&P 500 P. Chen, C. Wu and Y. Zhang
4421
page xxx
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Contents of Volume 4
130. Using Path Analysis to Integrate Accounting and Non-Financial Information: The Case for Revenue Drivers of Internet Stocks
b3568-v1-fm
xxxi
4441
A. Kozberg 131. The Implications of Regulation in the Community Banking Sector: Risk and Competition
4473
G. McKee and A. Kagan Author Index
4509
Subject Index
4587
Reference Index
4613
page xxxi
This page intentionally left blank
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Fathi Abid University of Sfax, Tunisia Maria-Eleni K. Agoraki Athens University of Economics and Business, Athens, Greece Mazin A. M. Al Janabi EGADE Business School, Tecnologico de Monterrey, Santa Fe Campus, Mexico City, Mexico Fuad Aleskerov National Research University Higher School of Economics (HSE), Moscow, Russia V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences (ICS RAS), Moscow, Russia Irina Andrievskaya National Research University Higher School of Economics (HSE), Moscow, Russia University Niccolo Cusano, Roma, Italy Armin B¨anziger Zurich University of Applied Sciences, Winterthur, Switzerland Orie Barron Penn State University, PA, USA James R. Barth Auburn University, Auburn, AL, USA Veronika Belousova National Research University Higher School of Economics (HSE), Moscow, Russia Christian Blecher Kiel University, Kiel, Germany Ivan E. Brick Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Jian Cao Florida Atlantic University, FL, USA Hector Carcel Bank of Lithuania, Vilnius, Lithuania George Chalamandaris Athens University of Economics and Business, Athens, Greece Jow-Ran Chang National Tsing Hua University, Hsinchu, Taiwan Hsin-Hue Chang Ming Chuan University, Taipei, Taiwan xxxiii
page xxxiii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxxiv
Hao Chang
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Chia-Hui Chao Hsing Wu University, New Taipei, Taiwan Yu-Fen Chen Da-Yeh University, Changhua, Taiwan Hsiao-Yin Chen Kainan University, Taoyuan City, Taiwan Sheng-Syan Chen National Chengchi University, Taipei, Taiwan Hong-Yi Chen National Chengchi University, Taipei, Taiwan Yu-Ting Chen National Chiao Tung University, Hsinchu, Taiwan Weiwei Chen Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Ren-Raw Chen Fordham University, New York, NY, USA Cathy Yi-Hsuan Chen Humboldt-Universit¨at, Berlin, Germany Li-Shya Chen Department of Statistics, National Chengchi University, Taipei, Taiwan Son-Nan Chen Shanghai Advanced Institute of Finance (SAIF), Shanghai Jiao Tong University, Shanghai, China Hung-Yin Chen Chung Yuan Christian University, Taoyuan City, Taiwan Yibing Chen Asset Allocation & Research Department, National Council for Social Security Fund, China Yong Chen Texas A&M University, College Station, TX, USA Peimin Chen Southwestern University of Finance and Economics, Chengdu, China Po-Jung Chen National Taiwan University, Taipei, Taiwan Peter Huaiyu Chen Youngstown State University, Youngstown, OH, USA David C. Cheng University of Alabama, Tuscaloosa, AL Thomas C. Chiang Drexel University, Philadelphia, PA, USA Nikolay Chichkanov National Research University Higher School of Economics (HSE), Moscow, Russia Chin-Chen Chien National Cheng Kung University, Tainan, Taiwan Wan-Jiun Paul Chiou Northeastern University, Boston, MA, USA She-Chih Chiu National Taipei University, Taipei, Taiwan Shih-Yung Chiu Soochow University, Taipei, Taiwan K. Victor Chow West Virginia University, Morgantown, West Virginia, USA Siu Kai Choy King’s College London, London, UK
page xxxiv
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
List of Contributors
Hongwei Chuang
9.61in x 6.69in
b3568-v1-fm
xxxv
Tohoku University, Sendai, Miyagi Prefecture, Japan Kai-Shi Chuang Tunghai University, Taichung, Taiwan Hwei-Lin Chuang National Tsing Hua University, Hsinchu, Taiwan Tian-Shyr Dai National Chiao Tung University, Hsinchu, Taiwan Dean Diavatopoulos Seattle University, Seattle, WA, USA Ding Du Office of the Comptroller of the Currency, USA Wayne Ferson University of Southern California, Los Angeles, CA, USA Liam A. Gallagher Dublin City University, Dublin, Ireland Feng Gao Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Dimitris A. Georgoutsos Athens University of Economics and Business, Athens, Greece Luis Alberiko Gil-Alana University of Navarra, Pamplona, Spain Kim-Leng Goh University of Malaya, Kuala Lumpur, Malaysia Thomas Gramespacher Zurich University of Applied Sciences, Winterthur, Switzerland Robert Grauer Simon Fraser University, Canada William H. Greene New York University, New York, NY, USA John Guerard McKinley Capital Management, LLC Re-Jin Guo University of Illinois at Chicago, Chicago, IL, USA Manak C. Gupta Temple University, Philadelphia, PA, USA Chuan-Hsiang Han National Tsing Hua University, Hsinchu, Taiwan Xiaomin He Taiho Oncology, Inc. Norbert Hilber Zurich University of Applied Sciences, Winterthur, Switzerland Hwai-Chung Ho Academia Sinica, Taipei, Taiwan National Taiwan University, Taipei, Taiwan Ted Hong Beyondbond, Inc. Der-Tzon Hsieh National Taiwan University, Taipei, Taiwan Chin-Wen Hsin Yuan Ze University, Taoyuan City, Taiwan Ying-Lin Hsu National Chung Hsin University, Taichung, Taiwan Ou Hu Youngstown State University, Youngstown, OH, USA Mao-Wei Hung National Taiwan University, Taipei, Taiwan
page xxxv
July 6, 2020
12:1
xxxvi
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Mao-Wei Hung National Taiwan University, Taipei, Taiwan Mark C. Hutchinson University College, Cork, Ireland Frank C. Jen State University of New York at Buffalo, Buffalo, NY, USA Deng-Yuan Ji Chung Yuan Christian University, Taoyuan City, Taiwan Wanjun Jiang Peking University, Beijing, China Sunghoon Joo Auburn University, Auburn, AL, USA Januj Juneja San Diego State University, San Diego, CA, USA Bilel Kaffel University of Sfax, Tunisia Lie-Jane Kao Department of Actuarial Mathematics and Statistics, Heriot-Watt University Malaysia, Putrjaya, Malaysia Marvin J. Karson University of New Hampshire, Durham, NH, USA Hyeongwoo Kim Auburn University, Auburn, AL, USA Bharat Kolluri University of Hartford, West Hartford, CT, USA Anthony Kozberg Hunter College, New York, NY, USA Stephanie Kruse Hunter College, New York, NY, USA Kiel University, Kiel, Germany Nan-Ting Kuo Nankai University, Tianjin, China Wikil Kwak University of Nebraska at Omaha, Omaha, Nebraska, USA Ojoung Kwon California State University at Fresno, Fresno, CA, USA Wing-Choong Lai University of Malaya, Kuala Lumpur, Malaysia Wei Lan Southwestern University of Finance and Economics, Chengdu, China Cheng Few Lee Department of Finance and Economics, Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Kin-Wai Lee Nanyang Business School, Nanyang Technological University, Singapore Alice C. Lee State Street, MA, USA Han-Hsing Lee National Chiao Tung University, Hsinchu, Taiwan Heeseok Lee Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon, South Korea Kang Bok Lee Auburn University, Auburn, AL, USA Chia-Ju Lee Chung Yuan Christian University, Taoyuan City, Taiwan
page xxxvi
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
List of Contributors
Michael Lee
9.61in x 6.69in
b3568-v1-fm
xxxvii
Computer Science, Intelligence and Modeling and Simulation Threads, Georgia Institute of Technology, USA John Lee Center for PBBEF Research, USA Daniel Lee Beyondbond, Inc. Jianping Li Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Jingrui Li West Virginia University, Morgantown, West Virginia, USA Gang Li Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Fu-Lai Lin Da-Yeh University, Changhua, Taiwan Wei-Hung Lin University of Leeds, Leeds, UK Fang-Chi Lin National Pingtung University, Pingtung, Taiwan Hsuan-Chu Lin National Cheng Kung University, Tainan, Taiwan Yu-Cheng Lin National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan Tsung-I Lin National Chung Hsin University, Taichung, Taiwan Mingxi Liu Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Min (Shirley) Liu Brooklyn College, CUNY, Brooklyn, NY, USA Sheen Liu Washington State University, Pullman, WA, USA Henry Horng-Shing Lu National Chiao Tung University, Hsinchu, Taiwan Yingda Lu University of Illinois at Chicago, Chicago, IL, USA Hsin-min Lu National Taiwan University, Taipei, Taiwan Ronghua Luo Southwestern University of Finance and Economics, Chengdu, China Stevan Maglic Regions Bank, Birmingham, AL, USA A. G. Malliaris School of Business Administration, Loyola University Chicago, Chicago, IL, USA Anandadeep Mandal University of Birmingham, UK Andrew Mark GlobeFlex Capital, LP Benjamin Melamed Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Tatsuyoshi Miyakoshi Hosei University, Tokyo, Japan Subhransu S. Mohanty St. Francis Institute of Management & Research, Mumbai, India
page xxxvii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xxxviii
Rajesh Mohnot George T. Moratis
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Middlesex University Dubai, Dubai, UAE Athens University of Economics and Business, Athens, Greece Muhammad Z. Mumtaz National University of Sciences and Technology, Islamabad, Pakistan Ekaterina Nikiforova National Research University Higher School of Economics (HSE), Moscow, Russia Alisa Nikitina National Research University Higher School of Economics (HSE),Moscow, Russia Wei-Fang Niu National Chiao Tung University, Hsinchu, Taiwan John O’Brien University College, Cork, Ireland Darius Palia Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Heping Pan Intelligent Finance Research Center, Chongqing Institute of Finance, Chongqing, China Finance Research Center, School of Business, Chengdu University, Chengdu, China Swingtum Prediction, Australia Robert H. Patrick Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Sunil Poshakwale Cranfield University, UK Kuntara Pukthuanthong University of Missouri, Columbia, Columbia, MO, USA Shafiqur Rahman Rubicon Global Advisors, Portland, Oregon, USA Jessica Schlossberg Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Matthew J. Schneider Drexel University, Philadelphia, Pennsylvania, USA Tumellano Sebehela School of Construction Economics & Management, WITS University, Johannesburg, South Africa Xuan Shen Regions Bank, Birmingham, AL, USA Xuguang Sheng American University, Washington, DC, USA Yuan-Chung Sheu National Chiao Tung University, Hsinchu, Taiwan Yong Shi University of Nebraska at Omaha, Omaha, Nebraska, USA Chinese Academy of Sciences, Beijing, China Wei-Kang Shih Public Company Accounting Oversight Board (PCAOB), Washington DC, USA
page xxxviii
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
List of Contributors
Yi-Cheng Shih Junji Shimada Keshab Shresth Sergey Shvydun
9.61in x 6.69in
b3568-v1-fm
xxxix
National Taipei University, Taipei, Taiwan Aoyama-Gakuin University, Tokyo, Japan Monash University Malaysia, Malaysia National Research University Higher School of Economics (HSE), Moscow, Russia V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences (ICS RAS), Moscow, Russia Zachary A. Smith Saint Leo University, St Leo, FL, USA Oleg Sokolinskiy Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Vasily Solodkov National Research University Higher School of Economics (HSE), Moscow, Russia Xuehu (Jason) Song California State University, Stanislaus, Turlock, CA, USA Huei Ching Soo Department of Actuarial Mathematics and Statistics, Heriot-Watt University Malaysia, Putrajaya, Malaysia Ben J. Sopranzetti Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Chen Su Newcastle University Business School, UK Dongxia Sun Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Ting Sun The College of New Jersey, Ewing, NJ, USA Norman R. Swanson Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Wen-Ming Szu National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan Tzu Tai Mezocliq, LLC, USA Yushan Tang Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Huei-Wen Teng National Chiao Tung University, Hsinchu, Taiwan Maya Thevenot Florida Atlantic University, FL, USA Luna C. Tjung Credit Suisse AG Henghsiu Tsai Academia Sinica, Taipei, Taiwan Chiung-Min Tsai The Central Bank of China, Taipei, Taiwan Chih-Ling Tsai University of California, Davis, Davis, CA, USA K.C. Tseng California State University at Fresno, Fresno, CA, USA
page xxxix
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
xl
Yoshihiko Tsukuda
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Tohoku University, Sendai, Miyagi Prefecture, Japan Miklos A. Vasarhalyi Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Susan Wahab University of Hartford, West Hartford, CT, USA Mahmoud Wahab University of Hartford, West Hartford, CT, USA Keh Luh Wang National Chiao Tung University, Hsinchu, Taiwan Sophia I-Ling Wang California State University, Fullerton, Fullerton, CA, USA Yi-Chen Wang National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan Jr-Yan Wang National Taiwan University, Taipei, Taiwan Wenching Wang Beyondbond, Inc. Chunchi Wu State University of New York at Buffalo, Buffalo, NY, USA Dengsheng Wu Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Yangru Wu Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Yuanyuan Xiao Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Lingling Xie Fudan University, Shanghai, China Baohua Xin University of Toronto, Toronto, Canada Sheng-Yung Yang National Chung Hsing University, Taichung, Taiwan Chi-Chun Yang Chung Yuan Christian University, Taoyuan City, Taiwan Wan-Ru Yang National University of Kaohsiung, Kaohsiung, Taiwan Bu-qing Yang Shanghai University of Finance and Economics, Shanghai, China Ke Yang University of Hartford, West Hartford, CT, USA Yanzhen Yao Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Gillian Hian-Heng Yeo Nanyang Business School, Nanyang Technological University, Singapore Philip S. Yu University of Illinois at Chicago, Chicago, IL, USA
page xl
July 6, 2020
12:1
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-fm
List of Contributors
Hai-Chin Yu
Chung Yuan Christian University, Taoyuan City, Taiwan Jing-Rung Yu National Chi-Nan University, Nantou, Taiwan T. Robert Yu University of Wisconsin — Whitewater, Whitewater, WI, USA Peter Zhang Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Xi Zhang Beijing University of Posts and Telecommunications, Beijing, China Hanxiong Zhang Lincoln International Business School, University of Lincoln, UK Ying Zhang Southwestern University of Finance and Economics, Chengdu, China Zhaodong Zhong Rutgers Business School at Newark and New Brunswick, Rutgers University, USA Xiaoqian Zhu Institutes of Science and Development, Chinese Academy of Sciences, Beijing, China Tao Zou The Australian National University, Canberra, Australia
xli
page xli
This page intentionally left blank
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Chapter 1
Introduction to Financial Econometrics, Mathematics, Statistics, and Machine Learning Cheng Few Lee Contents 1.1 1.2
1.3 1.4
1.5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . Financial Econometrics . . . . . . . . . . . . . . . . . 1.2.1 Single equation regression methods . . . . . . 1.2.2 Simultaneous equation models . . . . . . . . . 1.2.3 Panel data analysis . . . . . . . . . . . . . . . 1.2.4 Alternative methods to deal with measurement error . . . . . . . . . . . . . . . . 1.2.5 Time-series analysis . . . . . . . . . . . . . . . 1.2.6 Spectral analysis . . . . . . . . . . . . . . . . . Financial Mathematics . . . . . . . . . . . . . . . . . . Financial Statistics . . . . . . . . . . . . . . . . . . . . 1.4.1 Statistical distributions . . . . . . . . . . . . . 1.4.2 Principle components and factor analysis . . . 1.4.3 Non-parametric and semi-parametric analyses 1.4.4 Cluster analysis . . . . . . . . . . . . . . . . . 1.4.5 Fourier transformation method . . . . . . . . . Financial Technology and Machine Learning . . . . . . 1.5.1 Classification of financial technology . . . . . . 1.5.2 Classification of machine learning . . . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 1
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
2 4 4 7 8
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
8 9 10 10 10 10 11 11 11 11 11 11 11
page 1
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
2
1.5.3 1.5.4
Machine learning applications . . . . . . . . . Other computer science tools used for financial technology . . . . . . . . . . . . . . . . . . . . 1.6 Applications of Financial Econometrics, Mathematics, Statistics, and Machine Learning . . . . . . . . . . . . 1.6.1 Asset pricing . . . . . . . . . . . . . . . . . . . 1.6.2 Corporate finance . . . . . . . . . . . . . . . . 1.6.3 Financial institution . . . . . . . . . . . . . . . 1.6.4 Investment and portfolio management . . . . . 1.6.5 Option pricing model . . . . . . . . . . . . . . 1.6.6 Futures and hedging . . . . . . . . . . . . . . . 1.6.7 Mutual fund . . . . . . . . . . . . . . . . . . . 1.6.8 Credit risk modeling . . . . . . . . . . . . . . . 1.6.9 Other applications . . . . . . . . . . . . . . . . 1.7 Overall Discussion of this Book . . . . . . . . . . . . . 1.7.1 Chapter title classification . . . . . . . . . . . 1.7.2 Keyword classification . . . . . . . . . . . . . . 1.8 Summary and Concluding Remarks . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 1A: Abstracts and Keywords for Chapters 2 to 131
. . . . .
12
. . . . .
12
. . . . . . . . . . . . . . . .
13 13 13 14 14 14 14 14 15 15 16 16 17 24 24 32
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
Abstract The main purpose of this introductory chapter is to give an overview of the following 130 papers, which discuss financial econometrics, mathematics, statistics, and machine learning. There are eight sections in this introductory chapter. Section 1 is the introduction, Section 2 discusses financial econometrics, Section 3 explores financial mathematics, and Section 4 discusses financial statistics. Section 5 of this introductory chapter discusses financial technology and machine learning, Section 6 explores applications of financial econometrics, mathematics, statistics, and machine learning, and Section 7 gives an overview in terms of chapter and keyword classification of the handbook. Finally, Section 8 is a summary and includes some remarks. Keywords Asset Pricing Tests • Non-linear regression • Box–Cox transformation • Structural change • Generalize fluctuation • Probit regression • Logit regression • Poisson regression • Fuzzy regression • Path analysis • Three-stage least squares estimation (3SLS) method • Disequilibrium estimation method • Clustering effect model • Multi-factor and multiindicator (MIMIC) model • Financial mathematics • Financial statistics • Financial technology.
1.1 Introduction Financial econometrics, mathematics, statistics, and machine learning have been widely used in empirical research in both finance and accounting.
page 2
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
3
Specifically, econometric methods are important tools for asset pricing, corporate finance, options and futures, and conducting financial accounting research. Econometric methods used in finance and accounting related research include single equation multiple regression, simultaneous regression, panel data analysis, time series analysis, spectral analysis, non-parametric analysis, semi-parametric analysis, GMM analysis, and other methods. In both theory and methodology, we need to rely upon mathematics, which includes linear algebra, geometry, differential equations, stochastic differential equation (Ito calculus), optimization, constrained optimization, and others. These forms of mathematics have been used to derive capital market line, security market line (capital asset pricing model), option pricing model, portfolio analysis, and others. Statistics distributions, such as normal distribution, stable distribution, and log normal distribution, have been used in research related to portfolio theory and risk management. Binomial distribution, log normal distribution, noncentral chi square distribution, Poisson distribution, and others have been used in studies related to option and futures. Moreover, risk management research has used Copula distribution and other distributions. Both finance research and applications need financial technology for empirical analyses. These technologies include Excel, Excel VBA, SAS program, MINITAB, MATLAB, machine learning, and others. It is well known that simulation method is also frequently used in financial empirical studies. This handbook is composed of 130 chapters, which are used to show how financial econometrics, mathematics, statistics, and machine learning can be used both theoretically and empirically in finance research. Section 1.1 introduces the topics to be covered in this handbook. Section 1.2 discusses financial econometrics. In Section 1.2, there are six subsections. Each subsection briefly discusses a topic. These topics include single equation regression methods, simultaneous equation models, panel data analysis, alternative methods to deal with measurement error, timeseries analysis, and spectral analysis. In Section 1.3, financial mathematics is covered. Section 1.4 and its five subsections discusses financial statistics. The subsections in Section 1.4 are as follows: statistical distributions, principle components and factor analysis, non-parametric and semi-parametric analyses, cluster analysis, and Fourier transformation method. Section 1.5 briefly discusses financial technology and machine learning. In this section there are four subsections. The first and second subsections go over the classification of financial technology and machine learning. The third subsection mentions machine learning applications, and the fifth subsection talks about computer science tools used in financial technology. Section 1.6 discusses
page 3
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
4
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
applications of financial econometrics, mathematics, statistics, and technology, and includes nine subsections. The subsections discuss asset pricing, corporate finance, financial institution, investment and portfolio management, option pricing model, futures and hedging, mutual fund, credit risk modeling, and other applications. Section 1.7 is an overall discussion of the handbook, and Section 1.8 is a summary and provides some concluding remarks. 1.2 Financial Econometrics 1.2.1 Single equation regression methods Heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, generalize fluctuation, probit and logit regression, Poisson regression, and fuzzy regression are important issues related to single equation regression estimation method. These issues are briefly discussed below in the following sections: 1.2.1.1 Heteroskedasticity White (1980) and Newvey and West (1987) authored two important papers discussing how the heteroskedasticity test can be performed. Specifically, Newvey and West’s (1987) paper discusses heteroskedasticity when there are serial correlations. 1.2.1.2 Specification error Specification error occurs when there is a missing variable in a regression analysis. We can refer to the papers by Thursby (1985), Fok et al. (1996), Cheng and Lee (1986), and Maddala et al. (1996) for testing the existence of specification error. 1.2.1.3 Measurement errors and asset pricing tests Measurement error problem involves the existence of imprecise independent variables in a regression analysis. Lee and Jen (1978), Kim (1995, 2010), Miller and Modigliani (1966) and Chen, Lee, and Lee (2015) have explored how measurement error methods can be applied to finance research. Chen, Lee, and Lee (2015) have discussed alternative errors in variable estimation methods and their application in finance research. In his dissertation Errors-in-Variables Estimation Procedures with Applications to a Capital Asset Pricing Model, Lee (1973) extensively studied alternative methods
page 4
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
5
dealing with measurement error problems in regression. He also used individual security data instead of portfolio data to test asset pricing. Recently, researchers have realized that portfolio data is not suitable to test asset pricing model; therefore, researchers have started to use individual firm’s rates of return to test asset pricing. These research papers include Kim (1995), Jegadeesh et al. (2019), and others. In sum, it is now clear that the traditional method of using portfolio data to test asset pricing is no longer suitable. For the last 50 years, empirical research in capital asset pricing has been carried out by a lot of researchers. Besides the errors-in-variables problem mentioned above, Harvey et al. (2016) and McLean and Pontiff (2016) used adjusted-t in multiple testing and concluded that the significant criteria should use t = 3 instead of t = 2. 1.2.1.4 Skewness and kurtosis effect Both skewness and kurtosis are two important measurement variables to prepare stock variation analysis. Lee (1976), Sears and Wei (1988), and Lee and Wu (1985) discuss the skewness and kurtosis issue in asset pricing. 1.2.1.5 Non-linear regression and Box–Cox transformation Non-linear regression and Box–Cox transformation are important tools for finance, accounting, and urban economic research. Lee (1976 & 1977), Lee et al. (1990), Frecka and Lee (1983), and Liu (2006) have discussed how non-linear regression and Box–Cox transformation techniques can be used to improve the specification of finance and accounting research. In addition, Kau and Lee (1976), and Kau et al. (1986) have explored how Box– Cox transformation can be used to conduct the empirical study of urban structure. 1.2.1.6 Structural change Yang (1989), Lee et al. (2011), and Lee et al. (2012) have discussed how the structural change model can be used to improve the empirical study of dividend policy and the issuance of new equity. Chow (1960) have proposed a dummy variable approach to examine the existence of structure change for regression analysis. Zeileis et al. (2002) have developed software programs to perform the Chow test and other structural change models which has been frequently used in finance and economic research. In addition, Hansen (1996, 1997, 1999, 2000 (A), and 2000 (B)) has explored the
page 5
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
6
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
issue of threshold regressions and their applications in detecting structure change for regression. 1.2.1.7 Generalize fluctuation Kuan and Hornik (1995) have discussed how the generalized fluctuation test can be used to perform structural change to regression. In addition, Lee et al. (2011) have used both theory and econometric method to test the structural change of a cross-sectional change for dividend policy research. 1.2.1.8 Probit and logit regression Probit and logit regressions are frequently used in credit risk analysis. Ohlson (1980) used the accounting ratio and macroeconomic data credit risk analysis. Shumway (2001) has used accounting ratios and stock rate returns for credit risk analysis in terms of probit and logit regression techniques. Most recently, Hwang et al. (2008 & 2009) and Cheng et al. (2010) have discussed probit and logit regression for credit risk analysis by introducing non-parametric and semi-parametric techniques into this kind of regression analysis. 1.2.1.9 Poisson regression Lee and Lee (2014) have discussed how Poisson regression can be performed regardless of the relationship between multiple directorships, corporate ownership, and firm performance. 1.2.1.10 Fuzzy regression Shapiro (2005), Angrist and Lavy (1999), and Van Der Klaauw (2002) have discussed how Fuzzy regression can be performed. This method has potential to be used in finance accounting and research. 1.2.1.11 Path analysis Path analysis is a straightforward extension of multiple regression. Its aim is to provide estimates of the magnitude and significance of hypothesized causal connections between sets of variables. The application of path analysis in finance and accounting can be found in Kozberg (2004) and Riahi-Belkaoui and Pavlik (1993). Besides the above mentioned methodologies, in this handbook we also present other new econometric methodologies such as quantile cointegration
page 6
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
7
(Chapter 92), threshold regression (Chapter 22), Kalman filter (Chapter 53), and filtering methods (Chapter 64). 1.2.2 Simultaneous equation models Besides single equation regression models, we can estimate simultaneous equation models. There are two-stage least squares estimation (2SLS) method, seemly unrelated regression (SUR) method, three-stage least square estimation (3SLS) method, disequilibrium estimation method, and generalized method of moments. 1.2.2.1 Two-stage least squares estimation (2SLS) method Miller and Modigliani (1966) have used 2SLS to study cost of capital for utility industry. Lee (1976a) has applied 2SLS to started market model. Moreover, Chen et al. (2007) have discussed 2SLS method for investigating corporate governance. 1.2.2.2 Seemly unrelated regression (SUR) method Seemly unrelated regression has frequently been used in economic and financial research. Lee and Zumwalt (1981) have discussed how the seemly unrelated regression method can be applied in asset pricing determination. Spies (1974) and de Leeuw (1965) have proposed a stacking technique to replace either SUR or constrained SUR. 1.2.2.3 Three-stage least squares estimation (3SLS) method Chen et al. (2007) have discussed how the three-stage least squares estimation method can be applied in corporate governance research. Lee et al. (2016) have discussed applications of simultaneous equations in finance research. 1.2.2.4 Disequilibrium estimation method Fair and Jaffee (1972), Amemiya (1974), Quandt (1988), Mayer (1989), and Martin (1990) have discussed how alternative disequilibrium estimation method can be performed. Sealey et al. (1979), Tsai (2005), and Lee et al. (2011) have discussed how the disequilibrium estimation method can be applied in asset pricing test and banking management analysis.
page 7
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
8
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
1.2.2.5 Generalized method of moments Hansen (1992) and Hamilton (1994, chapter 14) have discussed how generalized method of moments method can be performed. Chen et al. (2007) have used the two-stage least squares estimation (2SLS), three-stage squares method and GMM method to investigate corporate governance. 1.2.3 Panel data analysis There are several important issues related to panel data analysis, such as fixed effect model, random effect model, and clustering effect model. Three well-known textbooks by Wooldridge (2002), Baltagi (2008) and Hsiao (2014) have discussed the applications of panel data in finance, economics, and accounting research. 1.2.3.1 Fixed effect model Chang and Lee (1977) and Lee et al. (2011) have discussed the role of the fixed effect model in panel data analysis of dividend research. 1.2.3.2 Random effect model Arellano and Bover (1995) have explored the random effect model and its role in panel data analysis. Chang and Lee (1977) have applied both fixed effect and random effect model to investigating the relationship between price per share, dividend per share, and retained earnings per share. 1.2.3.3 Clustering effect model Petersen (2009), Cameron et al. (2011), and Thompson (2011) review the clustering effect model and its impact on panel data analysis. 1.2.4 Alternative methods to deal with measurement error LISREL model, multi-factor and multi-indicator (MIMIC) model, partial least square method, and grouping method can be used to deal with measurement error problem. 1.2.4.1 LISREL model Titman and Wessal (1988), Chang (1999), Chang et al. (2009), Yang et al. (2010) have described the LISREL model and its way to resolve the measurement error problems of finance research.
page 8
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
9
1.2.4.2 Multi-factor and multi-indicator (MIMIC) model Wei (1984) and Chang et al. (2009) have applied the multi-factor and multiindicator (MIMIC) model in capital structure and asset pricing research. 1.2.4.3 Partial least square method Lambert and Lacker (1987), Ittner et al. (1997), and Core (2000) have applied the partial least square method to deal with measurement error problems in accounting research. 1.2.4.4 Grouping method Black et al. (1972), Blume and Friend (1973), Fama and MacBeth (1973), Lee (1973), Lee (1977b), Chen (2011), and Lee and Chen (2011) analyze grouping method and its way to deal with measurement error problem in capital asset pricing tests. In addition, there are other errors in variables methods, such as classical method, instrumental variable method, mathematical programming method, maximum likelihood method, GMM method, and Bayesian statistic method. Chen, Lee, and Lee (2015) have discussed the above-mentioned methods in detail. 1.2.5 Time-series analysis There are various important models in time series analysis, such as autoregressive integrated moving average (ARIMA) model, autoregressive conditional heteroskedasticity (ARCH) model, generalized autoregressive conditional heteroskedasticity (GARCH) model, fractional GARCH, and combined forecasting model. Anderson (1994) and Hamilton (1994) have discussed the issues related to time series analysis. Myers (1991) discloses ARIMA’s role in time-series analysis: Lien and Shrestha (2007) discuss ARCH and its impact on time-series analysis. Lien (2010) discusses GARCH and its role in time-series analysis. Leon and VaelloSebastia (2009) further research into GARCH and its role in time-series in a model called Fractional GARCH. Granger and Newbold (1973), Granger and Newbold (1974), Granger and Ramanathan (1984) have theoretically developed combined forecasting methods. Lee et al. (1986) have applied combined forecasting methods to forecast market beta and accounting beta. Lee and Cummins (1998) have
page 9
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
10
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
shown how to use the combined forecasting methods to perform cost of capital estimates. 1.2.6 Spectral analysis Anderson (1994), Chacko and Viceira (2003), and Heston (1993) have discussed how spectral analysis can be performed. Heston (1993) and Bakshi et al. (1997) have applied spectral analysis in the evaluation of option pricing.
1.3 Financial Mathematics Mathematics used in finance research includes linear algebra, calculus, and Ito calculus. For portfolio analysis we need to use constrained optimization. For CAPM derivation we need to use portfolio optimization chain rule, partial derivative, and some basic geometry. In option pricing model derivation, we need to use Ito calculus as well as related theories and propositions. For example, Black and Scholes (1973), Merton (1973), Hull (2018), Lee et al. (2016), Lee et al. (2013), and others have shown how Ito calculus can be used to derive option pricing model and other research topics. In addition, Lee et al. (2013) have shown how the constrained maximization method and linear algebra method can be used to obtain the optimum portfolio weights. 1.4 Financial Statistics 1.4.1 Statistical distributions Cox et al. (1979) and Rendleman and Barter (1979) have used binomial, normal, and lognormal distributions to develop an option pricing model. Some researchers provide studies on these different statistical distributions. Black and Sholes (1973) have used lognormal distributions to derive the option pricing model. Finally, Aitchison and Brown (1973) is a well-known book to investigate lognormal distribution. Schroder (1989) has derived the option pricing model in terms of noncentral Chi-square distribution. In addition, Fama (1971) has used stable distributions to investigate the distribution of stock rate of returns. Chen and Lee (1981) have derived statistics distribution of Sharpe performance measure and found that Sharpe performance measure can be described by Wishart distribution.
page 10
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
11
1.4.2 Principle components and factor analysis Anderson’s (2003) book An Introduction to Multivariate Statistical Analysis has discussed principle components and factor analysis in details. Pinches and Mingo (1973), Chen and Shimerda (1981), and Kao and Lee (2012) have discussed how principle components and factor analysis can be used in finance and accounting research. 1.4.3 Non-parametric and semi-parametric analyses Hutchison et al. (1994) and Ait-Sahalia and Lo (2000) have discussed how non-parametric can be used in risk management and the evaluation of financial derivatives. Hwang (2007), Hwang et al. (2010), and Chen et al. (2010) have used semi-parametric to conduct credit risk analysis. 1.4.4 Cluster analysis The detailed procedures to discuss how cluster analysis can be used to find groups in data can be found in the textbook by Kaufman and Rousseeuw (1990). In addition, Brown and Goetzmann (1997) have applied cluster analysis in mutual fund research. 1.4.5 Fourier transformation method Carr and Madan (1999) have used the fast Fourier transformation method to evaluate option pricing. 1.5 Financial Technology and Machine Learning Financial machine learning is the subset of financial technology. In this book, we concentrate on the methodology and its application of machine learning. 1.5.1 Classification of financial technology Following Teng and Kao (2019), financial technology (FinTech) can generally be classified as BankingTech, PayTech, RegTech, WealthTech, LendTech, and InsurTech. 1.5.2 Classification of machine learning Machine learning is one of the most important tools for financial technology. Machine learning can generally be classified as (i) supervised learning,
page 11
July 6, 2020
10:14
12
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
(ii) unsupervised learning, and (iii) others (reinforcement learning, semisupervised, and active learning). Supervised learning includes (i) regression (lasso, redge, loess, KNN, and spline) and (ii) classification (SVM, random forest, and deep learning). Unsupervised learning includes (i) clustering (K-means, hierarchical tree clustering) and (ii) factor analysis (principle component analysis, etc.). 1.5.3 Machine learning applications The major machine learning applications in financial technology are (i) investment predictions/quantitative investment: high-frequency trading, portfolio management, (ii) risk management: credit risk, (iii) payment default of credit card holders and loan borrowers, (iv) fraud prevention, and (iv) marketing. More specifically, machine learning applications in finance can be classified into current applications and the future value of machine learning in finance. The current applications include (i) Portfolio management (Betterment, Wealthfront, Schwab Intelligent Portfolios), (ii) Algorithm Trading, automated trading systems, high-frequency trading (Renaissance Technologies, Walnut Algorithms), (iii) Loan/insurance underwriting (Compare.com), (iv) Credit risk management: default risk of credit card holder, and (v) Fraud detection: “perfect storm” for data security risk (Kount, APEX Analytics). There are several benefits of machine learning in finance. They are (i) Customer service: chat bots and conversational interfaces (Kasisto), (ii) Security 2.0: facial recognition, voice recognition, or other biometric data (FaceFirst, Cognitec), (iii) Sentiment/news analysis (Hearsay Social), and (iv) sales/recommendations of financial products. Most recently, Nasekin and Chen (2019) have used deep learning technology to study cryptocurrency sentiment construction. Applications of machine learning in credit risk analysis will be discussed in Section 1.6.8. 1.5.4 Other computer science tools used for financial technology The other computer science tools used for FinTech include customer service, digital assistants, and network security. In finance research and applications, we need to use a lot of computer programming techniques such as Excel, SAS, MINITAB, R, machine learning, and support vector machine based methodology. Monte Carlo simulation techniques are frequently needed for financial research and practical applications. In addition, data mining and
page 12
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
13
big data analysis are also important financial technology tools. Time-series bootstrapping simulation can also be used in financial analysis. For example, Bradley and Mangasarian (2000) used linear support vector machine optimization to discriminate massive data, Meyer et al. (2012) used bootstrapping simulation to measure the skill in investors’ investment performance, Bzdok et al. (2018) discussed the relationship between statistics and machine learning, and Gu et al. (2018) used the machine learning method to perform capital asset pricing research. Finally, the book written by Mitchell (1997) explored machine learning methodology in detail. 1.6 Applications of Financial Econometrics, Mathematics, Statistics, and Machine Learning In this section, we will briefly discuss how different methodologies of financial econometrics, mathematics, statistics, and machine learning can be applied to the topics of finance and accounting research. Topics include asset pricing, corporate finance, investment and portfolio research, option pricing, futures and hedging, mutual fund, credit risk modeling, and other applications. 1.6.1 Asset pricing Methodologies used in asset pricing research include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, nonlinear regression and Box–Cox transformation, structural change, two-stage least squares estimation (2SLS) method, seemingly unrelated regression (SUR) method, three-stage least squares estimation (3SLS) method, disequilibrium estimation method, fixed effect model, random effect model, clustering effect model of panel data analysis, grouping method, ARIMA, ARCH, GARCH, Fractional GARCH, and Wishart distribution. 1.6.2 Corporate finance Methodologies used in corporate finance research include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, nonlinear regression and Box–Cox transformation, structural change, probit and logit regression for credit risk analysis, Poisson regression, fuzzy regression, two-stage least squares estimation (2SLS) method, seemingly unrelated regression (SUR) method, three-stage least squares estimation (3SLS) method, fixed effect model, random effect model, clustering effect model of panel data analysis, and GMM analysis.
page 13
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
14
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
1.6.3 Financial institution Methodologies used in financial institution research include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, probit and logit regression for credit risk analysis, Poisson regression, fuzzy regression, two-stage least squares estimation (2SLS) method, seemingly unrelated regression (SUR) method, three-stage least squares estimation (3SLS) method, disequilibrium estimation method, fixed effect model, random effect model, clustering effect model of panel data analysis, and semi-parametric analysis. 1.6.4 Investment and portfolio management Methodologies used in investment and portfolio management include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, probit and logit regression for credit risk analysis, Poisson regression, and fuzzy regression. Gu et al. (2018) have used financial machine learning techniques to study capital asset pricing. 1.6.5 Option pricing model Methodologies used in option pricing research include ARIMA, ARCH, GARCH, fractional GARCH, spectral analysis, binomial distribution, Poisson distribution, normal distribution, lognormal distribution, chisquare distribution, noncentral chi-square distribution, and non-parametric analysis. 1.6.6 Futures and hedging Methodologies used in future and hedging research include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, probit and logit regression for credit risk analysis, Poisson regression, and fuzzy regression. 1.6.7 Mutual fund Methodologies used in mutual fund research include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, probit and logit
page 14
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
15
regression for credit risk analysis, Poisson regression, fuzzy regression, and cluster analysis. 1.6.8 Credit risk modeling 1.6.8.1 Traditional approach Methodologies used in credit risk modeling include heteroskedasticity, specification error, measurement error, skewness and kurtosis effect, non-linear regression and Box–Cox transformation, structural change, two-stage least squares estimation (2SLS) method, seemingly unrelated regression (SUR) method, three-stage least squares estimation (3SLS) method, disequilibrium estimation method, fixed effect model, random effect model, clustering effect model of panel data analysis, ARIMA, ARCH, GARCH, and semi-parametric analysis. 1.6.8.2 Machine learning approach Recently, machine learning techniques have been extensively used in credit risk analysis. Butaru et al. (2016) have used machine learning techniques to study risk and risk management. Atiya (2001) used neural networks for predicting bankruptcy for credit risk. Bahrammirzaee (2010) examined artificial intelligence applications in finance such as artificial neural networks, expert system and hybrid intelligent systems. Crook et al. (2007) explored developments in consumer credit risk assessment. Demyanyk and Hasan (2010) studied prediction methods for financial crises and bank failures. Garrido et al. (2018) researched profit measure for binary classification model. Keerthi and Lin (2003) studied asymptotic behaviors of support vector machines with Gaussian kernel. Kingma and Ba (2014) explored a method for stochastic optimization. Kumar and Ravi (2007) studied bankruptcy in banks and firms using statistical and intelligent techniques. Thomas (2000) studied forecasting the financial risk of lending to consumers. Verbraken et al. (2014) used profit-based classification measures to develop consumer credit scoring models. Zopounidis et al. (1997) have surveyed the use of knowledge-based decision support systems in financial management. Finally, Lee et al. (2019) have used machine learning techniques for predicting the default of credit card holders and success of Kickstarters. 1.6.9 Other applications Financial econometrics is also an important tool for conducting research in trading cost (transaction cost) modeling, hedge fund research,
page 15
July 6, 2020
10:14
16
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
microstructure, earnings announcement, real option research, financial accounting, managerial accounting, auditing, and term structure modeling.
1.7 Overall Discussion of this Book In this section, we classify 130 papers (Chapters 2–131) which have been presented in Appendix 1 in accordance with chapter titles and keywords. 1.7.1 Chapter title classification Based on the chapter titles, we classify 130 chapters into the following 16 topics: (i) Financial Accounting and Auditing (Chapters 2, 3, 17, 22, 30, 32, 53, 54, 105 and 130) (ii) Mutual Funds and Hedge Funds (Chapters 12, 87, 91, 108 and 125) (iii) Corporate Finance (Chapters 45, 46, 49, 59, 67, 76, 96 and 97) (iv) Asset Pricing (Chapters 10, 12, 19, 48, 62, 64, 70, 79, 81, 82, 98, 99, 100, 114, 122 and 129) (v) Options (Chapters 7, 15, 24, 27, 33, 34, 40, 42, 50, 51, 83, 84, 85, 86, 89, 102, 103, 106, 109, 110 and 128) (vi) Portfolio Analysis (Chapters 4, 13, 14, 43, 47, 80, 81, 82, 88, 93, 104 and 121) (vii) Risk Management (Chapters 9, 15, 20, 35, 38, 39, 44, 50, 57, 58, 61, 65, 72, 73, 78, 101, 107, 127 and 131) (viii) International Finance (Chapters 13, 52, 56 and 60) (ix) Investment and Behavior Finance (Chapters 16, 18, 30, 31, 36, 42, 68, 69, 74, 87, 90, 94, 95, 103, 118, 119, 121 and 123) (x) Banking Management (Chapters 25, 63, 66, 71, 117 and 131) (xi) Event Study (Chapters 53 and 67) (xii) Futures and Index Futures (Chapters 11, 89 and 92) (xiii) Financial Econometrics (Chapters 2, 3, 4, 5, 6, 10, 11, 22, 23, 25, 26, 28, 29, 32, 37, 48, 55, 56, 60, 62, 63, 70, 75, 79, 90, 92, 94, 95, 98, 99, 100, 107, 108, 111, 112, 113, 116, 123, 124, 125, 129 and 131) (xiv) Financial Mathematics (Chapters 24, 27, 34, 40, 41, 49, 54, 86, 96, 97, 102, 104, 106, 109, 110 and 128) (xv) Financial Statistics (Chapters 7, 8, 16, 19, 33, 35, 36, 38, 39, 52, 57, 58, 59, 61, 64, 65, 68, 69, 72, 73, 74, 77, 78, 85, 93, 113, 114, 115, 118 and 120)
page 16
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
17
(xvi) Machine Learning (Chapters 14, 20, 21, 31, 39, 43, 44, 46, 55, 56, 84, 101, 112, 117, 119 and 127)
1.7.2 Keyword classification The number behind each keyword, which are found in Appendix 1, is the chapter it is associated with. Accounting beta (79), Acquisitions (116), Adaptive penalty (44), Additive and multiplicative rates of return (114), Advocacy groups (105), AI Content as a Service (AICaaS) (119), Algorithmic bias (117), American option (84, 85), American options (24), Analyst coverage network (31), Analyst recommendation revisions (55), Analysts’ information (2), Analytic hierarchy process (21), Announcement returns (76), Approximation approach (106), ARCH (119), ARCH & GARCH (107), ARCH (Autoregressive conditional heteroskedasticity) (11), ARCH method (11), Archimedean copula (73), ARIMA (119), ARIMA-GARCH model (6), ARIMA models (87), Artificial intelligence (101, 117, 127), Artificial regression (28), ASEAN (23), Asian financial crisis (52), Asset (100), Asset allocation (80), Asset portfolio (43), Asset pricing (10, 12, 98), Asset pricing tests (1), Asymmetric information (66), Asymmetric taxes (128), Audit fees (3, 22), Audit opinion prediction (21), Auditor change (21), Auditor independence (22), Auditor reputation (22), Autoregressive forecasting model (26), Balance of trade (23), Bank credit risk (60), Bank regulatory compliance (117), Bank relationships (66), Bank risk (38), Banking (56), Bankruptcy (15, 21), Banks (17), Barrier option (15), Basel committee on banking supervision (38), Bayes estimation (74), Bayes factor (93), Bayes rule (108), Bayesian Approach (37), Bayesian factor (52), Bayesian net (21), Bayesian test (93), Behavior finance (122), Behavioral finance (16), Beta coefficient (81), Betting against beta (19), Big data (21, 117), Binomial option pricing model (84, 102), Bipower variation tests (29), Black–Litterman model (14), Black–Scholes model (84), Black–Scholes option pricing model (102), Bond price (110), Bond strategies (88), Book–tax conformity (3), Book–tax differences (32), Book-to-market (10, 121), Booting (101), Bootstrap (108), BOS ratio (95), Box–Cox transformation (1), Box–Jenkins ARIMA methodology (112), Calendar (Time) spreads (83), Calibration (33), Call option (84), Capital Asset Pricing Model (19, 37), Capital gain (128), Capital structure (35, 37), Capital-rationed firms (46), CAPM (53, 79, 100), CARA utility function (11), Cash conversion cycle (46), Cash conversion System (46), Causal inference (124), Centrality (9), CEO compensation (32, 45), CEO talent (45), CEV model
page 17
July 6, 2020
10:14
18
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
(106), China (21), Classical method (37), Clayton copula (73), Clustering effect model (1), Coefficient determination (114), Cognitive biases (16), Coincident indicators (26), Co-integration and error assertion method effectiveness (11), Collar (83), Collective correlation (8), Combination forecasting model (6), Combined investment strategy (95), Comment letters (105), Commodity diversifier (4), Common stock valuation (90), Commonality (2), Community bank (131), Component analysis (87), Composite forecasting (79, 87), Computational finance (39), Concave utility function (80), Conditional multivariate F -test (93), Conditional tail expectation (72), Conditional value at risk model (104), Confidence index (87), Confirmatory factor analysis (CFA) (35), Conservative-minus-aggressive (19), Constant elasticity of variance model (109), Constant–Elasticity-of-Variance (CEV) process (51), Consumer sentiment (42), Consumption-based asset pricing model (48), Contagion (129), Continuous wavelet analysis (4), Corporate governance (32), Correlation (118), Correlation breakdown (8), Cost of capital (37, 100), Cost-minimization (49), Covariance (81), Covariance regression model (113), Covered call (83), Cox process (34), Credit analysis (78), Credit card (101), Credit card delinquency (127), Credit default swaps (60), Credit risk (38, 101), Credit risk classification (20, 44), Credit spread (110), Credit-scoring model (57), Cross-section of stock returns (121), Cross-section data (26), CRSP value-weighted index (93), Currency risk (12), Cyclical component (26), Data mining (21, 55), DCC-GARCH model (123), DCC-MVGARCH (129), Debt-like signal (59), Decision table (21), Decision tree (21, 101), Decomposition of estimated regression coefficient (62), Deep learning (119), Deep neural network (127), Default (101), Default barrier (110), Default prediction (50), Default probability (78, 126), Default risk (128), Delinquency (101), Delta (Δ) (86), Demand function (99, 122), Deposit insurance (15), Difference-in-differences (124), Dimension reduction (8), Direct and reverse regression (75), Direct effect (130), Disclosure and countersignaling (17), Discounted value (49), Discriminant analysis (77, 78, 126), Discriminatory power (57), Disequilibrium effect (99), Disequilibrium estimation method (1), Disequilibrium model (99), Disposition effect (16), Disruptive technologies (56), Distributed lag models (91), Diversification (116), Diversification benefits (13), Dividend Policy (97, 116), Dividends (96), Dodd–Frank (131), Domestic and foreign advisors (76), Dow theory (87), Down-and-out barrier model (50), Downside risk (58), DTSM (Dynamic term structure models) (61), Due Process (105), Duration (88), Durbin, Wu, Hausman (DWH) Specification Tests (28), Dynamic capital budgeting decision (5), Dynamic CAPM (122), Dynamic conditional correlation
page 18
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
19
(123, 129), Dynamic conditional variance decomposition (123), Dynamic Factors (63), Dynamic hedging (89), Early adoption (17), Earnings forecasts (30), Earnings management (32), Earnings revisions (30), Earnings Surprises (94), East Asian bond and stock markets (123), Econometric and statistical methods (47), Efficiency (131), Efficiency hypothesis (32), EGARCH model (14), Elliptical copula (73), Emerging markets (25), Empirical methods (131), Empirical performance (51), Employees (56), Endogeneity (28), Endogenous industry structure (45), Endogenous supply (100), Endogenous variables (5), Equality of tail risks (72), Equity-like signal (59), Error correction model (6), Errors-in-Variables (37, 75, 98, 116), Estimate implied variance (106), Estimation (116), Estimation approach (50), Estimation stability (114), ETFs (29), Euler equations (12), European options (24, 84), Event extraction (18), Evolution strategy (44), Ex-ante probability (70), Ex post sharpe ratio (93), Exactly identified (100), Ex-ante moments (40), Excel program (84), Excel VBA (84), Excess returns (29), Exchange option (34), Exogenous variables (5), Expected discounted penalty (41), Expected payoff (7), Explanatory power (57), Exponential smoothing (26), Exponential smoothing constant (26), Extended Kalman filter (64), External financing (17), Extra-legal institution (3), Extreme bound analysis (91), Factor analysis (77, 78, 126), Factor attributes (119), Factor loading (10, 77, 78), Factor models (10), Factor score (77), False discovery rates (108), Fama and French factor models (121), FDIC (15), Feature extraction (20), Feltham–Ohlson model (90), Finance — Investment (69), Financial constraints (49), Financial Crisis (107), Financial econometrics (61), Financial market integration (123), Financial mathematics (1), Financial ratios (90), Financial reform (25), Financial statement analysis (95), Financial statistics (1), Financial technology (1), Financial z-score (78, 126), Financing costs (49), Finite sample (74), Finite difference method of the SV model (51), Firm Value (66), First-difference method (124), Fixed Effects (FE) (28), Fixed-effects model (96), Flexibility hypothesis (96), Forecast timeliness (31), Forecasting Stock Prices (112), Foreign bank debt (25), Foreign bank relationships (25), Fractional integration (23), Francis and Rowell model (90), Fund performance (108), Fundamental analysis (87, 95, 112), Funding decisions (49), Funding requirements (49), Future Contract (92), Fuzzy set (21, 54), Fuzzy regression (1), Gamma (Γ) (86), GARCH (1, 1) (123), GARCH (Generalized Autoregressive conditional heteroskedasticity) (11), GARCH method (11), GARCH model (14), GARCH-jump (70), GARJI model (70), Gaussian copula (73), Gauss–Markov conditions (115), Generalize fluctuation (1), Generalized Method of Moments (GMM) (28), Global financial market
page 19
July 6, 2020
10:14
20
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
integration (118), Global investing (119), Goal programming (57), Gold (4, 92), Goodness of fit (108), GPU (39), Great recession (29), Grouping method (37), Growth rate (97), GRS test (10), Gumbel copula (73), GVSpread (40), GVIX Index (40), Habit formation (48), Hazard model (78, 126), Heckman’s two-stage estimation (111), Hedge fund (108, 125), Hedge funds performance (91), Hedge ratio (11), Hedging (86), Herding behaviors (113), Heteroskedasticity (52), High frequency data (39), High-frequency data (29), High-frequency jumps (29), High-Minus-Low (19), High-ranked analysts (31), Holt/Winters exponential smoothing (112), Holt–Winters forecasting model (26), Hyper-parameter optimization (20), Identification (28), Identification problem (116), Idiosyncratic standard deviation (78), Idiosyncratic risk (98), Implied risk-neutral distribution (42), Implied volatility (39, 40), Implied volatility Smile/skew/surface (33), Implied volatility spread (103), Incomplete market (24), Indifference curve (80), Indirect effect (130), Industry portfolios (121), Inference (72), Information fusion (18), Initial Public Offerings (67), Instrumental variable method (37), Instrumental variables (IV) (28), Insurance premium (15), Integrated process (72), Intelligent portfolio theory (43), Intention (71), Interconnectedness (9), Interest-rate anticipation swap (88), Intermarket-spread swap (88), Internal capital market (46), Internal control weakness (21), International CAPM (122), International finance (12), International portfolio (13), International stock market linkage (36), Internet stock (130), Intertemporal (12), Intertemporal CAPM (122), Intervention (6), Inverse Fourier Transform and Poisson Process (34), Investment (10, 13, 121), Investment banks (76), Investment constraints (13), Investment decision (116), Investment Equation (37), Investment Horizon (68), Investment horizons (4), Investor sentiment (42), IPO Issuance and Performance (67), Irregular component (26), Itˆo’s lemma (27), Japan (21), Jump (52), Jump diffusion (110), Jump risks (29), Jump spillover (29), Jump-diffusion (41), Kalman filter (53, 64), Kernel function selection (20), Kernel Smoothing (108), Key borrower (9), KMV-Merton model (78, 126), K-nearest neighbors (101), Korea (21), Kruskal–Wallis Test (105), Kurtosis (7), Lagging indicators (26), Lagrangian calculus maximization (81), Lagrangian multipliers (82), Lagrangian objective function (80), Largesample theory (115), Leading indicators (26), Lease Accounting (105), Leverage effect (58), Linear programming (7, 24, 81), Linear utility function (80), Linear-equation system (77), Liquidity risk (10), Liquidity shocks (121), Liquidity-based CAPM (122), LISREL (35), LISREL Method (37), Logistic Equation (97), Logistic regression (126), Logit (21), Logit model (78), Logit regression (1), Log-normal distribution (85), Lognormal distribution method
page 20
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
21
(102), Long call (83), Long memory (23, 58), Long Put (83), Long Straddle (83), Long vertical (Bull) spread (83), Loss aversion (48), Low interest rate environment (61), Lower bound (7), LSTM (119), Machine learning (101, 117, 119, 127), Make-to-stock inventory (46), Management earnings forecasts (2), Managerial implications (131), Mann–Whitney test (105), Market beta (79), Margrabe model (34), Market model (79, 81), Market portfolio (10), Market risk (38), Markovian models (46), Markowitz modern portfolio theory (14), Mathematical programming method (37), Matlab (39), MATLAB approach (106), Matrices (77), Maturity (88), Maximum likelihood estimation (50, 73), Maximum Likelihood Estimation (MLE) (50), Maximum likelihood estimator (65, 99), Maximum likelihood method (37), Maximum mean extended-gini coefficient hedge ratio (11), Mean reverting process (97), Mean squared error (26), Mean-variance capital asset pricing (47), Mean-variance efficiency (93), Measurement error (28, 37, 62, 75, 98), Mental accounting (16), Mergers (116), Mergers and acquisitions (76), Merton distance model (126), MINIMAX goal programming (57), Minimum generalized semi-variance hedge ratio (11), Minimum value at risk hedge ratio (11), Minimum variance hedge ratio (11), Minimum variance unbiased estimator (65), Mixture copula (38), Mixture Kalman filter (64), Mobile banking (71), Model of Ang and Piazzesi (2003) (61), Model of Joslin, Le, and Singleton (2013a) (61), Model of Joslin, Singleton, and Zhu (2011) (61), Moderating effect (16), Momentum (10, 19, 121), Momentum factor (103), Momentum strategies (94, 95), Money market liquidity premium (121), Moral hazard (15), Moving average (87), Multi variable spew- normal distribution method (11), Multi-factor risk model (119), Multinomial logit model (111), Multiperiod dynamic CAPM (99), Multiple criteria and multiple constraint level (MC2) linear programming (54), Multiple criteria linear programming data mining (21), Multiple discriminant analysis (21), Multiple factor transfer pricing model (54), Multiple-index model (81), Multivariate Discriminant Analysis (MDA) (78), Multivariate F -test (10), Multivariate GARCH (129), Multivariate log-normal distribution (85), Multivariate normal distribution (85), Multi-factor and multi-indicator (MIMIC) model (1), Mutual fund (108), Natural language generation (119), Natural language processing (21), Net chargeoff rates (63), Neural network (101), Neural network model (112), NLG (119), Non-parametric tests (28), Non-audit fees (22), Noncentral Chisquare distribution (109), Non-linear regression (1), Noncentral t distribution (65, 69), Non-financial information (130), Non-normal data (113), Non-parametric (24), Non-parametric method (120), Non-parametric
page 21
July 6, 2020
10:14
22
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
regression (118), Non-systematic risk (79), Normal distribution (85), N-Period OPM (84), Numerical experiment (51), Odd-Lot theory (87), OLS (45), Omega model (104), Omitted variables (28), One-period OPM (84), Operating profitability (121), Operational risk (38), Optimal capital structure (41), Optimal financial policy (49), Optimization (49), Optimum mean variance hedge ratio (11), Optimum mean MEG hedge ratio (11), Option (128), Option bound (7, 85), Option bounds (24), Option price (103), Option pricing (33, 109), Option pricing model (51), Options pricing (27), Out-ofsample forecasts (63), Panel data (28), Panel vector auto-regressions (60), Parallel computing (39), Parametric method (120), Partial adjustment (100), Partial adjustment model (97), Partial least squares (63), Particle filter (64), Partition function (8), Past stock returns (103), Path analysis (1, 130), Payout policy (96), Payout Ratio (97), PCA (Principal components analysis) (61), PCDTSM (Principal component-based DTSM) (61), Peer benchmarking (45), Percentage of moving average (26), Performance Manipulation (91), Performance measure (62), Phase-type distribution (41), Planning horizon (49), Poisson regression (1), Policy (15), Policy analyses (124), Portfolio (69), Portfolio construction (30), Portfolio management (30), Portfolio optimization (30), Portfolio selection (104), Portfolio theory (30), Post-earnings-announcement drift (94), Post-earnings-announcement drifts (53), Power index (9), Predictability (107), Price pressure (103), Principal Component Analysis (63), Principal components model (118), Probability integral transform (73), Probability limit for regression coefficient (62), Probit (21), Probit model (111, 126), Probit regression (1), Product market competition (45), Production cost (90), Profitability (10), Prospect theory (48), Protective put (83), Pure-yield-pickup swap (88), Put option (84), Put options (89), Put-call parity (42, 83), Python (101), Quadratic cost (100), Quality-minus-junk (19), Quantile (25, 72), Quantile co-integrated (92), Quantitative analysis (18), Random coefficient model (114), Random coefficient method (11), Random Effects (RE) (28), Realized variation (33), Recurrent survival analysis (59), Reduced-form (100), Regimeswitching GARCH method (11), Regret avoidance (16), Related mergers (116), Relative risk aversion distribution (65), Rent-seeking hypothesis (32), Revenue surprises (94), Rho (ρ) (86), Risk assessment (127), Risk aversion (80), Risk dependence (38), Risk integration (38), Risk management (38, 129), Risk-free rate (82), Risk-mitigating effect (59), Risk-return tradeoff (58), Risk-shifting (59), RNN (119), Robo-advisor (117), Robust Hausman (28), Robust standard errors that incorporate firm-level clustering (45), Robust-minus-weak (19), Sample estimators (115), Sample properties (115),
page 22
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
23
Sample selection bias (111), Sample size (68), Sarbanes–Oxley (131), Scoring system (119), Seasonal component (26), Seasonal index (26), Seasonal index method (26), Sector & location rotation (43), Security market line (122), Seemingly unrelated regression (SUR) (25, 100), Self-control (16), Sell-side analysts (55), Semi-parametric (24), Semi-parametric method (7, 120), Semi-parametric regressions (118), Sentiment analysis (18), Sequential Conversion (59), Shape parameter (70), Sharpe (68), Sharpe hedge ratio (11), Sharpe performance measure (82), Sharpe ratio (74), Short call (83), Short put (83), Short sales allowed (82), Short sales not allowed (82), Short selling (80, 104), Short straddle (83), Short vertical (Bear) spread (83), Significance identification (105), Simple summation approach (38), Simulation (7, 98), Simulation and bootstrap techniques (28), Simultaneous econometric models (5), Simultaneous equations (100), Simultaneous equations systems (116), Single-index model (81), Size (10, 121), Skewness (7), Sklar’s theorem (73), Small-minus-big (19), Social media (66), Social network (18), Sources of funds (49), Specification error (97), Spline regression analysis (67), Stale pricing (91), Standardized student’s t-distribution (73), State-space model (64), Static CAPM (122), Statistical analysis of response behavior (105), Statistics — Sampling (69), Stochastic calculus (102), Stochastic dominance (7, 24), Stochastic volatility (33), Stochastic volatility model (72), Stochastic volatility model with independent jumps (52), Stock correlation (18), Stock index futures (89), Stock market liquidity (121), Stock market momentum (103), Stock market returns (107), Stock prediction (18), Stock repurchase (59), Stock return comovement (113), Stop-loss orders (89), Strength investing (43), Structural breaks (61), Structural change (1), Structural credit risk model (50), Structural equation model (16), Structural equation modeling (SEM) (35), Structural hole (31), Student’s t copula (73), Subsidiaries (46), Substitution swap (88), Supervised learning (101), Supply chain financial management (46), Supply function (99, 122), Support vector machine (44, 101), Support vector machines (20, 21), SUR (5), Survival analysis (120), Survival model (21), Swapping (88), Synergies (116), Synthetic option (84, 89), Systematic risk (62, 79), Systematic risk coefficient (114), Systemic importance (9), TAIEX options (42), Tail dependence (38, 73), Tail risk (58), Tail wag the dog (89), Tax timing (128), Technical analysis (87, 95, 112), Technologies acceptance (56), Technology acceptance (71), Temporal aggregation (114), Test (72), Test power (108), The investor’s views (14), Theta (Θ) (86), Three-stage least squares estimation (3SLS) method (1), Threshold regression model (22), Time series decomposition (112), Timeseries bootstrapping simulations (55), Time-series data (26), Time-series
page 23
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
24
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
regression (121), Total risk (79), Trading strategies (108), Trading strategy portfolio (43), Trading volume (87, 95), Trading-day component (26), Transaction cost (128), Transaction costs (104), Transfer function model (6), Transfer pricing (54), Tree model (15), Trend component (26), Trend following, business & market cycle (43), Trend–cycle component (26), Treynor and Jensen measures (68), Treynor performance measure (82), Two-pass regression (98), Two-period OPM (84), Two stage least square method (116), US (21), Unbiased estimation (74), Uncertainty (2), Unrelated mergers (116), Unscented Kalman filter (64), Upper bound (7), User adoption (71), Utility function (80), Utility theory (80), Validity and reliability (16), Value at Risk (72), Value at risk model (104), Value-at-risk (58), Variance-covariance approach (38), Variance-gamma process (70), Vectors (77), Vega (ν) (86), VG NGARCH model (70), VIX (40), Volatility clustering (14), Warren and Shelton model (90), Wavelet coherence (36), Wavelet correlation (36), Wavelet multiple cross-correlation (36), X-11 model (26), ZLB (Zero lower bound) (61), 2SLS (5), and 3SLS (5). 1.8 Summary and Concluding Remarks This chapter has discussed important financial econometrics and statistics which are used in finance and accounting research. We discussed the regression models and topics related to financial econometrics, including single equation regression models, simultaneous equation models, panel data analysis, alternative methods to deal with measurement error, and timeseries analysis. We also introduced topics related to financial statistics, including statistical distributions, principle components and factor analysis, non-parametric and semi-parametric analyses, and cluster analysis. In addition, financial econometrics, mathematics, and statistics are important tools to conduct research in finance and accounting areas. We briefly introduced applications of econometrics, mathematics, and statistics models in finance and accounting research. Research topics include asset pricing, corporate finance, financial institution, investment and portfolio management, option pricing model, futures and hedging, mutual fund, credit risk modeling, and others. Bibliography Aitchison, J. and Brown, J. A. C. (1973). Lognormal Distribution, with Special Reference to Its Uses in Economics, Cambridge University Press. Ait-Sahalia, Y. and Lo, A. W. (2000). Non-parametric Risk Management and Implied Risk Aversion. Journal of Econometrics, 94, 9–51.
page 24
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
25
Amemiya, T. (1974). A Note on a Fair and Jaffee Model. Econometrica, 42, 759–762. Anderson, T. W. (1994). The Statistical Analysis of Time Series. Wiley-Interscience. Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis. WileyInterscience. Angrist, J. D. and Lavy, V. (1999). Using Maimonides’ Rule to Estimate the Effect of Class Size on Scholastic Achievement. Quarterly Journal of Economics, 14(2), 533–575. Arellano, M. and Bover, O. (1995). Another Look at the Instrumental Variable Estimation of Error- Components Models. Journal of Econometrics, 68(1), 29–51. Atiya, A. F. (2001). Bankruptcy Prediction for Credit Risk Using Neural Networks: A Survey and New Results. IEEE Transactions on Neural Networks, 12(4), 929–935. doi:10.1109/72.935101. Bahrammirzaee, A. (2010). A Comparative Survey of Artificial Intelligence Applications in Finance: Artificial Neural Networks, Expert System and Hybrid Intelligent Systems. Neural Computing & Applications, 19(8), 1165–1195. Bakshi, G., Cao, C. and Chen, Z. (1997). Empirical Performance of Alternative Option Pricing Models. Journal of Finance, 52(5), 2003–2049. Bakshi, G., Cao, C. and Chen, Z. (2010). Option Pricing and Hedging Performance under Stochastic Volatility and Stochastic Interest Rate. In Cheng F. Lee, Alice C. Lee, and John Lee, ed.: Handbook of Quantitative Finance and Risk Management (Springer, Singapore). Baltagi, B. (2008). Econometric Analysis of Panel Data (4th ed.). Wiley. Black, F. and Scholes, M. (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 81(3), 637–654. Black, F., Jensen, M. C. and Scholes, M. (1972). The Capital Asset Pricing Model: Some Empirical Tests. In M. C. Jensen, ed.: Studies in the Theory of Capital Markets (Praeger). Blume, M. E. and Friend, I. (1973). A New Look at the Capital Asset Pricing Model. Journal of Finance, 28, 19–33. Bradley, P. S. and Mangasarian, O. L. (2000). Massive Data Discrimination via Linear Support Vector Machines. Optimization Methods and Software, 13(1), 1–10. Brick, I. E., Palmon, O. and Patro, D. K. (2015). The Motivations for Issuing Putable Debt: An Empirical Analysis. In Cheng F. Lee and John Lee, ed.: Handbook of Financial Econometrics and Statistics (Springer, Singapore). Brown, S. J. and Goetzmann, W. N. (1997). Mutual Fund Styles. Journal of Financial Economics, 43(3), 373–399. Bzdok, D., Altman, N. and Krzywinski, M. (2018). Statistics Versus Machine Learning. Nature Methods, 15(4), 233–234. Cameron, A. C., Gelbach, J. B. and Miller, D. L. (2011). Robust Inference with Multiway Clustering. Journal of Business & Economic Statistics, 29, 238–249. Carr, P. and Madan, D. (1999). Option Evaluation Using the Fast Fourier Transform. Journal of Computational Finance, 24, 61–73. Cederburg, S. and O’Doherty, M. S. (2015). Asset-Pricing Anomalies at the Firm Level. Journal of Econometrics, 186, 113–128. Chacko, G. and Viceira, L. M. (2003). Spectral GMM Estimation of Continuous-time Processes. Journal of Econometrics, 116(1), 259–292. Chang, C. F. (1999). Determinants of Capital Structure and Management Compensation: The Partial Least Squares Approach. Ph.D. Dissertation, Rutgers University. Chang, C., Lee, A. C. and Lee, C. F. (2009). Determinants of Capital Structure Choice: A Structural Equation Modeling Approach. Quarterly Review of Economic and Finance, 49(2), 197–213.
page 25
July 6, 2020
10:14
26
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chang, H. S. and Lee, C. F. (1977). Using Pooled Time-series and Cross Section Data to Test the Firm and Time Effects in Financial Analysis. Journal of Financial and Quantitative Analysis, 12, 457–471. Chen, K. H. and Shimerda, T. A. (1981). An Empirical Analysis of Useful Finance Ratios. Financial Management, 10(1), 51–60. Chen, H. Y. (2011). Momentum Strategies, Dividend Policy, and Asset Pricing Test. Ph.D. Dissertation, Rutgers University. Chen, H. Y., Lee, A. C. and Lee, C. F. (2015). Alternative Errors-in-variables Models and the Applications in Finance Research. The Quarterly Review of Economics and Finance, 58, 213–227. Chen, H. Y., Gupta, M. C., Lee, A. C. and Lee, C. F. (2013). Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio: A Joint Optimization Approach. Journal of Banking and Finance, 37, 1205–1222. Chen, S. N. and Lee, C. F. (1981). The Sampling Relationship between Sharpe’s Performance Measure and its Risk Proxy: Sample Size, Investment Horizon and Market Conditions. Management Science, 27(6), 607–618. Chen, W. P., Chung, H., Lee, C. F. and Liao, W. L. (2007). Corporate Governance and Equity Liquidity: Analysis of S&P Transparency and Disclosure Rankings. Corporate Governance: An International Review, 15(4), 644–660. Cheng, D. C. and Lee, C. F. (1986). Power of Alternative Specification Errors Tests in Identifying Misspecified Market Models. The Quarterly Review of Economics and Business, 16(3), 6–24. Cheng, K. F., Chu, C. K. and Hwang, R. C. (2010). Predicting Bankruptcy Using the Discrete-time Semiparametric Hazard Model. Quantitative Finance, 10, 1055–1066. Chow, G. C. (1960). Tests of Equality between sets of Coefficients in Two Linear Regressions. Econometrica, 28, 591–605. Chu, C. C. (1984). Alternative Methods for Determining the Expected Market Risk Premium: Theory and Evidence. Ph.D. Dissertation, University of Illinois at UrbanaChampaign. Core, J. E. (2000). The Directors’ and Officers’ Insurance Premium: An Outside Assessment of the Quality of Corporate Governance. Journal of Law, Economics, and Organization, 16(2), 449–477. Cox, J. C., Ross, S. A. and Rubinstein, M. (1979). Option Pricing: A Simplified Approach. Journal of Financial Economics, 7(3), 229–263. Crook, J. N., Edelman, D. B. and Thomas, L. C. (2007). Recent Developments in Consumer Credit Risk Assessment. European Journal of Operational Research, 183(3), 1447–1465. de Leeuw, F. (1965). A Model of Financial Behavior in The Brookings Inst. Quarterly Econometric Model of the United States (James S. Duesenberry, Gary Fromm, L. R. Klein, and E. Kuh), 465–532. Chicago: Rand McNally. Demyanyk, Y. and Hasan, I. (2010). Financial Crises and Bank Failures: A Review of Prediction Methods. Omega, 38(5), 315–324. doi:DOI 10.1016/j.omega.2009.09.007. Fair, R. C. and Jaffee, D. M. (1972). Methods of Estimation of Market in Disequilibrium. Econometrica, 40, 497–514. Fama, E. F. (1971). Parameter Estimates for Symmetric Stable Distributions. Journal of the American Statistical Association, 66, 331–338. Fama, E. F. and MacBeth, J. D. (1973). Risk, Return, and Equilibrium: Empirical Tests. Journal of Political Economy, 81, 607–636.
page 26
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
27
Fok, R. C. W., Lee, C. F. and Cheng, D. C. (1996). Alternative Specifications and Estimation Methods for Determining Random Beta Coefficients: Comparison and Extensions. Journal of Financial Studies, 4(2), 61–88. Frecka, T. J. and Lee, C. F. (1983). Generalized Financial Ratio Adjustment Processes and their Implications. Journal of Accounting Research, 21, 308–316. Garrido, F., Verbeke, W. and Bravo, C. (2018). A Robust Profit Measure for Binary Classification Model Evaluation. Expert Systems with Applications, 92, 154–160. doi: https://doi.org/10.1016/j.eswa.2017.09.045. Granger, C. W. J. and Newbold, P. (1973). Some Comments on the Evaluation of Economic Forecasts. Applied Economics, 5(1), 35–47. Granger, C. W. J. and Newbold, P. (1974). Spurious Regressions in Econometrics. Journal of Econometrics, 2, 111–120. Granger, C. W. J. and Ramanathan, R. (1984). Improved Methods of Combining Forecasts. Journal of Forecasting, 3, 197–204. Gu, S., Kelly, B. and Xiu, D. (2018). Empirical Asset Pricing via Machine Learning. Technical Report No. 18 (04), Chicago Booth Research Paper. Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press. Hansen, B. E. (1996). Inference when a Nuisance Parameter is not Identified under the Null Hypothesis. Econometrica, 64(2), 413–430. Hansen, B. E. (1997). Approximate Asymptotic P-values for Structural Change Tests. Journal of Business and Economic Statistics, 15, 60–67. Hansen, B. E. (1999). Threshold Effects in Non-dynamic Panels: Estimation, Testing, and Inference. Journal of Econometrics, 93, 345–368. Hansen, B. E. (2000). (A) Sample Splitting and Threshold Estimation. Econometrica, 68(3), 575–603. Hansen, B. E. (2000). (B) Testing for Structural Change in Conditional Models. Journal of Econometrics, 97, 93–115. Hansen, L. P. (1982). Large Sample Properties of Generalized Method of Moments Estimators. Econometrica, 50(4), 1029–1054. Harvey, C. R., Liu, Y. and Zhu, H. (2016). . . . and the Cross-Section of Expected Returns. The Review of Financial Studies, 29(1), 5–68. Heston, S. L. (1993). A Closed-form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options. Review of Financial Studies, 6(2), 327–343. Hsiao, C. (2014). Analysis of Panel Data. Econometric Society Monographs (3rd ed.). Hull, J. C. (2018). Options, Futures, and Other Derivatives. 10th ed: Prentice Hall. Hutchinson, J. M., Lo, A. W. and Poggio, T. (1994). A Non-parametric Approach to Pricing and Hedging Derivative Securities via Learning Networks. Journal of Finance, 49(3), 851–889. Hwang, R. C., Wei, H. C., Lee, J. C. and Lee, C. F. (2008). On Prediction of Financial Distress Using the Discrete-time Survival Model. Journal of Financial Studies, 16, 99–129. Hwang, R. C., Chung, H. and Chu, C. K. (2010). Predicting Issuer Credit Ratings Using a Semiparametric Method. Journal of Empirical Finance, 17(1), 120–137. Hwang, R. C., Cheng, K. F. and Lee, C. F. (2009). On Multiple-class Prediction of Issuer Crediting Ratings. Journal of Applied Stochastic Models in Business and Industry, 25, 535–550. Hwang, R. C., Cheng, K. F. and Lee, J. C. (2007). A Semiparametric Method for Predicting Bankruptcy. Journal of Forecasting, 26, 317–342.
page 27
July 6, 2020
10:14
28
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Ittner, C. D., Larcker, D. F. and Rajan, M. V. (1997). The Choice of Performance Measures in Annual Bonus Contracts. Accounting Review, 72(2), 231–255. Jegadeesh, N., Noh, J., Pukthuanghong, K., Roll, R. and Wang, J. L. (2019). Empirical Tests of Asset Pricing Models with Individual Assets: Resolving the Errors-in-variables Bias in Risk Premium Estimation. Journal of Financial Economics, forthcoming. Journal of Financial Economics, 133(2), 273–298. Kao, L. J. and Lee, C. F. (2012). Alternative Method to for Determining Industrial Bond Ratings: Theory and Empirical Evidence. International Journal of Information Technology & Decision Making, 11, 1215–1235. Kau, J. B., Lee, C. F. and Sirmans, C. F. (1986). Urban Econometrics: Model Developments and Empirical Results. Research in Urban Economics, Vol. 6, JAI Press. Kau, J. B. and Lee, C. F. (1976). Functional form, Density Gradient and the Price Elasticity of Demand for Housing. Urban Studies, 13(2), 181–192. Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. (9th ed.), Wiley-Interscience. Keerthi, S. S. and Lin, C. -J. (2003). Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel. Neural Computation, 15(7), 1667–1689. Kim, D. (1995). The Errors in the Variables Problem in the Cross-section of Expected Stock Returns. Journal of Finance, 50(5), 1605–1634. Kim, D. (1997). A Reexamination of Firm Size, Book-to-market, and Earnings Price in the Cross-section of Expected Stock Returns. Journal of Financial and Quantitative Analysis, 32(4), 463–489. Kim, D. (2010). Issues Related to the Errors-in-variables Problems in Asset Pricing Tests. Handbook of Quantitative Finance and Risk Management, Part V, Chapter 70, 1091–1108. Kingma, D. P. and Ba, J. (2014). Adam: A Method for Stochastic Optimization. Paper presented at the International Conference on Learning Representations (ICLR’2015), San Diego, CA, USA. https://arxiv.org/abs/1412.6980. Kozberg, A. (2004). Using Path Analysis to Integrate Accounting and Non-Financial Information: The Case for Revenue Drivers of Internet Stocks. Advances in Quantitative Analysis of Finance and Accounting, 33–63. Kuan, C. M. and Hornik, K. (1995). The Generalized Fluctuation Test: A Unifying View. Econometric Reviews, 14, 135–161. Kumar, P. R. and Ravi, V. (2007). Bankruptcy Prediction in Banks and Firms via Statistical and Intelligent Techniques — A Review. European Journal of Operational Research, 180(1), 1–28. doi:10.1016/j.ejor.2006.08.043. Lambert, R. and Larcker, D. (1987). An Analysis of the Use of Accounting and Market Measures of Performance in Executive Compensation Contracts. Journal of Accounting Research, 25, 85–125. Lee, C. F. and Lee, J. C. (2015). Handbook of Financial Econometrics and Statistics, Vol. 1, Springer Reference, New York. Lee, A. (1996). Cost of Capital and Equity offerings in the Insurance Industry. Ph.D. Dissertation, The University of Pennsylvania in Partial. Lee, A. C. and Cummins, J. D. (1998). Alternative Models for Estimating the Cost of Capital for Property/casualty Insurers. Review of Quantitative Finance and Accounting, 10(3), 235–267. Lee, C. F. (1973). Errors-in-variables Estimation Procedures with Applications to a Capital Asset Pricing Model. Ph.D. Dissertation. The State University of New York at Buffalo. Lee, C. F. (1976a). A Note on the Interdependent Structure of Security Returns. Journal of Financial and Quantitative Analysis, 11, 73–86.
page 28
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
29
Lee, C. F. (1976b). Functional form and the Dividend Effect of the Electric Utility Industry. Journal of Finance, 31(5), 1481–1486. Lee, C. F. (1977a). Functional form, Skewness Effect and the Risk-return Relationship. Journal of Financial and Quantitative Analysis, 12, 55–72. Lee, C. F. (1977b). Performance Measure, Systematic Risk and Errors-in-variable Estimation Method. Journal of Economics and Business, 122–127. Lee, C. F., Lee, A. C. and Lee, J. (2019). Handbook of Financial Econometrics, Mathematics, Statistics, and Technology, World Scientific, Singapore, Forthcoming. Lee, C. F., Lee, A. C. and Lee, J. (2010). Handbook of Quantitative Finance and Risk Management, Springer, New York. Lee, C. F. and Wu, C. C. (1985). The Impacts of Kurtosis on Risk Stationarity: Some Empirical Evidence, Financial Review, 20(4), 263–269. Lee, C. F. and Zumwalt, J. K. (1981). Associations between Alternative Accounting Profitability Measures and Security Returns. Journal of Financial and Quantitative Analysis, 16, 71–93. Lee, C. F., Wu, C. C. and Wei, K. C. J. (1990). Heterogeneous Investment Horizon and Capital Asset Pricing Model: Theory and Implications. Journal of Financial and Quantitative Analysis, 25, 361–376. Lee, C. F., Tsai, C. M. and Lee, A. C. (2013). Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence. Quantitative Finance, 13, 227–239. Lee, C. F., Wei, K. C. J. and Bubnys, E. L. (1989). The APT versus the Multi-factor CAPM: Empirical Evidence. Quarterly Review of Economics and Business, Vol. 29. Lee, C. F., Gupta, M. C., Chen, H. Y. and Lee, A. C. (2011). Optimal Payout Ratio under Uncertainty and the Flexibility Hypothesis: Theory and Empirical Evidence. Journal of Corporate Finance, 17(3), 483–501. Lee, C. F., Newbold, P., Finnerty, J. E. and Chu, C. C. (1986). On Accounting-based, Market-based and Composite-based Beta Predictions: Method and Implications. Financial Review, 21, 51–68. Lee, C. F. and Jen, F. C. (1978). Effects of Measurement Errors on Systematic Risk and Performance Measure of a Portfolio. Journal of Financial and Quantitative Analysis, 13(2), 299–312. Lee, C. F., Finnerty, J., Lee, J. C., Lee, A. C. and Wort, D. (2013). Security Analysis, Portfolio Management, and Financial Derivatives, (3rd ed.). World Scientific Publishing. Lee, C. F., Chen, Y. and Lee, J. (2016). Alternative Methods to Derive Option Pricing Models: Review And Comparison. Review of Quantitative Finance and Accounting, 47(2): 417–451. Lee, C. F., Gupta, M. C., Chen, H. Y. and Lee, A. C. (2011). Optimal Payout Ratio under Uncertainty and the Flexibility Hypothesis: Theory and Empirical Evidence. Journal of Corporate Finance, 17(3), 483–501. Lee, K. W. and Lee, C. F. (2014). Are Multiple Directorships Beneficial in East Asia? Accounting and Finance, 54, 999–1032. Lee, C. F., Liang, W. L., Lin, F. L. and Yang, Y. (2016). Applications of Simultaneous Equations in Finance Research: Methods and Empirical Results. Review of Quantitative Finance and Accounting, 47(4), 943–971. Lee, M., Teng, H. W. and Kao, L. J. (2019). Machine Learning for Predicting Default of Credit Card Holders and Success of Kickstarters. Working Paper. Leon, A. and Vaello-Sebastia, A. (2009). American GARCH Employee Stock Option Valuation. Journal of Banking and Finance, 33(6), 1129–1143. Lien, D. (2010). An Note on the Relationship between the Variability of the Hedge Ratio and Hedging Performance. Journal of Futures Market, 30(11), 1100–1104.
page 29
July 6, 2020
10:14
30
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Lien, D. and Shrestha, K. (2007). An Empirical Analysis of the Relationship between Hedge Ratio and Hedging Horizon using Wavelet Analysis. Journal of Futures Market, 27(2), 127–150. Lin, F. C., Chien, C. C., Lee, C. F., Lin, H. C. and Lin, Y. C. (2019). Tradeoff between Reputation Concerns and Economic Dependence for Auditors- Threshold Regression Approach. Handbook of Financial Econometrics, Mathematics, Statistics, and Technology, World Scientific, Singapore, Forthcoming. Liu, B. (2006). Two Essays in Financial Econometrics: I: Functional forms and Pricing of Country Funds. II: The Term Structure Model of Inflation Risk Premia. Ph.D. Dissertation, Rutgers University. Maddala, G. S., Rao, C. R. and Maddala, G. S. (1996). Handbook of Statistics 14: Statistical Methods in Finance. Elsevier Science & Technology. Martin, C. (1990). Corporate Borrowing and Credit Constrains: Structural Disequilibrium Estimates for the UK Review of Economics and Statistics, 72(1), 78–86. Mayer, W. J. (1989). Estimating Disequilibrium Models with Limited a Priori Priceadjustment Information. Journal of Econometrics, 41(3), 303–320. McLean, R. D. and Pontiff, J. (2016). Does Academic Research Destroy Stock Return Predictability? Journal of Finance, 71(1) 5–32. Merton, R. C. (1973). Theory of Rational Option Pricing. The Bell Journal of Economics and Management Science, 4(1), 141–183. Meyer, S., Schmoltzi, D., Stammschulte, C., Kaesler, S., Loos, B. and Hackethal, A. (2012). Just Unlucky? — A Bootstrapping Simulation to Measure Skill in Individual Investors’ Investment Performance. Working Paper, Goethe University, Frankfurt. Miller, M. and Modigliani, F. (1966). Some Estimates of the Cost of Capital to the Utility Industry, 1954–7. American Economic Review, 56(3), 333–391. Mitchell, T. (1997). Machine Learning, McGraw Hill. Myers, R. J. (1991). Estimating Time-varying Optimal Hedge Ratios on Futures Markets. Journal of Futures Markets, 11(1), 39–53. Nasekin, S. and Chen, C. Y. H. (2019). Deep Learning-Based Cryptocurrency Sentiment Construction (December 10, 2018). SSRN: https://ssrn.com/abstract=3310784 or http://dx.doi.org/10.2139/ssrn.3310784. Newey, W. K. and West, K. D. (1987). A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica, 55(3), 703–708. Ohlson, J. S. (1980). Financial Ratios and the Probabilistic Prediction of Bankruptcy. Petersen, M. A. (2009). Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches. Review of Financial Studies, 22, 435–480. Pike, R. and Sharp, J. (1989). Trends in the Use of Management Science Techniques in Capital Budgeting. Managerial and Decision Economics, 10, 135–140. Pinches, G. E. and Mingo, K. A. (1973). A Multivariate Analysis of Industrial Bond Ratings, Journal of Finance, 28(1), 1–18. Quandt, R. E. (1988). The Econometrics of Disequilibrium. Basil Blackwell Inc., New York. Rendleman, R. J. Jr. and Barter, B. J. (1979). Two-state Option Pricing. Journal of Finance, 24, 1093–1110. Riahi-Belkaoui, A. and Pavlik, E. (1993). Effects of Ownership Structure, Firm Performance, Size and Diversification Strategy on CEO Compensation: A Path Analysis. Managerial Finance, 19(2), 33–54. Rubinstein, M. (1994). Implied Binomial Trees. Journal of Finance, 49, 771–818. Schroder, M. (1989). Computing the Constant Elasticity of Variance Option Pricing Formula. Journal of Finance, 44(1), 211–219.
page 30
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
Introduction
9.61in x 6.69in
b3568-v1-ch01
31
Sealey, C. W. Jr. (1979). Credit Rationing in the Commercial Loan Market: Estimates of a Structural Model under Conditions of Disequilibrium. Journal of Finance, 34, 689–702. Sears, R. S. and John Wei, K. C. (1988). The Structure of Skewness Preferences in Asset Pricing Model with Higher Moments: An Empirical Test. Financial Review, 23(1), 25–38. Shapiro, A. F. (2005). Fuzzy Regression Models. Working Paper, Penn State University. Shumway, T. (2001). Forecasting Bankruptcy More Accurately: A Simple Hazard Model. The Journal of Business, 74, 101–124. Spies, R. R. (1974). The Dynamics of Corporate Capital Budgeting, Journal of Finance, 29, 29–45. Teng, H. W. and Kao, L. J. (2019). Machine Learning versus Statistical Learning: Applications in FinTech. Working Paper. Thomas, L. C. (2000). A Survey of Credit and Behavioral Scoring: Forecasting Financial Risk of Lending to Consumers. International Journal of Forecasting, 16, 149–172. doi: http://dx.doi.org/10.1016/S0169-2070(00)00034-0. Thompson, S. B. (2011). Simple Formulas for Standard Errors that Cluster by Both Firm and Time. Journal of Financial Economics, 99(1), 1–10. Thursby, J. G. (1985). The Relationship among the Specification Error Tests of Hausman, Ramsey and Chow. Journal of the American Statistical Association, 80(392), 926–928. Titman, S. and Wessels, R. (1988). The Determinants of Capital Structure Choice. Journal of Finance, 43(1), 1–19. Tsai, G. M. (2005). Alternative Dynamic Capital Asset Pricing Models: Theories and Empirical Results. Ph.D. Dissertation, Rutgers University. Van Der Klaauw, W. (2002). Estimating the Effect of Financial Aid offers on College Enrollment: A Regression-discontinuity Approach. International Economic Review, 43(4), 1249–1287. Verbraken, T., Bravo, C., Weber, R. and Baesens, B. (2014). Development and Application of Consumer Credit Scoring Models using Profit-based Classification Measures. European Journal of Operational Research, 238(2), 505–513. doi: http://dx.doi.org/1 0.1016/j.ejor.2014.04.001. Wei, K. C. (1984). The Arbitrage Pricing Theory versus the Generalized Intertemporal Capital Asset Pricing Model: Theory and Empirical Evidence. Ph.D. Dissertation, University of Illinois at Urbana-Champaign. White, H. (1980). A Heteroskedasticity-consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817–838. Wooldrige, J. M. (2010). Econometric Analysis of Cross Section and Panel Data, (2nd ed.). The MIT Press. Yang, C. C. (1989). The Impact of New Equity Financing on Firms’ Investment, Dividend and Debt-financing Decisions. Ph.D. Dissertation, The University of Illinois at UrbanaChampaign. Yang, C. C., Lee, C. F., Gu, Y. X. and Lee, Y. W. (2010). Co-determination of Capital Structure and Stock Returns — A LISREL Approach: An Empirical Test of Taiwan Stock Markets. Quarterly Review of Economic and Finance, 50(2), 222–233. Zeileis, A., Leisc, F., Hornik, K. and Kleiber, C. (2002). Strucchange: An R Package for Testing for Structural Change in Linear Regression Models. Journal of Statistical Software, 7, 1–38. Zopounidis, C., Doumpos, M. and Matsatsinis, N. F. (1997). On the Use of Knowledge-based Decision Support Systems in Financial Management: A Survey. Decision Support Systems, 20(3), 259–277. doi: http://dx.doi.org/10.1016/S0167-9236(97)00002-X.
page 31
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
32
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Appendix 1A: Abstracts and Keywords for Chapters 2 to 131 Chapter 1: Introduction In this chapter we first discuss the overall literature related to financial econometrics, mathematics, statistics, and machine learning. Then we give a brief review of all the chapters included in this handbook in accordance with the subject classification and methodology classification. In Appendix 1A, we list the abstract and keywords of all of the papers included in this book in detail. Chapter 2: Do Managers Use Earnings Forecasts to Fill a Demand They Perceive from Analysts? This paper examines how the nature of the information possessed by individual analysts influences managers’ decisions to issue forecasts and the consequences of those decisions. Our analytical model yields the prediction that managers prefer to issue guidance when they perceive their private information to be more precise, and analysts possess mostly common, imprecise information (i.e., there is high commonality and uncertainty). Based on an econometric model, we obtain theory-based analyst variables and our empirical evidence confirms our predictions. High commonality and uncertainty in analysts’ prior information are accompanied by increases in analysts’ forecast revisions and trading volume following guidance, consistent with greater analyst incentives to generate idiosyncratic information. Yet, management guidance increases only with the commonality contained in analysts’ predisclosure information, but not with the level of uncertainty. Indeed, the disclosure propensity among a subset of firms (those with less able managers, bad news, and infrequent forecasts) has an inverse relationship with analyst uncertainty due to its reflection on the low precision of management information. Our results are robust to a variety of alternative analyses, including the use of propensity-score matched pairs with similar disclosure environments but differing degrees of commonality and uncertainty among analysts. We also demonstrate that the use of forecast dispersion as an empirical proxy for analysts’ prior information may lead to erroneous inferences. Overall, we define and support improved measures of analyst information environment based on an econometric model and find that the commonality of information among analysts acts as a reliable forecast antecedent by informing managers about the amount of idiosyncratic information in the market.
page 32
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
33
Keywords: Management earnings forecasts, Analysts’ information, Uncertainty, Commonality. Chapter 3: A Potential Benefit of Increasing Book–Tax Conformity: Evidence from the Reduction in Audit Fees Our study explores a possible benefit of conforming book income to taxable income. We expect that increased book–tax conformity can reduce audit fees by simplifying tax accruals and increasing tax authorities’ monitoring, which reduce audit workload and audit risk, respectively. Consistent with our expectations, we find that a higher country level of required book–tax conformity leads to lower audit fees. Moreover, firm-level book–tax differences are positively associated with audit fees. We also find that the negative association between country level of required book–tax conformity and audit fees is mitigated among firms with larger book–tax differences. Our findings are robust to including country-level legal investor protection or other extra-legal institutions. Overall, our results suggest that one benefit of increasing book– tax conformity is the reduction in audit fees. In the appendix we extend our main empirical test by including firm fixed effects and clustering standard errors of regression coefficients, and we find that these do not change our conclusions. Keywords: Audit fee, Book–tax conformity, Book–tax difference, Legal institution, Extra-legal institution. Chapter 4: Gold in Portfolio: A Long-Term or Short-Term Diversifier? The purpose of this chapter is to evaluate the role played by gold in a diversified portfolio comprised of bond and stock. The continuous wavelet transform analysis is applied to capture the correlation features between gold and other risky assets at specific time horizons to determine whether gold should be included in a diversified portfolio. This chapter uses the US stock, bond, and gold data from 1990 until 2013 to investigate the optimal weights of gold obtained from the minimum variance portfolio. Empirical findings suggest that little evidences support that gold acts an efficient diversifier in traditional stock and bond portfolio. Gold typically has been a longterm diversifier in the traditional portfolio comprised of bond and stock only before the early 2000s and acts as a short-term diversifier in times of
page 33
July 6, 2020
10:14
34
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
crisis periods. The significant drop in the long-term weight of gold indicate that gold loses much of its long-term role in the diversified portfolio. These findings are useful for portfolio managers to justify the gold’s diversification benefits over different investment horizons. Keywords: Gold, Commodity diversifier, Investment horizons, Continuous wavelet analysis. Chapter 5: Econometric Approach to Financial Analysis, Planning, and Forecasting In this chapter, we first review the basic models related to simultaneous equation, such as 2SLS, 3SLS, and SUR. Then we discuss how to estimate different kinds of simultaneous equation models. The application of this model in financial analysis, planning, and forecasting is also explored. Simultaneity and dynamics of corporate-budgeting is explored in detail in terms of data from Johnson & Johnson. Keywords: Simultaneous econometric models, Endogenous variables, Exogenous variables, Two-stage least squares (2SLS), SUR, Linear programming, Goal programming, Dynamic capital budgeting decision. Chapter 6: Forecast Performance of the Taiwan Weighted Stock Index: Update and Expansion This research introduces the following to establish a TAIEX prediction model: intervention analysis integrated into the ARIMA-GARCH model, ECM, intervention analysis integrated into the transfer function model, the simple average combination forecasting model, and the minimum error combination forecasting model. The results show that intervention analysis integrated into the transfer function model yields a more accurate prediction model than ECM and intervention analysis integrated into the ARIMAGARCH model. The minimum error combination forecasting model can improve prediction accuracy much better than noncombination models and also maintain robustness. Intervention analysis integrated into the transfer function model shows that the TAIEX is affected by external factors, the INDU, the exchange rate, and the consumer price index; therefore, facing the different issues of the TAIEX, the government could pursue some macroeconomic policies to reach the goals of policies.
page 34
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
35
Keywords: ARIMA-GARCH model, Transfer function model, Intervention, Combination forecasting model, Outlier detection, Error correction model, MAPE, RMSE, Turning point method. Chapter 7: Parametric, Semi-Parametric, and Non-Parametric Approaches for Option-Bound Determination: Review and Comparison Based upon Ritchken (1985), Levy (1985), Lo (1987), Zhang (1994), Jackwerth and Rubinstein (1996), and others, this chapter discusses the alternative method to determine option bound in terms of the first two moments of distribution. This approach includes stochastic dominance method and linear programming method, then we discuss semi-parametric method and nonparametric method for option-bound determination. Finally, we incorporate both skewness and kurtosis explicitly through extending Zhang (1994) to provide bounds for the prices of the expected payoffs for options, given the first two moments and skewness and kurtosis. Keywords: Option-bound, Upper bound, Lower bound, Expected payoff, Simulation, Skewness, Kurtosis, Stochastic dominance, Linear programming, Non-parametric method, Semi-parametric method, Arbitrage theory. Chapter 8: Measuring the Collective Correlation of a Large Number of Stocks Market makers or liquidity providers play a central role for the operation of the stock markets. In general, these agents execute contrarian strategies so that their profitability depends on the distribution of stock returns across the market. The more widespread the distribution is, the more arbitrage opportunities are available. This implies that the collective correlation of stocks is an indicator for the possible turmoil in the market. This paper proposes a novel approach to measure the collective correlation of stock market with the network as a tool for extracting information. The market network can be easily constructed by digitizing pairwise correlations. While the number of stocks becomes very large, the network can be approximated by an exponential random graph model under which the clustering coefficient of the market network is a natural candidate for measuring the collective correlation of the stock market. With a sample of S&P 500 components in the period from January 1996 to August 2009, we show that clustering coefficient can be
page 35
July 6, 2020
10:14
36
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
used as an alternative risk measure in addition to volatility. Furthermore, investigations on higher order statistics also reveal the distinctions on the clustering effect between bear markets and bull markets. Keywords: Collective correlation, Correlation breakdown, Dimension reduction, Partition function, Random graph. Chapter 9: Key Borrowers Detected by the Intensities of Their Interactions We propose a novel method to estimate the level of interconnectedness of a financial institution or system, as the measures currently suggested in the literature do not fully take into consideration an important aspect of interconnectedness — group interactions of agents. Our approach is based on the power index and centrality analysis and is employed to find a key borrower in a loan market. It has three distinctive features: it considers long-range interactions among agents, agents’ attributes and a possibility of an agent to be affected by a group of other agents. This approach allows us to identify systemically important elements which cannot be detected by classical centrality measures or other indices. The proposed method is employed to analyze the banking foreign claims as of 1Q 2015. Using our approach, we detect two types of key borrowers (a) major players with high ratings and positive credit history; (b) intermediary players, which have a great scale of financial activities through the organization of favorable investment conditions and positive business climate. Keywords: Power index, Key borrower, Systemic importance, Interconnectedness, Centrality, S-long-range interactions, Paths. Chapter 10: Application of the Multivariate Average F -Test to Examine Relative Performance of Asset Pricing Models with Individual Security Returns The standard multivariate test of Gibbons et al. (1989) used in studies examining relative performance of alternative asset pricing models requires the number of stocks to be less than the number of time series observations, which requires stocks to be grouped into portfolios. This results in a loss of disaggregate stock information. We apply a new statistical test to get around this problem. We find that the multivariate average F -test developed
page 36
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
37
by Hwang and Satchell (2014) has superior power to discriminate among competing models and does not reject tested models altogether, unlike the standard multivariate test. Application of the multivariate average F -test for examination of relative performance of asset pricing models demonstrate that a parsimonious 6-factor model with the market, size, orthogonal value, profitability, investment, and momentum factors outperforms all other models. Keywords: Multivariate F -test, Asset pricing, Factor models, Market portfolio, Size, Book-to-market, Momentum, Investment, profitability, Liquidity risk, Factor loadings, GRS test. Chapter 11: Hedge Ratio and Time Series Analysis This chapter discusses both static and dynamic hedge ratio in detail. In static analysis, we discuss minimum-variance hedge ratio, Sharpe hedge ratio, and optimum mean-variance hedge ratio. In addition, several time series analysis methods such as the multivariate skew-normal distribution method, the autoregressive conditional heteroskedasticity (ARCH) and generalized autoregressive conditional heteroskedasticity (GARCH) methods, the regime-switching GARCH model, and the random coefficient method are used to show how hedge ratio can be estimated. Keywords: Hedge ratio, Minimum variance hedge ratio, CARA utility function, Optimum mean variance hedge ratio, Sharpe hedge ratio, Maximum mean extended-gini coefficient hedge ratio, Optimum mean MEG hedge ratio, Minimum generalized semi-variance hedge ratio, Minimum value-atrisk hedge ratio Multivariable spew-normal distribution method, ARCH method, GARCH method, Regime-switching GARCH method, Random coefficient method, Co-integration and error assertion method effectiveness. Chapter 12: Application of Intertemporal CAPM on International Corporate Finance This chapter discusses both intertemporal asset pricing model and international asset pricing model (IAPM) in detail. In intertemporal asset pricing model, we discuss Campbell (1993) model which assumes that investors are assumed to be endowed with Kreps–Porteus utility and consumption is substituted out from the model. In addition, it extends Campbell’s (1993)
page 37
July 6, 2020
10:14
38
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
model to develop an intertemporal IAPM. We show that the expected international asset return is determined by a weighted average of market risk, market hedging risk, exchange rate risk and exchange rate hedging risk. A test of the conditional version of our intertemporal IAPM using a multivariate GARCH process supports the asset pricing model. We find that the exchange rate risk is important for pricing international equity returns and it is much more important than intertemporal hedging risk. Keywords: International finance, Asset pricing, Currency risk, Intertemporal, Log-linear budget constraint, Euler equations, Non-expected utility. Chapter 13: What Drives Variation in the International Diversification Benefits? A Cross-Country Analysis In this chapter, we show that, as the world becomes increasingly integrated, the benefits of global diversification still remain positive and economically significant over time. Both regression analysis and explanatory power tests show that international integration, measured by adjusted R2 from a multifactor model, has more profound impact on the diversification benefits than correlation. Our results support Roll’s (2013) argument that R2 , but not correlation, is an appropriate measure of market integration. We examine the impact of market integration determinants such as default risk, inflation, TED spread, past local equity market return, liquidity, and the relative performance of domestic portfolio on the potential diversification benefits. Keywords: Diversification benefits, Investment constraints, International portfolio. Chapter 14: A Heteroskedastic Black–Litterman Portfolio Optimization Model with Views Derived from a Predictive Regression The modern portfolio theory in Markowitz (1952) is a cornerstone for investment management, but its implementations are challenging in that the optimal portfolio weight is extremely sensitive to the estimation for the mean and covariance of the asset returns. As a sophisticated modification, the Black–Litterman portfolio model allows the optimal portfolio’s weight to rely on a combination of the implied market equilibrium returns and investors’ views (Black and Litterman, 1991). However, the performance of a Black– Litterman model is closely related to investors’ views and the estimated
page 38
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
39
covariance matrix. To overcome these problems, we first propose a predictive regression to form investors’ views, where asset returns are regressed against their lagged values and the market return. Second, motivated by stylized features of volatility clustering, heavy-tailed distribution, and leverage effects, we estimate the covariance of asset returns via heteroskedastic models. Empirical analysis using five industry indexes in the Taiwan stock market shows that the proposed portfolio outperforms existing ones in terms of cumulative returns. Keywords: Markowitz modern portfolio theory, Black–Litterman model, GARCH model, EGARCH model, The investor’s views, Volatility clustering. Chapter 15: Pricing Fair Deposit Insurance: Structural Model Approach In this chapter, we propose the structural model in terms of the Stair Tree model and barrier option to evaluate the fair deposit insurance premium in accordance with the constraints of the deposit insurance contracts and the consideration of bankruptcy costs. First, we show that the deposit insurance model in Brockman and Turle (2003) is a special case of our model. Second, the simulation results suggest that insurers should adopt a forbearance policy instead of a strict policy for closure regulation to avoid losses from bankruptcy costs. An appropriate deposit insurance premium can alleviate potential moral hazard problems caused by a forbearance policy. Our simulation results can be used as reference in risk management for individual banks and for the Federal Deposit Insurance Corporation (FDIC). Keywords: Deposit insurance, FDIC, Bankruptcy, Moral hazard, Insurance premium, Tree model, Barrier option, Policy. Chapter 16: Application of Structural Equation Modeling in Behavioral Finance: A Study on the Disposition Effect Studies on behavioral finance argue that cognitive/emotional biases could influence investors’ decisions and result in the disposition effect, wherein investors have the tendency to sell winning stocks too early and hold losing stocks too long. In this regard, this study proposes a conceptual model to examine the relationship among cognitive/emotional biases, the disposition effect, and investment performance. Cognitive/emotional biases mainly consist of mental accounting, regret avoidance, and self-control. Furthermore,
page 39
July 6, 2020
10:14
40
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
this study examines whether gender and marital status moderate the relationship between these biases and the disposition effect by collecting quantitative data through a questionnaire survey and employing a structural equation modeling (SEM) approach to execute the estimation procedure. The results of this study show that mental accounting has the most significant influence on the disposition effect, which implies that prospect theory is an alternative to the expected utility theory in accounting for investor’s behavior. The findings of moderating analysis indicate that female investors display a larger disposition effect than male investors. Keywords: Structural equation model, Behavioral finance, Cognitive biases, Disposition effect, Mental accounting, Regret avoidance, Self-control, Validity and reliability, Moderating effect, Path relationship.
Chapter 17: External Financing Needs and Early Adoption of Accounting Standards: Evidence from the Banking Industry Economic intuition and theories suggest that banks are motivated to voluntarily disclose information and signal their quality, for example, through early adoption of accounting standards, to better access capital markets. Examining accounting standards from January 1995 to March 2008, I find that US bank holding companies (BHCs) with lower profitability and higher risk profiles are more likely to choose early adoption. This evidence is consistent with a BHC’s incentive to better access external financing through information disclosure and signaling. Moreover, a counter-signaling effect of decisions not to early adopt is first identified because early-adopting BHCs are not necessarily the least risky and the most profitable. I also find the counter-signaling effect to be most evident when an accounting standard has no effect on the financial statement proper (i.e., only disclosure requirements). This finding complements prior research that managers treat recognition and disclosure differently and that financial statement users weigh more on recognized than disclosed values. Finally, the results show that early adopters generally experience higher fund growth in uninsured debts than matched late adopters in economic expansions, times when BHCs are most motivated to obtain funds. This finding is consistent with the bank capital structure literature that banks have shifted towards nondeposit debts to finance their balance sheet growth. Keywords: Early adoption, Banks, Disclosure and counter-signaling, External financing.
page 40
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
41
Chapter 18: Improving the Stock Market Prediction with Social Media via Broad Learning This chapter discusses how to exploit various Web information to improve the stock market prediction. We first discuss the impacts of investors’ social network on the stock market, and then propose several information fusion methods, that is, the tensor-based model and the multiple-instance learning model, to integrate the Web information and the quantitative information to improve the prediction capability. Keywords: Stock prediction, Event extraction, Information fusion, Social network, Sentiment analysis, Quantitative analysis, Stock correlation. Chapter 19: Sourcing Alpha in Global Equity Markets: Market Factor Decomposition and Market Characteristics The sources of risk in a market place are systematic, cross-sectional and time varying in nature. Though the CAPM provides an excellent risk-return framework and the market beta may reflect the risk associated with risky assets, there are opportunities for investors to take advantage of dimensional and time varying return anomalies in order to improve their investment returns. In this paper, we restrict our analysis to return variations linked to market factor anomalies or factor/dimensional beta using the Fama–French 3 factor, Carhart 4 factor, and Asness, Frazzini and Pederson (AFP)’s 5 and 6 factor models. We find significant variations in explaining sources of risk across 22 developed and 21 emerging markets with data over a long period from 1991 to 2016. Each market is unique in terms of factor risk characteristics and market risk as explained by the CAPM is not the true risk measure. Hence, contrary to the risk-return efficiency framework, we find that lower market risk results in higher excess return in 19 out of the 22 developed markets, which is a major anomaly. However, although in majority of the markets, the AFP models result in reducing market risk (15 countries) and enhancing Alpha (11 countries), it is also very interesting to note that, the CAPM is second only in generating excess returns in the developed markets. We are conscious of the fact however, that each market is unique in its composition and trend even over a long time horizon and hence a generalized approach in asset allocation cannot be adopted across all the markets. Keywords: Capital asset pricing model, Small-minus-big, High-minus-low, Momentum, Robust-minus-weak, Conservative-minus-aggressive, Qualityminus-junk, Betting against beta.
page 41
July 6, 2020
10:14
42
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 20: Support Vector Machines Based Methodology for Credit Risk Analysis Credit risk analysis is a classical and crucial problem which has attracted great attention from both academic researchers and financial institutions. Through the accurate classification of borrowers, it enables financial institutions to develop lending strategies to obtain optimal profit and avoid potential risk. Actually, in recent decades, several different kinds of classification methods have been widely used to solve this problem. Owing to the specific attributes of the credit data, such as its small sample size and non-linear characteristics, support vector machines (SVMs) show their advantages and have been widely used for scores of years. SVM adopts the principle of structural risk minimization (SRM), which could avoid the “dimension disaster” and has great generalization ability. In this study, we systematically review and analyze SVM-based methodology in the field of credit risk analysis, which is composed of feature extraction methods, kernel function selection of SVM and hyper-parameter optimization methods, respectively. For verification purpose, two UCI credit datasets and a real-life credit dataset are used to compare the effectiveness of SVM-based methods and other frequently used classification methods. The experiment results show that the adaptive Lq SVM model with Gauss kernel and ES hyper-parameter optimization approach (ES-ALqG-SVM) outperforms all the other models listed in this study, and its average classification accuracy in the two UCI datasets could achieve 90.77% and 75.21%, respectively. Moreover, the classification accuracy of SVM-based methods is generally better or equal than other kinds of methods, such as See5, DT, MCCQP and other popular algorithms. Besides, Gauss kernel based SVM models show better classification accuracy than models with linear and polynomial kernel functions when choosing the same penalty form of the model, and the classification accuracy of Lq-based methods is generally better or equal than L1- and L2-based methods. In addition, for a certain SVM model, hyper-parameter optimization utilizing evolution strategy (ES) could effectively reduce the computing time in the premise of guaranteeing a higher accuracy, compared with the grid search (GS), particle swarm optimization (PSO) and simulated annealing (SA). Keywords: Support vector machines, Feature extraction, Kernel function selection, Hyper-parameter optimization, Credit risk classification.
page 42
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
43
Chapter 21: Data Mining Applications in Accounting and Finance Context This chapter shows examples of applying several current data mining approaches and alternative models in an accounting and finance context such as predicting bankruptcy using US, Korean, and Chinese capital market data. Big data in accounting and finance context is a good fit for data analytic tool applications like data mining. Our previous study also empirically tested Japanese capital market data and found similar prediction rates. However, overall prediction rates depend on different countries and time periods (Mihalovic, 2016). These results are an improvement on previous bankruptcy prediction studies using traditional probit or logit analysis or multiple discriminant analysis. The recent survival model shows similar prediction rates in bankruptcy studies. However, we need longitudinal data to use the survival model. Because of computer technology advances, it is easier to apply data mining approaches. In addition, current data mining methods can be applied to other accounting and finance contexts such as auditor changes, audit opinion prediction studies, and internal control weakness studies. Our first paper shows 13 data mining approaches to predict bankruptcy after the Sarbanes–Oxley Act (SOX) (2002) implementation using 2008–2009 US data with 13 financial ratios and internal control weaknesses, dividend payout, and market return variables. Our second paper shows application of a Multiple Criteria Linear Programming Data Mining Approach using Korean data. Our last paper shows bankruptcy prediction models using Chinese firm data via several data mining tools and compared with those of traditional logit analysis. Analytic Hierarchy Process and Fuzzy Set also can be applied as an alternative method of data mining tools in accounting and finance studies. Natural language processing can be used as a part of the artificial intelligence domain in accounting and finance in the future (Fisher et al., 2016). Keywords: Data mining, Big data, Bankruptcy, Multiple criteria linear programming data mining, China, Korea, Japan, US, Probit, Logit, Multiple discriminant analysis, Survival model, Auditor change, Audit opinion prediction, Internal control weakness, Decision tree, Bayesian net, Decision table, Analytic hierarchy process, Support vector machines, Fuzzy set, Natural language processing.
page 43
July 17, 2020
13:28
44
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 22: Trade-off Between Reputation Concerns and Economic Dependence for Auditors — Threshold Regression Approach This chapter utilizes a panel threshold regression model to plow two of the most profound issues in auditing: First of all, does economic bonding compromise audit quality and secondly, does the SOX’s prohibition of certain nonaudit services mitigate the association between fees and auditor independence? Empirical results suggest that there indeed exists a threshold value which would impair audit quality once nonaudit services surpass it. Moreover, the threshold value has yet plummeted subsequent to the SOX’s prohibition of certain nonaudit services designated to mitigate auditors’ economic bonding with their clients, suggesting that the effort made by the authorities has been by large ineffective. The results lead us to ponder whether the fee structure and the existing practice of employing auditors at the discretion of the management should be rigorously reviewed to warrant audit quality. Keywords: Audit fees, Auditor independence, Auditor reputation, Nonaudit fees, Threshold regression model. Chapter 23: ASEAN Economic Community: Analysis Based on Fractional Integration and Cointegration This paper deals with the analysis of the trade balances in the 10 countries that form the ASEAN Economic Community (Brunei, Cambodia, Indonesia, Laos, Malaysia, Myanmar, The Philippines, Singapore, Thailand and Vietnam). For this purpose, we use standard unit roots along with fractional integration and cointegration methods. The latter techniques are more general than those based on integer differentiation and allow for a greater degree of flexibility in the dynamic specification of the series. The results based on unit roots were very inconclusive about the order of integration of the series. In fact, using fractional integration, the two hypotheses of stationarity I(0) and non-stationarity I(1) were decisively rejected in all cases, with orders of integration ranging between 0 and 1 and thus displaying long memory and mean reverting behavior. Focusing on the bivariate long-run equilibrium relationships between the countries, a necessary condition is that the two series must display the same degree of integration. This condition was fulfilled in a large number of cases. We observe some relations where cointegration could be satisfied, mainly involving countries such as Cambodia, Indonesia, Malaysia and the Philippines. Keywords: Fractional integration, Long memory, Balance of trade, ASEAN.
page 44
July 17, 2020
13:28
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
45
Chapter 24: Alternative Methods for Determining Option Bounds: A Review and Comparison This paper first reviews alternative methods for determining option bounds. This method includes Stochastic dominance, linear programming, semiparametric method, and non-parametric method for European option. Then option bounds for American and Asian options are discussed. Finally, we discuss empirical applications in equities and equity indices, index futures, foreign exchange rates, and real options. Keywords: Option bounds, Stochastic dominance, Linear programming, Semi-parametric, Non-parametric, Incomplete market, American options, European options. Chapter 25: Financial Reforms and the Differential Impact of Foreign Versus Domestic Banking Relationships on Firm Value This study documents a substantial difference in impact on an emerging market firm’s value due to its use of foreign bank debt relative to domestic bank debt. It finds a positive association between the use of collateral by foreign banks and firm value, however finds no such corresponding association for the use of collateral by domestic banks. The results suggest that as an emerging market’s banking system matures and becomes more sophisticated, the differences between the information contained in local versus foreign bank lending diminishes; this diminishment erodes the differential impact on firm value of foreign versus local bank lending. Keywords: Foreign bank relationships, Foreign bank debt, Emerging markets, Financial reform. Chapter 26: Time-Series Analysis: Components, Models, and Forecasting In this chapter, we first discuss the classical time-series component model, then we discuss the moving average and seasonally adjusted time series. A discussion on linear and log-linear time trend regressions follows. The autoregressive forecasting model as well as the ARIMA model are both reviewed. Finally, composite forecasting is discussed. Keywords: Autoregressive forecasting model, Coincident indicators, Crosssection data, Cyclical component, Exponential smoothing, Exponential smoothing constant, Holt–winters forecasting model, Irregular component,
page 45
July 6, 2020
10:14
46
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Lagging indicators, Leading indicators, Mean squared error, Percentage of moving average, Seasonal component, Seasonal index, Seasonal index method, Time-series data, Trend component, Trend–cycle component, Trading-day component, X-11 model.
Chapter 27: Itˆ o’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model The purpose of this chapter is to develop certain relatively recent mathematical discoveries known generally as stochastic calculus, or more specifically as Itˆ o’s Calculus and to also illustrate their application in the pricing of options. The mathematical methods of stochastic calculus are illustrated in alternative derivations of the celebrated Black–Scholes–Merton model. The topic is motivated by a desire to provide an intuitive understanding of certain probabilistic methods that have found significant use in financial economics. Keywords: Stochastic calculus, Itˆ o’s lemma, Options pricing, Martingale.
Chapter 28: Durbin–Wu–Hausman Specification Tests This chapter discusses Durbin, Wu, and Hausman (DWH) specification tests and provides examples of their application and interpretation. DWH tests compare alternative parameter estimates and can be useful in discerning endogeneity issues (omitted variables, measurement error/errors in variables, and simultaneity), incorrect functional form and contemporaneous correlation in the lagged dependent variable — serial correlation model, testing alternative estimators for a model, and testing alternative theoretical models. Empirical applications are provided illustrating the use of DWH tests in comparing LS, IV, FE, RE, and GMM estimators. Keywords: Durbin, Wu, Hausman (DWH) specification tests, Endogeneity, Measurement error, Omitted variables, Instrumental Variables (IV), Panel data, Fixed Effects (FE), Random Effects (RE), Generalized Method of Moments (GMM), Identification, Robust Hausman, artificial regression, Simulation and bootstrap techniques, Non-parametric tests.
page 46
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
47
Chapter 29: Jump Spillover and Risk Effects on Excess Returns in the United States During the Great Recession In this chapter, we review econometric methodology that is used to test for jumps and to decompose realized volatility into continuous and jump components. In order to illustrate how to implement the methods discussed, we also present the results of an empirical analysis in which we separate continuous asset return variation and finite activity jump variation from excess returns on various US market sector exchange traded funds (ETFs), during and around the Great Recession of 2008. Our objective is to characterize the financial contagion that was present during one of the greatest financial crises in US history. In particular, we study how shocks, as measured by jumps, propagate through nine different market sectors. One element of our analysis involves the investigation of causal linkages associated with jumps (via use of vector autoregressions), and another involves the examination of the predictive content of jumps for excess returns. We find that as early as 2006, jump spillover effects became more pronounced in the markets. We also observe that jumps had a significant effect on excess returns during 2008 and 2009; but not in the years before and after the recession. Keywords: High-frequency jumps, Jump spillover, Jump risks, Excess returns, ETFs, Great Recession, High-frequency data, Bipower variation tests, Swap variance based tests, Jump decompositions.
Chapter 30: Earnings Forecasts and Revisions, Price Momentum, and Fundamental Data: Further Explorations of Financial Anomalies Earnings forecasting data has been a consistent, and highly statistically significant, source of excess returns. This chapter discusses a composite model of earnings forecasts, revisions, and breadth, CTEF, a model of forecasted earnings acceleration, was developed in 1997 to identify mispriced stocks. Our most important result is that the forecasted earnings acceleration variable has produced statistically significant Active and Specific Returns in the Post-Global Financial Crisis Period. Simple earnings revisions and forecasted yields have not enhanced returns in the past 7–20 years, leading many financial observers to declare earnings research pass´e. We disagree! Moreover, earnings forecasting models complement fundamental data (earnings, book
page 47
July 6, 2020
10:14
48
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
value, cash flow, sales, dividends, liquidity) and price momentum strategies in a composite model for stock selection. The composite model strategy excess returns are greater in international stocks than in US stocks. The models reported in Guerard and Mark (2003) are highly statistically significant in its post-publication time period, including booms, recessions, and highly volatile market conditions. Keywords: Portfolio theory, Portfolio construction, Portfolio management, Earnings forecasts, Earnings revisions, Portfolio optimization. Chapter 31: Ranking Analysts by Network Structural Hole This paper proposes a novel approach to rank analysts using their positions in a network constructed by peer analysts connected with overlapping firm coverage. We hypothesize that analysts occupying the network structural holes can produce higher quality equity research by a better access to their peer analysts’ wealth and diversity of information and knowledge. We report consistent empirical evidence that high-ranked analysts identified by network structural holes have greater ability to affect stock prices. Furthermore, those analysts tend to issue timely opinions, but not necessarily more accurate or consistent earnings forecasts. Analysts occupying structural holes tend to be more experienced, have a higher impact on stock prices when they work for large brokerages, and are rewarded with better career outcomes. Keywords: Analyst coverage network, Structural hole, High-ranked analysts, Forecast timeliness, Social network research. Chapter 32: The Association Between Book–Tax Differences and CEO Compensation We examine the effect of book–tax differences on CEO compensation. We posit that CEOs can opportunistically exercise the discretion in GAAP to increase accounting income without affecting taxable income and in so doing increase their compensation. We test the data to determine which competing hypothesis dominates — efficiency or rent-seeking. Under the efficiency hypothesis, the board of directors uses the information in book–tax differences to undo CEOs’ attempts to artificially inflate accounting income and hence CEO compensation is negatively associated with book–tax differences. Under the rent-seeking hypothesis, CEOs gain effective control of the paysetting process so that they set their own pay with little oversight from
page 48
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
49
shareholders and directors. Directors do not use the information in book– tax differences to undo CEOs’ attempted earnings manipulation and this gives rise to a positive association between CEO compensation and book– tax differences. Consistent with the efficiency hypothesis, we find that CEO compensation is negatively associated with book–tax differences suggesting that directors use the information in book–tax differences to reduce excessive CEO compensation. We also find that strong corporate governance structure strengthens the negative association between CEO compensation and book–tax differences. Specifically, firms with high insider equity ownership and high proportion of independent directors on the board have lower CEO compensation when book–tax differences are large. Keywords: CEO compensation, Book–tax differences, Efficiency hypothesis, Rent-seeking hypothesis, Earnings management, Corporate governance. Chapter 33: Stochastic Volatility Models: Faking a Smile Stochastic volatility models of option prices treat variance as a variable. However, the application of such models requires calibration to market prices that often treats variance as an optimized parameter. If variance represents a variable, option pricing models should reflect measure-invariant features of its historic evolution. Alternatively, if variance is a parameter used to generate desired features of the implied volatility surface, stochastic volatility models lose their connection to the historic evolution of variance. This chapter obtains evidence that variance in stochastic volatility models is an artificial construct used to confer desired properties to the generated implied volatility surface. Keywords: Stochastic volatility, Implied volatility smile/skew/surface, Realized variation, Calibration, Option pricing. Chapter 34: Entropic Two-Asset Option This chapter extends the Margrabe formula such that it is suitable for accounting for any type jump of stocks. Despite the fact that prices of an exchange option are characterized by jumps, it seems no study has explored those price jumps of an exchange option. The jump in this chapter is illustrated by a Poisson process. Moreover, the Poisson process can be extended into Cox process in case there is more than
page 49
July 6, 2020
10:14
50
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
one jump. The results illustrate that incompleteness in an exchange option leads to a premium which in turn increases an option value whilst hedging strategies reveal mixed-bag type of results. Keywords: Cox process, Exchange option, Inverse Fourier transform, Poisson process, Synthetic hedging. Chapter 35: The Joint Determinants of Capital Structure and Stock Rate of Return: A LISREL Model Approach We develop a simultaneous determination model of capital structure and stock returns. Specifically, we incorporate the managerial investment autonomy theory into the structural equation modeling with confirmatory factor analysis to jointly determine the capital structure and stock return. Besides attributes introduced in previous studies, we introduce indicators affecting a firm’s financing decision, such as managerial entrenchment, macroeconomic factors, government financial policy, and pricing factors. Empirical results show that stock returns, asset structure, growth, industry classification, uniqueness, volatility, financial rating, profitability, government financial policy, and managerial entrenchment are major factors of the capital structure. Keywords: LISREL, Structural equation modeling (SEM), Confirmatory factor analysis (CFA), Capital structure. Chapter 36: Time-Frequency Wavelet Analysis of Stock-Market Co-Movement Between and Within Geographic Trading Blocs In the context of globalization, through a growing process of market liberalization, an advanced technology and an economic trading bloc, national stock markets have become more interdependent, which limits the international portfolio diversification opportunities. This chapter investigates the degree of stock market co-movement between and within 13 developed European Union markets, six developing Latin American markets, two developed North American markets, 10 developing Asian markets, Norway, Switzerland, Australia, and Japan markets. The research methodology employed includes wavelet correlation, wavelet multiple cross-correlation and wavelet coherence. Results show a positive correlation across intra and inter trading blocs in all investment horizons and over time, and they show that the linkage
page 50
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
51
between stock returns increases with the time scale, implying that the international diversification benefits have largely disappeared in globalized world markets. Moreover, we found a high degree of co-movement at low frequencies in crisis and no crisis periods, which indicates a fundamental relationship between stock market returns. Finally, multiple cross-correlation analysis reveals that stock markets are positively correlated at all wavelet scales and at all lags, and it reveals that France’s stock market is the potential leader or follower of the other European and other major world stock markets at low and high frequencies. Keywords: International stock market linkage, Wavelet correlation, Wavelet multiple cross-correlation, Wavelet coherence. Chapter 37: Alternative Methods to Deal with Measurement Error Specification error and measurement error are two major issues in finance research. The main purpose of this chapter is (i) to review and extend existing errors-in-variables (EIV) estimation methods, including classical method, grouping method, instrumental variable method, mathematical programming method, maximum likelihood method, LISREL method, and the Bayesian approach; (ii) to investigate how EIV estimation methods have been used to finance related studies, such as cost of capital, capital structure, investment equation, and test capital asset pricing models; and (iii) to give a more detailed explanation of the methods used by Almeida et al. (2010). Keywords: Bayesian approach, Capital asset pricing model, Capital structure, Classical method, Cost of capital, Errors-in-variables, Grouping method, Instrumental variable method, Investment equation, Mathematical programming method, Maximum likelihood method, Measurement error, LISREL method. Chapter 38: Simultaneously Capturing Multiple Dependence Features in Bank Risk Integration: A Mixture Copula Framework This chapter proposes a mixture copula framework for integration of different types of bank risks, which is able to capture comprehensively the nonlinearity, tail dependence, tail asymmetry and structure asymmetry of
page 51
July 6, 2020
10:14
52
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
bank risk dependence. We analyze why mixture copula is well-suited for bank risk integration, discuss how to construct a proper mixture copula and present detailed steps for using mixture copula. In the empirical analysis, the proposed framework is employed to model the dependence structure between credit risk, market risk and operational risk of Chinese banks. The comparisons with seven other major approaches provide strong evidence of the effectiveness of the constructed mixture copulas and help to uncover several important pitfalls and misunderstandings in risk dependence modeling. Keywords: Bank risk, Credit risk, Market risk, Operational risk, Risk management, Risk dependence, Risk integration, Simple summation approach, Variance–covariance approach, Mixture copula, Basel committee on banking supervision, Tail dependence. Chapter 39: GPU Acceleration for Computational Finance Recent progress of graphics processing unit (GPU) computing with applications in science and technology has demonstrated tremendous impact over the last decade. However, financial applications by GPU computing are less discussed and may cause an obstacle toward the development of financial technology, an emerging and disruptive field focusing on the efficiency improvement of our current financial system. This chapter aims to raise the attention of GPU computing in finance by first empirically investigating the performance of three basic computational methods including solving a linear system, Fast Fourier transform, and Monte Carlo simulation. Then a fast calibration of the wing model to implied volatility is explored with a set of traded futures and option data in high frequency. At least 60% executing time reduction on this calibration is obtained under the Matlab computational environment. This finding enables the disclosure of an instant market change so that a real-time surveillance for financial markets can be established for either trading or risk management purposes. Keywords: GPU, Parallel computing, Matlab, Computational finance, Implied volatility, High frequency data. Chapter 40: Does VIX Truly Measure Return Volatility? This chapter demonstrates theoretically that without imposing any structure on the underlying forcing process, the model-free CBOE volatility index (VIX) does not measure market expectation of volatility but that of a linear
page 52
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
53
moment-combination. Particularly, VIX undervalues (overvalues) volatility when market return is expected to be negatively (positively) skewed. Alternatively, we develop a model-free generalized volatility index (GVIX). With no diffusion assumption, GVIX is formulated directly from the definition of log-return variance, and VIX is a special case of the GVIX. Empirically, VIX generally understates the true volatility, and the estimation errors considerably enlarge during volatile markets. The spread between GVIX and VIX follows a mean-reverting process. Keywords: Implied volatility, VIX, Ex ante moments. Chapter 41: An ODE Approach for the Expected Discounted Penalty at Ruin in a Jump-Diffusion Model Under the assumption that the asset value follows a phase-type jumpdiffusion, we show that the expected discounted penalty satisfies an ODE and obtains a general form for the expected discounted penalty. In particular, if only downward jumps are allowed, we get an explicit formula in terms of the penalty function and jump distribution. On the other hand, if the downward jump distribution is a mixture of exponential distributions (and upward jumps are determined by a general L´evy measure), we obtain closed-form solutions for the expected discounted penalty. As an application, we work out an example in Leland’s structural model with jumps. For earlier and related results, see Gerber and Landry et al. (1998), Hilberink and Rogers et al. (2002), Asmussen et al. (2004), and Kyprianou and Surya et al. (2007). Keywords: Jump-diffusion, Expected discounted penalty, Phase-type distribution, Optimal capital structure. Chapter 42: How Does Investor Sentiment Affect Implied Risk-Neutral Distributions of Call and Put Options? This chapter investigates the characteristics of implied risk-neutral distributions derived separately from call and put options prices. Differences in risk-neutral moments between call and put options indicate deviations from put–call parity. We find that sentiment effect is significantly related to differences between call and put option prices. Our results suggest there is differential impact of investor sentiment and consumer sentiment on call and put option traders’ expectations. Rational and irrational sentiment components have different influence on call and put option traders’ beliefs as well.
page 53
July 17, 2020
13:28
54
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Keywords: Implied risk-neutral distribution, Put–call parity, Investor sentiment, Consumer sentiment, TAIEX options. Chapter 43: Intelligent Portfolio Theory and Strength Investing in the Confluence of Business and Market Cycles and Sector and Location Rotations This chapter presents the state of the art of the Intelligent Portfolio Theory which consists of three parts: the basic theory — principles and framework of intelligent portfolio management, the strength investing methodology as the driving engine, and the dynamic investability map in the confluence of business and market cycles and sector and location rotations. The theory is based on the tenet of “invest in trading” beyond “invest in assets”, distinguishing asset portfolio versus trading strategies and integrating them into a multi-asset portfolio which consists of many multi-strategy portfolios, one for each asset. The multi-asset portfolio is managed with an active portfolio management framework, where the asset allocation weights are dynamically estimated from a multi-factor model. The weighted investment on each single asset is then managed via a portfolio of trading strategies. Each trading strategy is itself a dynamically adapting trading agent with its own optimization mechanism. Strength investing as a methodology for asset selection with market timing focuses on dynamically tracing a small open cluster of assets which exhibit stronger trends and simultaneously follow trends of those assets, so as to alleviate the drawbacks of single-asset trend following such as drawdown and stop loss. In the real world of global financial markets, the investability both in terms of asset selection and trade timing emerges in the confluence of business cycles and market cycles as well as the sector rotation for stock markets and location rotation for real estate markets. Keywords: Intelligent portfolio theory, Asset portfolio, Trading strategy portfolio, Strength investing, Trend following, Business & market cycle, Sector & location rotation. Chapter 44: Evolution Strategy-Based Adaptive Lq Penalty Support Vector Machines with Gauss Kernel for Credit Risk Analysis Credit risk analysis has long attracted great attention from both academic researchers and practitioners. However, the recent global financial crisis has made the issue even more important because of the need for further enhancement of accuracy of classification of borrowers. In this study, an evolution
page 54
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
55
strategy (ES)-based adaptive Lq SVM model with Gauss kernel (ES-ALqGSVM) is proposed for credit risk analysis. Support vector machine (SVM) is a classification method that has been extensively studied in recent years. Many improved SVM models have been proposed, with non-adaptive and pre-determined penalties. However, different credit data sets have different structures that are suitable for different penalty forms in real life. Moreover, the traditional parameter search methods, such as the grid search method, are time consuming. The proposed ES-based adaptive Lq SVM model with Gauss kernel (ES-ALqG-SVM) aims to solve these problems. The non-adaptive penalty is extended to (0, 2] to fit different credit data structures, with the Gauss kernel, to improve classification accuracy. For verification purpose, two UCI credit datasets and a real-life credit dataset are used to test our model. The experiment results show that the proposed approach performs better than See 5, DT, MCCQP, SVM light and other popular algorithms listed in this study, and the computing speed is greatly improved, compared with the grid search method. Keywords: Adaptive penalty, Support vector machine, Credit risk classification, Evolution strategy. Chapter 45: Product Market Competition and CEO Pay Benchmarking This chapter examines the impact of product market competition on the benchmarking of a CEO’s compensation to their counterparts in peer companies. Using a large sample of US firms, we find a significantly greater effect of CEO pay benchmarking in more competitive industries than in less competitive industries. Using three proxies for managerial talent that have been used by Albuquerque et al. (2013), we find that CEO benchmarking is more pronounced in competitive markets wherein managerial talent is more valuable. This suggests that pay benchmarking and product market competition are complements. The above results are not due to industry homogeneity. Keywords: CEO compensation, Peer benchmarking, Product market competition, CEO talent, Endogenous industry structure, OLS, Robust standard errors that incorporate firm-level clustering. Chapter 46: Equilibrium Rate Analysis of Cash Conversion Systems: The Case of Corporate Subsidiaries This chapter defines and studies a class of cash conversion systems in firms, consisting of a funds pool, a single-product Make-to-Stock inventory, and
page 55
July 6, 2020
10:14
56
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
a receivables pool. The system implements a perpetual flow cycle where funds convert to product and back to funds. The equilibrium rate analysis (ERA) methodology is used to analyze the firm’s operational and financial performance metrics, including net profit rate, rate of return on investment, and cash conversion cycle statistics. Specifically, in this chapter, we model the case where the firm is a subsidiary of a financially stable parent corporation, and the subsidiary’s cash conversion system is capital rationed. We model this system as a discrete-state continuous-time Markovian process, and compute its stochastic equilibrium distribution using analytic and numerical methods. These are used, in turn, to compute the aforesaid financial metrics in stochastic equilibrium. Finally, we present a methodology that uses these financial metrics to optimize the financial and operational design of the system, and specifically, the firm’s capital structure and the sizing of the inventory’s base stock level. Numerical results show that optimal designs for profit rate maximization and rate of return maximization can differ substantially, reflecting the differing interests of firm managers and investors. Keywords: Cash conversion cycle, Capital-rationed firms, Credit-limited firms, Equilibrium rate analysis, Make-to-stock inventory, Markovian models, Stochastic equilibrium, Supply chain financial management. Chapter 47: Is the Market Portfolio Mean–Variance Efficient? This chapter investigates the characteristics of a subset of the infinite number of Security Market Lines (SMLs) that ensure the market portfolio is mean– variance efficient both at a point in time and over time. The analysis employs raw rather than excess returns. With some specifications of the SML, the risk-free rate exceeds the market portfolio’s equilibrium mean, which is inconsistent with CAPM theory. At a point in time, a Hotelling’s I 2 test may reject most of the SMLs or none of them, although other mean–variance criteria may indicate some are economically reasonable and others are not. Keywords: Mean–variance capital asset pricing, Econometric and statistical methods, Hotelling’s t2 test, CAPM, SML. Chapter 48: Consumption-Based Asset Pricing with Prospect Theory and Habit Formation In this chapter, we propose a novel model to incorporate prospect theory into the consumption-based asset pricing model, where habit formation of
page 56
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
57
consumption is employed to determine endogenously the reference point. Our model is motivated by the common element of prospect theory and habit formation of consumption that investors care little about the absolute level of wealth (consumption), but rather pay attention to gains or losses (excess or shortage in consumption level) compared to a reference point. The results show that if investors evaluate their excess or shortage amounts in consumption relative to their habit consumption levels based on prospect theory, the equity premium puzzle can be resolved. Keywords: Prospect theory, Habit formation, Loss aversion, Consumptionbased asset pricing model. Chapter 49: An Integrated Model for the Cost-Minimizing Funding of Corporate Activities Over Time To enhance the value of a firm, the firm’s management must attempt to minimize the total discounted cost of financing over a planning horizon. Unfortunately, the variety of sources of funds and the constraints that may be imposed on accessing funds from any one source make this exercise a difficult task. The model presented and illustrated here accomplishes this task considering issuing new equity and new bonds, refunding the bonds, borrowing short-term from financial institutions, temporarily parking surplus funds in short-term securities, repurchasing its stock, and retaining part or all of a firm’s earnings. The proportions of these sources of funds are determined subject to their associated costs and various constraints such as not exceeding a specific debt/equity ratio and following a stable dividend policy, among others. Keywords: Cost-minimization, Discounted value, Financing costs, Financial constraints, Funding decisions, Funding requirements, Optimal financial policy, Optimization, Planning horizon, Sources of funds. Chapter 50: Empirical Studies of Structural Credit Risk Models and the Application in Default Prediction: Review and New Evidence This chapter first reviews empirical evidence and estimation methods of structural credit risk models. Next, an empirical investigation of the performance of default prediction under the down-and-out barrier option framework is provided. In the literature review, a brief overview of the structural credit risk models is provided. Empirical investigations in extant
page 57
July 6, 2020
10:14
58
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
literature papers are described in some detail, and their results are summarized in terms of subject and estimation method adopted in each paper. Current estimation methods and their drawbacks are discussed in detail. In our empirical investigation, we adopt the Maximum Likelihood Estimation method proposed by Duan (1994). This method has been shown by Ericsson and Reneby (2005) through simulation experiments to be superior to the volatility restriction approach commonly adopted in the literature. Our empirical results surprisingly show that the simple Merton model outperforms the Brockman and Turtle (2003) model in default prediction. The inferior performance of the Brockman and Turtle model may be the result of its unreasonable assumption of the flat barrier. Keywords: Structural credit risk model, Estimation approach, Default prediction, Maximum likelihood estimation (MLE), Monte Carlo experiment, Down-and-out barrier model, KMV estimation method. Chapter 51: Empirical Performance of the Constant Elasticity Variance Option Pricing Model In this chapter, we empirically test the Constant–Elasticity-of-variance (CEV) option pricing model by Cox (1975, 1996) and Cox and Ross (1976), and compare the performances of the CEV and alternative option pricing models, mainly the stochastic volatility model, in terms of European option pricing and cost-accuracy-based analysis of their numerical procedures. In European-style option pricing, we have tested the empirical pricing performance of the CEV model and compared the results with those by Bakshi et al. (1997). The CEV model, introducing only one more parameter compared with Black–Scholes formula, improves the performance notably in all of the tests of in-sample, out-of-sample and the stability of implied volatility. Furthermore, with a much simpler model, the CEV model can still perform better than the stochastic volatility model in short term and out-of-the-money categories. When applied to American option pricing, highdimensional lattice models are prohibitively expensive. Our numerical experiments clearly show that the CEV model performs much better in terms of the speed of convergence to its closed-form solution, while the implementation cost of the stochastic volatility model is too high and practically infeasible for empirical work. In summary, with a much less implementation cost and faster computational speed, the CEV option pricing model could be a better candidate than more complex option pricing models, especially
page 58
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
59
when one wants to apply the CEV process for pricing more complicated path-dependent options or credit risk models. Keywords: Constant–Elasticity-of-variance (CEV) process, Option pricing model, Empirical performance, Numerical experiment, Stochastic volatility option pricing model, Finite difference method of the SV model. Chapter 52: The Jump Behavior of a Foreign Exchange Market: Analysis of the Thai Baht We study the heteroskedasticity and jump behavior of the Thai baht using models of the square root stochastic volatility with or without jumps. The Bayesian factor is used to evaluate the explanatory power of competing models. The results suggest that in our sample, the square root stochastic volatility model with independent jumps in the observation and state equations (SVIJ) has the best explanatory power for the 1996 Asian financial crisis. Using the estimation results of the SVIJ model, we are able to link the major events of the Asian financial crisis to jump behavior in either volatility or observation. Keywords: Heteroskedasticity, Jump, Stochastic volatility model with independent jumps, Asian financial crisis, Bayesian factor. Chapter 53: The Revision of Systematic Risk on Earnings Announcement in the Presence of Conditional Heteroscedasticity This chapter attempts to explore the puzzle of post-earnings-announcement drifts by focusing on the revision of systematic risk subsequent to the release of earnings information. This chapter proposes a market model with time-varying systematic risk by incorporating ARCH into the CAPM. The Kalman filter is then employed to estimate how the market revises its risk assessment subsequent to earnings announcement. This chapter also conducts empirical analysis based on a sample of US publicly held companies during the five-fiscal year sample period, 2010–2014. After controlling for the revision of risk and isolating potential confounding effect, this chapter finds that the phenomenon of post-earnings announcement drifts, so well documented in accounting literature, no longer exists. Keywords: ARCH (autoregressive conditional heteroskedasticity), CAPM, Kalman filter, Post-earnings-announcement drifts.
page 59
July 6, 2020
10:14
60
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 54: Applications of Fuzzy Set to International Transfer Pricing and Other Business Decisions In today’s dynamic but ambiguous business environment, the fuzzy set applications are growing continuously as one of a manager’s most useful decision-making tools. Recent fuzzy set business applications show promising results (Alcantud et al., 2017, Frini, 2017, Toklu, 2017, and Wang et al., 2017). International transfer pricing recently has received more attention as the US wages trade wars with China and other countries as some firms try to choose minimizing taxes as a transfer pricing strategy. This chapter demonstrates how to apply the fuzzy set in international transfer pricing problems. Applications of Fuzzy set to Other Business Decision are also discussed in some detail. Keywords: Fuzzy set, Transfer pricing, Multiple criteria and multiple constraint level (MC2 ) linear programming, Multiple factor transfer pricing model. Chapter 55: A Time-Series Bootstrapping Simulation Method to Distinguish Sell-Side Analysts’ Skill from Luck Data mining is quite common in econometric modeling when a given dataset is applied multiple times for the purpose of inference; it in turn could bias inference. Given the existence of data mining, it is likely that any reported investment performance is simply due to random chance (luck). This study develops a time-series bootstrapping simulation method to distinguish skill from luck in the investment process. Empirically, we find little evidence showing that investment strategies based on UK analyst recommendation revisions can generate statistically significant abnormal returns. Our rolling window-based bootstrapping simulations confirm that the reported insignificant portfolio performance is due to sell-side analysts’ lack of skill in making valuable stock recommendations, rather than their bad luck, irrespective of whether they work for more prestigious brokerage houses or not. Keywords: Data mining, Time-series bootstrapping simulations, Sell-side analysts, Analyst recommendation revisions. Chapter 56: Acceptance of New Technologies by Employees in Financial Industry Banks now are facing strong competition from both technological giants and small fintech startups. Under these conditions, banks also have started to
page 60
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
61
implement disruptive technologies in their day-to-day operations. However, in some cases huge investments in different technological systems do not lead to an increase in company performance due to the resistance of employees. In this chapter, we focus on both internal and external factors that may influence employees’ labor productivity and performance of the whole company. The sample includes 148 employees with education in banking and finance. The model was estimated based on Partial Least Squares Structural Equation Modeling (PLS-SEM). It was shown that both motivation to use disruptive technologies and digital skills have a strong impact of labor productivity, while both labor productivity and organizational support positively contribute to the improvement of company performance that is based on the usage of new technologies. Keywords: Disruptive technologies, Banking, Technologies acceptance, Employees. Chapter 57: Alternative Method for Determining Industrial Bond Ratings: Theory and Empirical Evidence The financial ratio-based credit-scoring model for bond rating system requires the maximization of two conflicting objectives simultaneously, namely, the explanatory and discriminatory power, which had not been directly addressed in literature. The main purpose of this study is to develop a credit-scoring model that combines the principle component analysis and Fisher’s discriminant analysis using the MINIMAX goal programming technique so the maximization of the two conflicting objectives can be compromised. The performance of alternative credit-scoring models including the stepwise discriminant analysis by Pinch and Mingo, Fisher’s discriminant analysis, and the principle component analysis is analyzed and compared using dataset from previous studies. We find that the proposed hybrid credit-scoring model outperforms other alternative models in both explanatory and discriminatory powers. Keywords: Explanatory power, Discriminatory power, Credit-scoring model, MINIMAX goal programming, Goal programming. Chapter 58: An Empirical Investigation of the Long Memory Effect on the Relation of Downside Risk and Stock Returns This chapter resolves an inconclusive issue in the empirical literature about the relationship between downside risk and stock returns for Asian markets. This study demonstrates that the mixed signs on the risk coefficient stem from the fact that the excess stock return series is assumed to be stationary
page 61
July 6, 2020
10:14
62
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
with a short memory, which is inconsistent with the downside risk series featuring a long memory process. After we appropriately model the long memory property of downside risk and apply a fractional difference to downside risk, the evidence consistently supports a significant and positive risk–return relation. This holds true for downside risk not only in the domestic market but also across markets. The evidence suggests that the risk premium is higher if the risk originates in a dominant market, such as the US. These findings are robust even when we consider the leverage effect, value-at-risk feedback, and the long memory effect in the conditional variance. Keywords: Downside risk, Value-at-risk, Tail risk, Long memory, Risk– return tradeoff, Leverage effect. Chapter 59: Analysis of Sequential Conversions of Convertible Bonds: A Recurrent Survival Approach This study uses recurrent survival analysis technique to show that higher spread of conversion-stock prices and higher buy-back ratio of stock repurchase provide the CBs’ debt-like signals of Constantinides and Grundy (1989); while lower risk-free rate, higher capital expenditures, higher non-management institutional ownership and higher total asset value provide the CBs’ equity-like signals of Stein (1992). While the equity-like signals might accelerate the rate of sequential conversions and weaken the CBs’ risk-mitigating effect in the presence of risk-shifting potential, this study shows that this can happen only in a financially healthy firm with higher free cash flow. For financially distressed firms, the CBs’ risk-mitigating effect is maintained. Keywords: Recurrent survival analysis, Debt-like signal, Equity-like signal, Stock repurchase, Sequential conversion, Risk-mitigating effect, Riskshifting. Chapter 60: Determinants of Euro-Area Bank CDS Spreads This study relies on a structural approach model to investigate the determinants of Credit Default Swap (CDS) spread changes for Euro-zone’s financial institutions over the period January 2005 to October 2015. Going beyond the structural model, this study incorporates features such as the role of systemic risk factors, bank-specific characteristics and credit ratings. We adopt the dynamic framework provided by panel Vector Autoregressive Models which allows for endogeneity issues and this is a novelty of our approach. The main findings are that structural models seem to be more relevant during
page 62
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
63
high volatile periods and that the relation between the CDS and its theoretical determinants is not constant over time. Overall, the empirical results suggest that structural models perform well in explaining bank credit risk, but determinants of CDS also rely on the underlying economic situation which should be taken into consideration when interpreting CDS spread changes. Keywords: Credit default swaps, Bank credit risk, Panel vector autoregressions. Chapter 61: Dynamic Term Structure Models Using Principal Components Analysis Near the Zero Lower Bound This chapter examines the empirical performance of dynamic Gaussian affine term structure models (DGATSMs) at the zero lower bound (ZLB) when principal components analysis (PCA) is used to extract factors. We begin by providing a comprehensive review of DGATSM when PCA is used to extract factors highlighting its numerous auspicious qualities; it specifies bond yields to be a simple linear function of underlying Gaussian factors. This is especially favorable since, in principle, PCA works best when the model is linear and the first two moments are sufficient to describe the data, among other characteristics. DGATSM have a strong theoretical foundation grounded in the absence of arbitrage. DGATSM produce reasonable cross-sectional fits of the yield curve. Both of these qualities are inherited into the model when PCA is used to extract the state vector. Additionally, the implementation of PCA is simple in that it takes a matter of seconds to estimate factors and is convenient to include in estimation as most software packages have ready-to-use algorithms to compute the factors immediately. The results from our empirical investigation lead us to conclude that DGATSM, when PCA is employed to extract factors, perform very poorly at the ZLB. It frequently crosses the ZLB enroot to produce negative out-of-sample forecasts for bond yields. The main implication in this study is that despite its numerous positive characteristics, DGATSM when PCA is used to extract factors produce poor empirical forecasts around the ZLB. Keywords: Financial econometrics, PCA, DTSM, ZLB, Structural breaks, Model of Ang and Piazzesi (2003), Model of Joslin, Singleton, and Zhu (2011), Model of Joslin, Le, and Singleton (2013a), PCDTSM, Low interest rate environment.
page 63
July 6, 2020
10:14
64
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 62: Effects of Measurement Errors on Systematic Risk and Performance Measure of a Portfolio In this chapter, we first investigate how measurement errors can affect the ˆ assuming Rm estimators of CAPM, such as αj and βj . Then, we derive Plimb and Rb are measured with error. Finally, we develop an alternative hypothesis testing procedure for the CAPM. Keywords: Measurement error, Probability limit for regression coefficient, Decomposition of estimated regression coefficient, Performance measure, Systematic risk. Chapter 63: Forecasting Net Charge-Off Rates of Banks: A PLS Approach This chapter relies on a factor-based forecasting model for net charge-off rates of banks in a data-rich environment. More specifically, we employ a partial least squares (PLS) method to extract target-specific factors and find that it outperforms the principal component approach in-sample by construction. Further, we apply PLS to out-of-sample forecasting exercises for aggregate bank net charge-off rates on various loans as well as for similar individual bank rates using over 250 quarterly macroeconomic data from 1987Q1 to 2016Q4. Our empirical results demonstrate superior performance of PLS over benchmark models, including both a stationary autoregressive type model and a non-stationary random walk model. Our approach can help banks identify important variables that contribute to bank losses so that they are better able to contain losses to manageable levels. Keywords: Net charge-off rates, Partial least squares, Principal component analysis, Dynamic factors, Out-of-sample forecasts. Chapter 64: Application of Filtering Methods in Asset Pricing Filtering methods such as the Kalman Filter (KF) and its extended algorithms have been widely used in estimating asset pricing models about many topics such as rational stock bubble, interest rate term structure and derivative pricing. The basic idea of filtering is to cast the discrete or continuous time series model of asset prices into a discrete state-space model where the state variables are the latent factors driving the system and the observable variables are usually asset prices. Based on a state-space model, we can
page 64
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
65
choose a specific filtering method to compute its likelihood and estimate unknown parameters using maximum likelihood method. The classical KF can be used to estimate the linear state-space model with Gaussian measurement error. If the model becomes nonlinear, we can rely on Extended Kalman Filter (EKF), Unscented Kalman Filter (UKF) or Particle Filter (PF), for estimation. For a piecewise linear state-space model with regime switching, the Mixture Kalman Filter (MKF), which inherits merits of both KF and PF, can be employed. However, if the measurement error is nonGaussian, only PF is the applicable method. For each filtering method, we review its algorithm, application scope, computational efficiency and asset pricing applications. This chapter provides a brief summary of applications of filtering methods in estimating asset pricing models. Keywords: State-space model, Kalman filter, Extended Kalman filter, Unscented Kalman filter, Mixture Kalman filter, Particle Filter, Asset pricing. Chapter 65: Sampling Distribution of the Relative Risk Aversion Estimator: Theory and Applications Brown and Gibbons (1985) developed a theory of relative risk aversion estimation in terms of average market rates of return and the variance of market rates of return. However, the exact sampling distribution of an appropriate relative risk aversion estimator. First, we have derived theoretically the density of Brown and Gibbons’ maximum likelihood estimator. It is shown that the central t is not appropriate for estimating the significance of estimated relative risk aversion distribution. Then, we derived the minimum variance unbiased estimator by a linear transformation of Brown and Gibbons’ maximum likelihood estimator. The density function is neither a central nor a noncentral t distribution. Then, density function of this new distribution has been tabulated. There is an empirical example to illustrate the application of this new sampling distribution. Keywords: Relative risk aversion distribution, Maximum likelihood estimator, Minimum variance unbiased estimator, Noncentral t distribution. Chapter 66: Social Media, Bank Relationships and Firm Value This study examines how a firm’s social media efforts and banking relationships influence the firm value. Using a sample of 855 non-financial Taiwanese
page 65
July 6, 2020
10:14
66
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
listed companies, with a total of 6,651 firm-year observations from 2008 to 2015, we find that while the social media positively influences the firm value, the numbers of bank relationships negatively affected the firm value. There is a substitute relationship between social media and bank relationships. The social media has a similar function as banks to mitigate the asymmetric information between firms and investors, and its impact is even stronger relative to banking relationships on affecting firm value. Keywords: Social media, Bank relationships, Asymmetric information, Firm value. Chapter 67: Splines, Heat, and IPOs: Advances in the Measurement of Aggregate IPO Issuance and Performance The objective of this chapter is to provide an update to the literature on initial public offering (IPO) performance and issuance focusing explicitly on the methodological approaches used to conduct these analyses and to develop a more general approach to evaluating aggregate IPO issuance and performance. Traditionally, empirical studies of IPO performance have been critically dependent on the general methodology that researchers use to adjust the individual IPO’s returns to account for market performance and the time horizon of the study; however, more recent studies have examined the patterns of returns that IPOs emit, in general, sometimes prior to performance adjustments. In the US market, for instance, changes in the regulatory regime as a result of the introduction of the JOB’s Act and events such as the financial collapse have led to a period of relatively benign issuance associated with IPOs. This has recently led to new questions about the true relationship between the volume of IPO issuance and performance. Historically, we have assumed that hot and cold market cycles affect performance; however, recently the methodology used to capture whether markets are indeed hot or cold has been questioned. In addition, there has been a renaissance of late as researchers critically examine the validity of research projects that claim to identify hot and cold markets or identify cyclicality in the performance of IPO. The research has evolved from a segmentation of a population of IPO returns into quartiles or terciles and referring to the segments of these populations hot and cold, to Markov two and three state regime-shifting models, to more recent applications of event specific and spline regression models; researchers have been working to uncover what actually causes the IPO markets to move and the cyclical nature of IPO performance and issuance seems to indicate that the current state of research on IPOs needs some
page 66
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
67
restructuring and clarification. This chapter has important implications for financial markets participants, portfolio managers, investment bankers, regulatory bodies, and business owners. Furthermore, this review chapter can aid in the setting of benchmarks for the valuation of IPOs, help investors, business owners, and the managers of businesses to understand the relationship between IPO performance and issuance so that they are better positioned to make wise investment decisions when purchasing IPOs or when issuing their IPOs and enable researchers to think more critically about developing their models of IPO issuance and performance. Keywords: Initial public offerings, IPO issuance and performance, Spline regression analysis. Chapter 68: The Effects of the Sample Size, the Investment Horizon and the Market Conditions on the Validity of Composite Performance Measures: A Generalization In our previous study, the empirical relationship between Sharpe measure and its risk proxy was shown to be dependent on the sample size, the investment horizon and the market conditions. This important result is generalized in the present study to include Treynor and Jensen performance measures. Moreover, it is shown that the conventional sample estimate of ex ante Treynor measure is biased. As a result, the ranking of mutual fund performance based on the biased estimate is not an unbiased ranking as implied by the ex ante Treynor measure. In addition, a significant relationship between the estimated Jensen measure and its risk proxy may produce a potential bias associated with the cumulative average residual technique which is frequently used for testing the market efficiency hypothesis. Finally, the impact of the dependence between risk and average return in Friend and Blume’s findings is also investigated. Keywords: Sharpe, Treynor and Jensen measures, Sample size, Investment horizon. Chapter 69: The Sampling Relationship Between Sharpe’s Performance Measure and its Risk Proxy: Sample Size, Investment Horizon and Market Conditions Sharpe’s, Treynor’s and Jensen’s measures have been extensively used for performance evaluation of mutual funds or portfolios. These three widely used performance measures have been found to be highly correlated with
page 67
July 6, 2020
10:14
68
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
their corresponding risk measures by a number of empirical studies. This paper focuses the investigation on the possible sources of the bias associated with the empirical relationship between the estimated Sharpe’s measure and its estimated risk measure. In general, the sample size, the investment horizon and the market conditions are three important factors in determining the strong relationship between the ex post Sharpe’s measure and its estimated risk surrogate. The interesting findings of this study are as follows: (1) the estimated Sharpe’s measure is uncorrelated with the estimated risk measure either when the risk-free rate of interest equals the expected return on the market portfolio over the sample period or when the sample size is infinite, (2) the estimated Sharpe’s measure is positively (or negatively) correlated with the estimated risk measure if the risk-free rate of interest is greater than (or less than) the expected return on the market portfolio, (3) an observation horizon shorter than the true investment horizon can reduce the dependence of the estimated Sharpe measure on its estimated risk measure, and (4) an observation horizon longer than the true investment horizon will magnify the dependence. The results have indicated that, in conducting empirical research, a shorter observation horizon and a large sample size should be used to reduce the bias associated with the estimated Sharpe’s measure. Keywords: Finance — Investment, Portfolio, Statistics — Sampling, Noncentral t-distribution. Chapter 70: VG NGARCH Versus GARJI Model for Asset Price Dynamics This study proposes and calibrates the VG NGARCH model, which provides a more informative and parsimonious model by formulating the dynamics of log-returns as a variance-gamma (VG) process by Madan et al. (1998). An autoregressive structure is imposed on the shape parameter of the VG process, which describes the news arrival rates that affect the price movements. The performance of the proposed VG NGARCH model is compared with the GARCH-jump with autoregressive conditional jump intensity (GARJI) model by Chan and Maheu (2002), in which two conditional independent autoregressive processes are used to describe stock price movements caused by normal and extreme news events, respectively. The comparison is made based on daily stock prices of five financial companies in the S&P 500, namely, Bank of America, Wells Fargo, J. P. Morgan, CitiGroup, and AIG, from January 3, 2006 to December 31, 2009. The goodness of fit of the
page 68
July 17, 2020
13:28
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
69
VG NGARCH model and its ability to predict the ex ante probabilities of large price movements are demonstrated and compared with the benchmark GARJI model. Keywords: VG NGARCH model, Variance-gamma process, Shape parameter, GARCH-jump, GARJI model, Goodness of fit, Ex ante probability. Chapter 71: Why do Smartphones and Tablets Users Adopt Mobile Banking? Purpose: Increased penetration of mobile phones has built great opportunities for increasing the level of financial inclusion around the world. Digital channels help banks in not only attracting new customers but also in ensuring that the existing ones remain loyal. This chapter studies the incentives to encourage the use of mobile banking by smartphone and tablet users. Design/methodology/approach: An online survey is conducted to explore possible relations between the potential determinants of the intention to use mobile banking. The model is assessed with Partial Least Squares Structural Equation Modelling (PLS-SEM) technique. Findings: The results show that perceived usefulness and perceived efforts tend to be the most significant factors in the adoption of mobile banking. However, such factors as perceived risks, compatibility with lifestyle and social influence are found to be insignificant due to some cultural and institutional features attributed to CIS countries. Originality/value: This chapter contributes to the field of m-banking studies by focusing on both smartphone and tablet users. At least, the majority of respondents represent Y and Z generations who seem to move from traditional banking to digital channels. Keywords: Mobile banking, Technology acceptance, User adoption, Intention. Chapter 72: Non-Parametric Inference on Risk Measures for Integrated Returns When evaluating the market risk of long-horizon equity returns, it is always difficult to provide a statistically sound solution due to the limitation of the sample size. To solve the problem for the value-at-risk (VaR) and the conditional tail expectation (CTE), Ho et al. (2016, 2018) introduce a general multivariate stochastic volatility return model from which asymptotic formulas for the VaR and the CTE are derived for integrated returns with the
page 69
July 6, 2020
10:14
70
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
length of integration increasing to infinity. Based on the formulas, simple non-parametric estimators for the two popular risk measures of the longhorizon returns are constructed. The estimates are easy to implement and shown to be consistent and asymptotically normal. In this chapter, we further address the issue of testing the equality of the CTE’s of integrated returns. Extensive finite-sample analysis and real data analysis are conducted to demonstrate the efficiency of the t-statistics we propose. Keywords: Conditional tail expectation, Equality of tail risks, Inference, Integrated process, Quantile, Stochastic volatility model, Test, Value at risk. Chapter 73: Copulas and Tail Dependence in Finance This chapter discusses the copula methods for application in finance. It provides an overview of the concept of copula, and the underlying statistical theories as well as theorems involved. The focus is on two copula families, namely, the elliptical and Archimedean copulas. The Gaussian and Student’s t copulas in the family of elliptical copulas which have symmetrical tails in their distributions are explained. The Clayton and Gumbel copulas in the family of Archimedean copulas whose distributions are asymmetrical are also described. Elaborations are given on tail dependence and the associated measures for these copulas. The estimation process is illustrated using an application of the methods on the returns of two exchange series. Keywords: Elliptical copula, Archimedean copula, Gaussian copula, Student’s t copula, Clayton copula, Gumbel copula, Tail dependence, Maximum likelihood estimation, Sklar’s theorem, Probability integral transform, Standardized student’s t-distribution. Chapter 74: Some Improved Estimators of Maximum Squared Sharpe Ratio By assuming multivariate normal distribution of excess returns, we find that the sample maximum squared Sharpe ratio (MSR) has a significant upward bias. We then construct estimators for MSR based on Bayes estimation and unbiased estimation of the squared slope of the asymptote to the minimum variance frontier (ψ 2 ). While the often used unbiased estimator may lead to unreasonable negative estimates in the case of finite sample, Bayes estimators will never produce negative values as long as the prior is bounded below by zero although it has a larger bias. We also design a mixed estimator by combining the Bayes estimator with the unbiased estimator. We show by simulation that the new mixed estimator performs as good as the unbiased
page 70
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
71
estimator in terms of bias and root mean square errors and it is always positive. The mixed estimators are particularly useful in trend analysis when MSR is very low, for example, during crisis or depression time. While negative or zero estimates from unbiased estimator are not admissible, Bayes and mixed estimators can provide more information. Keywords: Sharpe ratio, Finite sample, Unbiased estimation, Bayes estimation. Chapter 75: Errors-in-Variables and Reverse Regression Errors-in-variables (EIVs) and measurement errors are commonly encountered in asset prices and returns in capital market. This study examines the explanatory power of direct and reverse regression technique to bound the true regression estimates in the presence of EIVs and measurement error. We also derive standard error of reverse regression estimates to compute t-ratio of these estimates for the purpose of testing their statistical significance. Keywords: Measurement error, Errors-in-variables, Direct and reverse regression. Chapter 76: The Role of Financial Advisors in M&As: Do Domestic and Foreign Advisors Differ? This study investigates the role of financial advisors on the influence of deal outcomes in M&As. In particular, this study examines whether firms hired by domestic financial advisors can outperform those by foreign counterparts. Using 333 targets and 949 bidders from 1995 to 2011, the results show that targets take more (less) time to complete the deals when hiring low reputable (foreign) financial advisors. When bidders hire low reputable financial advisors and foreign advisors, bidders can complete the deals faster. In addition, the evidence indicates that low reputable financial advisors create higher gains to both targets and bidders around the announcement date. However, bidders hired by less prestigious financial advisors suffer larger losses during post-announcement period. Interestingly, when hiring domestic advisors, both targets and bidders obtain higher announcement returns. The regression analysis further reveals that bidders obtain higher post-announcement returns when bidders hire domestic advisors. Hence, this study reveals that domestic advisors play an important role in M&As. Keywords: Mergers and acquisitions, Investment banks, Announcement returns, Domestic and foreign advisors.
page 71
July 6, 2020
10:14
72
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 77: Discriminant Analysis, Factor Analysis, and Principal Component Analysis: Theory, Method, and Applications In this chapter, we first discuss the basic concepts of linear algebra and linear combination and its distribution. Then we discuss the concepts of vectors, matrices, and their operations. Linear-equation system and its solution are also explored in detail. Based upon this information, we discuss discriminant analysis, factor analysis, and principal component analysis. Some applications of these three analyses are also demonstrated. Keywords: Vectors, Matrices, Linear-equation system, Discriminant analysis, Factor analysis, Factor score, Factor loading. Chapter 78: Credit Analysis, Bond Rating Forecasting, and Default Probability Estimation In this chapter, we will discuss how to use discriminant analysis to do credit analysis and calculate financial z-score, then we will use both discriminant analysis and factor analysis to forecast bond rating by using financial ratio information. In addition, we will discuss Ohlson’s model and the KMV–Merton model for default probability estimation. Finally, we will cite some empirical results about default probability estimation and compare the results of two different probability estimation models. Keywords: Credit analysis, Default probability, Discriminant analysis, Factor analysis, Factor score, Factor loading, Financial z-score, Hazard model, Idiosyncratic standard deviation, KMV–Merton model, Logit model, Multivariate discriminant analysis (MDA). Chapter 79: Market Model, CAPM, and Beta Forecasting This chapter uses the concepts of basic portfolio analysis and dominance principle to derive the CAPM. A graphical approach is first utilized to derive the CAPM, after which a mathematical approach to the derivation is developed that illustrates how the market model can be used to decompose total risk into two components. This is followed by a discussion of the importance of beta in security analysis and further exploration of the determination and forecasting of beta. The discussion closes with the applications and
page 72
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
73
implications of the CAPM, and the appendix offers empirical evidence of the risk-return relationship. In this chapter, we define both market beta and accounting beta, and how they are determined by different accounting and economic information. Then, we forecast both market beta and accounting beta. Finally, we propose a composite method to forecast beta. Keywords: Total risk, Systematic risk, Non-systematic risk, Market model, CAPM, Accounting beta, Market beta, Composite forecasting. Chapter 80: Utility Theory, Capital Asset Allocation, and Markowitz Portfolio-Selection Model In this chapter, we first discuss utility theory and utility function in detail, then we show how asset allocation can be done in terms of the quadratic utility function. Based upon these concepts, we show Markowitz’s portfolio selection model can be executed by constrained maximization approach. Real world examples in terms of three securities are also demonstrated. In the Markowitz selection model, we consider that short sale is both allowed and not allowed. Keywords: Asset allocation, Concave utility function, Indifference curve, Lagrangian objective function, Linear utility function, Risk aversion, Short selling, Utility theory, Utility function. Chapter 81: Single-Index Model, Multiple-Index Model, and Portfolio Selection This chapter offers some simplifying assumptions that reduce the overall number of calculations of Markowitz models through the use of the Sharpe single-index and multiple-index models. Besides single-index model, we also discuss how multiple-index model can be applied to portfolio selection. We have theoretically demonstrated how single-index and multipleindex portfolio selection models can be used to replace Markowitz portfolio selection model. An Excel example of how to apply the single-index model approach is also demonstrated. Keywords: Beta coefficient, Covariance, Lagrangian calculus maximization, Linear programming, Market model, Multiple-index model, Singleindex model.
page 73
July 6, 2020
10:14
74
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 82: Sharpe Performance Measure and Treynor Performance Measure Approach to Portfolio Analysis The main points of this chapter show how Markowitz’s portfolio selection method can be simplified by either Sharpe performance measure or Treynor performance measure. These two approaches do not need to use constrained optimization procedure, however, these methods do require the existence of risk-free rate. Overall, this chapter has mathematically demonstrated how Sharpe measure and Treynor measure can be used to determine optimal portfolio weights. Keywords: Lagrangian multipliers, Risk-free rate, Sharpe performance measure, Short sales allowed, Short sales not allowed, Treynor performance measure. Chapter 83: Options and Option Strategies: Theory and Empirical Results This chapter aims to establish a basic knowledge of options and the markets in which they are traded. It begins with the most common types of options, calls, and puts, explaining their general characteristics and discussing the institutions where they are traded. In addition, the concepts relevant to the new types of options on indexes and futures are introduced. The next focus is the basic pricing relationship between puts and calls, known as put–call parity. The final study concerns how options can be used as investment tools. Alternative option strategies theory has been presented. Excel is used to demonstrate how different option strategies can be executed. Keywords: Put-call parity, Long call, Short call, Long put, Short put, Short straddle, Long vertical (Bull) spread, Short vertical (Bear) spread, Calendar (Time) spreads, Long straddle, Protective put, Covered call, Collar. Chapter 84: Decision Tree and Microsoft Excel Approach for Option Pricing Model In this chapter, we (i) use the decision-tree approach to derive binomial option pricing model (OPM) in terms of the method used by Rendleman and Barter (RB, 1979) and Cox et al. (CRR, 1979) and (ii) use Microsoft Excel to show how decision-tree model can be converted to Black–Scholes model when the number period increases to infinity. In addition, we develop binomial tree model for American option and trinomial tree model. The
page 74
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
75
efficiency of binomial and trinomial tree methods is also compared. In sum, this chapter shows how binomial OPM can be converted step by step to Black–Scholes OPM. Keywords: Binomial option pricing model, Call option, Put option, Oneperiod OPM, Two-period OPM, N-Period OPM, Synthetic option, Excel program, Black–Scholes model, Excel VBA, European options, American option. Chapter 85: Statistical Distributions, European Option, American Option, and Option Bounds In this chapter, we first review the basic theory of normal and log-normal distribution and their relationship, then bivariate normal density function are analyzed in detail. Next, we discuss American options in terms of random dividend payment. We then use bivariate normal density function to analyze American options with random dividend payment. Computer programs are used to show how American co-options can be evaluated. Finally, pricing option bounds are analyzed in some detail. Keywords: Normal distribution, Log-normal distribution, American option, Option bound, Multivariate normal distribution, Multivariate log-normal distribution. Chapter 86: A Comparative Static Analysis Approach to Derive Greek Letters: Theory and Applications Based on comparative analysis, we first discuss different kinds of Greek letters in terms of Black–Scholes option pricing model, then we show how these Greek letters can be applied to perform hedging and risk management. The relationship between delta, theta, and gamma is also explored in detail. Keywords: Delta (Δ), Theta (Θ), Gamma (Γ), Vega (ν), Rho (ρ), Hedging. Chapter 87: Fundamental Analysis, Technical Analysis, and Mutual Fund Performance This chapter discusses the methods and applications of fundamental analysis and technical analysis. In addition, it investigates the ranking performance of the Value Line and the timing and selectivity of mutual funds. A detailed
page 75
July 6, 2020
10:14
76
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
investigation of technical versus fundamental analysis is first presented. This is followed by an analysis of regression time-series and composite methods for forecasting security rates of return. Value Line ranking methods and their performance then are discussed, leading finally into a study of the classification of mutual funds and the mutual-fund managers’ timing and selectivity ability. In addition, the hedging ability is also briefly discussed. All of these topics can help improve performance in security analysis and portfolio management. Keywords: Fundamental analysis, Technical analysis, Dow theory, OddLot theory, Confidence index, Trading volume, Moving average, Component analysis, ARIMA models, Composite forecasting, Sharpe performance measure, Treynor performance measure, Jensen performance measure. Chapter 88: Bond Portfolio Management, Swap Strategy, Duration, and Convexity This chapter first focuses on the bond strategies of riding the yield curve and structuring the maturity of the bond portfolio in order to generate additional return. This is followed by a discussion of swapping, which are essentially interest-rate swaps. Next is an analysis of duration or the measure of the portfolio sensitivity to changes in interest rates with and without convexity, after which immunization is the focus. The convexity is essentially discussed in non-linear relationship between bond price and duration. Finally, a case study is presented of bond-portfolio management in the context of portfolio theory. Overall, this chapter presents how interest rate changes affect bond price and how maturity and duration can be used to manage portfolios. Keywords: Bond strategies, Swapping, Substitution swap, Intermarketspread swap, Interest-rate anticipation swap, Pure-yield-pickup swap, Duration, Maturity. Chapter 89: Synthetic Options, Portfolio Insurance, and Contingent Immunization This chapter discusses how futures, options, and futures options can be used in portfolio insurance (dynamic hedging). Four alternative portfolio insurance strategies are discussed in this chapter. These strategies are: (i) stop-loss orders, (ii) portfolio insurance with listed put options, (iii) portfolio insurance with synthetic options, and (iv) portfolio insurance with dynamic hedging. In addition, the techniques of combining stocks and futures to derive
page 76
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
77
synthetic options are explored in detail. Finally, important literature related to portfolio insurance is also reviewed. Keywords: Synthetic option, Stop-loss orders, Put options, Dynamic hedging, Tail wag the dog, Stock index futures. Chapter 90: Alternative Security Valuation Model: Theory and Empirical Results In this chapter, we will discuss four alternative security valuation models. These four models are as follows: (i) Warren and Shelton model, (ii) Francis and Rowell model, (iii) Feltham–Ohlson model, and (iv) combined forecasting model. In this chapter, we will show how accounting, stock price, and economic information can be used to determine security values in terms of finance theory. Algebraic simultaneous equation, econometrics model, and Excel program will be used for empirical studies. Keywords: Financial ratios, Francis and Rowell model, Feltham–Ohlson model, Warren and Shelton model, Production cost, Common stock valuation. Chapter 91: Opacity, Stale Pricing, Extreme Bounds Analysis, and Hedge Fund Performance: Making Sense of Reported Hedge Fund Returns The purpose of this chapter is to critically evaluate the methods used to examine hedge fund performance, review and synthesize studies that attempt to explain the inconsistencies associated with the performance of hedge funds and to attempt to compare the returns of hedge funds against more liquid investments. In fact, research related to hedge fund performance seems to have been focused on whether hedge fund managers manipulate their performance and what investors should think about this performance manipulation; however, recent studies have questioned whether this perceived performance manipulation is manipulation per se or something else. In general, researchers have used a number of different techniques to attempt to model hedge fund performance and the relative opacity and latency that is evident in the reporting of hedge fund returns. Nevertheless, the very nature of the structure of a hedge fund makes it difficult to mark the returns to market on a frequent basis and even if managers wanted their performance marked to market, which would unveil their positioning through time, the relative illiquidity and stale pricing associated with some of the investments
page 77
July 17, 2020
13:28
78
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
that are held by hedge funds make pricing the hedge fund a difficult and somewhat pointless exercise. To this end, studies that attempt to analyze and evaluate aggregate performance for hedge fund returns have focused on identifying the true determinates of hedge fund performance, attempted to account for and explain the relative staleness of pricing in hedge fund returns, and to relate the performance of hedge funds to more liquid and transparent investments. This chapter offers key suggestions for financial market participants such as hedge funds managers, portfolio managers, risk managers, regulatory bodies, financial analysts, and investors about their evaluation and interpretation of hedge fund performance. In addition, this critical review chapter can benefit investors, portfolio managers, and researchers in the establishment of a yardstick for the assessment of hedge fund performance and the performance of assets that have stale pricing and are relatively opaque. Keywords: Hedge funds performance, Stale pricing, Performance manipulation, Distributed lag models, Extreme bound analysis.
Chapter 92: Does Quantile Co-Integration Exist Between Gold Spot and Futures Prices? This study examines the relationships between the gold spot and futures with different maturities using a time-varying and quantile-dependent approach, that is, the quantile co-integration model. This model allows the co-integrating coefficient to vary over the conditional distribution of gold spot prices. We find that the returns of gold at lower quantiles, the co-integration among gold spot prices and one- to six-month gold futures prices are less stronger than the returns at high quantiles. When the gold returns of quantiles are high, these relationships become even stronger. In terms of the co-integration between gold and VIX (CBOE Volatility Index), we find that the co-integration of gold spot prices, futures prices and VIX at high quantile are greater than those observed at low quantiles. Our work adds another cross-sectional dimension to the extant literature, which uses only the time-series dimension to examine the co-integration. Furthermore, the results suggest that while investors intend to hedge risk by exercising futures contracts, using short-term futures would be a better choice than the long-term contracts. Keywords: Quantile co-integrated, Gold, Future contract, VIX.
page 78
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
79
Chapter 93: Bayesian Portfolio Mean–Variance Efficiency Test with Sharpe Ratio’s Sampling Error This study proposes a Bayesian test for a test portfolio p’s mean-variance efficiency that takes into account the sampling errors associated with the ex post Sharpe ratio SˆR of the test portfolio p. The test is based on the Bayes factor that compares the joint likely hoods under the null hypothesis H0 and the alternative H1 , respectively. Using historical monthly return data of 10 industrial portfolios and a test portfolio, namely, the CRSP value-weighted index, from January 1941 to December 1973 and January 1980 to December 2012, the power function of the proposed Bayesian test is compared to the conditional multivariate F test by Gibbons, Ross and Shanken (1989) and the Bayesian test by Shanken (1987). In an independent simulation study, the performance of the proposed Bayesian test is also demonstrated. Keywords: Bayesian test, Mean–variance efficiency, Ex post Sharpe ratio, Bayes factor, CRSP value-weighted index, Conditional multivariate F -test.
Chapter 94: Does Revenue Momentum Drive or Ride Earnings or Price Momentum? This chapter examines the profits of revenue, earnings, and price momentum strategies in an attempt to understand investor reactions when facing multiple information of firm performance in various scenarios. We first offer evidence that there is no dominating momentum strategy among the revenue, earnings, and price momentums, suggesting that revenue surprises, earnings surprises, and prior returns each carry some exclusive unpriced information content. We next show that the profits of momentum driven by firm fundamental performance information (revenue or earnings) depend upon the accompanying firm market performance information (price), and vice versa. The robust monotonicity in multivariate momentum returns is consistent with the argument that the market does not only underestimate the individual information but also the joint implications of multiple information on firm performance, particularly when they point in the same direction. A three-way combined momentum strategy may offer monthly return as high as 1.44%. The information conveyed by revenue surprises and earnings surprises combined account for about 19% of price momentum effects, which finding adds to the large literature on tracing the sources of price momentum.
page 79
July 6, 2020
10:14
80
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Keywords: Revenue surprises, Earnings announcement drift, Momentum strategies.
surprises,
Post-earnings-
Chapter 95: Technical, Fundamental, and Combined Information for Separating Winners from Losers This study examines how fundamental accounting information can be used to supplement technical information to separate momentum winners from losers. We first introduce a ratio of liquidity buy volume to liquidity sell volume (BOS ratio) to proxy the level of information asymmetry for stocks and show that the BOS momentum strategy can enhance the profits of momentum strategy. We further propose a unified framework, produced by incorporating two fundamental indicators — the FSCORE (Piotroski, 2000) and the GSCORE (Mohanram, 2005) — into momentum strategy. The empirical results show that the combined investment strategy includes stocks with a larger information content that the market cannot reflect in time, and therefore, the combined investment strategy outperforms momentum strategy by generating significantly higher returns. Keywords: BOS ratio, Combined investment strategy, Financial statement analysis, Fundamental analysis, Momentum strategies, Technical analysis, Trading volume. Chapter 96: Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis: Theory and Empirical Evidence Following the dividend flexibility hypothesis used by DeAngelo and DeAngelo (2006), Blau and Fuller (2008), and others, we theoretically extend the proposition of DeAngelo and DeAngelo’s (2006) optimal payout policy in terms of the flexibility dividend hypothesis. In addition, we also introduce growth rate, systematic risk, and total risk variables into the theoretical model. To test the theoretical results derived in this paper, we use data collected in the US from 1969 to 2009 to investigate the impact of growth rate, systematic risk, and total risk on the optimal payout ratio in terms of the fixed-effect model. We find that based on flexibility considerations, a company will reduce its payout when the growth rate increases. In addition, we find that a non-linear relationship exists between the payout ratio and the risk. In other words, the relationship between the payout ratio and risk is negative (or positive) when the growth rate is higher (or lower) than the
page 80
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
81
rate of return on total assets. Our theoretical model and empirical results can therefore be used to identify whether flexibility or the free cash flow hypothesis should be used to determine the dividend policy. Keywords: Dividends, Flexibility hypothesis, Payout policy, Fixed-effects model. Chapter 97: Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio: A Joint Optimization Approach A large number of studies have examined issues of dividend policy, while they rarely consider the investment decision and dividend policy jointly from a non-steady state to a steady state. We extend Higgins’ (1977, 1981, 2008) sustainable growth rate model and develop a dynamic model which jointly optimizes the growth rate and payout ratio. We optimize the firm value to obtain the optimal growth rate in terms of a logistic equation and find that the steady-state growth rate can be used as the benchmark for the mean-reverting process of the optimal growth rate. We also investigate the specification error of the mean and variance of dividend per share when introducing the stochastic growth rate. Empirical results support the mean-reverting process of the growth rate and the importance of covariance between the profitability and the growth rate in determining dividend payout policy. In addition, the intertemporal behavior of the covariance may shed some light on the fact of disappearing dividends over decades. Keywords: Dividend policy, Payout ratio, Growth rate, Specification error, Logistic equation, Partial adjustment model, Mean reverting process. Chapter 98: Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests of Asset-Pricing Models It is well known that in simple linear regression measurement, errors in the explanatory variable lead to a downward bias in the OLS slope estimator. In two-pass regression tests of asset-pricing models, one is confronted with such measurement errors as the second-pass cross-sectional regression uses as explanatory variables imprecise estimates of asset betas extracted from the first-pass time-series regression. The slope estimator of the secondpass regression is used to get an estimate of the pricing-model’s factor riskpremium. Since the significance of this estimate is decisive for the validity
page 81
July 6, 2020
10:14
82
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
of the model, knowledge of the properties of the slope estimator, in particular, its bias, is crucial. First, we show that cross-sectional correlations in the idiosyncratic errors of the first-pass time-series regression lead to correlated measurement errors in the betas used in the second-pass cross-sectional regression. We then study the effect of correlated measurement errors on the bias of the OLS slope estimator. Using Taylor approximation, we develop an analytic expression for the bias in the slope estimator of the second-pass regression with a finite number of test assets N and a finite time-series sample size T . The bias is found to depend in a non-trivial way not only on the size and correlations of the measurement errors but also on the distribution of the true values of the explanatory variable (the betas). In fact, while the bias increases with the size of the errors, it decreases the more the errors are correlated. We illustrate and validate our result using a simulation approach based on empirical return data commonly used in asset-pricing tests. In particular, we show that correlations seen in empirical returns (e.g., due to industry effects in sorted portfolios) substantially suppress the bias. Keywords: Asset pricing, CAPM, Errors in variables, Simulation, Idiosyncratic risk, Two-pass regression, Measurement error. Chapter 99: Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence Breeden (1979), Grinols (1984), and Cox et al. (1985) describe the importance of supply side for the capital asset pricing. Black (1976) derives a dynamic, multi-period CAPM, integrating endogenous demand and supply. However, this theoretically elegant model has never been empirically tested for its implications in dynamic asset pricing. We first review and theoretically extend Black’s CAPM to allow for a price adjustment process. We then derive the disequilibrium model for asset pricing in terms of the disequilibrium model developed by Fair and Jaffe (1972), Amemiya (1974), Quandt (1988), and others. We discuss two methods of estimating an asset pricing model with disequilibrium price adjustment effect. Finally, using price per share, dividend per share, and outstanding shares data, we test the existence of price disequilibrium adjustment process with international index data and US equity data. We find that there exists disequilibrium price adjustment process in our empirical data. Our results support Lo and Wang’s (2000) findings that trading volume is one of the important factors in determining capital asset pricing. Keywords: Multiperiod dynamic CAPM, Demand function, Supply function, Disequilibrium model, Disequilibrium effect, Two-stage least squares (2SLS) estimator, Maximum likelihood estimator.
page 82
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
83
Chapter 100: A Dynamic CAPM with Supply Effect: Theory and Empirical Results Breeden [An intertemporal asset pricing model with stochastic consumption and investment opportunities. Journal of Financial Economics 7, (1979) 265–296], Grinols [Production and risk leveling in the intertemporal capital asset pricing model. Journal of Finance 39, 5, (1984) 1571–1595] and Cox et al. [An intertemporal general equilibrium model of asset prices. Econometrica 53, (1985) 363–384] have described the importance of supply side for the capital asset pricing. Black [Rational response to shocks in a dynamic model of capital asset pricing. American Economic Review 66, (1976) 767– 779] derives a dynamic, multiperiod CAPM, integrating endogenous demand and supply. However, Black’s theoretically elegant model has never been empirically tested for its implications in dynamic asset pricing. We first theoretically extend Black’s CAPM. Then we use price, dividend per share and earnings per share to test the existence of supply effect with US equity data. We find the supply effect is important in US domestic stock markets. This finding holds as we break the companies listed in the S&P 500 into 10 portfolios by different level of payout ratio. It also holds consistently if we use individual stock data. A simultaneous equation system is constructed through a standard structural form of a multiperiod equation to represent the dynamic relationship between supply and demand for capital assets. The equation system is exactly identified under our specification. Then, two hypotheses related to supply effect are tested regarding the parameters in the reduced-form system. The equation system is estimated by the seemingly unrelated regression (SUR) method, since SUR allows one to estimate the presented system simultaneously while accounting for the correlated errors. Keywords: CAPM, Asset, Endogenous supply, Simultaneous equations, Reduced-form, Seemingly unrelated regression (SUR), Exactly identified, Cost of capital, Quadratic cost, Partial adjustment. Chapter 101: Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default Machine learning has successful applications in credit risk management, portfolio management, automatic trading, and fraud detection, to name a few, in the domain of finance technology. Reformulating and solving these topics adequately and accurately is problem specific and challenging along with
page 83
July 6, 2020
10:14
84
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
the availability of complex and voluminous data. In credit risk management, one major problem is to predict the default of credit card holders using real data set. We review five machine learning methods: the k-nearest neighbors, decision trees, boosting, support vector machine, and neural networks, and apply them to the above problem. In addition, we give explicit Python scripts to conduct analysis using a real data set of 29,999 instances with 23 features collected from a major bank in Taiwan. We show that the decision tree performs best among others in terms of validation curves. Keywords: Artificial intelligence, Machine learning, Supervised learning, K-nearest neighbors, Decision tree, Booting, Support vector machine, Neural network, Python, Delinquency, Default, Credit card, Credit risk. Chapter 102: Alternative Methods to Derive Option Pricing Models: Review and Comparison The main purposes of this paper are: (i) to review three alternative methods for deriving option pricing models (OPM), (ii) to discuss the relationship between binomial OPM and Black–Scholes OPM, (iii) to compare the Cox et al. method and the Rendleman and Bartter method for deriving Black–Scholes OPM, (iv) to discuss lognormal distribution method to derive the Black–Scholes OPM, and (v) to show how the Black–Scholes model can be derived by stochastic calculus. This paper shows that the main methodologies used to derive the Black–Scholes model are: binomial distribution, lognormal distribution, and differential and integral calculus. If we assume risk neutrality, then we do not need stochastic calculus to derive the Black– Scholes model. However, the stochastic calculus approach for deriving the Black–Scholes model is still presented in Section 102.6. In sum, this paper can help statisticians and mathematicians understand how alternative methods can be used to derive the Black–Scholes option model. Keywords: Black–Scholes option pricing model, Binomial option pricing model, Lognormal distribution method, Stochastic calculus. Chapter 103: Option Price and Stock Market Momentum in China Option prices tend to be correlated to past stock market returns due to market imperfections. This chapter discusses this issue in Chinese derivative market. Implied volatility spread based on pairs of options is constructed to measure the price pressure in the option market. By regressing the implied
page 84
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
85
volatility spread on past stock returns, we find that past stock returns exert a strong influence on the pricing of index options. Specifically, the SSE 50 ETF calls are significantly overvalued relative to SSE 50 ETF puts after stock price increases, and vice versa. Moreover, we empirically validate that momentum effects in the underlying stock market are responsible for the price pressure. These findings are both economically and statistically significant and have important implications. Keywords: Option price, Implied volatility spread, Past stock returns, Stock market momentum, Price pressure, Momentum factor. Chapter 104: Advancement of Optimal Portfolios Models with Short-Sales and Transaction Costs: Methodology and Effectiveness This chapter presents advancement of several widely applied portfolio models to ensure flexibility in their applications: Mean–variance (MV), Mean-absolute deviation (MAD), linearized value-at-risk (LVaR), conditional value-at-risk (CVaR), and Omega models. We include short-sales and transaction costs in modeling portfolios and further investigate their effectiveness. Using the daily data of international ETFs over 15 years, we generate the results of the rebalancing portfolios. The empirical findings show that the MV, MAD, and Omega models yield higher realized return with lower portfolio diversity than the LVaR and CVaR models. The outperformance of these risk-return-based models over the downside-risk-focused models comes from efficient asset allocation but not only the saving of transaction costs. Keywords: Portfolio selection, Conditional value at risk model, Value at risk model, Omega model, Transaction costs, Short selling. Chapter 105: The Path Leading up to the New IFRS 16 Leasing Standard: How was the Restructuring of Lease Accounting Received by Different Advocacy Groups? The due process of the International Financial Reporting Standards (IFRS) enables interested parties to comment on the development of new IFRS. Unsurprisingly, different advocacy groups have very different perspectives and interests. For example, businesses are more likely to be interested in “user-friendly” rules, whereas standard-setters and academics tend to prefer theoretically coherent standards.
page 85
July 6, 2020
10:14
86
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
This paper analyzes the response behavior of different advocacy groups using the example of lease accounting reform whereas leasing seems to be a promising example. First, to analyze the response behavior, five different advocacy groups are defined. The 657 comment letters submitted for the Re-Exposure Draft “Leases” are then assigned to these five advocacy groups. The Re-Exposure Draft formulates questions about different aspects of the new standard and asks for comments regarding these aspects. Next, the response behavior of the different advocacy groups with respect to the most relevant questions is examined quantitatively and qualitatively. The quantitative analysis uses the Kruskal–Wallis test (H-test) and the Mann– Whitney test (U-test) to evaluate the response behavior. The main result of the study is that the response behavior to various questions differs significantly between advocacy groups. In particular, it is shown that the response behavior differs drastically between more “user-oriented” and more “theoretically oriented” advocacy groups. Keywords: Lease accounting, Advocacy groups, Due process, Comment letters, Statistical analysis of response behavior, Kruskal–Wallis test, Mann– Whitney test, Significance identification. Chapter 106: Implied Variance Estimates for Black–Scholes and CEV OPM: Review and Comparison The main purpose of this chapter is to demonstrate how to estimate implied variance for both Black–Scholes option pricing model (OPM) and constant elasticity of variance (CEV) OPM. For the Black–Scholes OPM model, we classify them into two different estimation routines: numerical search methods and closed-form derivation approaches. Both MATLAB approach and approximation method are used to empirically estimate implied variance for American and Chinese options. For the CEV model, we present the theory and demonstrate how to use related Excel program in detail. Keywords: Implied variance, Black–Scholes model, MATLAB approach, Approximation approach, CEV model. Chapter 107: Crisis Impact on Stock Market Predictability This research paper aims to examine the predictability of the Spanish Stock Market returns. Earlier studies suggest that stock market returns in
page 86
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
87
developed countries can be predicted with a noise term but this study has specifically covered two time horizons; one pre-crisis period and the other one current crisis period to evaluate the stock market returns predictability. Since mean returns cannot prove all the time to be an efficient predictor, variance of such returns do, hence various autoregressive models have been used to test the existence of persisting volatility in the Spanish Stock Market. The empirical results show that higher order of autoregressive models such as ARCH(5) and GARCH(2, 2) can be used to predict future risk in Spanish Stock Market both in pre-crisis and current crisis period. The paper also reveals that there is a positive correlation between Spanish Stock Market returns and the conditional standard deviations as produced by ARCH(5) and GARCH(2, 2), implying that the models have some success in predicting future risk on Spanish Stock Market. The predictability of stock market returns during crisis period is not found to be affected contrary though the degree of predictability may be. Keywords: Predictability, Autoregressive conditional heteroskedasticity (ARCH) & Generalized Autoregressive conditional heteroskedasticity (GARCH), Stock market returns, Financial crisis. Chapter 108: How Many Good and Bad Funds are There, Really? Building on the work of Barras, Scaillet and Wermers (BSW, 2010), we propose a modified approach to inferring performance for a cross-section of investment funds. Our model assumes that funds belong to groups of different abnormal performance or alpha. Using the structure of the probability model, we simultaneously estimate the alpha locations and the fractions of funds for each group, taking multiple testing into account. Our approach allows for tests with imperfect power that may falsely classify good funds as bad, and vice versa. Examining both mutual funds and hedge funds, we find smaller fractions of zero-alpha funds and more funds with abnormal performance, compared with the BSW approach. We also use the model as prior information about the cross-section of funds to evaluate and predict fund performance. Keywords: Hedge fund, Mutual fund, Fund performance, False discovery rates, Bayes rule, Bootstrap, Goodness of fit, Test power, Trading strategies, Kernel smoothing.
page 87
July 6, 2020
10:14
88
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Chapter 109: Constant Elasticity of Variance Option Pricing Model: Integration and Detailed Derivation In this paper, we review the renowned constant elasticity of variance (CEV) option pricing model and give the detailed derivations. There are two purposes of this chapter. First, we show the details of the formulae needed in deriving the option pricing and bridge the gaps in deriving the necessary formulae for the model. Second, we use a result by Feller to obtain the transition probability density function of the stock price at time t. In addition, some computational considerations are given for the facilitation of computing the CEV option pricing formula. Keywords: Constant elasticity of variance model, Noncentral chi-square distribution, Option pricing. Chapter 110: An Integral Equation Approach for Bond Prices with Applications to Credit Spreads We study bond prices in Black–Cox model with jumps in asset value. We assume that the jump size distribution is arbitrary and, if default occurs, following Longstaff and Schwartz [A Simple Approach to Valuing Risky Fixed and Floating Rate Debt. Journal of Finance 50 (1995), 789–819] and Zhou [The Term Structure of Credit Spreads with Jump Risk. Journal of Banking & Finance 26 (2001), 2015–2040], the payoff at maturity date depends on a general writedown function. Under this general setting, we propose an integral equation approach for the bond prices. As an application of this approach, we study the analytic properties of the bond prices. Also we derive an infinite series expression for the bond prices. Keywords: Jump diffusion, Default barrier, Writedown function, Bond price, Credit spread. Chapter 111: Sample Selection Issues and Applications In many occasions of regression analysis, researchers may encounter the problem of a non-random sample that leads to a biased estimator when using the OLS method. This study thus examines some related issues of sample selection bias due to non-random sampling. We first explain the source of bias caused by non-random sampling and then demonstrate that the direction of such bias in most cases cannot be ascertained based on prior information.
page 88
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
89
By treating the sample selection as informative sampling, we can formulate the sample selection bias issue as an omitted variable problem in the regression model. Heckman (1979) proposed a two-stage estimation procedure to correct for selection bias. The first stage applies the Probit model to produce the estimated value of the inverse Mill’s ratio and then includes it into the second-stage regression model as an explanatory variable to yield unbiased estimators. As the sample selection rule may not always be derived from a yes–no choice, our study further utilizes Lee’s (1983) extension by applying the Multinomial Logit model into the first-stage estimation procedure to allow for its application with multi-choice sample selection rule. Since the pioneer works related to sample selection issues are mostly in the field of labor economics, we give two examples of an empirical study in labor economics to respectively demonstrate applications of the Probit correction approach and Multinomial Logit correction approach. Finally, we point out that the problem of a non-random sample is not limited to applications in economics. In the past 20 years, quite a few researchers have taken into account the issue of sample selection for studies of finance and management issues. Keywords: Sample selection bias, Heckman’s two-stage estimation, Probit model, Multinomial logit model. Chapter 112: Time Series and Neural Network Analysis This chapter discusses and compares the performances of the traditional time-series models and the neural network (NN) model to see which one does a better job of predicting changes in stock prices and to identify critical predictors in forecasting stock prices in order to increase forecasting accuracy for professionals in the market. Time-series analysis is somewhat parallel to technical analysis, but it differs from the latter by using different statistical methods and models to analyze historical stock prices and predict the future prices. Neural network approaches can make important contributions since they can incorporate very large number of variables and observations into their models. In this study, the authors apply the traditional time-series decomposition (TSD), Holt/Winters (H/W) models, Box– Jenkins (B/J) methodology, and neural network (NN) model to 50 randomly selected stocks from September 1, 1998 to December 31, 2010 with a total of 3105 observations for each company’s close stock price. This sample period covers high tech boom and bust, the historical 9/11 event, housing boom
page 89
July 17, 2020
13:28
90
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
and bust, and the recent serious recession and current slow recovery. During this exceptionally uncertain period of global economic and financial crises, it is expected that stock prices are extremely difficult to predict. Keywords: Forecasting stock prices, Neural network model, Time-series decomposition, Holt/Winters exponential smoothing, Box–Jenkins ARIMA methodology, Technical analysis, and fundamental analysis. Chapter 113: Covariance Regression Model for Non-Normal Data Recently, Zou et al. (2017) proposed a novel covariance regression model to study the relationship between the covariance matrix of responses and their associated similarity matrices induced by auxiliary information. To estimate the covariance regression model, they introduced five estimators: the maximum likelihood, ordinary least squares, constrained ordinary least squares, feasible generalized least squares and constrained feasible generalized least squares estimators. Among these five, they recommended the constrained feasible generalized least squares estimator due to its estimation efficiency and computational convenience. Under the normality assumption, they further demonstrated the theoretical properties of these estimators. However, the data in the area of finance and accounting may exhibit heavy tails. Hence, to broaden the usefulness of the covariance regression model, we relax the normality assumption and employ Lee’s (2004) approach to obtain inferences for covariance regression parameters based on the five estimators proposed by Zou et al. (2017). Two empirical examples are presented to illustrate the practical applications of the covariance regression model in analyzing stock return comovement and herding behavior of mutual funds. Keywords: Covariance regression model, Herding behaviors, Non-normal data, Stock return comovement. Chapter 114: Impacts of Time Aggregation on Beta Value and R 2 Estimations Under Additive and Multiplicative Assumptions: Theoretical Results and Empirical Evidence Data for big and small market-value firms are used to evaluate the effects of temporal aggregation on beta estimates, t-values, and R2 estimates. In addition to our analysis of the standard market model within addictive rates of return framework, the standard model under assumption of multiplicative rates of return is also discussed. Furthermore, dynamic is estimated in
page 90
July 20, 2020
11:30
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Introduction
b3568-v1-ch01
91
this study to evaluate differences in the short-term and long-term dynamic relationships between the market and each type of firm. It is found that temporal aggregation has important effects on both the specification of a market model and the stability of beta and R2 estimates. Keywords: Temporal aggregation, Additive and multiplicative rates of return, Random coefficient model, Coefficient determination, Estimation stability. Chapter 115: Large-Sample Theory In this chapter, we discuss large-sample theory that can be applied under conditions that are quite likely to be met in large samples even when the Gauss–Markov conditions are broken. There are two reasons for using largesample theory. First, there may be some problems that corrupt our estimators in small samples but tend to lie down as the sample gets bigger. Thus, if we cannot get a perfect small-sample estimator, we will usually want to choose the one that will be best in large samples. Second, in some circumstances, the theory used to derive the properties of estimators in small samples just does not work, and working out the properties of the estimators can be impossible. This makes it very hard to choose between alternative estimators. In these circumstances we judge different estimators on their “large sample properties” because their “small (or finite) sample properties” are unknown. Keywords: Large-sample theory, Gauss–Markov conditions, Sample properties, Sample estimators. Chapter 116: Impacts of Measurement Errors on Simultaneous Equation Estimation of Dividend and Investment Decisions This chapter analyzes the errors-in-variables problems in a simultaneous equation estimation in dividend and investment decisions. We first investigate the effects of measurement errors in exogenous variables on the estimation of a just-identified or an over-identified simultaneous equations system. The impacts of measurement errors on the estimation of structural parameters are discussed. Moreover, we use a simultaneous system in terms of dividend and investment policies to illustrate how theoretically the unknown variance of measurement errors can be identified by the overidentified information. Finally, we summarize the findings.
page 91
July 20, 2020
11:30
92
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
Keywords: Errors-in-variables, Simultaneous equations system, Estimation, Identification problem, Investment decision, Dividend policy, Two stage least square method. Chapter 117: Big Data and Artificial Intelligence in the Banking Industry Big data and artificial intelligence (AI) assist businesses with decisionmaking. They help companies create new products and processes or improve existing ones. As the amount of data grows exponentially and data storage and computing power costs drop, AI is predicted to have great potentials for banks. This chapter discusses the implications of big data and AI for the banking industry. First, we provide background on big data and AI. Second, we identify areas in which banks can benefit from big data and AI, and evaluate their applications for the banking industry. Third, we discuss the implications of big data and AI for regulatory compliance and supervision. Last, we conclude with the limitations and challenges facing the use of big-data based AI. Keywords: Big data, Artificial intelligence, Machine learning, Bank, Robo-advisor, Bank regulatory compliance, Algorithmic bias. Chapter 118: A Non-Parametric Examination of Emerging Equity Markets Financial Integration Prior studies on financial markets integration use parametric estimators whose underlying assumptions of linearity and normality are, at best, questionable, particularly when using high frequency data. We re-examine the evidence regarding financial integration trends using data for 14 emerging equity markets from Southeast Asia, Latin America, and the Middle East, along with US and Japan. We employ non-parametric estimators of Pukthuanthong and Roll’s (2009) adjusted R2 measure of financial integration. Results from non-parametric estimators are contrasted with results from parametric estimators of adjusted R2 financial integration measure using bi-daily returns for contiguous yearly sub-periods from 1993 to 2016. We find two key results. First, we confirm prior evidence in Pukthuanthong and Roll (2009) that simple correlation (SC) understates financial integration trends compared to parametric adjusted R2 . Second, parametric adjusted R2 understates financial integration trends relative to non-parametric adjusted R2 . Hence, emerging equity markets may be
page 92
July 17, 2020
13:28
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
93
more financially integrated, and offer fewer diversification benefits to global investors than previously thought. The results underscore the need to exercise caution when drawing inferences regarding financial markets integration using parametric estimators. Keywords: Financial integration, Non-parametric regression, Locallyweighted regression, Principal components regression, Simple correlation.
Chapter 119: Algorithmic Analyst (ALAN) — An Application for Artificial Intelligence Content as a Service This chapter presents Algorithmic Analyst (ALAN), an application that implements statistics and artificial intelligence methods with natural language generation to publish multimedia financial reports in Chinese and English. ALAN is a portion of a long-term project to develop an Artificial Intelligence Content as a Service (AICaaS) platform. ALAN gathers global capital market data, performs big data analysis driven by algorithms, and makes market forecasts. ALAN uses a multi-factor risk model to identify equity risk factors and ranks stocks based on a set of over 150 financial market variables. For each instrument analyzed, ALAN computes and produces narrative metadata to describe its historical trends, forecast results, and any causal relationship with global macroeconomic variables. ALAN generates English and Chinese text commentaries in html and pdf formats, audio in mp3 format, and video in mp4 format for the US and Taiwanese equity markets on a daily basis. Keywords: Natural language generation (NLG), Multi-factor risk (MFR) model, AI Content as a Service (AICaaS), ARIMA, Generalized auto regressive conditional heteroskedasticity (GARCH), Machine learning, Deep learning, RNN, LSTM, Factor attributes, Scoring system, Global investing.
Chapter 120: Survival Analysis: Theory and Application in Finance This chapter outlines some commonly used statistical methods for studying the occurrence and timing of events, i.e., survival analysis. It is also called duration analysis or transition analysis in econometrics. Statistical methods for survival data usually include nonparametric method, parametric method and semiparametric method. While some non-parametric estimators (e.g.,
page 93
July 6, 2020
10:14
94
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
the Kaplan–Meier estimator and life-table estimator) estimate survivor functions, others (e.g., the Nelson–Aalen estimator) estimate the cumulative hazard function. The commonly used nonparametric test for comparing survivor functions is the log-rank test. Parametric models such as the exponential model, Weibull model, and the generalized Gamma model, etc., are based on different assumptions of survival time. Semiparametric regression models are also called the Cox proportional hazards (PH) model, which is estimated by the method of partial likelihood and do not require the assumption of survival time. Other applications of discrete time data and the competing risks model are also introduced. Keywords: Survival analysis, Non-parametric methods, Parametric methods, Semi-parametric methods, Discrete time data, Competing risks. Chapter 121: Pricing Liquidity in the Stock Market In this study, we aim to test the pricing power of market liquidity in the crosssection of US stock returns. We examine three liquidity measures: P´astor and Stambaugh (2003)’s liquidity factor, Bali et al. (2014)’s liquidity shocks, and Dreshsler, Savov, and Schanbl (2017)’s money market liquidity premium. With a large set of test assets and the time-series regression approach of Fama and French (2015), we find that aggregate liquidity is not priced in the cross-sections of stock returns. That is, adding the liquidity factor to common asset-pricing models does not improve the performance of models significantly. Therefore, our results call for more research on the impact of aggregate liquidity on the stock market. Keywords: Stock market liquidity, Liquidity shocks, Money market liquidity premium, Cross-section of stock returns, Time-series regression, Fama and French factor models, Size, Book-to-market, Momentum, Operating profitability, Investment, Industry portfolios. Chapter 122: The Evolution of Capital Asset Pricing Models: Update and Extension Since Sharpe (1964) derived the CAPM, it has been the benchmark of asset pricing models and has been used to calculate the cost of equity capital and other asset pricing determinations for more than four decades. Many researchers have tried to relax the original assumptions and generalize the static CAPM. In addition, Merton (1973) and Black (1976) have generalized the static CAPM in terms of intertemporal CAPM. In this
page 94
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
95
chapter, we survey the important alternative theoretical models of capital asset pricing and provide a complete review of the evolution of both static and intertemporal asset pricing models. We also discuss the interrelationships among these models and suggest several possible directions for future research. In addition, we review the asset pricing tests in terms of individual companies’ data instead of portfolio data. Our results might be used as a guideline for future theoretical and empirical research in capital asset pricing. Keywords: Security market line, Static CAPM, Dynamic CAPM, Intertemporal CAPM, Liquidity-based CAPM, Demand function, Supply function, International CAPM, Behavior finance. Chapter 123: The Multivariate GARCH Model and its Application to East Asian Financial Market Integration We review briefly multivariate GARCH models in contrast with univariate GARCH models, and clarify the statistical perspective of the DCC-GARCH model introduced by Engel (2002). This model ingeniously compromises two contrary requirements for constructing a model: sufficiently flexible to catch the behaviors of actually observed data process, and sufficiently parsimonious for statistical analysis in practice. Then, we illustrate practical usefulness of the DCC-GARCH through its application to the bond and stock markets in the emerging East Asian countries. The DCC-GARCH can evaluate the comovements of different financial assets by use of dynamic variance decomposition (volatility spillover) in addition to the DCCs. Empirical investigation of this paper clarifies that the bond market integration is still limited in terms of both DCCs and volatility spillover, while the stock markets are highly integrated both regionally and globally. Keywords: Dynamic Conditional Correlation (DCC) Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, Dynamic conditional correlation, Dynamic conditional variance decomposition, Financial market integration, East Asian bond and stock markets. Chapter 124: Review of Difference-in-Difference Analyses in Social Sciences: Application in Policy Test Research In this chapter, we review the difference-in-difference (DID) method and first-difference method, which have been widely used in quantitative research
page 95
July 6, 2020
10:14
96
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
designs in the social sciences (e.g., economics, finance, accounting, etc.). First, we define the DID and first-difference methods. Then, we explain the models that may be used in the DID and first-difference methods and briefly discuss the critical assumptions when researchers make a casual inference from the results. Next, we use some examples documented in previous studies to illustrate how to apply the DID and first-difference methods in the research related to policy implementations. Finally, we compare the DID method to the comparative interrupted time series (CITS) design and briefly introduce two popular methods that researchers have used to create a control sample in order to reduce sample selection bias in a quasi-experimental design: propensity score matching (PSM) and regression discontinuity design (RDD). Keywords: Difference-in-differences (DID), First-difference method, Causal inference, Policy analyses.
Chapter 125: Using Smooth Transition Regressions to Model Risk Regimes The smooth transition regression (STR) methodology was developed to model nonlinear relationships in the business cycle. We demonstrate the methodology can be used to analyze return series where exposure to financial market risk factors depends on market regime. The smooth transition between regimes inherent in STR is particularly appropriate for risk models as it allows for gradual transition of risk factor exposures. Variations in the methodology and tests, its appropriateness are defined and discussed. We apply the STR methodology to model the risk of the return series of the convertible arbitrage (CA) hedge fund strategy. CA portfolios are comprised of instruments that have both equity and bond characteristics and alternate between the two depending on market level (state). The dual characteristics make the CA strategy a strong candidate for nonlinear risk models. Using the STR model, we confirm that the strategy’s risk factor exposure changes with market regime and, using this result, are able to account for the abnormal returns reported for the strategy in earlier studies. Keywords: Regime switching, Smooth transition regression, Risk measurement, Hedge funds.
page 96
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
97
Chapter 126: Application of Discriminant Analysis, Factor Analysis, Logistic Regression, and KMV-Merton Model in Credit Risk Analysis The main purposes of this paper are to review and integrate the applications of discriminant analysis, factor analysis, and logistic regression in credit-risk management. First, we will discuss how the discriminant analysis can be used for credit rating such as calculating financial z-score to determine the chance of bankruptcy of the firm. In addition, we also discuss how discriminant analysis can be used to classify banks into problem banks and non-problem banks. Secondly, we will discuss how factor analysis can be combined with discriminant analysis to perform bond rating forecasting. Thirdly, we will show how logistic regression technique can be used to calculate the default risk probability. Fourthly, we will discuss the KMV-Merton model and Merton distance model for calculating default probability. Finally, we will compare all techniques discussed in previous sections and draw the conclusions and give suggestions for future researches. We propose using CEV option model to improve original Merton DD model. In addition, we also propose a modified na¨ıve model to improve Bharath and Shumway’s (2008) na¨ıve model. Keywords: Discriminant analysis, Factor analysis, Logistic regression, KMV-Merton model, Probit model, Hazard model, Merton distance model, Financial z-score, Default probability. Chapter 127: Predicting Credit Card Delinquencies: An Application of Deep Neural Networks The objective of this paper is 2-fold. First, it develops a prediction system to help the credit card issuer model the credit card delinquency risk. Second, it seeks to explore the potential of deep learning (also called deep neural network), an emerging artificial intelligence technology, in credit risk domain. With a real-life credit card data linked to 711,397 credit card holders from a large bank in Brazil, this study develops a deep neural network to evaluate the risk of credit card delinquency based on the client’s personal characteristics and the spending behaviors. Compared to machine learning algorithms of logistic regression, na¨ıve Bayes, traditional artificial neural network, and decision tree, deep neural network has a better overall predictive performance with the highest F scores and AUC. The successful application of
page 97
July 6, 2020
10:14
98
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
C. F. Lee
deep learning implies that artificial intelligence has great potential to support and automate credit risk assessment for financial institutions and credit bureaus. Keywords: Credit card delinquency, Deep neural network, Artificial intelligence, Risk assessment, Machine learning. Chapter 128: Estimating the Tax-Timing Option Value of Corporate Bonds US tax laws provide investors an incentive to time the sales of their bonds to minimize tax liability. This grants a tax-timing option that affects bond value. In reality, corporate bond investors’ tax-timing strategy is complicated by risk of default. In this chapter, we assess the effects of taxes and stochastic interest rates on the timing option value and equilibrium price of corporate bonds by considering discount and premium amortization, multiple trading dates, transaction costs, and changes in the level and volatility of interest rates. We find that the value of tax-timing option account for a substantial proportion of corporate bond price and the option value increases with bond maturity and credit risk. Keywords: Tax timing, Option, Capital gain, Default risk, Transaction cost, Asymmetric taxes. Chapter 129: DCC-GARCH Model for Market and Firm-Level Dynamic Correlation in S&P 500 Understanding the dynamic correlations among asset returns is essential for ascertaining the behavior of asset prices and their comovements. It also has important implications for portfolio diversification and risk management. In this chapter, we apply the DCC-GARCH model pioneered by Engle (2001) and Engle and Sheppard (2002) to investigate the dynamics of correlations among S&P 500 stocks during the sub-prime crisis. Using the daily data of stocks in the S&P 500 index, we document strong evidence of persistent dynamic correlations among the returns of the index component stocks. Conditional correlations between S&P 500 index and the component stocks increase substantially during the period of sub-prime crisis, showing strong evidence of contagion. In addition, stock return variance is time-varying and peaks at the crest of financial crisis. The results show that the DCC-GARCH model is a powerful tool for forecasting return correlations and performing value-at-risk portfolio analysis.
page 98
July 6, 2020
10:14
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch01
Introduction
99
Keywords: Dynamic conditional correlation, Multivariate GARCH, DCCMVGARCH, Contagion, Risk management. Chapter 130: Using Path Analysis to Integrate Accounting and Non-Financial Information: The Case for Revenue Drivers of Internet Stocks This chapter utilizes path analysis, an approach common in behavioral and natural science literatures but relatively unseen in finance and accounting, to improve inferences drawn from a combined database of financial and non-financial information. Focusing on the revenue generating activities of internet firms, this paper extends the literature on internet valuation while addressing the potentially endogenous and multicollinear nature of the internet activity measures applied in their tests. Results suggest that both SG&A and R&D have significant explanatory power over the web activity measures, suggestive that these expenditures represent investments in product quality. Evidence from the path analysis also indicates that both accounting and non-financial measures, in particular SG&A and pageviews, are significantly associated with firm revenues. Finally, this paper suggests other areas of accounting research which could benefit from a path analysis approach. Keywords: Direct effect, Indirect effect, Path analysis, Internet stock, Nonfinancial information. Chapter 131: The Implications of Regulation in the Community Banking Sector: Risk and Competition This chapter examines the relationship between financial performance, regulatory reform, and management of community banks. The consequences of the Sarbanes–Oxley Act (SOX) and Dodd–Frank Act (DFA) regulations are observed. Risk management responses to regulatory reforms, as observed in the loan loss provision, are examined in relation to these reforms. We also observe the consequences of compliance costs on product offerings and competitive condition. Empirical methods and results provided here show that sustained operations for community banks will require a commitment to developing management expertise that observes the consequences of regulatory objectives at the firm level. Keywords: Community bank, Dodd–Frank, Sarbanes–Oxley, Efficiency, Empirical methods, Managerial implications.
page 99
This page intentionally left blank
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
Chapter 2
Do Managers Use Earnings Forecasts to Fill a Demand They Perceive from Analysts? Orie Barron, Jian Cao, Xuguang Sheng, Maya Thevenot and Baohua Xin Contents 2.1 2.2 2.3 2.4 2.5
Introduction . . . . . . . . . . . . A Simple Model . . . . . . . . . . The Econometric Model . . . . . Research Design . . . . . . . . . Sample and Descriptive Statistics 2.5.1 Sample . . . . . . . . . . 2.5.2 Descriptive statistics . .
Orie Barron Penn State University e-mail: [email protected] Jian Cao Florida Atlantic University e-mail: [email protected] Xuguang Sheng American University e-mail: [email protected] Maya Thevenot Florida Atlantic University e-mail: [email protected] Baohua Xin University of Toronto e-mail: [email protected] 101
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
102 107 111 115 119 119 121
page 101
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
102
2.6
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 The decision to forecast . . . . . . . . . . . . . . . . . . 2.6.2 Posterior market belief revisions . . . . . . . . . . . . . 2.6.3 Management uncertainty as a correlated variable . . . . 2.6.4 Cross-sectional analyses of the relation of guidance with commonality and uncertainty . . . . . . . . . . . . . . . 2.6.5 Propensity-score matched samples . . . . . . . . . . . . 2.6.6 Other analyses . . . . . . . . . . . . . . . . . . . . . . . 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 2A: Theoretical Framework . . . . . . . . . . . . . . . . . . Appendix 2B: Estimation of the Conditional Variance σλ2 t by GARCH Models . . . . . . . . . . . . . . . . . . . . . .
126 126 129 132 133 137 140 141 141 145 148
Abstract This paper examines how the nature of the information possessed by individual analysts influences managers’ decisions to issue forecasts and the consequences of those decisions. Our analytical model yields the prediction that managers prefer to issue guidance when they perceive their private information to be more precise, and analysts possess mostly common, imprecise information (i.e., there is high commonality and uncertainty). Based on an econometric model, we obtain theory-based analyst variables and our empirical evidence confirms our predictions. High commonality and uncertainty in analysts’ prior information are accompanied by increases in analysts’ forecast revisions and trading volume following guidance, consistent with greater analyst incentives to generate idiosyncratic information. Yet, management guidance increases only with the commonality contained in analysts’ pre-disclosure information, but not with the level of uncertainty. Indeed, the disclosure propensity among a subset of firms (those with less able managers, bad news, and infrequent forecasts) has an inverse relationship with analyst uncertainty due to its reflection on the low precision of management information. Our results are robust to a variety of alternative analyses, including the use of propensity-score matched pairs with similar disclosure environments but differing degrees of commonality and uncertainty among analysts. We also demonstrate that the use of forecast dispersion as an empirical proxy for analysts’ prior information may lead to erroneous inferences. Overall, we define and support improved measures of analyst information environment based on an econometric model and find that the commonality of information among analysts acts as a reliable forecast antecedent by informing managers about the amount of idiosyncratic information in the market. Keywords Management earnings forecasts • Analysts’ information • Uncertainty • Commonality.
2.1 Introduction Anecdotes and empirical research suggest that managers often issue guidance to ensure sell-side consensus forecasts and market expectations are
page 102
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
103
reasonable and fill a demand they perceive from analysts.1 Specifically, management uses its earnings forecasts as a device to walk-down analysts’ consensus forecasts to avoid penalties associated with failing to meet analysts’ expectations (Matsumoto, 2002; Richardson, Teoh, and Wysocki, 2004; and Cotter, Tuna, and Wysocki, 2006). Poor alignment of analysts’ expectations often leads to the decision to stop earnings guidance (Feng and Koch, 2010; Houston, Lev, and Tucker, 2010; and Chen, Matsumoto, and Rajgopal, 2011). Despite evidence on the use of earnings guidance as a tool to facilitate expectation alignment, relatively little is known about how management earnings guidance strategy is affected by market participants’ incentives to develop private information. Since market participants possess different prior beliefs or likelihood functions (Barron, Byard, and Kim, 2002), an important problem facing the managers is how to issue earnings forecasts to influence idiosyncratic beliefs among market participants and, in turn, security prices. In this paper, we examine how the nature of the information possessed by individual analysts influences managers’ decisions to issue forecasts and the consequences of those decisions. Previous studies examine the relation between management guidance and two proxies for analysts’ information environment: analyst coverage and forecast dispersion. Empirical evidence relating to this issue is limited and mixed. For example, Houston et al. (2010) and Chen et al. (2011) find that firms with decreasing analyst following are more likely to stop providing earnings guidance. Although an unstated reason for this stoppage could be poor performance and repeated consensus misses, stoppers that publicly announced the decision to stop guidance did not experience a change in analyst coverage. Balakrishnan et al. (2014) examine losses of analyst coverage (i.e., closures of brokerage operations) that are unrelated to individual firms’ future prospects and show that firms respond to the exogenous shocks by providing more timely and informative earnings guidance and that such efforts improve trading liquidity. In the absence of significant coverage termination events, a firm’s analyst following does not often change quickly, while analysts constantly attempt to produce information from various sources (Brown et al. 2014). We explore how variation in analysts’ incentives to develop private information affects managers’ decision to forecast (while the level of analyst following is held constant). 1
According to a 2009 forward-looking guidance practices survey by the National Investor Relations Institute among its public company members, the primary reason for issuing guidance is to ensure sell-side consensus and market expectations are reasonable (http://www.niri.org/findinfo/Guidance.aspx).
page 103
July 6, 2020
10:15
104
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
Forecast dispersion, on the other hand, likely captures various different aspects of analysts’ information environment. For example, Cotter et al. (2006) and Feng and Koch (2010) suggest that management guidance is less likely as the dispersion in analyst forecasts increases, whereas Houston et al. (2010) suggest that high dispersion is one of the antecedents related to stopping guidance. Both Houston et al. (2010) and Chen et al. (2011) find that analyst forecast dispersion and forecast error increase following the stoppage of guidance. Seen in those contexts, dispersion is referred to as a “catch-all” information proxy for a number of different constructs, such as analyst herding, information asymmetry, and forecasting uncertainty. Barron, Stanford, and Yu (2009) suggest that dispersion captures various constructs to a different degree and its appropriateness as a proxy for a given factor varies by the setting and empirical specification. This argument is confirmed by our econometric model, which reveals that dispersion is only one component of analyst information environment. It is clearly not possible from those studies to determine the effect of analysts’ private information incentives on management guidance practices. We infer market participants’ incentives to develop private information from the nature of the information contained in individual analysts’ forecasts and, more specifically, analysts’ level of uncertainty and the commonality of their beliefs. Sell-side financial analysts serve as sophisticated processors of accounting disclosures, whose primary role in the capital markets is to search, analyze, and interpret information about industry trends, company strategy, and profit potential to generate value for their clients and themselves (Brown et al., 2014). If individual forecasts convey relatively little idiosyncratic information (i.e., there is high commonality), analysts would seek to develop more uniquely private information in their forecasts to maintain competitive advantages or obtain trading profits (Barron et al., 2002). Analysts’ career advancement is also affected by their forecast accuracy (Mikhail, Walther, and Willis, 1999; and Wu and Zang, 2009). High levels of uncertainty in individual analysts’ information (i.e., lack of precision) would stimulate analysts to create new information to increase forecast precision (Frankel, Kothari, and Weber, 2006). Relying on a simple theoretical model, we predict that managers prefer to issue guidance when they perceive their private information to be more precise, and analysts possess mostly common, imprecise information. We maintain that higher analysts’ incentives to develop private information will lead to more analysts’ effort and the processing of more public disclosures. Alternatively, as commonality and uncertainty of information
page 104
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
105
among analysts increase, managers likely feel increasing pressure to provide guidance to fill the demand they perceive from analysts. Yet, since management has different goals than analysts, analysts’ incentives may be in significant conflict with managers’ personal goals. The intuition underlying our proposition is that the decision to forecast depends not only on analysts’ incentives to develop private information, but also on the precision of management’s private information. When the level of uncertainty among analysts is high, managers similarly face significant constraints that could preclude them from disclosure, such as a lack of information precision or an inability to predict future changes themselves. To the extent that analysts’ uncertainty corresponds with low information precision faced by managers, managers may not always desire to issue new forecasts. Based on improved empirical measures obtained from our econometric model, we present evidence that confirms our predictions. We find that managers provide more guidance when pre-disclosure commonality among analysts’ beliefs is high. We corroborate this finding by showing that managers are more likely to issue forecasts when the precision of analysts’ common (idiosyncratic) information is high (low). These findings support the view that analysts possess an innate tendency or desire to develop private information of their own and management guidance is provided to fit this specific need of analysts. We also find that high uncertainty among analysts sometimes prompts less disclosure due, at least in part, to its correlation with the (unobservable) uncertainty contained in managers’ information. The inverse relation between uncertainty and guidance is mostly driven by firms whose managers have low ability, firms that provide infrequent guidance, and firms that report bad earnings news. Our results continue to hold in propensity-score matched pairs with similar disclosure environments but differing degrees of commonality and uncertainty among analysts. Our results also largely support the conjecture that the uncertainty and the commonality of information contained in individual analysts’ earnings forecasts lead to more analysts’ effort and the generation of idiosyncratic information. We find that high commonality and uncertainty in analysts’ prior information are accompanied by increases in analysts’ forecast revisions and trading volume following guidance. These findings suggest that analysts and investors revise their beliefs differentially according to the properties of pre-disclosure information in the market. The differential belief revision around management forecasts arises from a lack of both diversity and uncertainty in market participants’ prior information. We demonstrate that the use of dispersion as an empirical proxy for analysts’ prior information may
page 105
July 6, 2020
10:15
106
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
lead to erroneous inferences such as no variation in market reactions conditional on analysts’ prior information. Taken together, our results suggest that market participants’ incentives to develop private information is a reliable forecast antecedent and that the market’s differential interpretation of management earnings forecasts leads to subsequent analyst forecast revision and significant trading. The commonality of information among analysts based on our econometric model is the best empirical measure of market participants’ incentives to develop private information, because it reflects solely the amount of idiosyncratic information in the market and not management attributions and has a significant effect on managers’ decision to forecast. Analysts’ uncertainty is correlated with that of managers (an omitted factor) making it difficult to infer how the nature of analysts’ information affects managers’ disclosure decisions. This paper contributes to the literature in several important ways. First, we add to the literature on management’s forecast decisions by providing evidence on the role of market participants’ incentives to develop private information in motivating managers to supply guidance. Although studies have looked at the general relation between management earnings forecasts and analysts’ information environment (Feng and Koch, 2010; Houston et al., 2010; and Chen et al., 2011), none have addressed how the decision to forecast is affected by the idiosyncratic element of analysts’ prior information. Our study provides additional insights beyond prior studies in this area that the commonality of analysts’ prior information acts as a more reliable forecast antecedent (compared to other alternatives such as levels of uncertainty and forecast dispersion) and that managers care about the amount of idiosyncratic information in the market. Second, we add to prior research on the effect of earnings announcements on belief revisions to include the disclosure of managers’ forecasts. Barron et al. (2002) show that earnings releases trigger the generation of idiosyncratic information by financial analysts, and Bamber, Barron, and Stober (1999) show that analysts’ idiosyncratic interpretations of the disclosure lead to more informed trading. We find a positive association between the commonality and uncertainty of information among analysts and analyst forecast revisions and trading volume pursuant to management forecast releases. Our findings suggest that either high uncertainty or high commonality may induce analysts to move out of their comfort zone and actively seek out management-provided information to develop new idiosyncratic information. Finally, this paper contributes to the literature on managers’ walkingdown of analysts’ forecasts over the horizon (Cotter et al., 2006) by
page 106
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
107
demonstrating a different side of the “game” between analysts and managers. Our findings that analysts’ incentives to develop private information explain why managers choose to forecast and what the resulting forecast consequences are suggest that managers can strategically forecast to achieve a desired result. Hence, our results are relevant to the debate about whether firms should discontinue guidance to analysts due to the potential myopic incentive effects created by providing guidance (Houston et al., 2010; and Chen et al., 2011). The rest of the paper is organized as follows. Section 2.2 provides a simple disclosure model and motivates the hypotheses to be tested. Section 2.3 presents the econometric model behind our analyst information environment proxies and Section 2.4 discusses the research design. Section 2.5 provides the sample selection and descriptive statistics and Section 2.6 discusses the empirical results. Finally, Section 2.7 concludes. 2.2 A Simple Model The prior research on managerial disclosure incentives in connection with analysts’ interest has focused on incentives to bias analyst outputs. Whereas managers use their earnings forecasts to strategically manage the analysts’ consensus earnings forecasts (Fuller and Jensen, 2002), analysts have a tendency to curry favor with management due to the importance of maintaining strong relationships with management and generating brokerage revenues (Lim, 2001; O’Brien, McNichols, and Lin, 2005; Cowen, Groysberg, and Healy, 2006; and Brochet, Miller, and Srinivasan, 2014). While the prior research suggests a game between management and analysts, analysts have many competing incentives tied to their information role in capital markets. Analysts generally have an interest in building a good reputation for issuing accurate forecasts, signaling private information, and maximizing trading the stocks they cover (Beyer et al., 2010). Prior research suggests that analysts seek and assess management disclosure, and expanded disclosure creates additional analyst and investor interest in the stocks (Healy, Hutton, and Palepu, 1999), implying that managers have incentives to increase analysts’ ability to effectively understand and forecast the firm. However, evidence on the direct interplay between the nature of analysts’ prior information and management voluntary disclosure response is limited. The existing literature suggests two properties of analysts’ information environment, which may be indicative of their incentives to develop private
page 107
July 6, 2020
10:15
108
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
information — the levels of commonality and uncertainty among analysts. High commonality in analysts’ beliefs indicates that either analysts lack private information or that they do not fully use their private information when issuing forecasts (Barron et al., 1998, BKLS hereafter; and Clement and Tse, 2005). Moreover, Brown et al. (2014) find that issuing forecasts below consensus earns analysts credibility with their clients, rather than negatively impacting their compensation or career opportunities. Building on the Indjejikian (1991) and Fischer and Verrecchia (1998) models about analysts’ motives to increase the idiosyncratic information in their forecasts, Barron et al. (2002) suggest that an important role of accounting disclosures is to trigger the generation of idiosyncratic information by financial analysts, which decreases the commonality in their information. Increased public disclosure also increases investor demand for idiosyncratic interpretations of the disclosure and, accordingly, analysts expect greater profits from trading on their private information. Prior research suggests that uncertainty in the information environment adversely impacts analysts’ forecast accuracy (Zhang, 2006; and Amiram et al., 2013). There is a higher probability of job changes for analysts whose forecast accuracy is lower than that of their peers (Mikhail et al., 1999; and Hong, Kubik, and Solomon, 2000). However, Frankel et al. (2006) find that the informativeness of analyst research increases with stock return volatility, suggesting high uncertainty presents analysts with more opportunity to gain from information acquisition. Waymire (1986) and Clement, Frankel, and Miller (2003) show that management forecasts improve posterior analyst forecast accuracy and reduce analyst forecast dispersion, indicating reduced uncertainty about future earnings. We provide a stylized framework to demonstrate the theoretical underpinnings of our hypotheses about the effects of analyst’ information environment on management voluntary disclosure decisions.2 Consider the following setting. A firm has underlying earnings with states being either high or low (xL or xH ). The manager of the firm learns some private but imperfect information s that is stochastically associated with the underlying earnings and characterized by its precision, r. There is no credible way for the manager to convey his private information to the capital market directly due to the non verifiable nature of the information, but he has the option to issue a voluntary disclosure based on that information. Thus, the voluntary disclosure
2
For expositional purposes, we focus on the intuition here and relegate the technical description to the Appendix 2A.
page 108
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
109
(or lack of) potentially conveys information about the manager’s private information, and further, true earnings. The price of the firm is determined by risk-neutral investors’ inference of the firm’s value based on the manager’s disclosure. Later, the underlying earnings are revealed. If the underlying earnings are not contained in the manager’s voluntary disclosure, the manager will need to pay a personal penalty, c. The manager tries to maximize a fraction, α, of the share price at the voluntary disclosure stage, net the expected penalty for a “faulty” voluntary disclosure. Because the state of nature is binary, the only two possible voluntary disclosures are (1) silence, interpreted in equilibrium as earnings being either high or low; and (2) the earnings is high. Provided that the expected probability of incurring a penalty associated with voluntarily disclosing earnings being high is a decreasing function of the manager’s information,3 an equilibrium exists where the manager applies a switching-strategy: if the manager observes a sufficiently high signal, he voluntarily discloses that earnings is high; otherwise, he remains silent. The intuition goes as follows. The manager faces the following tradeoff when determining his voluntary disclosure choices: disclosing that earnings is high has the benefit of a higher share price, but also increases the probability of bearing a penalty because the realization of the true earnings may be low. If the manager observes a sufficiently high signal, the probability of a future penalty is sufficiently low, and the benefit from the inflated price of disclosing high earnings outweighs the expected penalty. In contrast, if the signal is sufficiently low, the posterior probability of incurring a penalty is high, and the expected cost of realizing low earnings outweighs the benefits from the inflated price of disclosing high earnings. The manager then rationally chooses to keep silent. In this equilibrium, the threshold at which the manager decides to xH −xL make a voluntary disclosure or remain silent satisfies: s∗ = 1+r 2r − α 2c . Furthermore, it is intuitive that the probability of issuing earnings guidance increases (i.e., the threshold value, s∗ , decreases) as (1) the responsiveness ∗ 0), and (2) the of manager’s utility to the share price increases ( ∂s ∂α < ∗ < 0). Within the manager’s private information becomes more precise ( ∂s ∂r context of this paper, the fraction α captures to some extent market participants’ incentives to develop private information. It represents a variable unrelated to the asset’s true, economic value, but which nonetheless affects
3
This is a standard assumption in the literature that makes the private information informative.
page 109
July 6, 2020
10:15
110
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
the manager’s payoff arising from his own disclosure choice. A greater α may dampen the effect of disclosure on price change through the generation of new idiosyncratic information from the public announcement. As analysts’ incentives to impound idiosyncratic information in their forecasts increase, analysts are more likely to react to the information contained in management earnings forecasts (i.e., revising their forecasts). Likewise, the stock market will react more strongly to management forecasts as investors are likely to uncover more idiosyncratic interpretations of the disclosure (such as those provided by analysts). As a result, managers’ utility becomes more responsive to market’s demand for management guidance.4 Whereas we focus primarily on the interpretations of α and r, the analysis also indicates that the threshold value increases with the expected penalty for ∗ a “faulty” voluntary disclosure ( ∂s ∂c > 0). The penalty c is simply introduced as a constraint on manager’s voluntary disclosure. Our model suggests that managers prefer to issue guidance when market participants’ incentives to develop private information are high (i.e., a higher α) and their private signal is more precise (i.e., a higher r). The ∗ comparative static result of ∂s ∂α < 0 implies that when there is little idiosyncratic information in the market, the manager responds by lowering the disclosure threshold. This interpretation is consistent with public announcements creating idiosyncratic beliefs in Barron et al. (2002), in which earnings announcements trigger generation of new idiosyncratic information by financial analysts. As discussed earlier, there are two major instances of increasing analysts’ (as the primary market agents) incentives to develop private information: when the degree of commonality or uncertainty among analysts is high. However, the parameter r in the model captures the manager’s overall information uncertainty about the underlying earnings. The comparative ∗ statics result of ∂s ∂r < 0 suggests that as the precision of management’s private signal decreases, the preferred disclosure policy ex ante leans toward that of a non disclosure. This prediction is consistent with managers caring about the errors in their earnings forecasts (Beyer, 2009). In the context
4
Our framework is essentially a good news disclosure story as Dye (1985) and Jung and Kwon (1988), i.e., good news is disclosed, and bad news is suppressed. The stronger market’s demand for new information increases the impact that manager’s disclosure has on the increase in the share price. As a result the firm may award the manager more shares in the manager’s compensation package to incentivize him to disclose, or the manager may choose to exercise more of his existing options due to the increase in share price in response to disclosure.
page 110
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
111
of our study, managers may withhold information when their incentives (reducing their forecast errors) are partially misaligned with the goal of analysts such as in the case of high information uncertainty. Prior theoretical work also suggests that uncertainty prevents full disclosure in equilibrium as managers themselves may lack the information, or lack the ability to predict changes (Verrecchia, 2001).5 In short, the presence of high commonality among analysts increases the likelihood of a forecast being issued to fit analysts’ incentives to develop new idiosyncratic information, whereas the existence of high uncertainty may either encourage or inhibit the disclosure because of possible misalignment in the manager’s and analysts’ incentives (e.g., accurate reporting vs. information seeking). The above discussion leads to two testable empirical predictions: Hypothesis 2.1 The commonality among analysts’ beliefs increases the likelihood of a firm issuing management earnings forecasts. Hypothesis 2.2 The level of analysts’ earnings forecast uncertainty does not influence the likelihood of a firm issuing management earnings forecasts. 2.3 The Econometric Model This section presents the full econometric model, summarized in Sheng and Thevenot (2012) and derives the constructs of analyst commonality and uncertainty that exists at the time a forecast is made. For N analysts, T target years, H forecast horizons, let Fith be the h-quarter ahead of earnings forecast made by analyst i, for target year t. If At is the actual earnings, then analyst i’s forecast error eith can be defined as eith = At − Fith .
(2.1)
Following Davies and Lahiri (1995), we write eith as the sum of a common component, λth and an idiosyncratic error, εith : eith = λth + εith , 5
(2.2)
For example, Dye (1985) suggests that information is withheld because there is doubt about whether the manager is informed or, equivalently, whether the information in question has yet to arrive. Trueman (1986) and Verrecchia (1990) suggest that executives may abstain from disclosure due to lack of confidence in their ability to predict future changes or concerns about the adverse effects of inaccuracies such as increased litigation risk and market volatility.
page 111
July 6, 2020
10:15
112
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
where λth =
h
utj .
(2.3)
j=1
The idiosyncratic errors, εith arise from analysts’ private information and differences in their information acquisition, processing, interpretation, judgment and forecasting models. The common component, λth denotes forecast errors that all analysts would make due to the unpredictable events that affect target earnings and occur from the time the analyst issues a forecast until the end of the time over which target earnings are realized. These shocks could be economy-wide, like the events of September 11, 2001, or firm-specific events, like an unanticipated merger, loss of a major customer or bankruptcy. Equation (2.3) shows that this accumulation of shocks is the sum of each quarterly shock, utj that occurs between the time the analyst releases a forecast and the end of the fiscal period over which earnings are realized. Hence, even if analysts make “perfect” forecasts, i.e., they have perfect private information, the forecast error may still be non-zero due to shocks which are, by nature, unpredictable, but nevertheless affect target earnings. In line with Lahiri and Sheng (2010), we make the following simplifying assumptions: Assumption 1 (Common Component). E(utj ) = 0; var(utj ) = σu2 tj for any t and j; E(utj uts ) = 0 for any t and j = s; E(uth ut−k,h ) = 0 for any t, h and k = 0. Assumption 2 (Idiosyncratic Component). E(εith ) = 0; var(εith ) = σε2ih for any i, t and h; E(εith εjth ) = 0 for any t, h and i = j. Assumption 3 (Identification Condition). E(εith ut−k,j ) = 0 for any i, t, h, k and j. Assumption 1 implies that the unanticipated shocks are uncorrelated over time and horizons. The idiosyncratic errors are taken to be mutually independent (Assumption 2). In addition, the common component and idiosyncratic disturbances are assumed to be independent (Assumption 3). Taken together, Assumptions 1 to 3 allow the individual forecast error to be decomposed into a common and idiosyncratic component as specified in equations (2.2) and (2.3). Note that the model structure and assumptions described above are similar to the model of Abarbanell et al. (1995) that
page 112
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
113
assumes an exogenous individual forecast with two error terms: one common and one idiosyncratic. As in previous research, the observed dispersion among analysts, dth is expressed as dth ≡
N 1 (Fith − F•th )2 , N
(2.4)
i=1
where F•th = N1 N j=1 Fjth is the mean forecast averaged over analysts. As Lahiri and Sheng (2010) suggest, the uncertainty associated with a forecast of any given analyst is measured by the variance of the individual forecast errors, which, given equations (2.1) and (2.2), can be expressed as the sum of the volatilities in each error component: Uith ≡ Var(eith ) = Var(λth + εith ) = σλ2 th + σε2ih ,
(2.5)
where σλ2 th = Var(λth ). An individual analyst’s forecast uncertainty in equation (2.5) is comprised of two components: uncertainty associated with forthcoming shocks, σλ2 th , which is common to all analysts, and the variance of his idiosyncratic error, σε2ih . In line with BKLS, we measure overall forecast uncertainty, Uth , as the average of the individual forecast error variances, which can be interpreted as the uncertainty associated with a typical analyst’s forecast. Therefore, Uth can be expressed as: Uth
N N 1 1 2 2 ≡ Uith = σλth + σεih . N N i=1
(2.6)
i=1
Alternatively, in the absence of individual forecast bias, i.e., if E(At − Fith ) = 0, Uth is equal to the expectation of the average squared individual forecast errors.6 Following Engle (1983), we decompose the average squared individual forecast errors as: N 1 (At − Fith )2 = (At − F•th )2 + dth . N
(2.7)
i=1
6
Prior research suggests that analysts are optimistically biased (Francis and Philbrick, 1993). More recent studies show that individual forecast bias has decreased over time (Matsumoto, 2002), decreases over the forecast horizon (Richardson, Teoh and Wysocki, 2004), and is more pronounced for firms reporting losses (Brown, 2001). However, as we discuss later in Section 4, with the GARCH model estimation, any systematic bias in the mean forecast errors will be included in the model intercept and eliminated from the estimate of common uncertainty.
page 113
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
114
Taking expectations on both sides, given all available information at time t − h including Fith and dth , we get the following conditional relationship between uncertainty, the variance of mean forecast errors and observed dispersion: Uth = E(At − F•th )2 + dth .
(2.8)
The first term on the right-hand side of equation (2.8) can alternatively be written as (Markowitz 1959, p. 111): E(At − F•th )2 =
N 1 E(At − Fith )2 N2 i=1
N N 1 + 2 E(At − Fith )(At − Fjth ). N
(2.9)
i=1 j=i
Under Assumption 3, that is, the independence between the common and idiosyncratic error components, equation (2.9) can be expressed as: E(At − F•th ) = 2
σλ2 th
N 1 2 + 2 σεih . N
(2.10)
i=1
Note that as the number of forecasters gets large, the second term on the right hand side will be close to zero and the uncertainty about the mean forecast E(At − F•th )2 will reflect only the uncertainty in common information, σλ2 th . Substituting equation (2.10) in (2.8), we obtain Uth =
σλ2 th
N 1 2 + dth + 2 σεih . N
(2.11)
i=1
For large values of N , the last term on the right-hand side of equation (2.11) will be close to zero and can be ignored. Hence, given the model assumptions and for large values of N , ex ante forecast uncertainty, dispersion and the variance of forthcoming aggregate shocks are expected to be related as in the following proposition. Proposition 1: Suppose Assumptions 1–3 hold and N is large. Then earnings forecast uncertainty can be expresses as: Uth = σλ2 th + dth .
(2.12)
The proposition shows that the difference between uncertainty and dispersion will be determined partly by the length of the forecast horizon over
page 114
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
115
which the unanticipated shocks accumulate — the longer the forecast horizon the bigger the difference on average because σλ2 th > σλ2 tk for h > k. It also suggests that the robustness of the relationship between the two will depend on the variability of the aggregate shocks over time, i.e., if the time period is volatile, then dispersion would be a noisy measure of uncertainty. Note that Barry and Jennings (1992, p. 173) and BKLS (p. 425) derive a similar relationship between uncertainty and dispersion: Uth = Cth + Dth ,
(2.13)
where Dth is the expected across-analyst dispersion, i.e., Dth ≡ E(dth ) and Cth is the average covariance among forecast errors: N
Cth =
N
1 Cov(At − Fith , At − Fjth ). N (N − 1)
(2.14)
i=1 j=i
Their result justifies forecast dispersion as one component of forecast uncertainty. Under our framework, we can simplify the expression for the average covariance among individual forecast errors in equation (2.14) as7 N
Cth
N
1 = E[(λth + εith )(λth + εjth )] = σλ2 th , N (N − 1)
(2.15)
i=1 j=i
which can be easily interpreted as the uncertainty shared by all forecasters due to their exposure to common unpredictable shocks. Thus, the added structure we impose leads to equation (2.15), which greatly simplifies the results in Barry and Jennings (1992) and BKLS. 2.4 Research Design Based on the general model described above, BKLS provides a direct empirical estimate of analysts’ overall uncertainty (V ) and an estimate of the proportion of analysts’ information that is common (ρ) using observable features of analysts’ forecasts: SE it − DNit , SE it + (1 − N1 )Dit 1 Dit , SEit + 1 − Vit = 1 N ρit =
7
(2.16) (2.17)
Note that E(εith εjth ) = 0 for any t, h and i = j (Assumption 2) and E(εith λt−k,j ) = 0 for any i, t, h, k and j (Assumption 3).
page 115
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
116
where ρ is commonality in analysts’ beliefs, measured as the expected covariance of error in individual forecasts; V is uncertainty in the information conveyed by analysts’ forecasts, measured as the expected variance of error in individual forecasts; SE is the expected squared error of the mean forecast; D is expected forecast dispersion; and N is the number of analysts. BKLS commonality measure can also be expressed as the ratio of the precision of analysts’ common information to the precision of their h ), h and s are the precision of individual analysts’ total information ( h+s common and idiosyncratic information, respectively. As with estimation of ρ, estimation of h and s is based on observable features of analysts’ forecasts: hit =
Dit N 2 − N1 )Dit ]
SE it − [SE it + (1
(2.18)
and sit =
Dit [SE it + (1 −
2 1 N )Dit ]
.
(2.19)
BKLS suggests that one can use observed dispersion and mean squared error as proxies for Dit and SE it to empirically estimate the constructs in equations (2.1) through (2.4). However, theoretically, these are ex ante constructs, attached to a forecast before the actual earnings are known and hence, they must be constructed using data available to analysts at the time forecasts are issued. Our model and proxies defined in Equations (2.16)–(2.18) suggest that the information environment is a function of dispersion and the variance of accumulated aggregate shocks. While the estimation of dispersion is straight-forward and is based on ex ante information, i.e., prior to the revelation of actual earnings, estimating the variance of aggregate shocks empirically poses a problem, because some periods are likely to be more volatile than others and volatile periods tend to cluster. To deal with these problems, Engle (1982) develops the celebrated Autoregressive Conditional Heteroskedasticity (ARCH) model, which is generalized by Bollerslev (1986) to form GARCH. These models can be used to estimate volatility conditional on historical data and are therefore suitable for our purpose. The method is now a standard approach for modeling different types of uncertainty in economics and finance (e.g., Batchelor and Dua, 1993; and Giordani and S¨oderlind, 2003) and has been introduced to accounting by Sheng and Thevenot (2012). In this setting, the GARCH model assumes that the variability of common forecast errors depends on past forecast errors and lagged earnings forecast uncertainty. The method uses the time-series of
page 116
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
Management Forecasts and Analysts’ Information Demand
EAt–1
–90 days
+30
117
EAt
Quartert-1 Uncertaintyt–1 Consensust–1 Figure 2.1:
Guidancet
Timeline.
mean analyst forecast errors, in which any idiosyncratic errors are expected to be averaged out, to provide an estimate of the variance of common errors. We estimate a simple GARCH(1, 1) model and generate the conditional variance, σ ˆλ2 th , which is then used as an estimate of SEit in the proxies above. This procedure provides a stable, reliable and comprehensive estimate of analyst information environment that can be used in settings where others cannot, such as when firms’ operations are affected by significant unanticipated events like 9/11, bankruptcy and large restructuring charges and when the construct of interest is the change in information environment (Sheng and Thevenot, 2012). We examine the association between pre-disclosure commonality and uncertainty among analysts and management decision to issue future earnings guidance. Since we are interested in how managers respond to their firms’ information environment, we first discuss the timing of the variable measurement. As Figure 2.1 illustrates, we measure our variables of interest sequentially. The information environment variables are obtained using analyst forecasts of the current quarter’s earnings issued in the 90 days prior to quarter t − 1 earnings announcement and this window excludes the day of the announcement. Guidance is measured using management forecasts between the announcement of quarter t − 1 earnings and the 30 days following, where the day of the announcement is included in this window as managers often bundle their forecasts with the earnings announcement.8 Our goal is to ascertain the issuance of guidance is a response to the prior information in analyst forecasts. However, this is a limitation of our study design, as one could alternatively view that the issuance of guidance is partially a response to the earnings news. Our primary results, however, are not attributable to this aspect of our study design, because we obtain consistent results restricting our sample to guidance issued more than 5 days after quarter t − 1 earnings announcement. 8
The target period for guidance is not restricted but in a robustness check, we restrict guidance to be for quarter t only and find similar results.
page 117
July 6, 2020
10:15
118
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
To test our hypotheses we estimate the following regression model: k + eit , Guide it = β0 + β1∗ Uncert it−1 + β2∗ Commit−1 + βk∗ Xit−1
(2.20)
where Guide equals 1 if a firm issues a forecast in the 30 days following the announcement of quarter t − 1 earnings, and zero otherwise, Uncert and Comm are uncertainty and commonality, respectively, measured prior to the announcement of quarter t − 1 earnings, and X k represents a vector of k control variables including PriorGuide, Prior8Guide, Assets, BM, FourthQ, Optimism, EPSVolat, Return, Loss, FSE, Following, Litigation, Restat and News, which are defined below. The control variables are measured as of the end of quarter t − 1. We follow prior guidance research and control for other factors that may affect management’s forecasting behavior. Extant studies show that guidance behavior is “sticky” and we include two variables intended to control for the firm’s guidance history. PriorGuide is equal to one if the firm issued guidance in the previous quarter, and zero otherwise, and Prior8Guide is equal to the number of quarters from t − 8 to t − 1 during which the firm issued guidance. We include Assets, the amount of total assets as of the end of quarter t − 1, because larger firms are more likely to issue guidance. BM is the firm’s book value of equity divided by its market value of equity at the end of quarter t − 1, controls for the effect of value vs. growth firms. FourthQ is an indicator variable equal to one if quarter t − 1 is the fourth quarter because managers may be more responsive to unbeneficial information environment characteristics concerning the fourth quarter. We explicitly incorporate a control for analyst following (Following). Following is the number of analysts following the firm during quarter t−1 since firms with heavier analyst following are more likely to provide management guidance (Ajinkya, Bhojraj, and Sengupta, 2005). We also include controls for earnings volatility (EPSVolat), prior returns (Return), the presence of prior losses (Loss), the extent to which managers have failed to meet prior analysts’ expectations (FSE), litigation risk (Litigation), and the incidence of restatement (Restat) (Brochet, Faurel, and McVay, 2011).9 EPSVolat is equal to the standard deviation of quarterly earnings per share over quarters t − 8 through t − 1 and is included because firms with more volatile earnings are less likely to issue 9
Brochet et al. (2011) also include an indicator variable for restructuring. We do not control for restructuring in our main analysis because restructuring data is available starting in 2001 and that decreases our sample. However, our results are robust to including an indicator variable for restructuring, where Restruct is equal to one if the firm reports restructuring charges in quarter t − 1, and restricting our sample to the later years.
page 118
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
119
guidance (Waymire, 1985). Return is equal to the cumulative size-adjusted return over quarter t − 1 and Loss is the percentage of quarters during which the firm reported negative earnings over quarters t − 8 to t − 1. These variables control for performance as Miller (2002) shows that firms with good performance are more likely to issue guidance. FSE is the percentage of quarters during which the firm failed to meet the consensus analyst forecast upon announcement of quarterly earnings over t − 4 to t − 1 and controls for the possibility that firms with historically disappointing results are less likely to issue guidance (Feng and Koch, 2010). Litigation is equal to one if the firm is operating within a high-litigation-risk industry, and zero otherwise. Restat is equal to one if the firm announces a restatement during quarters t − 1 and t, and zero otherwise. Litigation and Restat are included to control for managers’ incentives or disincentives to provide guidance when their firms are affected by such uncertain events. Finally, News is equal to the difference between the actual and the analyst consensus forecast of quarter t earnings issued in the 30 days following the earnings announcement of quarter t − 1 earnings, where at least two analysts provide a forecast in this time period, scaled by the absolute value of actual earnings. We include this variable because managers often issue forecasts to preempt bad news to avoid legal repercussions (Skinner, 1994; and Kasznik and Lev, 1995). Since many of our continuous variables are skewed with outlying observations, we use ranked variables. In each year, firm/quarters are assigned a decile rank based on the continuous variables, i.e., Uncert, Comm, Prior8Guide, Assets, BM, EPSVolat, Return, FSE, Following and News. The decile ranks are scaled to [0, 1] and used for the respective continuous independent variables in the regressions, where “R” at the end of a variable indicates the ranked variable.10 In addition, we follow prior guidance research and include industry and year fixed effects, and cluster standard errors by firms. 2.5 Sample and Descriptive Statistics 2.5.1 Sample Table 2.1 summarizes our sample selection procedure. Our initial sample includes all US firms with quarterly forecasts in the I/B/E/S Detail tape 10
Our results are qualitatively similar if we use raw variables or the natural logarithm transformation of the variables that take only positive values, such as Uncert, Comm, Prior8Guide, Assets, BM, EPSVolat, Loss, FSE and Following.
page 119
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch02
O. Barron et al.
120
Table 2.1:
Sample selection.
Sample with at least 40 consecutive quarters in period 1983–2010 Less firm/quarters prior to 1997 Initial sample Less firm/quarters with unavailable data for control variables Final sample
Firm/ Quarters
Firms
36,919 17,442 19,477 4,054 15,423
636 18 618 51 567
Firm/ Quarters Percent Initial sample Firm/Quarters where Guide = 1 Firm/Quarters where Guide = 0
5,621 13,853
28.86 71.14
Final sample Firm/Quarters where Guide = 1 Firm/Quarters where Guide = 0
4,105 11,318
26.62 73.38
Notes: Our sample includes all US firms with quarterly forecasts in the I/B/E/S Detail tape for the period 1983–2010 with at least two analyst forecasts in the 90 days prior to the quarterly earnings announcement.
for the period 1983–2010. For the purposes of obtaining the measures of interest in this paper, we use analyst forecasts made 90 days prior to the quarterly earnings announcement where the earnings announcement is made within 90 days of the quarter end. If an analyst issues multiple forecasts in this period, we retain only the forecast closest to the earnings announcement date. In order to calculate our information environment variables, we require that there are at least two forecasts in each firm/quarter and a minimum of 40 consecutive quarters of observations in the 1983–2010 time period. This yields a total of 36,919 firm/quarter observations from 636 unique firms. Further, we eliminate observations prior to 1997 as guidance data on the First Call Company Issued Guidelines file begins in 1995 and we need up to eight quarters prior to each firm/quarter to calculate controls for previous guidance. In addition, restatement data is available starting in 1997. We also eliminate observations where commonality is less than zero, which represent measurement errors. This leaves us with 19,319 firm/quarter observations in the period 1997–2010 from 618 unique firms, which constitute the sample used for our initial analysis of the relationships of dispersion with uncertainty and commonality. We further eliminate 4,054 observations from 51 firms with unavailable data for our control variables. Our final sample
page 120
July 6, 2020
10:15
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Management Forecasts and Analysts’ Information Demand
b3568-v1-ch02
121
includes 15,423 firm/quarter observations from 567 unique firms, with managers issuing guidance in approximately 27% of the firm/quarters. 2.5.2 Descriptive statistics The descriptive statistics pertaining to the variables used in our empirical analyses are presented in Table 2.2. Panel A displays results for the full sample and Panel B shows statistics by the presence of guidance and provides results of differences between means and medians of the sub samples. Panel A shows that our sample is comprised of large and heavily followed firms, which trade at a substantial premium over book value. Based on summary statistics for commonality, it appears that analysts following our sample firms rely more on common, rather than idiosyncratic information. Summary statistics for uncertainty, dispersion and information precision show large standard deviations and skewness, supporting the use of ranked, rather than raw variables. Further, Panel B of Table 2.2 shows that dispersion, uncertainty, commonality and information precision in no guidance quarters are significantly higher than in guidance quarters. Firms that tend to guide less frequently are bigger, have higher book-to-market ratios and more volatile prior earnings, are more likely to report losses and miss analyst forecasts. On the other hand, guiders are more likely to face higher litigation risk or be involved in restatements. Overall, the evidence is generally consistent with prior research and our research design explicitly controls for the differences between guiding and non guiding firms. Table 2.3 provides Pearson and Spearman correlation coefficients between the variables used in the regression analysis. Most correlation coefficients are significant at the 5% level. Uncertainty and commonality are positively correlated suggesting that when uncertainty is high, analysts likely rely more on common, rather than idiosyncratic information. Public and private information precisions are negatively related to both uncertainty and commonality indicating that analyst information tends to be imprecise when they are highly uncertain. Moreover, analysts tend to have higher commonality when they have imprecise information. Dispersion is strongly positively related to analysts’ uncertainty but strongly negatively related to the commonality in analysts’ information. These preliminary findings suggests that, if dispersion is included in a model with uncertainty and commonality excluded, while being correlated with the dependent variable, then the coefficient on dispersion will be biased. The direction of the bias, assuming no other variables are considered, will be driven by how uncertainty and commonality relate to
page 121
July 6, 2020
Variable
Q1
Median
Q3
0.089 0.784 1, 612.460 517.167 0.010 0.305 2.411 42,214.630 0.464 0.238 0.511 0.014 0.103 0.259 16.167 0.176 0.011 −0.067
1.634 0.219 6, 062.890 3,553.150 0.201 0.460 3.032 153,984.850 0.366 0.426 0.789 0.193 0.184 0.267 6.476 0.381 0.106 142.980
0.001 0.688 70.812 8.124 0.000 0.000 0.000 2,576.750 0.246 0.000 0.145 −0.092 0.000 0.000 11.000 0.000 0.000 −0.044
0.002 0.860 296.604 51.466 0.000 0.000 1.000 7,700.320 0.400 0.000 0.270 0.001 0.000 0.250 15.000 0.000 0.000 0.021
0.010 0.951 1, 208.150 266.803 0.001 1.000 5.000 24,639.000 0.585 0.000 0.556 0.101 0.125 0.500 20.000 0.000 0.000 0.110
9.61in x 6.69in
Std Dev
O. Barron et al.
Uncert Comm PublicPrec PrivatePrec Disp PriorGuide Prior8Guide Assets BM FourthQ EPSVolat Return Loss FSE Following Litigation Restat News
Mean
Handbook of Financial Econometrics,. . . (Vol. 1)
Descriptive statistics.
10:15
122
Table 2.2: Panel A: Full Sample (N = 15 ,423 )
(Continued ) b3568-v1-ch02 page 122
July 6, 2020 10:15
Guide = 1 (N = 4,105)
Variable
Mean
Std Dev
0.001 0.668 134.199 16.094 0.000 1.000 5.000 3,225 0.238 0.000 0.136 −0.076 0.000 0.000 13.000 0.000
Median
Q3
0.002 0.005 0.848 0.944 409.006 1,310.760 83.999 335.121 0.000 0.001 1.000 1.000 8.000 8.000 9,878 27,288 0.357 0.534 0.000 0.000 0.236 0.472 0.005 0.096 0.000 0.125 0.000 0.250 17.000 22.000 0.000 1.000
Mean 0.113 0.789 1,729.750 543.250 0.012 0.137 0.987 46,413 0.482 0.236 0.547 0.013 0.111 0.290 15.602 0.141
Std Dev 1.905 0.217 6,770.640 4,062.280 0.232 0.343 1.737 172,566 0.391 0.425 0.855 0.203 0.189 0.276 6.388 0.348
Q1 0.001 0.697 57.962 6.533 0.000 0.000 0.000 2,405 0.251 0.000 0.149 −0.098 0.000 0.000 11.000 0.000
Median
Q3
0.003 0.012 0.865 0.953 254.439 1,148.980 41.773 241.912 0.000 0.001 0.000 0.000 0.000 1.000 7,076 23,607 0.416 0.603 0.000 0.000 0.282 0.592 −0.001 0.103 0.000 0.125 0.250 0.500 15.000 20.000 0.000 0.000
t-test
Wilcoxon rank sum test
0.003 0.
(7A.12)
Let x4 = a(s) stands for the implicit function determined by P (x4 ) = 0 and take derivative to both sides of P (x4 ) = 0 with respect to s yields
3s2 x4 4 ∂α (s) = φ− . (7A.13) ∂s P (x4 ) 1−s Letting the right hand side of equation (7A.13) equal zero yields α (s) = (1 − s) φ1/4 .
(7A.14)
Taking derivative to both sides of equation (7A.13) using equation (7A.14) yields 12s2 φ ∂ 2 α (s) . = ∂s2 (1 − s) P (x4 )
(7A.15)
Equation (7A.15) implies that the necessary and sufficient condition for x4 = ¯ 4C2 reaches maximum simultaneously as x4 = α(s) to have a maximum (R α(s) reaches maximum) is φ > 0. For positively skewed asset prices, κ > 0, and κ > 0 is sufficient for φ > 0 but not necessary. When φ > 0, the negativeness of equation (7A.15) shows that x4 = α(s) reaches its maximum at the s that is determined by equation (7A.14). Equation (7A.14) is a function of s = F (K), it determines s as a function of φ, therefore, the optimal s = F (K) which yields the maximum x4 can be written as F (K) = G(μ, δ, η, K). E(Y 4 ; v, λ) = (v + λ)6 18(v + λ)4 (v + 2λ) + 4(v + λ)3 [16(v + 3λ) + 3(v + 2λ)] + (v + λ)2 [432(v + 4λ) + 96(v + 3λ) + 12(v + 2λ) + 48(v + 2λ)2 ] + 640[(v + 3λ)2 + (v + 2λ)2 ] + 1440(v + λ)(v + 4λ) + 288(v + λ)(v + 2λ)(v + 3λ) + 48(v + λ)(v + 3λ) + 2304(v + λ)(v + 5λ) + 288(v + 2λ)(v + 3λ) + 288(v + λ)(v + 4λ) + 72(v + λ)(v + 2λ)2 + 3840(v + 6λ).
page 334
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
Chapter 8
Measuring the Collective Correlation of a Large Number of Stocks Wei-Fang Niu and Henry Horng-Shing Lu Contents 8.1 8.2 8.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Random Graph . . . . . . . . . . . . . . . . . . . . . . . Constructing Market Network and Measuring Collective Correlation . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Constructing market network . . . . . . . . . . . 8.3.2 Properties of the market network . . . . . . . . 8.3.3 Measuring collective correlation of stock market 8.4 Empirical Investigations . . . . . . . . . . . . . . . . . . 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 336 . . . . 338 . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
340 341 342 345 346 351 352 352
Abstract Market makers or liquidity providers play a central role for the operation of the stock markets. In general, these agents execute contrarian strategies so that their profitability depends on the distribution of stock returns across the market. The more widespread the distribution is, the more arbitrage opportunities are available. This implies that the collective correlation of stocks is an indicator for the possible turmoil in the market. This paper proposes a novel approach to measure the collective correlation of stock market with the network as a tool for extracting information. The market network can be easily
Wei-Fang Niu National Chiao Tung University e-mail: [email protected] Henry Horng-Shing Lu National Chiao Tung University e-mail: [email protected] 335
page 335
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
336
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
constructed by digitizing pairwise correlations. While the number of stocks becomes very large, the network can be approximated by an exponential random graph model under which the clustering coefficient of the market network is a natural candidate for measuring the collective correlation of the stock market. With a sample of S&P 500 components in the period from January 1996 to August 2009, we show that clustering coefficient can be used as alternative risk measure in addition to volatility. Furthermore, investigations on higher order statistics also reveal the distinctions on the clustering effect between bear markets and bull markets. Keywords Collective correlation • Correlation breakdown • Dimension reduction • Partition function • Random graph.
8.1 Introduction It is relatively easy to measure the volatility of a stock market, and developments in volatility modeling have made significant progress in the past decades. In contrast to volatility, relatively few stylized facts about correlation have been commonly recognized. Especially, when a large number of assets are considered, it seems hard to give a description regarding their correlation structure. This paper proposes a network-based approach to measure the collective correlation of the market. The measure is statistically stable and incorporates the clustering effect in the market. Empirical investigation shows its potential to be used as a risk measure like the volatility index. The most important feature about correlation is that it increases during crisis, usually referred to as correlation breakdown. In the past years, many articles have tried to address this issue, for example Karolyi and Stulz (1996), Longin and Solnik (2001), Forbes and Rigobon (2002), Ang and Chen (2002), Campbell et al. (2002), Dungey and Martin (2007). However, recent research by Campbell et al. (2007) attributes the phenomenon to be caused by fat tails. Beyond the bull/bear market scheme, Becker and Schmidt (2010) investigated some stocks pairs and showed that the correlations are either constant or increasing in bull markets. In fact, correlations of asset prices have been known to be time-varying for a long time. The Dynamic Conditional Correlation (DCC) model proposed by Engle (2002) and many of its variants are commonly used in academia and industries for forecasting and filtering correlations. So it does not seem so ambitious to ask questions like: Is there some functional relation between correlations and volatility in a stock market? To answer such a question, a measure for the collective correlation should be built first. But what should it be?
page 336
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
337
The relation between correlations and market crash provides the key. Some researchers had suggested that large market crashes could be explained by the behavior of large participants, for example Gabaix et al. (2003) and Khandani and Lo (2007). That is, position changes of large participants may trigger a large market crash if all stocks are tightly bound. Under this context, tight integration of the market should be observed prior to the crash. On the other hand, recent theoretical developments for the mechanism of market crash have been concentrated on the role of liquidity (Brunnermeier and Pedersen, 2009; Huang and Wang, 2009). Especially, Brunnermeier and Pedersen analyzed the links between an asset’s market liquidity and traders’ funding liquidity, and they suggested that liquidity provider’s funding ability would be a critical driving force for all the effect of liquidity. Since liquidity providers basically execute contrarian trading strategies, their profitability and risk quite depend on the distribution of returns of assets (Cornad and Kaul, 1998). So the degree of integration of the market is closely related to the liquidity provider’s funding and thus the possibility of market crash. Thus simply summarizing pairwise correlations is not enough for a collective correlation measure. Basically, the measure should be statistically stable even with a large number of stocks and short time series. And most importantly, it should be able to provide information about the tightness of the market or the deviation of the correlations from their “usual” level. In addition, it better be able to address the clustering effect that commonly exists in the financial markets. Intuitively, analyzing the correlation matrix will help in gathering certain information. However, as pointed out by Laloux et al. (1999), the correlation matrices calculated with a large number of asset price returns are mainly composed of noises. Thus, dimension reduction is essential in dealing with the correlations. Ara´ uo and Lou¸c˜a (2007) proposed a procedure to denoise, determine the number of effective dimensions, and then compute the eigenvalues in the projected space as an index for market integration. They found that unusually high levels of the market structure index generally corresponded to major market crashes. The eigenvalue approach over the (denoised) correlation matrix provides a macroscopic view of the stock market. It indexes simultaneously the extent and the amplitude of correlations for a large amount of stocks. However, it provides fewer insights for the microscopic phenomenon — how stocks interact with each other as they get closer toward a market crash. An alternative approach is to digitize all the correlations with respective to some thresholds, then a network or an undirected graph can be obtained
page 337
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
338
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
through it. This means that by treating each stock as a vertex, an edge in a graph is established when the two stocks are significantly correlated. Emmert-Streib and Dehmer (2010) proposed to use graph edit distance to quantify the changes between two networks in two consecutive periods. They also found that abrupt changes in the graph edit distance correspond to market crashes. Peralta (2015) also studied undirected stock network and suggested that some network-based measures could be leading indicators of distress in the stock market. Under this context, researchers from various areas proposed network-based portfolio selection and trading strategies, for example Peralta and Zareei (2016), Lee et al. (2018), Wen et al. (2018) and Zhao et al. (2018). On the other hand, random graph theory may help with extracting information. When the number of stocks is large, the network may be approximated by an exponential random graph model (ERGM). Thus the clustering coefficient, which represents the forecasting power for existence of edges through local structure of the network, becomes a natural choice for aggregating correlations. In addition, the dispersion of the degree distribution is also important since it involves the tendency of centralization for the structure of the market. In this paper, we will illustrate the network approach with a sample from the constituents of the S&P 500 index between 1996 and 2009. A simple monotonic relation between realized volatilities of the index and the clustering coefficient of the market network could be easily observed. Furthermore, it is also found that the market usually tends to have some core stocks correlated with large proportions of stocks during calm periods. And the whole market looks homogenous while volatility is high. The rest of the study is organized as follows. Section 8.2 briefly introduces preliminary knowledge about random graph. Section 8.3 discusses the construction and properties of the market network. Section 8.4 provides empirical investigations and Section 8.5 presents concluding remarks.
8.2 Random Graph A network or a graph G is composed of an ordered pair (V , E), where V is a set of vertices and E is a set of edges that join pairs of vertices. A graph can be either directed or undirected. In this paper, only undirected graphs are considered. Graph representation for Granger causality of multivariate time series can be found in Eichler (2007). In practical applications, each vertex represents a subject or a random variable and the edge reveals certain dependency relationships between each
page 338
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
339
pair of vertices. The network approach thus provides an easy way to visualize the correlation structure of a large-scale complex system. Two vertices are neighbors if they share a common edge. The degree of a vertex is the number of edges it has. A graph is complete if any pair of its vertices is connected. A clique is a complete subset of a graph. A k-star refers to a subset for which k vertices are neighbors to one specific vertex simultaneously. A triangle corresponds to three vertices that are neighbors to each other. A random network has edges that are randomly produced. The simplest random network is the Erd˝os–R´enyi model, in which the presence of an edge is independent of the presence of any other edge with a constant probability p. Thus, the probability for any vertex to have a degree x is λx e−λ N −1 , (8.1) P (X = x) = px (1 − p)N −1−x ≈ x x! where λ = (N − 1)p and the approximation relies on a large N . While the edges are not independent, the probability function for the graph can be quite complex. Dependence graph D on the edge set E may help the dependence structure. Its vertex set consists of all pairs {i, j} ∈ specify V and the edge between Eij and Elm exists when they are dependent 2 conditional on the rest of G. Clearly, the dependence graph is a nonrandom graph. By Hammersley–Clifford theorem the probability function of a random graph G can be characterized as αA , (8.2) Pr(G) ∝ exp A⊆G
where αA is some constant when A is a clique of D and equal to 0 otherwise. A Markov graph has its dependence graph D, which contains no edge between disjoint Eij and Euv for distinct i, j, u, v. A random graph satisfying the Markov condition is a Markov random graph. By adding a homogeneity condition that the probability function Pr(G) are the same for all isomorphic graphs G, Frank and Strauss (1986) showed that the probability for any homogenous undirected Markov graph can be reduced to αk Sk , (8.3) Pr(G) ∝ exp α0 T + k
where T is the number of triangles, S1 is the number of edges and Sk is the number of k-stars. This family of graph models is usually referred as exponential family random graph (ERGM) or p∗ models.
page 339
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
340
Note that a higher order component inevitably produces certain amounts of lower order components. For example, one triangle has three 2-stars and one k-star leads to k (k − 1)-stars. In addition, the information set with the numbers of edges and k-stars are exactly equivalent to that of the numbers of vertices with degree x, x = 0, 1, . . . , N −1. Thus, the ERGM (8.3) may have a more parsimonious form by incorporating some higher order statistics. Two commonly used are the alternating k-stars by Snijder et al. (2006), uλ =
N −1
(−1)k
k=2
Sk , λk−2
and exponentially weighted degree statistics by Hunter (2007), N −1 k
φ −φ 1− 1−e Dk . uφ = e k=1
By observing the relation, Sk =
N −1 l=2
l k
Dl ,
it is easy to see that alternating k-stars with the number of edges is equivalent to EWD. These statistics also suggested that dispersion of the degree distribution is an important characteristic of the system. In addition, another interesting property of a network is the transitivity or clustering. Consider three vertices Vi , Vj and Vk in a network. If Vj and Vk are simultaneously connected to Vi , then it is well expected that Vj and Vk are connected to each other too. So a clustering coefficient is defined as 3 · number of triangles , (8.4) number of 2-stars which represents the probability for such an event to be realized. Among all graph models available, the ERGM gives good intuition and rationale for this coefficient. Call =
8.3 Constructing Market Network and Measuring Collective Correlation Some possible candidates can be used for measuring collective correlation. An intuitive candidate is the mean or median of pairwise correlations even though these quantities are correlated as a whole. Another possibility is to find the largest several (nonlinear) principal components and evaluate their proportion over total variations.
page 340
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
341
However, in evaluating these measures, some characteristics about correlations between assets should be taken into consideration. First, correlations are time varying and overly long time series may lead to over-smoothing and will not provide adaptive information. This implies that principal components or its nonlinear variants may not be good candidates when the number of assets is large and the correlation matrix becomes singular. Second, clustering effect generally exists in financial markets. Consider an extreme condition under which the correlation coefficients for all pairs in the same cluster are ρ0 > 0 and ρ0 = 0 for between clusters. Clearly under such a situation, the mean or median will be some constant distinct from ρ0 or 0 and may mislead our understanding about the structure of the market. Alternative approach is to digitize all correlation coefficients. That is, we simply identify those pairs with significant comovements instead of specifying all coefficients precisely. This also gets rid of the constraint on positive definiteness for the correlation matrix. Obviously a network or an undirected graph will be obtained through this process. 8.3.1 Constructing market network Emmert-Streib and Dehmer (2010) proposed a network-based approach to investigate the correlation structure. They built networks by testing the following three hypotheses on the correlation coefficient respectively: (A) H0 : ρ = 0 vs H1 : ρ = 0; (B) H0 : ρ = 0 vs H1 : ρ < 0; (C) H0 : ρ = 0 vs H1 : ρ > 0. The edge for each pair of stocks exists when the data fall into the rejection region, with the network constructed in this way, the two authors calculated the graph edit distance (GED) for graphs from consecutive periods. The GED is the minimum transformation cost on adding or deleting a vertex or an edge for one graph to become isomorphic to another. They use daily returns for the constituents of the Dow Jones Industrial Average Index from 1986 to December 2007. They found that large jumps in GED correspond to major market crisis. Note that using such a measure implicitly assumes exchangeability of vertices (stocks). In the meantime, only the differences on the structure between two networks are considered, but the information contained within one graph is totally ignored. In this paper, we modify the procedure by
page 341
July 6, 2020
11:57
342
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
Emmert-Streib and Dehmer and focus on the correlation structure revealed by the network of a certain period. For processing the time series data, we use a GARCH(1,1) model to filter standardized innovations. The parameters for the GARCH models are estimated at year-ends with 2 years data prior to that day. The innovations for the next 12 months are obtained based on the parameters. Then the correlation coefficients are calculated by these innovations. For constructing the network, only the hypothesis (C) is considered. This is due to two reasons. First, by CAPM, assets in one stock market are supposed to have positive correlations. Second, there are indeed very few pairs with significantly negative correlations actually observed. Furthermore, since there are a large number of stocks pairs, this is typically a multiple testing problem. General practice is to control the false discovery rate, and detailed procedures can be found in, for example, Efron (2004, 2007) and Sun and Cai (2009). The hypothesis (C) can also be written as (C ) H0 : ρ = ρ0 vs H1 : ρ > ρ0 , so that we may reasonably balance the Type I and Type II errors. In fact, what really matters in further application is the number of edges. Either too many or too few edges lead to an almost trivial network. Furthermore, normality assumption is not always necessary for constructing the market network. While the returns are assumed to follow some arbitrary distribution, we may also approximate the joint distribution of pairs of stocks with a copula. Then testing the tail dependence may be an alternative appropriate approach. 8.3.2 Properties of the market network Consider a market with N assets. Let zt = (z1t , . . ., zN t ) , t = 1, . . ., T , be the standardized innovation vector at time t and the variance of zt be Σ which has identity diagonal elements. Let A be the scatter matrix: A=
T
(zt − z¯) (zt − z¯) ,
t=1
where z¯ = T1 Tt=1 zt . Then A follows the Wishart distribution: f (A; Σ, N, T ) =
|A|(T −N −1)/2 exp − 12 trΣ−1 A 2N T /2 |Σ|T /2 ΓN (T /2)
,
(8.5)
page 342
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
343
where ΓN is the multivariate gamma function defined as ΓN (T /2) = π N (N −1)/4
N 1−j T + . Γ 2 2
j=1
With the term |A|, it is easily seen that in general any pair (aik , auv ) will not be independent even though the standardized returns (zi , zj , zu , zv ) are uncorrelated. This property can be also easily understood from a geometric viewpoint. Consider a market with four stocks. Suppose that the joint distribution for (z1 , z2 , z3 , z4 ) follows a multivariate normal distribution. Then zi , i = 1, . . . , 4, can be viewed as a point in a T -dimensional space. The edge between the (i, j) pair exists when the angle between the vectors zi and zj is smaller than some certain threshold. So we may map all the points to a (T − 1)dimensional sphere. Figure 8.1 illustrates this example. Now consider the conditional probability under the null hypothesis P (E12 , E34 | E13 = E14 = E23 = E24 = 0). Clearly, even though the two edges E12 and E34 share no common vertices, they are still dependent on each other given the rest of the edges. With the information about the non-existence of other four edges, it is seen that the two sets of vertices (1, 2) and (3, 4) should be some distance away from each other on the sphere. However, when E34 =1, vertex 3 and 4 are close and the space allowed to allocate vertices 1 and 2 becomes larger and thus the probability for E12 = 1 becomes lower, and vice versa. Thus, while we have a finite number of stocks, any pairs of edges are dependent conditional on the rest of the edges.
Figure 8.1: Dependency of edges. The four stocks are mapped onto a (T −1) dimensional sphere. The two edges (1, 2) and (3, 4) are conditionally dependent on (E13 , E14 , E23 , E24 ).
page 343
July 6, 2020
11:57
344
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
So, in general, the dependence graph D of the market network M is a complete graph. This means that any pair of edges is dependent while there are a finite number of stocks. However, when the number of stocks under consideration becomes large, the angle between vertices i and j can be obtained simply by the number of vertices simultaneously connected to vertices i and j. That is, the existence of Eij depends on those edges Eiu and Eju only. More specifically, under the exponential family random graph models, basic relations between any pair of edges are categorized into two types: with or without a common vertex. Clearly those edges of the form Eiu and Eju are the major source of information for the existence of Eij . Then any element A in the conditional distribution for Eij ,
Pr((i, j) | rest) ∝ exp
⎧ ⎨ ⎩
(i,j)∈A
αA
⎫ ⎬ ⎭
,
(8.6)
will contain N −2some Eiu or Eju except it is Eij itself. Since there are possible edges of the form Eiu or Eju , the contribution to total 2 · 1 information about the existence of Eij from each them is roughly of N −2of order 1/N by symmetry. However, as there are 2 possible edges of the form Euv , each of the contribution from each of them is roughly of order 1/N 2 . These imply that each disjoint pair of edges Eij and Euv tend to be conditional independent as N is large. The market network can be approximated by the ERGM along with the exchangeability of stocks under the null hypothesis. Thus the statistics to be investigated are just the number of edges, k-stars and triangles. This information set is equivalent to the number of vertices with different degrees and the number of triangles. It also provides an alternative way toward dimension reduction since we need to pay attention to only N statistics. And most importantly, equation (8.6) takes the form of the partition function. So the result suggests a statistical mechanics approach to the study for the financial market. The homogeneity condition used by Frank and Strauss (1986) could be a little bit controversial. Mathematically it means the exchangeability of stocks in their joint distribution. This is correct under the null hypotheses. And in real application here, we are mainly concerned with those agents who use contrarian strategies. This type of trading strategies relies on the distribution
page 344
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
345
of stocks across the market and need not assume distinct distribution for individual stock. An example can be referred to Lo and MacKinlay (1990) in which only past returns are used for the determination of weights in the portfolio. 8.3.3 Measuring collective correlation of stock market A simple rule of thumb that has been widely applied in many e-businesses, for example Facebook, can be as follows: for two subjects simultaneously related with someone else, they will have higher probability to be related to each other. An analog for the stock market network would be that two stocks highly correlated to one stock tend to be highly correlated. Such arguments, though intuitively reasonable but frequently ignored, can be useful for the modeling of correlations. The concept becomes meaningful while certain clustering effect exists in the market. More specifically, there are always some highly correlated pairs of stocks and some uncorrelated pairs while others change over time. In addition to counting the number of highly correlated pairs, a more elaborate approach is to investigate how many candidate pairs are actually significantly correlated. But which pair of stocks can be considered to be a candidate? Certainly, knowledge such as industry sectors may reveal some clues. But such static information cannot be sufficient for describing a dynamic system. The rule of thumb described above can be an alternative choice. The clustering coefficient of the network (4) is actually a good realization of this concept. While calculated over the whole network, the number of 2-stars in the denominator exactly represents how many pairs of stocks are supposed to be the candidates that are significantly correlated according to the time series data obtained. The numerator just reflects the realized number of candidates. Figure 8.2 shows the contour plot of the probability function P (E12 = 1, E12 = 1; ρ12 , ρ23 , ρ23 ) for different values of (ρ12 , ρ13 ) with ρ23 = 0 or 0.5. Clearly the weight for calculating the clustering coefficient is put on those pairs actually with high correlations. Note that clustering coefficient not only measures the level of correlation but also reflects the clustering effect. That is, it also addresses the predictability for correlation relationship through common neighbors, a local property of the network. The clustering coefficient would be higher than the proportion of connected edges if the clustering effect actually exists. In other words, a market is highly integrated when the clustering effect is strong. For very large N and under uniform correlation structure in which all ρij s are equal, the expectation value and variance of Call can be
page 345
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
346
α=0.05, ρ23=0.5
0.9
0.6
0.6
0.8
0.8
α=0.05, ρ23=0
0.9
0.8
0.8
0.7 0.5
0.7 0.6
0.4
0.4
0.6
0.5
0.4
0.4
0.2
0.3
0.2
0.2
0.3 0.2
0.1
0.0
0.1
0.0
July 6, 2020
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
Figure 8.2: Probability function P (E12 = 1, E12 = 1; ρ12 , ρ23 , ρ23 ) with respect to different values of (ρ12 , ρ13 ) with ρ23 = 0 or 0.5.
approximated as ECall ≈ Var(Call ) ≈
P (E123 ) , P (E12 E13 )
(8.7)
P (E123 E456 ) − P (E123 )2 P (E12 E13 )2 1 P (E123 )2 + P (E12 E13 E45 E46 ) − P (E12 E13 )2 4 4 P (E12 E13 ) 2P (E123 ) (P (E123 E45 E46 ) − P (E123 )P (E45 E46 )) . − P (E12 E13 )3 (8.8)
Note that its variance does not converge to zero as N approaches infinity since all pairs of correlations are dependent. Instead, it approaches some constant. Figure 8.3 shows simulation results with different stock numbers and significance levels. It is seen that the clustering coefficient changes along with the correlation coefficient but the width of the confidence interval does not change with the two parameters significantly. 8.4 Empirical Investigations In this study, we use daily returns of a sample from S&P 500 index components for numerical illustrations. Stock prices are collected from Yahoo! Finance. These data consist of 3441 daily prices of 389 companies from January 2, 1996 to August 31, 2009. Each company has at least 3400 trading
page 346
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
Measuring the Collective Correlation of a Large Number of Stocks
347
α=0.01, N=400
0.0
0.0
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
1.0
1.0
α=0.01, N=100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.7
0.8
α=0.05, N=400
0.0
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
1.0
1.0
α=0.05, N=100
0.0
July 6, 2020
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.1
0.2
0.3
0.4
0.5
0.6
Figure 8.3: Simulated clustering coefficient versus correlation coefficient for multivariate normal distribution with uniform correlation structure.
days during the period. All returns are adjusted to dividends and splits. Missing values of returns are filled with 0. Figure 8.4 shows the trend chart for clustering coefficient with significance level set at 5%, 1% and 0.5%. Essentially the three time series have similar patterns but the one with 5% significance level is slightly shifted upward. It implies that those additional edges compared to that under 1% significance level tend to produce more triangles. This is an indirect evidence for the clustering effect in the stock market. Also, it is seen that the trends of clustering coefficient with 1% and 0.5% are quite close. Significance level set around this value may be a good choice. Of course, optimal choice is still worth further investigation. Figure 8.5 shows the trend chart for clustering coefficient of 20 random samples of sizes 100. In general, the diversity of the 20 clustering coefficient becomes larger when clustering coefficients are relatively lower. This implies the market is less “uniform” when clustering coefficient is low.
page 347
July 6, 2020
11:57
348
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
Figure 8.4: Clustering coefficient for market networks constructed with significance level set at 5% (dotted), 1% (solid) and 0.5% (dashed).
Figure 8.5: Trend chart for clustering coefficient of 20 random samples of sizes 100. The solid line represents the clustering coefficient with 389 stocks.
Figure 8.6 shows the trend chart for the level of the S&P 500 index, the clustering coefficient and the CBOE VIX. Generally speaking, the last two time series have similar trends and, most important, they have very similar timing and patterns for jumps, the most obvious feature of the collective correlation index. This generally corresponds to correlations increasing during the market plunges. The average levels of the collective correlation index
page 348
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
349
Figure 8.6: Trend chart for the level of the S&P 500 index (solid), the clustering coefficient (dot-dashed), and the CBOE VIX (dashed).
subsequent to the NASDAQ bubble seem to be higher than that prior to the bubble. This phenomenon indicates changes in trading behaviors before and after the year 2000. In addition, in the middle of 2007, both indexes had risen from their bottoms before the subprime crisis. This is essentially compliant with the results by Becker and Schimdt (2010) on the phenomenon that correlations may still increase in the bull markets. Thus, it may be possible that the market becomes integrated so it approaches the brink to crashes. Figure 8.7 shows the scatter plot for clustering coefficient and realized volatility for each month. Across the whole period the two quantities appear to have a positive correlation. While we divide them into before and after year 2002, as the two panels in the right hand side show, the relation becomes especially clear after 2002. This is compliant with the argument by Khandani and Lo (2007) on the emergence of hedge funds after 2002 and Straetmans et al. (2008) on the extreme fluctuations in the US stock market after 9/11. Clustering effect is another important feature of the stock market. Since the sufficient statistics can be also represented as the number of triangles and vertices with different degrees, we may consider the 90% and 50% quantiles of the degree distribution. Figure 8.7 shows the scatter plot for the difference of the two log-quantiles and clustering coefficient. When the difference is large, the degree distribution is wide and some stocks have more than expected neighbors. That is, the market tends to be less homogenous. Clearly, Figure 8.8 points out that when clustering coefficient is low, the
page 349
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
350
1997−2002
0.8
0.4
0.6
0.9
0.8
0.9
1997−2009
0.005
0.01
0.02
0.04
0.4
0.6
0.4
0.8
0.9
0.6
2003−2009
0.005
0.01
0.02
0.04
0.005
0.01
0.02
0.04
Figure 8.7: Scatter plot for clustering coefficient and realized volatility. In the left panel, the rectangles are for the period 1997–2002 and circles for 2003–2009. 1997−2002
1
3
2
3
4
4
1997−2009
0.3
0.5
0.7
0.9
2
3
2
4
2003−2009
1
1
July 6, 2020
0.3
0.5
0.7
0.9
0.3
0.5
0.7
0.9
Figure 8.8: Scatter plot for the difference of the logarithms of 90% and 50% quantiles of degrees and clustering coefficient. In the left panel, the rectangles are for the period 1997–2002 and circles for 2003–2009.
page 350
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
Measuring the Collective Correlation of a Large Number of Stocks
351
2003−2009
2
2
3
3
4
4
1997−2002
1
1
July 6, 2020
0.3
0.5
0.7
0.9
0.3
0.5
0.7
0.9
Figure 8.9: Scatter plot for the difference of the logarithms of 90% and 50% quantiles of degrees and clustering coefficient. The rectangles are obtained with the data in the respective period. The dots are obtained through simulation under normality assumption and uniform correlation structure.
market is inhomogeneous, and the threshold is around 0.7. The cut becomes clearer for the post-2002 period. Note that the difference of log-quantiles is inversely related in nature. Figure 8.9 compares the differences of log-quantiles from real data and simulations under normality assumption and uniform correlation structure. It illustrates that the wide spread of the difference of the log-quantile for low clustering coefficient is quite significant. 8.5 Conclusion This paper proposes a network approach to explore the correlation structure of the stock market. By approximating the network with an exponential random graph model, it provides an alternative approach to dimension reduction while a huge amount of stocks are taken into considerations. In the meantime, it can be carried out without specific distributional assumptions. And most important of all, this is an attempt to build a link to complex systems. Since the distribution function takes an exponential form and thus the normalizing constant in (3) can be interpreted as the partition function that plays a central role in statistical mechanics.
page 351
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
352
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
The clustering coefficient of the market network measures the predictive ability of significant correlations through neighborhood relations. It also implicitly reflects the profitability of arbitrageurs and thus the funding liquidity of the market maker. Empirical investigations show that a positive relation between the clustering coefficient and volatility exist. This implies the potential ability for the clustering coefficient to become an alternative risk measure in addition to volatilities. Furthermore, the network approach can indeed reveals much more information about the structure of the market. As shown in Section 8.4, there exist quite different regimes in bull and bear markets. Developing testing method to recognize the state of the market would be an interesting work. And to make this idea applicable, the network should be constructed at higher frequency in future studies. Quite a lot of extensions can be inspired for the network approach, under which the curse of dimensionality is relieved. For example, determining the number of factors with short time series would be possible. Furthermore, network modeling provides flexibility to accommodate possible structure for the stock market when the number of stocks is large. And these quantitative techniques are useful for practical applications in trading or asset allocation.
Acknowledgment The authors thank the helpful discussion and suggestion of Professor Jin-Chuan Duan and Ruey S. Tsay. The authors also acknowledge the support from Risk Management Institute at National University of Singapore in Singapore, Ministry of Science and Technology in Taiwan, National Center for Theoretical Sciences and Shing-Tung Yau Center at National Chiao Tung University in Taiwan.
Bibliography Ang, A. and Chen, J. (2002). Asymmetric Correlations of Equity Portfolios. Journal of Financial Economics, 63, 443–494. Ara´ uo, T. and Louc˜ a, F. (2007). The Geometry of Crashes. A Measure of the Dynamics of Stock Market Crises. Quantitative Finance, 7, 63–74. Becker, C. and Schimdt, W. (2010). State-Dependent Dependencies: A Continuous-Time Dynamics for Correlations. Available at SSRN: http://papers.ssrn.com/sol3/papers. cfm?abstract id=1553115. Brunnermeier, M. and Pedersen, L. (2009). Market Liquidity and Funding Liquidity. Review of Financial Studies, 22, 2201–2238. Campbell, R., Koedijk, K. and Kofman, P. (2002). Increased Correlation in Bear Markets. Financial Analyst, 58, 87–94.
page 352
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Measuring the Collective Correlation of a Large Number of Stocks
b3568-v1-ch08
353
Campbell, R., Forbesc, C., Koedijk, K. and Kofman, P. (2007). Increasing Correlations or Just Fat Tails? Journal of Empirical Finance, 15, 287–309. Cornad, J. and Kaul, G. (1998). An Anatomy of Trading Strategies. Review of Financial Studies, 11, 489–519. Dungey, M. and Martin, V. (2007). Unravelling Financial Market Linkages During Crises. Journal of Applied Econometrics, 22, 90–119. Efron, B. (2004). Large-Scale Simultaneous Hypothesis Testing. Journal of the American Statistical Association, 99, 96–104. Efron, B. (2007). Correlation and Large-Scale Simultaneous Significance Testing. Journal of the American Statistical Association, 102, 93–103. Eichler, M. (2007). Granger-Causality and Path Diagrams for Multivariate Time Series. Journal of Econometrics, 137, 334–353. Emmert-Streib, F. and Dehmer, M. (2010). Identifying Critical Financial Networks of The DJIA: Toward a Network-Based Index. Complex, 16, 24–33. Engle, R. (2002). Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized Autoregressive Conditional Heteroskedasticity Models. Journal of Business & Economic Statistics, 20, 339–350. Forbes, K. and Rigobon, R. (2002). No Contagion, Only Interdependence: Measuring Stock Market Comovements. Journal of Finance, 57, 2223–2261. Frank, O. and Strauss, D. (1986). Markov Graphs. Journal of the American Statistical Association, 81, 832–842. Gabaix, X., Gopikrishnan, P., Plerou, V. and Stanley, H. (2003). A Theory of Power-Law Distributions in Financial Market Fluctuations. Nature, 423, 267–270. Granger, C. (2008). Nonlinear Models: Where Do We Go Next – Time Varying Parameter Models? Studies in Nonlinear Dynamics & Econometrics, 12, Article 1. Hartmann, P., Straetmans, S. and de Vries C. (2004). Asset Market Linkages in Crisis Periods. Review of Economics and Statistics, 86, 313–326. Hong, Y., Tu, J. and Zhou, G. (2007). Asymmetries in Stock Returns: Statistical Tests and Economic Evaluation. Review of Financial Studies, 20, 1547–1581. Huang, J. and Wang, J. (2009). Liquidity and Market Crashes. Review of Financial Studies, 22, 2607–2643. Hunter, D. (2007). Curved Exponential Family Models for Social Networks. Social Networks, 29, 216–230. Karolyi, G. and Stulz, R. (1996). Why Do Markets Move Together? An Investigation of US-Japan Stock Return Comovements. Journal of Finance, 51, 951–986. Khandani, A. and Lo, A. (2007). What Happened to the Quants in August 2007? Journal of Investment Management, 5, 5–54. Laloux, L., Cizeau, P., Bouchaud, J. and Potters, M. (1999). Noise Dressing of Financial Correlation Matrices. Physical Review Letters, 83, 1467–1470. Lee, T., Cho, J., Kwon, D. and Sohn, S. (2018). Global Stock Market Investment Strategies Based on Financial Network Indicators Using Machine Learning Techniques. Expert Systems with Applications, 117, 228–242. Lo, A. and MacKinlay, A. (1990). When are Contrarian Profits Due to Stock Market Overreaction? Review of Financial Studies, 3, 175–205. Longin, F. and Solnik, B. (2001). Extreme Correlation of International Equity Markets. Journal of Finance, 56, 649–676. Newman, M., Strogatz, S. and Watts, D. (2001). Random Graph with Arbitrary Distributions and Their Applications. Physical Review E, 64, 026118. Newman, M. (2003). The Structure and Function of Complex Networks. SIAM review, 45, 167–256.
page 353
July 6, 2020
11:57
354
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch08
W.-F. Niu & H. H.-S. Lu
Okimoto, T. (2008). New Evidence of Asymmetric Dependence Structures in International Equity Markets. Journal of Financial and Quantitative Analysis, 43, 787–815. Pedersen, L. (2009). When Everyone Runs for the Exit. International Journal of Central Banking, 5, 177–199. Peralta, G. (2015). Network-Based Measures as Leading Indicators of Market Instability: The Case of the Spanish Stock Market. Journal of Network Theory in Finance, 1, 91–122. Peralta, G. and Zareei, A. (2016). A Network Approach to Portfolio Selection. Journal of Empirical Finance, 38, 157–180. Penzer, J., Schmid, F. and Schmidt, R. (2011). Measuring Large Comovements in Financial Markets. Quantitative Finance, 11, 1–13. Ramchand, L. and Susmel, R. (1998). Volatility and Cross Correlation Across Major Stock Markets. Journal of Empirical Finance, 5, 397–416. Rodriguez, J. (2006). Measuring Financial Contagion: A Copula Approach. Journal of Empirical Finance, 14, 401–423. Snijders, T., Pattison, P., Robins, G. and Handcock, M. (2006). New Specification for Exponential Random Graph Models. Sociological Methodology, 36, 99–153. Straetmans, S., Verschoor, W. and Wolff, C. (2008). Extreme US Stock Market Fluctuations in the Wake of 9/11. Journal of Applied Econometrics, 23, 17–42. Sun, W. and Cai, T. (2009). Large-Scale Multiple Testing Under Dependence. Journal of the Royal Statistical Society Series, B 71, 393–424. Tse, Y. and Tsui, A. (2002). A Multivariate Generalized Autoregressive Conditional Heteroscedasticity Model with Time-Varying Correlations. Journal of Business & Economic Statistics, 20, 351–362. Wen, D., Ma, C., Wang, G. and Wang, S. (2018). Investigating the Features of Pairs Trading Strategy: A Network Perspective on the Chinese Stock Market. Physica A: Statistical Mechanics and its Applications, 505, 903–918. Zhao, L., Wang, G., Wang, M., Bao, W., Li, W. and Stanley, H. (2018). Stock Market as Temporal Network. Physica A: Statistical Mechanics and its Applications, 506, 1104–1112.
page 354
July 17, 2020
13:32
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
Chapter 9
Key Borrowers Detected by the Intensities of Their Interactions Fuad Aleskerov, Irina Andrievskaya, Alisa Nikitina and Sergey Shvydun
Contents 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Methodology for the Case with Long-Term Interactions 9.3 S-Long Range Interactions Index Based on Paths . . . . 9.4 Empirical Application — Country Assessment . . . . . . 9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
356 358 361 369 377 377 378
Fuad Aleskerov National Research University Higher School of Economics (HSE), V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences (ICS RAS) e-mail: [email protected] Irina Andrievskaya National Research University Higher School of Economics (HSE), University Niccolo Cusano e-mail: [email protected] Alisa Nikitina National Research University Higher School of Economics (HSE) e-mail: [email protected] Sergey Shvydun National Research University Higher School of Economics (HSE), V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences (ICS RAS) e-mail: [email protected]
355
page 355
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
356
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
Appendix 9A: Methodology for the Case with Short-Range Interactions . . . . . . . . . . . . . . . . . . . . . . Appendix 9B: S-Long-Range Interactions Centrality Index Based Simulations . . . . . . . . . . . . . . . . . . . . . . Appendix 9C: Centrality Measures . . . . . . . . . . . . . . . . .
. . . 380 on . . . 382 . . . 387
Abstract We propose a novel method to estimate the level of interconnectedness of a financial institution or system, as the measures currently suggested in the literature do not fully take into consideration an important aspect of interconnectedness — group interactions of agents. Our approach is based on the power index and centrality analysis and is employed to find a key borrower in a loan market. It has three distinctive features: it considers long-range interactions among agents, agents’ attributes and a possibility of an agent to be affected by a group of other agents. This approach allows us to identify systemically important elements which cannot be detected by classical centrality measures or other indices. The proposed method is employed to analyze the banking foreign claims as of 1Q 2015. Using our approach, we detect two types of key borrowers (a) major players with high ratings and positive credit history; (b) intermediary players, which have a great scale of financial activities through the organization of favorable investment conditions and positive business climate. Keywords Power index • Key borrower • Systemic importance • Interconnectedness • Centrality • S-long-range interactions • paths.
9.1 Introduction In his testimony to the US Congress, Federal Reserve Chairman Ben Bernanke observed: “If the crisis has a single lesson, it is that the too-bigto-fail problem must be solved” (Bernanke, 2010). Thus, identification of organizations, particularly banks of systemic relevance, is a crucial task for assessing financial stability and enhancing macroeconomic supervision. Systemic importance can be identified by either using an indicatorbased approach (cf. BCBS, 2013; ECB, 2006; IMF, 2010) or examining the contribution of an institution to systemic risk (cf. Acharya et al., 2010; Adrian and Brunnermeier, 2010; Bluhm and Krahnen, 2014; Chan-Lau, 2010; Drehmann and Tarashev, 2011; Huang et al., 2012; Lehar, 2005; Segoviano and Goodhart, 2009; Tarashev et al., 2010; Zhou, 2010). One of the most relevant indicators of systemic importance is considered to be interconnectedness (cf. Chan-Lau, 2010; ECB, 2010; IMF/BIS/FSB, 2009; Leon and Murcia, 2012; Br¨ uhl, 2017). The Basel Committee on Banking Supervision (BCBS) proposes an indicator-based approach to estimate
page 356
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
357
the level of interconnectedness of financial institutions (BCBS, 2013). Other examples of using this approach include, among others (ECB, 2014; IMF, 2015). An alternative approach to estimating the level of interconnectedness is to apply a network theory. In this case, a network can be considered as a system of nodes (financial institutions) and links (flows of capital) among them. Network analysis has already been applied in the context of stock ownership networks (Garlaschelli et al., 2005), emergence of contagion and systemic risk in the interbank market and in payment systems (Angelini et al., 1996; Furfine, 2003; Iori et al., 2008) and also in terms of how interconnected a financial system can be at the national and international level (Allen and Babus, 2009; Allen and Gale, 2000; Malik and Xu, 2017; Sun and ChanLau, 2017). For the purposes of measuring the degree of importance in networks many centrality indices have been proposed (Bonacich, 1972; Barrat et al., 2004; von Peter, 2007; Newman, 2010; Peltonen et al., 2015). For example, IMF (2010) and (von Peter, 2007) use degree, closeness, betweenness, intermediation (employed only in (von Peter, 2007)) and prestige (or Bonacich centrality) within a global financial system framework to reveal the most interconnected banking sectors. These centrality measures are also widely used for the interbank market investigation (Akram and Christophersen, 2010; Bech and Atalay, 2008; Cajueiroa and Tabak, 2008; Iori et al., 2008; and others) and for the global banking network analysis (Minoiu and Reyes, 2013; Aldasoro and Alves, 2017). Other attempts to evaluate the degree of importance of elements in networks are based on simulation mechanism. For example, a method of firm dynamics simulations was developed by applying game theory to a stochastic agent model in order to analyze the uncertainty in the business environment (Ikeda et al., 2007; Giovanetti, 2012). Also, simulation procedure is widely used to assess the industrial transaction networks, property relations, and the dynamics of industrial and innovation clusters as well as modeling the financial risks in the interbank market and payment systems. Many attempts of the key nodes detection in networks come from cooperative game theory. In that case, the network is interpreted as a set of interacting individuals that contribute to a total productive value of the network and the problem is how to share generated value among them. In (Myerson, 1977) a measure based on the Shapley–Shubik index (Shapley and Shubik, 1954) was proposed for communication games. The Myerson value is an allocation rule in the context of network games where the value of each
page 357
July 6, 2020
11:57
358
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
individual depends on the value generated by the network with and without that individual. Several attempts to employ power indices to find systemically important financial institutions were accomplished in (Tarashev et al., 2010; Drehmann and Tarashev, 2011; Garratt et al., 2012). The above-described approaches, however, do not fully incorporate intensity of connections among agents and do not consider at all a group influence of agents to the system. They do not consider as well individual parameters of agents. Therefore, the aim of this paper is to propose an approach that takes into consideration all these shortages. Our methodology is an extension of the approach developed in (Aleskerov et al., 2016) where the authors propose a preference-based power index — we call it key borrower index (KBI). This index takes into account short-range interactions between each lender and its borrowers. We modify the index to take into account long-range interactions and propose to use the long-range interaction centrality (LRIC) measure. The contribution of our paper is two-fold. First, we contribute to the systemic importance analysis literature developing an approach to estimate the level of interconnectedness of an institution or system taking into account short-range and long-range interactions among market participants. Second, we contribute to the network-in-finance literature developing a centrality measure that can be used within financial system analysis. The paper is organized as follows. In the following two sections we explain our methodology. Empirical application of the proposed methodology is presented in Section 4. Section 5 concludes.
9.2 Methodology for the Case with Long-Term Interactions In this section, we extend the approach proposed in (Aleskerov et al., 2016) to take into account long-term interactions among system’s participants (please see Appendix 9A for more details on the methodology developed in (Aleskerov et al., 2016)). Our approach is based on a very simple observation. When we consider the network of interconnected lenders and borrowers, each creditor’s sustainability will be affected by its direct borrowers. In addition, bankruptcy of any of the direct borrowers may occur due to the bankruptcy of those ones to whom they have given loans, i.e., both direct and indirect borrowers are in relation to the original creditor. In other words, our methodology allows us to consider the interaction between a lender and a borrower not just on the first level, but also on some levels beyond.
page 358
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
359
There are two different ideas on how to take into account long-range interactions between members of the network. The first one is a distancebased approach where all different paths are considered for each member and somehow aggregated into a single value. The second one is based on the idea of simulations where we analyze the influence of individual members and their combinations to the whole network (please see Appendix 9B for further details on the simulation-based approach). Both ideas have an easy interpretation and can be applied to different areas. The aim of this section is to explain in detail our approach and demonstrate how it works for the numerical example (see Figure 9.1). We consider a complex system of interconnections for agent’s lending activities. The values on the edges represent the amount of loans that one agent gives to another one, and network structure corresponds to the bow-tie representation. Moreover, this network is structurally closer to the actually existing network of financial interactions, as links between the agents are more diversified, and there are less strongly connected components. As it was mentioned before, we propose two methods to find key borrowers in the system. The main difference from the KBI is that now s-long-range borrowers for each lender are taken into account. In many problems interactions of indirect neighbors play a significant role in the whole system, hence, there is
Figure 9.1:
Numerical example.
page 359
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
360
a need to consider these links. The parameter s defines how many “layers” are examined for each lender, it depends on the problem and in general case can be unspecified so all possible direct and indirect neighbors are taken into account. To describe the proposed approach some definitions are given below. Consider a set of members N , N = {1, . . . , n}, and a matrix A = [aij ], where i, j ∈ N and aij is the loan from the member i to the member j. For simplicity suppose the matrix A being already transformed, i.e., if aij = 0 then aji = 0. Denote by Ni a set of direct neighbors of the i-th member, i.e., Ni = {j ∈ N : aij = 0}. Obviously, the total number of possible groups of direct neighbors for the member i is equal to 2|Ni | . Definition 1. The group of direct neighbors of the i-th member Ω(i) ⊆ Ni is critical if j∈Ω(i) aij ≥ qi , where qi is a predefined threshold of the i-th member. Definition 2. The member x ∈ Ω(i) is pivotal if j∈Ω(i)\{x} aij < qi . Denote by Ωp (i) a set of pivotal members in the group Ω(i), i.e., Ωp (i) = {y ∈ Ω(i)| j∈Ω(i)\{y} aij < qi }. For our numerical example, there are 11 agents in the system (n = 11): 9 of them are both lenders and borrowers while two remaining elements are pure borrowers. So, we could form the matrix A of the size of 11 × 11. The sets of direct neighbors Ni and critical groups when qi = 25% for each element are shown in Table 9.1. Pivotal members for the element 1 when q1 = 25% is provided in Table 9.2. Table 9.1: Element, i 1 2 3 4 5 6 7 8 9 10 11
Direct borrowers and critical groups for the numerical example. Direct borrowers, N i {2, {5, {2, {5,
3, 4} 6, 8} 4, 5} 7, 9} ∅ {10, 11} {10, 11} {10, 11} {10, 11} {1} ∅
Critical groups, Ω(i ), q = 25% {2}, {2, 3}, {2, 4}, {3, 4}, {2, 3, 4}, {6}, {5, 6}, {5, 8}, {6, 8}, {5, 6, 8} {4}, {2, 4}, {2, 5}, {4, 5}, {2, 4, 5} {7}, {5, 7}, {5, 9}, {7, 9}, {5, 7, 9} ∅ {11}, {10, 11} {11}, {10, 11} {11}, {10, 11} {11}, {10, 11} {1} ∅
page 360
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
361
Table 9.2: Critical groups and pivotal member for the element 1. Critical groups, Ω(1) {2} {2, 3} {2, 4} {3, 4} {2, 3, 4}
Pivotal members, Ωp (1) {2} {2} {2} {3, 4} ∅
9.3 S-Long Range Interactions Index Based on Paths Let us construct a matrix C = [cij ] with respect to the matrix A and predefined threshold as ⎧ aij ⎪ , if j ∈ Ωp (i) ⊆ Ni , ⎨ min l∈Ω(i)ail cij = Ω(i)⊆N i |j∈Ωp (i) ⎪ ⎩ 0, j∈ / Ωp (i) ⊆ Ni , where Ω(i) is a critical group of direct neighbors for the element i, Ω(i) ⊆ Ni , and Ωp (i) is pivotal group for the element i, Ωp (i) ⊆ Ω(i). The construction of matrix C is highly related to the methodology described in the previous sections as it is necessary to consider separately each element of the system as a lender while other participants of the system are assumed as borrowers. The only difference here is that only groups of direct neighbors are considered. The interpretation of matrix C is rather simple. If cij = 1 then the borrower j has a maximal influence to the lender i, i.e., the loan amount to the borrower j is critical for the lender i. On the contrary, if cij = 0 then the borrower j does not directly influence the lender i. Finally, the value 0 < cij < 1 indicates the impact level of the borrower j to the bankruptcy of the lender i. Let us construct a matrix C = [cij ] for the numerical example with threshold value q = 25% according to the approach based on paths. For example, when we want to estimate the direct influence of borrowers for the element 1, we search for a minimal critical group, i.e., a critical group with the lowest total loan from the element 1, where a particular borrower is pivotal and then estimate the direct influence c1j . According to Table 9.1, the element 1 has 3 direct borrowers, hence, the minimal critical group for the element 2 is {2} and c12 = 60/60 = 1. For the elements 3 and 4 the minimal critical group is {3, 4} and c13 = 16/(16 + 24) = 0.4, c14 = 24/(16 + 24) = 0.6.
page 361
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
362
Figure 9.2:
The network for matrix C.
A graphical representation of the matrix is shown in Figure 9.2. In addition, we can see on Figure 9.2 that the element 10 does not influence any other element in the system. Thus, we evaluate the direct influence of the first level of each element in the system. To define the indirect influence between two elements let us give a definition of the ρ-path. Denote by ρ a binary relation which is constructed as iρj ⇔ cij > 0. A pair (i, j) such that iρj is called a ρ-step. A path from i to j is an ordered sequence of steps starting at i and ending at j, such that the second element in each step coincides with the first element of the next step. If all steps in a path belong to the same relation ρ, we call it ρ-path, i.e., a ρ-path is an ordered sequence of elements i, j1 , . . . , jk , j, such that iρj1 , j1 ρj2 , . . . , jk−1 ρjk , jk ρj. The number of steps in a path is called the path’s length. To define the indirect influence between any two elements, consider all ρ-paths between them of length less than some parameter s. Each path should not contain any cycles, i.e., there are no elements that occur in the ρ-path at least twice. For instance, there are only two paths between elements 8 and 1 from the numerical example (see Figure 9.3): dashed lines, via element 2 (8ρ2ρ1), and dotted lines, via elements 3 and 2 (8ρ2ρ3ρ1). ij } a set of unique ρ-paths from i to j, Denote by P ij = {P1ij , P2ij , . . . , Pm where m is the total number of paths and denote by n(k) = |Pkij |, where
page 362
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
363
Figure 9.3: Paths between elements 8 and 1 (via element 2 — dashed lines, via elements 2 and 3 — dotted lines).
k = 1, . . . , m, a length of the k-th path. Then we can define the indirect influence f (Pkij ) between elements i and j via the k-th ρ-path as f (Pkij ) = cij(1,k) · cj(1,k)j(2,k) · . . . · cj(n(k),k)j ,
(9.1)
f (Pkij ) = min(cij(1,k) , cj(1,k)j(2,k) , . . . , cj(n(k),k)j ),
(9.2)
or
where j(l, k), l = 1, . . . , n(k), is the l-th element which occurs on k-th ρ-path from i to j. The interpretation of formulae (1) and (2) is the following. According to the formula (9.1) the total influence of the element j to the element i via the k-th ρ-path Pkij is calculated as the aggregated value of direct influences between elements which are on the k-th ρ-path between i and j while the formula (9.2) defines the total influence as the minimal direct influence between any elements from the k-th ρ-path. A simple example of indirect influence estimation between two elements is provided in Figure 9.4. In the first case, the influence is proportional to the losses (risks) from bankruptcy of each borrower on the path while in the second case the influence is equal to the minimal risk of bankruptcy of the borrower which is on the path between elements 1 and 2. It is necessary to mention that in some cases there is no need to consider all possible paths between elements i and j, i.e., we can assume that starting from some path’s length s indirect interactions does not influence the initial
page 363
July 6, 2020
11:57
364
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
(a)
(b)
Figure 9.4: Indirect influence: (a) multiplication of direct influences and (b) minimal direct influence.
member. Thus, we designed the parameter s that defines how many layers (path’s length) are taken into account. For example, consider all paths between elements 11 and 1 from the numerical example (see Figure 9.2). There are four paths between these elements: 11ρ8ρ2ρ1, 11ρ9ρ4ρ1, 11ρ8ρ2ρ3ρ1 and 11ρ9ρ4ρ3ρ1. If parameter s = 3, so we will be interested only in first two paths, whereas the others will not be taken into account. Since there can be many paths between two elements of the system, there is a problem of aggregating the influence of different paths. To estimate the aggregated indirect influence several methods are proposed. The aggregated results will form a new matrix C ∗ (s) = [c∗ij (s)]. (1) The indirect influence: sum of paths influences c∗ij (s) = min (1, Σk: n(k)≤s f (Pkij )).
(9.3)
(2) The indirect influence: maximal path influence c∗ij (s) = max f (Pkij ). k: n(k)≤s
(3) The indirect influence: the threshold rule
(9.4) (9.5)
The threshold aggregation was proposed in (Aleskerov et al., 2007) and the idea of the rule is rather simple. Suppose we have a set of elements and each element is evaluated by n grades that may have m different values. Then we can calculate for each element the values v1 (k), v2 (k), . . . , vm (k) which contain information on how many i-th (i = 1, . . . , m) grades each element received. Then according to the threshold rule the element x V-dominates the element y if v1 (x) < v1 (y) or, if there exists d ≤ m, such that vh (x) = vh (y), ∀h = 1, . . . , d − 1, and vd (x) < vd (y). In other words,
page 364
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
365
first, the number of worst places are compared, if these numbers are equal then the number of second worst places are compared, and so on. The element which is not dominated by any other element via V is considered as the best one. Considering the threshold rule as one of the possible ways on how the indirect influence can be evaluated, we propose the following aggregation procedure c∗ij (s) = f (Pzij ),
(9.6)
z = argmin v(Pkij ),
(9.7)
where k:n(k)≤s
and
m vl Pkij ∗ (s + 1)m−l + s − n(k). v Pkij = l=1
Note that if there is no path between elements i and j, then c∗ij (s) = 0. For example, consider the paths between elements 11 and 1. According to the formula (9.1) the influence of the path 11ρ8ρ2ρ1 (path’s length = 3) is equal to 0.8, the influence of the path 11ρ8ρ2ρ3ρ1 (path’s length = 4) is equal to 0.56, the influence of the path 11ρ9ρ4ρ1 (path’s length = 3) is equal to 0.426, the influence of the path 11ρ9ρ4ρ3ρ1 (path’s length = 4) is equal to 0.284. If s = 3, only two paths are taken into account. Thus, the total influence of the element 11 to the element 1 will be equal to 1 according to the sum of paths influences (min{1, (0.8 + 0.56)}) or will be equal to 0.8 according to the maximal path influence (max{0.8, 0.56}). For the threshold rule calculations are not so obvious since they depend on the system of grades which will be described later. The interpretation of formulae (9.3)–(9.5) in terms of borrowers is the following. The sum of paths influences is equivalent to the most pessimistic case of the indirect influence where we take into account all possible channels of risk from a particular borrower to the creditor. The maximal path influence and the influence assessment by the threshold rule help us to find the most vulnerable risk transmission channel. Thus, we can define the indirect influence between elements i and j via all possible paths between these elements. The paths influences can be evaluated by formulae (9.1)–(9.2) and aggregated into a single value by formulae (9.3)–(9.5). Thus, 6 combinations are possible for matrix C ∗ (s) construction (see Table 9.3). In our opinion, all possible combinations of formulae have a sense except the combination of formulae (9.2) and (9.3).
page 365
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
366 Table 9.3:
Possible combinations of methods for indirect influence paths aggregation. Paths aggregation Sum of paths influences Multiplication of direct influence Minimal direct influence
Path influence
Table 9.4:
ID
Maximal path influence Threshold rule
SumPaths
MaxPath
MultT
—
MaxMin
MaxT
Possible paths between elements 5 and 1. Multiplication of path influences
Path
Minimal direct influence
5 → 2 →1
0.2
0.2
→1 5 → 3
0.16
0.4
→1 5 → 4
0.174
0.29
4
→ 3 →1 5 → 2
0.048
0.2
5
→ 3 →1 5 → 4
0.116
0.29
1 2 3
0.2
1
0.4
0.4
0.29
0.6
0.2
0.6
0.4
0.29
1
0.4
For our numerical example, we can differently construct the matrix C ∗ which represents the total influence and is used for aggregation of influences into a single vector with respect to the weights. For instance, let us calculate the influence of the element 5 to the element 1. Consider all possible paths from the element 5 to the element 1. They are shown in Table 9.4. Thus, we can now aggregate this information into a single value by different methods. To compare different paths by the threshold rule, the following grades of direct influence were developed. Grades: 0. cij = 0; 1. 0 < cij ≤ 0.25; 2. 0.25 < cij ≤ 0.5; 3. 0.5 < cij ≤ 0.8; 4. 0.8 < cij ≤ 1.
page 366
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
Key Borrowers Detected by the Intensities of Their Interactions
367
Now we can define the path between the elements 5 and 1 according to the threshold rule. Note that for the threshold rule the values on the edges are equal to the grades which were proposed above. The results are provided in Table 9.5. The overall results are provided in Table 9.6. Similarly, we can estimate the influence of any other elements and construct the matrix C ∗ according to different methods. The results are provided in Table 9.7–9.11. The aggregation of matrix C ∗ (s) into a single vector that shows the total influence of each element of the system can be done with respect to the weights (importance) of each element as it is done in Section 9.2. As a result, we can see that elements 6, 7, and 11 are considered as the most pivotal in the system while the influence of the element 5 is more than the influence of the elements 1 and 10. Paths aggregation by the threshold rule, s = 3.
Table 9.5:
Path
Path (grades on edges)
1
5 → 2 →1
5 → 2 →1
66
2
→1 5 → 3
5 → 3 →1
33
3
→1 5 → 4
5 → 4 →1
21∗
4
5 → 2 → 3 →1
5 → 2 → 3 →1
84
→ 3 →1 5 → 4
5 → 4 → 3 →1
33
ID, k
5
0.2
1
0.4
0.4
0.29
0.6
0.2
0.6
0.4
0.29
1
0.4
1
4
2
2
2
3
1
3
2
2
4
2
Paths influence, v (P 15 k )
Table 9.6: The total influence of the element 5 to the element 1 by different methods. Method
Considered paths IDs
SumPaths MaxPath MaxMin MultT MaxT
1–5 1 2 3 3
Influence 0.698 0.2 0.4 0.174 0.29
page 367
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
368 Table 9.7:
1 2 3 4 5 6 7 8 9 10 11 Total Total (normalized)
Matrix C ∗ for the numerical example, SumPaths.
1
2
3
4
5
6
7
8
9
10
11
Weights
0 0 0 0 0 0 0 0 0 1 0 0.11 0.03
1 0 0.60 0 0 0 0 0 0 1 0 0.29 0.08
0.40 0 0 0 0 0 0 0 0 0.40 0 0.09 0.02
1 0 1 0 0 0 0 0 0 1 0 0.33 0.09
0.70 0.20 0.81 0.29 0 0 0 0 0 0.70 0 0.30 0.08
1 1 0.60 0 0 0 0 0 0 1 0 0.40 0.11
1 0 1 1 0 0 0 0 0 1 0 0.44 0.12
0.99 0.80 0.48 0 0 0 0 0 0 0.99 0 0.36 0.10
0.71 0 0.71 0.71 0 0 0 0 0 0.71 0 0.31 0.09
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 1 1 1 1 0 1 0.27
0.11 0.11 0.11 0.11 0 0.11 0.11 0.11 0.11 0.11 0
Table 9.8: 1 1 2 3 4 5 6 7 8 9 10 11 Total Total (normalized)
0 0 0 0 0 0 0 0 0 1 0 0.11 0.03
Table 9.9: 1 1 2 3 4 5 6 7 8 9 10 11 Total Total (normalized)
0 0 0 0 0 0 0 0 0 1 0 0.11 0.03
Matrix C ∗ for the numerical example, MaxPath. 2
3
4
5
6
7
8
9
10
11
Weights
1 0 0.60 0 0 0 0 0 0 1 0 0.29 0.09
0.40 0 0 0 0 0 0 0 0 0.40 0 0.09 0.03
0.60 0 1 0 0 0 0 0 0 0.60 0 0.24 0.08
0.20 0.20 0.40 0.29 0 0 0 0 0 0.20 0 0.14 0.04
1 1 0.60 0 0 0 0 0 0 1 0 0.40 0.12
0.60 0 1 1 0 0 0 0 0 0.60 0 0.36 0.11
0.80 0.80 0.48 0 0 0 0 0 0 0.80 0 0.32 0.10
0.42 0 0.71 0.71 0 0 0 0 0 0.42 0 0.25 0.08
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 1 1 1 1 0 1 0.31
0.11 0.11 0.11 0.11 0 0.11 0.11 0.11 0.11 0.11 0
Matrix C ∗ for the numerical example, MaxMin. 2
3
4
5
6
7
8
9
10
11
Weights
1 0 0.60 0 0 0 0 0 0 1 0 0.29 0.09
0.40 0 0 0 0 0 0 0 0 0.40 0 0.09 0.03
0.60 0 1 0 0 0 0 0 0 0.60 0 0.24 0.07
0.40 0.20 0.40 0.29 0 0 0 0 0 0.40 0 0.19 0.06
1 1 0.60 0 0 0 0 0 0 1 0 0.40 0.12
0.60 0 1 1 0 0 0 0 0 0.60 0 0.36 0.11
0.80 0.80 0.60 0 0 0 0 0 0 0.80 0 0.33 0.10
0.60 0 0.71 0.71 0 0 0 0 0 0.60 0 0.29 0.09
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 1 1 1 1 0 1 0.30
0.11 0.11 0.11 0.11 0 0.11 0.11 0.11 0.11 0.11 0
page 368
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
Key Borrowers Detected by the Intensities of Their Interactions Matrix C ∗ for the numerical example, MultT.
Table 9.10: 1 1 2 3 4 5 6 7 8 9 10 11 Total Total (normalized)
0 0 0 0 0 0 0 0 0 1 0 0.11 0.03
2
3
4
5
6
7
8
9
10
11
Weights
1 0 0.60 0 0 0 0 0 0 1 0 0.29 0.09
0.40 0 0 0 0 0 0 0 0 0.40 0 0.09 0.03
0.60 0 1 0 0 0 0 0 0 0.60 0 0.24 0.07
0.18 0.20 0.40 0.29 0 0 0 0 0 0.18 0 0.14 0.04
1 1 0.60 0 0 0 0 0 0 1 0 0.40 0.12
0.60 0 1 1 0 0 0 0 0 0.60 0 0.36 0.11
0.80 0.80 0.48 0 0 0 0 0 0 0.80 0 0.32 0.10
0.42 0 0.71 0.71 0 0 0 0 0 0.42 0 0.25 0.08
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 1 1 1 1 0 1 0.31
0.11 0.11 0.11 0.11 0 0.11 0.11 0.11 0.11 0.11 0
Table 9.11: 1 1 2 3 4 5 6 7 8 9 10 11 Total Total (normalized)
369
0 0 0 0 0 0 0 0 0 1 0 0.11 0.03
Matrix C ∗ for the numerical example, MaxT. 2
3
4
5
6
7
8
9
1 0 0.60 0 0 0 0 0 0 1 0 0.29 0.09
0.40 0 0 0 0 0 0 0 0 0.40 0 0.09 0.03
0.60 0 1 0 0 0 0 0 0 0.60 0 0.24 0.08
0.29 0.20 0.40 0.29 0 0 0 0 0 0.29 0 0.16 0.05
1 1 0.60 0 0 0 0 0 0 1 0 0.40 0.13
0.60 0 1 1 0 0 0 0 0 0.60 0 0.36 0.11
0.80 0.80 0.60 0 0 0 0 0 0 0.80 0 0.33 0.10
0.60 0 0.71 0.71 0 0 0 0 0 0.60 0 0.29 0.09
10 11 Weights 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 1 1 1 1 0 1 0.31
0.11 0.11 0.11 0.11 0 0.11 0.11 0.11 0.11 0.11 0
9.4 Empirical Application — Country Assessment In this section, the model outlined so far will be applied for evaluation of the level of banking systems interconnectedness. We try to detect countries with the most interconnected financial systems taking into consideration the intensities of countries’ banking systems interactions. At the same time, we understand the limitations of the existing data. The analysis of crossborder country exposures relies primarily on data aggregated at the level of countries, and, hence, overlooks bank-level heterogeneity.
page 369
July 6, 2020
11:57
370
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
The data is taken from the Bank of International Settlements (BIS) statistics.1 More precisely, we use the BIS consolidated banking statistics on an ultimate risk basis. For example, suppose that a bank from country A extends a loan to a company from country B and the loan is guaranteed by a bank from country C. On an ultimate risk basis, this loan would be reported as a claim on the country C because, if the company from B were unable to meet its obligations, then ultimately the bank from A would be exposed to the bank from C that guaranteed the loan. In other words, claims are allocated to the country where the final risk lies. Foreign claims in BIS statistics are designed to analyze the exposure of internationally active banks to individual countries and sectors. The sectoral classification consists of (a) banks; (b) official sector, which includes general government central banks and international organizations; (c) non-bank private sector, including non-bank financials. Thus, our figures indicate the ith banking system foreign claims on borrowers from different sectors in country j, which include its worldwide consolidated direct cross-border claims on country j plus the positions booked by its affiliates (subsidiaries and branches) in country j vis-`a-vis residents of country j. The data covers on-balance sheet claims as well as some offbalance sheet exposures of banks headquartered in the reporting country and provides a measure of country credit risk exposures consonant with banks’ own risk management systems. The reporting countries comprise the G10 countries (Belgium, Canada, France, Germany, Japan, Netherlands, Sweden, Switzerland, UK, and USA) plus Australia, Austria, Chile, Finland, Greece, India, Ireland, South Korea, Portugal, Spain, and Turkey. BIS consolidated banking statistics apart from information about banking foreign claims includes aggregated data on regional country groupings such as regional residuals of developed countries, offshore centers, Africa and the Middle East, Asia-Pacific, Europe, Latin America and the Caribbean, and “Unallocated” claims. These positions are used in BIS Statistics in cases when the balance of payments concept of residence of both the reporting bank and its counterparty could not be applied. In this paper, we analyze only cross-country relationships, so we exclude these groupings from the database as well as a position of international financial organizations.
1
Table 9D “Foreign claims by nationality of reporting banks, ultimate risk basis” (http: //www.bis.org/statistics/r_qa1509_hanx9d_u.pdf).
page 370
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
371
As a result, we obtained a database covering 22 countries that have bank foreign claims and 198 countries that have obligations for the end of 1Q 2015. Thus, the network considered on the basis of the data includes all information about the international borrowings, except for transactions between countries which do not report. According to BIS Statistical Bulletin, our network covers about 94% of total foreign claims and other potential exposures on an ultimate risk basis. The important aspect of the analysis is the choice of the critical loan amount threshold level for each country. One possible way to define it is to follow the recommendations of the Basel Committee (BCBS, 2013) on large exposure limits (25% of the Tier 1 capital). At the international level, when we deal with banking system’s borrowings, choosing an appropriate threshold level (critical loan amount) is not so obvious. We decided to put on the edges of the network not loans, but the value measured by the ratio of loans to the gross domestic product (GDP) of the lending country in order to take into account the relative size of the loan. In this paper the nominal GDP is used. However, GDP measure can be replaced by the total banking system assets or capital estimations. So, suppose that the threshold q be 10% of the nominal GDP. Another important issue here is to assign grades to the direct influence values for the threshold procedure. In Table 9.12, we propose the following system of grades which in our opinion is reasonable for this case. The highest grade corresponds to the highest influence value, the lowest grade indicates the absence of influence between elements. Thus, we can calculate the influence value of each borrower according to our methodology. Table 9.13 contains a list of countries that were in the Table 9.12: Grade
Grades for direct influence values description.
Condition
Description
7 6
cij = 1 0.92 ≤ cij < 1
5
0.85 ≤ cij < 0.92
4 3 2 1 0
0.75 ≤ cij < 0.85 0.5 ≤ cij < 0.75 0.25 ≤ cij < 0.5 0 < cij < 0.25 cij = 0
ultimately high influence very high influence (explanation similar to the capital adequacy ratio for banks, when the loss of more than 92% of assets will lead to the bank’s capital fall below zero) high influence (according to the upper value of the capital adequacy ratio, a standard procedure of Bank of Russia) average influence moderate influence low influence very low influence no influence
page 371
July 6, 2020
Power of countries as borrowers.
LRIC indices
0.542 0.083 0.051 0.073
0.133 0.078
0.080 0.052
0.540 0.149
0.1493 0.0924
0.1115 0.0651
1 4
1 2
1 2
1 2
1 2
1 2
1 2
0.004 0.061 0.090 0.060
0.053 0.059
0.044 0.040
0.004 0.064
0.0781 0.0694
0.0443 0.0477
19 2
3 4
4 3
3 4
16 3
4 3
4 3
0.002 0.040 0.005 0.039
0.029 0.042
0.028 0.031
0.004 0.008
0.0476 0.0521
0.0237 0.0326
25 17
5 6
8 5
6 4
17 14
7 5
7 5
0.006 0.010 0.037 0.037 0.012 0.029
0.039 0.035 0.033 0.029 0.0283 0.233
0.031 0.026 0.018 0.033 0.0251 0.0202
0.025 0.019 0.017 0.022 0.0193 0.0191
0.016 0.010 0.003 0.010 0.041 0.0136
0.0310 0.0304 0.0285 0.0382 0.0303 0.0277
0.0187 0.0214 0.0135 0.0271 0.0178 0.0160
15 11 6 5 10 8
7 8 9 10 11 13
7 9 17 6 10 14
7 12 16 9 10 14
9 13 18 12 5 10
10 8 14 6 11 13
10 8 14 6 11 13
0.059 0.005 0.005 0.009 0.034 0.024
0.0180 0.0103 0.009 0.009 0.0166 0.012
0.0237 0.0091 0.009 0.009 0.0280 0.014
0.0191 0.075 0.009 0.008 0.0175 0.015
0.043 0.017 0.017 0.017 0.0124 0.000
0.0328 0.0107 0.0105 0.0099 0.0261 0.0194
0.0196 0.0076 0.0073 0.0071 0.0169 0.0106
3 16 18 13 7 9
17 28 30 31 18 26
11 31 32 33 13 22
11 31 32 33 13 22
4 8 7 6 11 54
9 27 29 30 12 20
9 27 29 30 12 20
b3568-v1-ch09
Max Sum Max Path MaxMin Simul MaxT MultT KBI Paths Path MaxMin Simul MaxT MultT
9.61in x 6.69in
Sum KBI Path
F. Aleskerov et al.
United States Hong Kong SAR China United Kingdom Singapore Cayman Islands Brazil Luxembourg Poland Germany Mexico Czech Republic Japan Norway Finland Denmark Italy Austria
Rank Handbook of Financial Econometrics,. . . (Vol. 1)
Name
11:57
372
Table 9.13:
page 372
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
Key Borrowers Detected by the Intensities of Their Interactions
Figure 9.5:
373
A graphical representation of banking foreign claims network.
TOP-10 by one of LRIC indices. Graphical representation of the modeling network is shown in Figure 9.5. We have also calculated the Key Borrower Index and compared the results with our methods. The countries with the largest value of the index are considered as the most pivotal/interconnected ones in the market. All five versions of LRIC (SumPaths, MaxPath, MaxMin, MaxT, MultT) give us almost similar rankings, whereas LRIC based on simulations (Simul (for more details on this measure 6 please see Appendix 9B)) demonstrates some differences. However, the main differences start from the middle of the TOP-10 countries. TOP-2 positions are stable according to all methods and occupied by the United States of America (USA) and Hong Kong. LRIC index based on simulations also groups the ranking of some countries forming regional clusters (Scandinavian, Baltic countries, Australia and New Zealand). The results allowed us to obtain two types of countries. First of all, the highest ratings are typical for large and strong economies such as USA, UK and China. They have developed financial systems with high level of trustworthiness and sovereign ratings. As a result, their financial products (banking deposits or securities) attract a large number of investors. These results are in line with the findings of (IMF, 2015) and could be a good basis for “too big to fail” policy, when financial sectors of these countries could be a source of global systemic risk and should be more closely monitored.
page 373
July 6, 2020
11:57
374
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
However, in contrast to the previous section we can identify a group of countries that are not so large in terms of size of the economy, but also received the highest LRIC values. Countries like Hong Kong, Cayman Islands, Singapore, and Luxembourg could be good examples of “too interconnected to fail” economies. Due to their attractive business environment, well-developed infrastructure, human capital and positive reputation, these countries stimulate that investors place their assets in their financial systems, which makes these countries important borrowers. The appearance of these countries in the top ranking does not look normal at first sight, but it is in line with our initial hypothesis that the greatest influence must have not only the largest market participants, but also the most interconnected ones. In other words, for these countries each individual cash flow is not so significant, but their combination can be critical for the stability of the financial system as a whole. For example, in the case of the elimination of a country from the network, we will most likely not see a chain of cascading failures (because of volumes of interaction to each country are not so great), but it will lead to redirecting financial flows on the other countries that will affect the overall financial stability. In this regard, there is an interesting question about the sensitivity of the results to changes in the network structure. According to our estimates, LRIC method allows determining the key elements in networks of any configuration, and can also be used to analyze the dynamics of the network configuration as well. We have also estimated the level of country interconnectedness using a broad range of existing centrality measures: weighted degree centrality, closeness centrality, betweenness centrality, PageRank and eigenvector centrality. These measures are described in Appendix 9C. The results of the centrality indices calculations are shown in Table 9.14. In order to compare rankings, we used a correlation analysis. Since the position in the ranking is a rank variable, to assess the consistency of different orderings other than traditional Pearson coefficient rank correlation coefficients should be used. In our work, it is applied to the idea of Kendall metrics (Kendall, 1970), that counts the number of pairwise disagreements between two ranking lists. Also, we used Goodman and Kruskal γ rank coefficient, which shows the similarity of the orderings of the data when ranked by each of the quantities (Goodman, and Kruskal, 1954). This coefficient is calculated as γ = (NS − ND )/(NS + ND ), where NS is the number of pairs of cases ranked in the same order on both NS + ND variables (number of concordant pairs) and ND is the number of pairs of cases ranked in reversed order on both variables (number of reversed pairs).
page 374
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
Key Borrowers Detected by the Intensities of Their Interactions Table 9.14: Name United States United Kingdom Germany France Japan Netherlands Cayman Islands China Hong Kong SAR Italy Spain Canada Luxembourg Singapore Brazil Australia Switzerland Poland Mexico Czech Republic Belgium India Austria Denmark Sweden Norway Finland
375
Rankings by centrality measures.
WInDeg WOutDeg WDeg WDDif Betw Clos PageRank EigenVec 1
3
1
198
2
1
1
1
2
1
2
3
1
2
2
2
3 4 6 10 5
5 4 2 9 59
5 1 2 5 4
6 2 1 9 197
14 16 11 15 22
58 105 54 151 157
3 4 5 6 7
5 4 3 9 10
9 8
61 97
3 9
195 196
24 25
133 125
8 9
14 13
7 11 16 13 14 15 12 17 19 21 22
12 6 8 13 14 15 11 7 19 21 22
13 15 14 10 6 8 18 22 20 11 7
13 4 7 194 192 193 10 5 188 191 180
8 4 10 27 28 29 3 17 30 33 34
79 158 61 48 64 197 9 190 42 116 70
10 11 12 13 14 15 16 17 19 21 22
11 8 6 17 16 19 12 7 28 24 41
24 25 27 28 32 29 30
17 27 16 28 10 144 29
27 23 33 17 24 16 26
174 186 11 189 8 185 183
5 6 9 35 7 36 19
12 62 11 108 182 188 3
24 25 27 28 29 30 33
21 22 25 34 15 37 35
The results are provided below (Tables 9.14–9.15). We can see that according to our estimations, the rankings of LRIC indices are highly related to the results of the PageRank. This fact is confirmed by both our correlation coefficients (Kendall τ and Goodman, Kruskal γ-coefficient). It should be noted that the weighted degree centrality also gives us similar rankings. As for other centrality measures, their correlation coefficients are relatively high except the betweenness centrality and weighted out-degree centrality measures for which the correlation coefficient is less than 0,5 (Kendall τ ) or less than 0,4 (γ-coefficient).
page 375
July 6, 2020
Kendall τ -coefficient and Goodman, Kruskal coefficient.
11:57
376
Table 9.15:
WinDeg WOutDeg WDeg WDDif Clos PageRank EigenVec Betw Simul SumPaths MaxPath MaxMin MaxT MultT 0.434 —
0.963 −0.664 0.978 0.548 −0.354 0.863 — −0.781 0.908 — −0.564 —
0.984 0.965 0.987 −0.806 −0.898 —
0.963 0.976 0.974 −0.794 0.922 0.951 —
0.546 0.735 0.455 0.314 0.464 0.911 −0.386 −0.654 0.483 0.845 0.438 0.913 0.456 0.869 — 0.355 —
0.824 0.812 0.926 −0.698 0.869 0.937 0.897 0.344 0.966 —
0.836 0.796 0.93 −0.674 0.87 0.939 0.897 0.359 0.967 0.998 —
0.864 0.814 0.918 −0.688 0.843 0.933 0.866 0.353 0.965 0.987 0.988 —
Goodman, Kruskal γ-coefficient
0.874 0.862 0.822 0.804 0.936 0.956 −0.656 −0.647 0.874 0.865 0.952 0.936 0.963 0.942 0.341 0.325 0.964 0.955 0.96 0.943 0.985 0.976 0.966 0.959 — 1.000 —
WinDeg WOutDeg WDeg WDDif Clos PageRank EigenVec Betw Simul SumPaths MaxPath MaxMin MaxT MultT —
0.384 —
0.754 −0.487 0.796 0.469 −0.304 0.639 — −0.654 0.745 — −0.456 —
0.753 0.723 0.903 −0.789 −0.728 —
0.721 0.705 0.88 −0.628 0.763 0.828 —
0.396 0.587 0.388 0.363 0.378 0.73 −0.265 −0.547 0.397 0.654 0.354 0.739 0.371 0.673 — 0.289 —
0.69 0.735 0.767 −0.523 0.684 0.791 0.719 0.278 0.887 —
0.674 0.758 0.772 −0.564 0.684 0.789 0.719 0.291 0.887 0.976 —
0.687 0.801 0.743 −0.599 0.657 0.768 0.681 0.285 0.874 0.924 0.93 —
b3568-v1-ch09
0.699 0.659 0.814 0.784 0.786 0.76 −0.503 −0.526 0.754 0.798 0.789 0.789 0.695 0.684 0.256 0.289 0.864 0.876 0.915 0.919 0.925 0.936 0.935 0.928 — 1.000 —
9.61in x 6.69in
WIndeg WOutDeg WDeg WDDif Clos PageRank EigneVec Betw Simul SumPaths MaxPath MaxMin MaxT MultT
Handbook of Financial Econometrics,. . . (Vol. 1)
—
F. Aleskerov et al.
WIndeg WOutDeg WDeg WDDif Clos PageRank EigneVec Betw Simul SumPaths MaxPath MaxMin MaxT MultT
page 376
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
377
However, although the correlation coefficients of most centrality measures and LRIC indices are quite high, it was shown in Table 9.15 that in contrast to LRIC indices classical centrality measures are worse in detecting systemically important countries of the second group (e.g., Cayman Island, Luxembourg, Hong Kong). 9.5 Conclusions Network approach can be applied to different segments of the financial system in order to characterize a systemic risk. In our work, network models of systemic risk have been applied to a specific section of the financial market — international credit market. We explore two approaches to measure systemic importance: measurement of long-range interactions’ intensities based on paths, and measurement of long-range interactions’ intensities based on simulations. This network model aims to help regulators identify the financial elements that are too big or too interconnected to fail during any specific crisis. The proposed methodology allows to identify countries, which at first sight do not have high level of systemic importance, but have a significant impact on the stability of the system as a whole. Also, LRIC index based on simulations could be a useful instrument for identification of regional financial clusters. We carried out estimations for hypothetical examples and presented an empirical analysis of cross-border country exposures to demonstrate the feasibility of the proposed methodology. The empirical results based on our methodology are in line with the conclusions made by IMF and other international financial institutions. In addition, these results draw our attention to the importance of countries, which due to their intermediation role in the global finances can have a strong influence on the stability of the entire system. Acknowledgments We thank Professors Jane M Binner, Logan Kelly, Richard Anderson, and Dr John Duca for their invitation to the Conference “Financial Services Indices, Liquidity and Economic Activity” jointly hosted by the Bank of England and the University of Birmingham’s Financial Resilience Research Centre on 23–24 May 2017, and the participants of that Conference for many useful comments. Sections 9.2–9.3 was funded by the Russian Science Foundation under grant No 17-18-01651 National Research University Higher School of Economics.
page 377
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
378
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
Bibliography Acharya, V. V., Pedersen, L. H., Philippon, T. and Richardson, M. (2010). Measuring Systemic Risk. Federal Reserve Bank of Cleveland. Working Paper No. 02. Adrian, T. and Brunnermeier, M. (2010). CoVaR. Federal Reserve Bank of New York, Staff Report No. 348. Akram, Q. F. and Christophersen C. (2010). Interbank Overnight Interest Rates — Gains from Systemic Importance. Norges Bank. Working Paper No. 11. Aldasoro, I. and Alves, I. (2017). Multiplex interbank networks and systemic importance An application to European data. BIS. Working Papers No. 603. Aleskerov, F., Andrievskaya, I. and Permjakova, E. (2016). Key Borrowers Detected By The Intensities of Their Short-Range Interactions. In Models, Algorithms, and Technologies for Network Analysis. From the 4th International Conference on Network Analysis, edited by V. A. Kalyagin, P. Koldanov and P. M. Pardalos, New York: Springer International Publishing. Aleskerov, F. T., Yuzbashev, D. V. and Yakuba, V. I. (2007). Threshold Aggregation for Three-Graded Rankings. Automation and Remote Control (in Russian). No. 1. 147–152. Aleskerov, F. T. (2006). Power Indices Taking into Account Agents’ Preferences. Eds. Simeone B. & Pukelsheim, F. Mathematics and Democracy, Springer, Berlin, 1–18. Allen, F. and Gale, D. (2000). Financial Contagion. Journal of Political Economy, 108(1), 1–33. Allen, F. and Babus, A. (2009). Networks in Finance. In Network-based Strategies and Competencies. Ed. Kleindorfer, P. & Wind, J. 367–382. Angelini, P., Maresca, G. and Russo, D. (1996). Systemic risk in the netting system. Journal of Banking and Finance, 20, 853–868. Barrat, A., Barthelemy, M., Pastor-Satorras R. and Vespignani A. (2004). The Architecture of Complex Weighted Networks. Proceedings of the National Academy of Sciences, 101(11), 3747–3752. Basel Committee on Banking Supervision (BCBS) (2013). Global Systemically Important Banks: Updated Assessment Methodology and the Higher Loss Absorbency Requirement Consultative Document. Bech, M. L. and Atalay, E. (2008). The Topology of the Federal Funds Market. Federal Reserve Bank of New York, Staff Report No. 354. Bernanke, B. (2010). Causes of the Recent Financial and Economic Crisis. Testimony before the Financial Crisis Inquiry Commission (FCIC), September 2, 2010, Washington D.C. Bluhm, M. and Krahnen, J. P. (2014). Systemic Risk in An Interconnected Banking System with Endogenous Asset Markets. Journal of Financial Stability, 13, 75–94. Bonacich, P. (1972). Technique for Analyzing Overlapping Memberships. Sociological Methodology, 4, 176–185. Bruhl, V. (2017). How to Define a Systemically Important Financial Institution — A New Perspective. Intereconomics, March 2017, 52(2), 107–110. Cajueiro, D. O. and Tabak, B. M. (2008). The Role of Banks in the Brazilian Interbank Market: Does Bank Type Matter? Physica A: Statistical Mechanics and its Applications, 387(27), 6825–6836. Chan-Lau, J. A. (2010). The Global Financial Crisis and its Impact on the Chilean Banking System. IMF Working Paper No. 108. Drehmann, M. and Tarashev, N. (2011). Measuring the Systemic Importance of Interconnected Banks. Bank for International Settlements. Working Paper No. 342.
page 378
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
379
ECB. (2010). (June 2010). Financial Stability Review. ECB. (2006). (December 2006). Financial Stability Review. ECB. (2014). Financial Stability Review. Furfine, C. (2003). Interbank Exposures: Quantifying the Risk of Contagion. Journal of Money, Credit and Banking, 35, 111–128. Garlaschelli, D., Battiston, S., Castri, M., Servedio, V. D. P. and Caldarelli G. (2005). The Scale-Free Topology of Market Investments. Physica A, 350, 491–499. Garratt, R., Webber, L. and Willison, M. (2012). Using Shapley’s Asymmetric Power Index to Measure Banks’ Contributions to Systemic Risk. Bank of England, Working Paper No. 468. Giovannetti, A. (2012). Financial Contagion in Industrial Clusters: A Dynamical Analysis and Network Simulation. University of Siena, Working Paper. Goodman, L. A. and Kruskal, W. H. (1954). Measures of Association for Cross Classifications. Journal of the American Statistical Association, 49(268), 732–764. Huang, X., Zhou, H. and Zhu, H. (2012). Assessing the Systemic Risk of a Heterogeneous Portfolio of Banks During the Recent Financial Crisis. Journal of Financial Stability, 8(3), 193–205. Ikeda, Y., Kubo, O. and Kobayashi, Y. (2007). Firm Dynamics Simulation Using Gametheoretic Stochastic Agents. Physica A, 382, 138–148. IMF/BIS/FSB (2009). Guidance to Assess the Systemic Importance of Financial Institutions, Markets and Instruments: Initial Considerations. Report to the G-20 Finance Ministers and Central Bank Governors. IMF (2010). Integrating Stability Assessments Under the Financial Sector Assessment Program into Article IV Surveillance: Background Material. IMF (2015). Global Financial Stability Report. A Report by the Monetary and Capital Markets. Department on Market Developments and Issues. Iori, G., de Masi, G., Precup, O. V., Gabbi G. and Caldarelli, G. (2008). A network analysis of the Italian overnight money market. Journal of Economic Dynamics and Control, 32(1), 259–278. Kendall, M. (1970). Rank correlation methods, 4th Edition, Griffin, London. Lehar, A. (2005). Measuring Systemic Risk: A Risk Management Approach. Journal of Banking and Finance, 29(10), 2577–2603. Leon, C. and Murcia, A. (2012). Systemic Importance Index for Financial Institutions: A Principal Component Analysis Approach. Banco de la Repblica (Central Bank of Colombia) Working Papers, No. 741. Malik, S. and Xu, T. T. (2017). Interconnectedness of Global Systemically-Important Banks and Insurers. IMF Working Papers, No. 17/210. Minoiu, C. and Reyes, J. A. (2013). A Network Analysis of Global Banking: 1978–2010. Journal of Financial Stability, 9(2), 168–184. Myerson, R. B. (1977). Graphs and Cooperation in Games. Mathematics of Operations Research, 2, 225–229. Newman, M. E. J. (2010). Networks: An Introduction. Oxford University Press, Oxford, UK. Peltonen, T. A., Rancan, M. and Sarlin, P. (2015). Interconnectedness of the Banking Sector as a Vulnerability to Crises, ECB Working Paper Series, N1866. Segoviano, M. A. and Goodhart C. (2009). Banking Stability Measures. IMF Working Paper No. 4. Shapley, L. S. and Shubik, M. (1954). A Method for Evaluating the Distribution of Power in a Committee System. American Political Science Review, 48, 787–792.
page 379
July 6, 2020
11:57
380
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
Sun, A. J. and Chan-Lau, J. (2017). Financial Networks and Interconnectedness in An Advanced Emerging Market Economy. Quantitative Finance, 17(12), Systemic Risk Analytics. Tarashev, N., Borio, C. and Tsatsaronis, K. (2010). Attributing Systemic Risk to Individual Institutions. BIS Working Papers No. 308. von Peter, G. (2007). International Banking Centres: A Network Perspective. BIS Quarterly Review. Zhou, C. (2010). Are Banks Too Big to Fail? Measuring Systemic Importance of Financial Institutions. International Journal of Central Banking.
Appendix 9A: Methodology for the Case with Short-Range Interactions In this section, we describe the methodology for the case of short-range interactions proposed in (Aleskerov, et al., 2016). This method is based on the power index analysis that was worked out in (Aleskerov, 2006) and adjusted for the network theory. The index is called a Key Borrower Index (KBI) and is employed to find the most pivotal borrower in a loan market in order to take into account some specific characteristics of financial interactions. For the “one pure lender, many borrowers/lenders” case, KBI is calculated for each lender individually in order to determine the influence of each borrower with respect to the lender. The distinct feature of the proposed index is that it takes into account short-range interactions between each lender and its borrowers. In other words, only direct neighbors are considered to estimate the direct and indirect influence to a specific lender. The methodology includes several steps. First, some key terms should be defined as the following: • A critical group is interpreted as a group whose default may lead to the default of the lender (while the lender is able to cover its losses from the distress of members outside the critical group). Thus, the group is “critical” if the total amount of its members’ borrowings is greater than or equal to a predefined threshold. • A borrower is pivotal in the group if his/her exclusion from the critical group makes it non critical. • The most pivotal borrower will be the one that becomes pivotal in more critical groups than any other borrower does. The next step is to estimate the intensity of agents’ interconnection in the following way (Aleskerov et al., 2016): Pli + Pli , f (i, wl ) = j Plj
(9A.1)
page 380
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
381
where f (i, wl ) — the intensity of agents’ interconnection, wl is a critical group with respect to a lender l where a borrower i is pivotal, Pli are total direct loans taken by a borrower i from a creditor l, P ’li are total indirect loans taken by a borrower i from a creditor l. At this stage, indirect connections of only the first-order (between a lender and a borrower) are considered. The third step is to calculate the intensity of indirect connections between a lender L and a borrower Bi through a borrower Bj (or pji ) as the following (Aleskerov et al., 2016): ⎧ P ⎪ ⎪ ji , Pji < PLj , i = j, k = 1, . . . , 3 ⎪ ⎪ ⎪ k PLk ⎪ ⎪ ⎪ (borrowers of a Lender L) ⎨ P pji = Lj , P > P , i = j, k = 1, . . . , 3 ji Lj ⎪ ⎪ ⎪ k PLk ⎪ ⎪ ⎪ (borrowers of a Lender L) ⎪ ⎪ ⎩ (9A.2) and direct connections between L and Bi as PLi . pLi = k PLk As a result, the total intensity of connection between L and Bi would be: pji + pLi . (9A.3) fli = j
In such a way, the intensity of connection for a borrower i is calculated separately for each critical group and then aggregated over all possible critical groups as f (i, wl )/Nw , (9A.4) χi = wl
where Nw — the number of borrowers in the group. KBI is then calculated as: χj . KBI = χi /
(9A.5)
j
The value of the index for each borrower reflects the magnitude of his/her pivotal role in the group. The higher the value, the more pivotal the agent is. The most pivotal borrower will be the one that becomes pivotal in more critical groups than any other borrower does. For the “many lenders, many borrowers” case, the KBI is aggregated over all lenders taking into account the size of each lender’s total loans.
page 381
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
382
Appendix 9B: S-Long-Range Interactions Centrality Index Based on Simulations Another approach of estimating the power of each element in the system is based on the following idea. Suppose that some borrowers are not capable to return the loan. Will they form critical group? Will it lead to the fact that their creditors in turn will not cover the loans to other creditors? More formally, let us construct a matrix C = [Cij ] with respect to the matrix A and predefined threshold as ⎧ if aij ≥ qi ⎪ ⎨ 1, aij (9B.1) cij = q , if 0 < aij < qi ⎪ ⎩ i 0, if aij = 0. In other words, the matrix C indicates what share of the threshold value (critical loan amount) the element i gave to the element j. The matrix C is used to evaluate the long-range influence between elements of the system through simulations. A graphical representation of the matrix is shown in Figure 9.6. Similarly, let us say that the group Ω(i) ⊆ Ni is critical if k∈Ω(i) cik ≥ 1, and any element in group Ω(i) is not capable to return the loan. If the group Ω(i) exists, then i is not capable to return the loan to his own creditor.
Figure 9.6:
A graphical representation of matrix C for simulations approach.
page 382
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Key Borrowers Detected by the Intensities of Their Interactions
b3568-v1-ch09
383
Assume now that some borrowers (total number is k0 ) are not capable to return a loan. Then, we can define a list of borrowers (total number is k1 ) for which k0 borrowers form critical groups. Similarly, we can define a list of borrowers (total number is k2 ) for which k0 + k1 borrowers form the critical group. The procedure continues until the predefined limit in the number of stages s is reached or there is a stage r (in the worst case when the parameter s is undefined r is less than or equal the diameter of the network) such that min(r,s) kl ) kr = 0. Thus, we derived a list of borrowers (total number is l=1 that are not capable to return their loans if we assume that k0 borrowers cannot return their loans. For instance, suppose that elements 5, 6, 9 are not capable to return their loans. Then we can define which elements in turn will not be able to cover their loans (see Table 9.16). Thus, the bankruptcy of elements 5, 6 and 9 will lead to the bankruptcy of other five elements. Similarly, we can assume any other combination of borrowers that cannot return their loans and define a list of all bankrupted borrowers. The results will form a new matrix C ∗ (s) = [c∗ij (s)], which shows in what percentage of cases the borrower i could not return its loans if we assume that the borrower j is not capable to return his own loans. The interpretation of the matrix C ∗ (s) is rather simple. If the value c∗ is close to 1, then borrower j is very critical for the borrower i. On the contrary, if the value c∗ is close to 0, then the borrower j hardly influences the borrower i. The results can be aggregated into a single vector that shows the total influence of each element in the system. Thus, we can construct the matrix C ∗ (s) = [c∗ij (s)] for the numerical example (see Table 9.17). One of the key advantages of this approach is that it accurately takes into account all chain reactions of the system, so-called domino or contagion effect.
Table 9.16: Step 0 1 2 3
Simulation procedure for the combination {5, 6, 9}.
Bankrupted elements {5, 6, 9} {2, 4} {1, 3} {10}
Total number of elements k0 k1 k2 k3
=3 =2 =2 =1
page 383
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch09
F. Aleskerov et al.
384
Matrix C ∗ for the numerical example (simulations).
Table 9.17:
1 2 3 4 5 6 7 8 9 10 Total Total (normalized)
1
2
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0.27 0.085
3 4 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
5 1 0 0 0 0 0 0 0 0 0 0.27 0.085
6 1 1 1 1 1 0 0 0 0 0 0.69 0.211
7
8
0.4 0 0 0 0.4 0 0 0 0 0 0.23 0.072
0.2 0 0 0 0.58 0 0 0 0 0 0.23 0.071
9 1 1 0 0 0.47 0 1 0 0 0 0.74 0.216
10 1 0 0 0 1 0 1 1 0 0 0.89 0.261
Weights 0.27 0.05 0.04 0.02 0.30 0.00 0.27 0.04 0.00 0.00
One of the key questions of this approach is which borrowers should be chosen in the first stage. For the general case, we can choose each borrower the same number of times. However, in real-life problems this is not always the case since different elements have different probability of default. It means that these probabilities can be taken into account at the simulation stage. Another important issue of this approach is how we should define pivotal borrowers for each lender. Obviously, it is possible that for the lender i there is a group of borrowers of size k∗ WOutDeg) or net borrower (WInDeg < WOutDeg). Low values of this measure in case of banking foreign claims analysis can be explained by two factors: attractive conditions for direct foreign investments or the realization of government financial assistance programs. In both cases, the incoming flow will be significantly higher than outgoing flows. Betweenness centrality shows often the vertex is placed on the shortest path between any two nodes in the network. Betweenness centrality for node i is the sum of the proportions for all pairs of actors, j and k, in which actor i is involved in a pair’s geodesic(s), i.e., σjk (ni ) , (9C.2) Cib (g) = σjk j T , the sample variance-covariance matrix of asset returns becomes singular and is not invertible (Knight and Satchell, 2005). To get around this restriction, portfolios are formed from securities, thereby ensuring that N < T . Grouping securities into portfolios to satisfy the conditions of the GRS test does not always serve this purpose well. The GRS tests easily rejected all models tested in Fama and French (2015, 2016). Another motivation for grouping stocks into portfolios is to reduce the errors-in-variables (EIV) problem inherent in estimating betas. Since measurement errors in betas across securities are not perfectly correlated, grouping would likely decrease the asymptotic bias in OLS estimators (Litzenberger and Ramaswamy, 1979). Errors-in-variable bias would be reduced and should approach zero as the number of securities (N ) grows without bound, since portfolios would have less idiosyncratic risk (Jegadeesh et al., 2015). However, forming portfolios by grouping stocks results in a reduction of efficiency caused by the loss of information regarding the crosssectional behavior of individual stocks unrelated to portfolio groupings. Portfolios diversify away and mask the relevant risk and return features of individual stocks. Connor and Korajczyk (1988) point out that such a 2
Researchers have used hundreds of factors to explain the cross-sectional variation of expected returns. See Harvey et al. (2016) for further details on the factors used and a survey of related literature.
page 393
July 6, 2020
11:57
394
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
grouping procedure may conceal important deviations from the model under review if the deviations are related to the attributes used to assign assets into portfolios. While creating portfolios reduces the sampling variability of the estimates of factor loadings, the standard errors of factor risk premia actually increase. (see Roll, 1977; Ang et al., 2009, 2010; Hwang and Satchell, 2014). Ang et al. (2010) show analytically and demonstrate empirically that the smaller standard errors of beta estimates from creating portfolios do not lead to smaller standard errors of cross-sectional coefficient estimates. The standard errors of factor risk premia estimates are determined by the crosssectional distributions of factor loadings and residual risk. Creating portfolios destroys information by shrinking the cross-sectional dispersion of betas and inflates the standard errors of estimated coefficients (Ang et al., 2010). Furthermore, Lo and MacKinlay (1990) note that tests of asset pricing models may yield misleading inferences when tests are based on returns to portfolios constructed by sorting on empirically motivated characteristics such as size, beta, or book-to-market. They argue that these are examples of “data-snooping,” and this kind of data-snooping could lead to rejection of the null hypothesis too often, an increase in type I error. Kim (1995) notes that the formation of portfolios might cause a loss of valuable information about cross-sectional behavior among individual assets, since cross-sectional variations would be smoothed out. He considers using all available individual stocks (without forming portfolios) in the estimation process as an efficient way to avoid loss of information. Ferson et al. (1999) point out that many empirical studies use the returns of attribute-sorted portfolios of common stocks to represent risk factors in tests of asset pricing models, where the attributes are selected on the basis of empirically observed relationships to the cross-section of stock returns. They argue that such attribute-sorted portfolios may present useful risk factors even when the attributes are completely unrelated to risk exposure. Jagadeesh et al. (2015) point out that when using portfolios, rather than individual assets, there is an immediate issue of test power since dimensionality is reduced, i.e., there are unavoidably fewer explanatory variables with portfolios than with individual assets. Berk (2000) shows that the standard practice of grouping stocks into portfolios to test asset pricing inferences introduces a bias in rejecting the model under consideration. A direct and more effective method to deal with these problems is to use individual stocks instead of characteristics- or attribute-sorted portfolios in testing asset pricing models (Hwang and Lu, 2007). This study is an attempt to empirically explore the explanatory power of Fama and French’s (2015) 5-factor model, an extended version of the 4-factor
page 394
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
395
offspring of Fama and French’s (1993) 3-factor model by Carhart (1997), and a few other competing multi-factor models motivated by works of Amihud and Mendelson (1986), Pastor and Stambaugh (2003), Korajczyk and Sadka (2008), Ibbotson et al. (2013), Asness (2014) Asness et al. (2015a), Ibbotson and Idzorek (2014), and Ibbotson and Kim (2015). We apply these models to individual security or asset returns which are readily observed in the marketplace and which reflect the interaction of rational as well as irrational and informed as well as uninformed traders and investors. Using individual stocks permits more efficient tests of whether a particular factor carries a significant price risk. In order to test the validity of competing asset pricing models, we use the multivariate average F -test developed by Hwang and Satchell (2014) instead of the GRS test. A key advantage of the multivariate average F -test is that it allows us to test the explanatory power of each competing model against all individual securities at once without introducing a datasnooping bias. Therefore, our conclusions can speak about the explanatory power of the factors on the individual securities directly. This is important because an asset pricing model’s performance with individual securities cannot be inferred from that model’s performance vis-` a-vis portfolios. Ang et al. (2010) note that using individual stocks or portfolios can result in very different conclusions regarding whether a particular factor carries a significant price risk. The results with portfolios will be useful only to measure ex-post performance of managed portfolios or to detect market anomalies. Consequently, these results will not be useful to investors who select securities based on their exposure to the factor(s). Compared to the extant literature, the major contributions of our work are to examine the relative performance of competing models for individual securities and to allow investors to select stocks based on our findings. Our study fills a gap in the literature by testing the comparative performance of competing models directly using individual securities.3 Our results show that a parsimonious 6-factor model with the market, size, orthogonal value/growth, profitability, investment, and
3
It is worth noting that recently Chordia et al. (2001), Gagliardini et al. (2016), and Jagadeesh et al. (2015) used individual stock data in testing asset pricing models. Chordia et al. (2015) and Gagliardini et al. (2016) focused on risk premium estimation from crosssectional regressions of individual stock returns on factor loadings (betas) and firm characteristics. On the contrary, we present a simple and intuitive way of examining relative performance of alternative models by focusing on the magnitude and significance of intercept from regression (pricing error). The key difference between our study and Jegadeesh et al. (2015) will be discussed in a later section.
page 395
July 6, 2020
11:57
396
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
momentum factors outperforms other models such as the 5-factor model of Fama and French, a 5-factor model without the value/growth factor, a 5-factor model with the momentum and liquidity factors, and a 6-factor model with the momentum and an alternative measure of the value/growth factor. We also find that the average F -test has superior power to discriminate among competing models and does not reject tested models altogether, unlike the GRS test. The rest of the chapter is organized as follows. Section 10.2 discusses alternative multi-factor asset pricing models that we consider in this chapter, section 10.3 discusses the data and methodology, and section 10.4 presents empirical analysis in detail. Section 10.5 concludes the chapter. 10.2 Multi-Factor Models Next, we briefly describe several models that examine the relationship between the asset pricing and factors, which we use as a comparison in our study. First, Fama and French (1993) developed a 3-factor model to explain the cross-sectional variation in stock returns. In their model, the excess return on a portfolio or security is explained by sensitivity to threefactors: broad market excess return, a factor based on market capitalization, and a factor based on the book-to-market equity ratio (B/M): Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + di HMLt + eit ,
(10.1)
where Rit is the return on security or portfolio i for period t, Rf t is the riskfree return, Rmt is the return on the value-weighted market portfolio, SMBt is the difference between the returns on diversified portfolios of small and large cap stocks, HMLt is the difference between the returns on diversified portfolios of high B/M stocks (value stocks) and low B/M stocks (growth stocks), and eit is a zero-mean disturbance term. Jegadeesh and Titman (1993) show that intermediate-term returns tend to persist; stocks with higher returns in the prior year tend to have higher future returns. Carhart (1997) developed a 4-factor model by adding a factor to incorporate Jegadeesh and Titman’s (1993) one-year momentum anomaly into Fama and French’s (1993) 3-factor model: Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + di HMLt + st MOMt + eit , (10.2) where MOMt is the difference between the returns on diversified portfolios of past 12-month winners (i.e., stocks with high returns) and losers (i.e.,
page 396
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
397
stocks with low returns). This model is a response to the inability of Fama and French’s (1993) 3-factor model to capture the part of the variation in asset returns resulting from a momentum investment strategy, as Fama and French (1996) acknowledged. Amihud and Mendelson (1986), Amihud (2002), and many other researchers confirm the impact of liquidity on stock returns. Brennan and Subrahmanyam (1996) and Chordia et al. (2001) demonstrate that there is a significant relationship between required return and measures of liquidity risk after controlling for the Fama and French (1993) risk factors and the momentum factor. Pastor and Stambaugh (2003), Acharya and Pederson (2005), and Sadka (2006) note that liquidity is a priced risk factor in explaining asset returns. Pastor and Stambaugh (2003) find that expected stock returns are related cross-sectionally to the sensitivities of returns to fluctuations in aggregate liquidity. Korajczyk and Sadka (2008) find evidence that liquidity is a priced factor in the cross-section of security returns even after the inclusion of other equity characteristics and risk factors, such as size, value/growth, and momentum. Ben-Rephael et al. (2015) point out that markets can experience systematic shocks to liquidity, and stocks whose returns are more sensitive to such shocks earn higher returns than stocks that exhibit lower sensitivity.4 High margin assets have higher required returns, especially during times of funding illiquidity (Frazzini and Pederson, 2014). Ashcraft et al. (2010) find that prices rise when central bank lending activities reduce margins and increase liquidity in the market. Acharya and Pederson (2005) find that in a liquidity-adjusted CAPM, a security’s return depends on its expected liquidity, as well as on the covariance of its own return and liquidity with the market return and liquidity. Sadka (2006) finds a positive relationship between measures of liquidity and momentum for US stocks and notes that the addition of liquidity factors to CAPM and the Fama and French 3-factor model seems to significantly increase the adjusted R2 . Asness et al. (2013) find significant evidence that liquidity risk is negatively related to value and positively related to momentum globally across asset 4
Ben-Rephael et al. (2015) distinguish between two types of liquidity premium: (i) a characteristic liquidity premium, associated with the transaction costs of trading the security, and (ii) a systematic liquidity premium, associated with the sensitivity of the stock returns to shocks in market liquidity. Systematic liquidity is driven by the uncertainty that stock prices will decline when market liquidity is low. They show that the characteristic liquidity premium of US stocks has significantly declined over the past four decades following a series of technological and regulatory changes, such as decimalization. By contrast, systematic liquidity has not been trending down and is still significantly priced, especially among NASDAQ stocks.
page 397
July 6, 2020
11:57
398
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
classes. Asness et al. (2015b) note that there are other sources of return, one of which is liquidity. Similar to Asness (2014), Asness et al. (2015a), and Fama and French (2015), Ibbotson et al. (2013) and Ibbotson and Kim (2015) regress a liquidity factor represented by the difference between the returns on portfolios of the least liquid and most liquid quartile of stocks (LMH) separately on Fama and French’s (1993) three factors and Carhart’s (1997) four factors. It appears that in these regressions, the estimated intercept or alpha is large, positive, and statistically significant indicating that the impact of the liquidity factor in explaining asset returns is not subsumed by other factors. Motivated by the significant findings of the importance of liquidity in asset pricing, we augment Carhart’s (1997) 4-factor model by adding a liquidity factor as constructed by Ibbotson et al. (2013) and Ibbotson and Kim (2015): Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + di HMLt + qt MOMt + st LMHt + eit ,
(10.3)
where LMH is the monthly return of a long–short portfolio in which the returns of the most liquid quartile of stocks is subtracted from the return of the least liquid quartile of stocks. Ibbotson et al. (2013) and Ibbotson and Kim (2015) separate stocks into quartiles based on their turnover rate, which is the number of shares traded during the year divided by the number of shares outstanding. The stocks with the highest (lowest) turnover rates are the most (least) liquid.5 The rationale for adding the liquidity factor to the model also came from Carhart (1997) who used a version of the liquidity factor while examining mutual fund performance. His liquidity factor-mimicking portfolio (VLMH) is the spread between returns on low and high trading volume stocks orthogonalized to the four factors of market, size, value/growth, and momentum. He finds that the VLMH-loading estimates on mutual fund portfolios are strongly related to performance. The best oneyear return portfolios load significantly and negatively on VLMH, indicating relatively more liquid stocks in the funds, while the worst portfolios load significantly and positively, indicating relatively more illiquid stocks. 5
Another commonly used measure of liquidity is the Pastor and Stambaugh (2003) liquidity factor which is based on stock’s within-month daily returns and trading volume. Ibbotson and Kim (2015) preferred using turnover-based liquidity measure because turnover is simple, easy to measure and has a significant impact on returns. Ibbotson et al. (2013) termed it as “before the fact” measure of liquidity. Idzorek et al. (2012) show that turnover exhibits greater explanatory power for US mutual fund returns.
page 398
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
399
Fama and French (2015), motivated by the evidence in Titman et al. (2004), Aharoni et al. (2013), Novy-Marx (2013), and Hou et al. (2015b), among others, note that Fama and French’s (1993) 3-factor model of equation (10.1) misses much of the variation in stock returns due to profitability and investment. They developed a 5-factor model by adding two factors to Fama and French’s (1993) 3-factor model: Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + di HMLt + pt RMWt + qt CMAt + eit ,
(10.4)
where RMWt is the difference between the returns on diversified portfolios of stocks with robust and weak operating profitability, and CMAt is the difference between the returns on diversified portfolios of stocks of low (conservative) and high (aggressive) investment firms. Fama and French (2015) find that a 4-factor model without HML performs as well as the 5-factor model with HML when applied to portfolios formed on size, B/M, profitability, and investment. They note that HML is redundant for explaining average returns in the sense that its impact is fully captured by the other factors. Redundancy of HML is based on regression of each of the 5-factors on the other four factors (Table 10.6 in Fama and French, 2015). The intercept of the regression of HML on the other factors is small, negative, and not statistically significant. They also suggest substituting orthogonal HML (OHML) for HML, which captures the additional impact of the value premium that is not captured by other factors. We are also motivated to explain one of the anomalies that Fama and French (1993, 2015) did not explicitly model, though Fama and French (2008) call momentum “the center stage anomaly of recent years . . . an anomaly that is above suspicion . . . the premier market anomaly” (Antonacci, 2015). They observe that the abnormal returns associated with momentum are pervasive. Schwert (2003) explores all known market anomalies and declares momentum as the only one that has been persistent and that has survived since its disclosure. We test a variation of Fama and French’s (2015) fivefactor model by dropping the HML factor and adding momentum and an orthogonal HML factor which is the sum of the intercept and residual from the regression of HML on Rm − Rf , SMB, RMW, CMA, and MOM: HMLt = αi + bi (Rmt − Rf t ) + ci SMBt + pt RMWt + qt CMAt + st MOMt + μit ,
(10.5)
page 399
July 6, 2020
11:57
400
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + pi RMWt + qt CMAt + dt OHMLt + st MOMt + eit ,
(10.6)
where OHMLt is αi + μit . Asness (2014) and Asness et al. (2015a) note that long-standing value, as proxied by HML, being captured by other sources such as the combination of profitability and investment, does not make value useless or redundant. Asness et al. (2015a) argue that two factors contributed to this conclusion: (1) Fama and French’s (2015) HML factor uses price that is highly lagged, and (2) failure of Fama and French (2015) to embrace the momentum factor despite the overwhelming evidence that it contributes to explaining returns. Asness and Frazzini (2013), Asness (2014), and Asness et al. (2015a) find that Fama and French’s (1993, 2015) method of constructing the HML factor updates value once a year on June 30, using book and market price as of the prior December 31. Both book and market price are 6 months old upon portfolio formation and 18 months old on the rebalancing date of the following June 30. Asness and Frazzini (2013) point out that this method was reasonable in the early days when momentum trading was not common. They argue that widespread use of momentum trading now by investors makes the method suboptimal. Asness and Frazzini (2013) construct an alternative to Fama and French’s (1993, 2015) HML factor, known as HML DEV, using book value from the prior December 31 and current price on the rebalancing date of June 30. Thereafter, HML DEV is revised monthly using the current price.6 Asness (2014) replicates Table 6 of Fama and French (2015) adding a sixth momentum factor and replacing the standard HML of Fama and French (1993, 2015) with HML DEV. He finds that the intercept of the separate regression of HML DEV and the momentum factor on the other five factors are large, positive, and statistically significant, and concludes that in search of the most parsimonious yet effective asset pricing model, a 6-factor model is clearly warranted. In the spirit of Asness (2014), we propose to test a 6-factor model consisting of Fama and French’s (2015) 5-factor plus the momentum factor with substituting HML DEV of Asness and Frazzini (2013) for standard HML: Rit − Rf t = ai + bi (Rmt − Rf t ) + ci SMBt + dt HML DEVt + pi RMWt + qt CMAt + st MOMt + eit .
6
See Asness and Frazzini (2013) for details.
(10.7)
page 400
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
401
Regardless of which of the models under consideration here emerges as the winner, i.e., has the best explanatory power, in general factor models do give some indication of the relationship between risk and expected return despite their many empirical and theoretical challenges. Factor models are regularly employed by researchers to evaluate investment performance of managed portfolios, such as mutual funds and pension funds (see, for example, Carhart, 1997; Kosowski et al., 2006; Barras et al., 2010; Fama and French, 2010), and equity institutional products managed by investment management firms (Busse et al., 2010). Factor models have also been used for explaining anomalies that asset pricing models fail to capture (Fama and French, 1996, 2016; Hou et al., 2015b). Factor models also have some value as a guide to the strength and significance of different portfolio strategies. Intercept or alpha from a factor model and its statistical significance are useful for evaluating the effectiveness of trading strategies relative to appropriate benchmarks. Return-based style analysis introduced by Sharpe (1988, 1992) uses factor models to detect exposure of mutual funds and hedge funds to various investment styles, such as large versus small cap, value versus growth, emerging markets, etc. 10.3 Data and Research Methodology This section provides information on the data used in our study and describes in detail the multivariate test used to detect pricing error in the alternative models. 10.3.1 Data description In this study, we apply five alternative multi-factor models to explain the cross-sectional variation in returns for a sample of nonfinancial stocks from the Russell 3000 Index, which covers approximately 98% of the investable US equity market. Fama and French (1992) note that high leverage which is standard for financial firms may not have the same implication as for nonfinancial firms, where high leverage is more likely an indication of financial distress. Our sample consists of the 407 stocks from the Russell 3000 Index with no missing monthly returns for the sample period January 1990 to December 2014 (300 monthly observations).7 The list of stocks in the 7 To facilitate the computation of average F -test statistic discussed in the following section, we must have the same number of time-series observations (T ) for each of the stocks in the sample. This necessitates excluding stocks with missing observations within the sample period.
page 401
July 6, 2020
11:57
402
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
Russell 3000 Index comes from Russell Investments. Monthly returns on the sampled stocks were collected from the CRSP database. Monthly returns on factor portfolios and risk-free rates came from the online data library of Professor French, and monthly observations on the alternative HML factor portfolio of Asness and Frazzini (2013) came from the online data library of AQR Capital Management.8 We generated monthly observations on the liquidity factor portfolio following the methodology of Ibbotson et al. (2013) and Ibbotson and Kim (2015). 10.3.2 Balanced versus unbalanced data panel One criticism of our data may emerge from using a balanced panel of data, i.e., a sample of stocks with no missing observations in the entire sample period which is 407 securities in this study. An alternative would be to use all the securities with no missing data for each year of the sample period. In other words, we let N vary from year to year. One of the requirements for using a multivariate test (GRS or average F -test) to detect average pricing error in the model (as in Fama and French, 2015, 2016; Hou et al., 2011, 2015a, 2015b) is that the number of assets (N ) remains the same during the sample period. The above-mentioned studies accomplished this objective by including all the securities with no missing observations in a given sample year and then grouping them into a fixed number of portfolios based on attributes in each of the year in the sample period. Although they theoretically used an unbalanced panel of securities in each year, they gave up considerable security-specific information by forming portfolios resulting in a balanced panel. We trade off unbalanced panel of assets for a balanced panel to avoid loss of securityspecific information. Inferences drawn on average pricing errors in the alternative models from several hundred individual securities is statistically more meaningful than those drawn from several dozen portfolios subject to data snooping. 10.3.3 The average F-test The focus of this study will be on comparing alternative models in terms of their ability to reduce alpha for individual securities. Fama and French (2016) point out that if an asset pricing model fully captures expected returns, the 8
Source: Online data library of AQR Capital Management (www.AQR.com/library).
page 402
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
403
intercept or alpha is indistinguishable from zero in the time-series regression of any asset’s excess return on the model’s factor returns regardless of whether it is an individual security or portfolio. Our sample of 407 stocks (N ) exceeds the number of time-series observations of 300 months (T ), and we employ Hwang and Satchell’s (2014) average F -test to detect pricing errors in alternative models. To resolve the inability of the GRS statistic to test the asset pricing model when the number of stocks (N ) exceeds the number of time-series observations (T ), Hwang and Satchell (2014) developed an average F -test for testing linear asset pricing models by relaxing the restriction of T > N . Since N is allowed to be greater than T , the average F -test can be applied to thousands of individual stocks. This allows us to avoid grouping stocks into portfolios and get around the data-snooping biases in portfolio formation. We now describe the average F -test of Hwang and Satchell (2014) and examine the validity of alternative models using the average F -test. The average F -test statistic is given by N
ˆ 2i T α C , AF = N σ ˆ2 i=1 i
(10.8)
ˆi2 is the where α ˆ 2i is the square of the estimated intercept of security i, σ estimated residual variance of security i, C is a 1 × 1 scalar equal to [1 + ˆ −1 (¯ r1 , r¯2 , . . . . r¯k )/ ], (¯ r1 , r¯2 , . . . . r¯k ) is a vector of average (¯ r1 , r¯2 , . . . . r¯k )Ω ˆ is the estimated variance-covariance matrix of returns on the factors, and Ω factor returns. The average F -test statistic is distributed as the sum of N F -distributions N 1 Fi (1, T − K − 1), (10.9) AF ∼ N i=1
where Fi (1, T − K − 1) is an F -distribution with 1 degree of freedom in the numerator and T − K − 1 degrees of freedom in the denominator. The average F -test is a multivariate F -test where residuals from two securities are assumed to be uncorrelated. It a special case of a test for mispricing in the APT model in Connor and Korajczyk (1988) who place prior restriction on the covariance of residuals by assuming that residuals or idiosyncratic returns are temporally independent but possibly cross-sectionally dependent. Unlike the GRS test, the average F -test does not require T > N because the off-diagonal elements of the variance-covariance matrix of the error terms are assumed to be zero (Hwang and Satchell, 2014). Although this relaxation assumes that errors are uncorrelated cross-sectionally (which could be a problem for two stocks with high covariance), it overcomes the shortcomings
page 403
July 6, 2020
11:57
404
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
of the non-invertible covariance matrix in the GRS test. However, we consider the issue of cross-sectionally dependent residuals or idiosyncratic returns within a sector or industry in the following section. The exclusion of covariance between residuals of securities (off-diagonal elements) can be justified based on the evidences in Litzenberger and Ramaswamy (1979) and Kim (1995) who proposed test procedures that emphasize the diagonal elements and individual stocks. Litzenberger and Ramaswamy (1979) pointed out that the variance-covariance matrix of residuals between securities is block diagonal, with the off-diagonal blocks being zero. Their conclusion is based on the assumption (as in Fama, 1973; Gibbons, 1982) that security returns are serially uncorrelated, so that E(eit , ejs ) = o for t = s, i = 1, . . . , N , and t = 1, . . . , T .9 Hwang and Satchell (2014) noted that the covariances are possibly less significant for individual stocks than for portfolios. They added that modern portfolio theory indicates that the variances of idiosyncratic errors (the diagonal elements) are expected to contribute more to individual stocks compared with portfolios that could be driven primarily by the covariances. Kim (1995) found strong support for the assumption of weak cross-sectional dependence between residuals of individual stocks. Furthermore, two securities are related to each other via their relation with the market in a single index model or with factors in a multiple index model. Within the framework of the CAPM, this relationship is evident in the equation for covariance between returns of two securities (σij ); 2 , σij = βi βj σm
(10.10)
2 is the variance where βi is the measure of systematic risk of security i, and σm of the market return. What is left after capturing the systematic risk of a security is its company-specific or unsystematic risk. This is the random or unpredictable variation in a security’s return or residual variance. Examples of sources of such variation are health of the CEO of the company, poor management, management change, plant shutdown, power failure, strike by employees, product recall, supply chain disruption, etc. Variations in returns of two securities due to these sources are unlikely to be correlated because of their random nature even if these securities are from the same industry.
9
In empirical tests of asset pricing models with individual assets using instrumental variables technique, Jagadeesh et al. (2015) also assume that the factors and residuals are stationary processes, and that the residuals are cross-sectional and time-series uncorrelated and uncorrelated with factors.
page 404
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
405
Intuitively, the average F -test determines whether a group of factors can adequately explain the returns of all stocks. It tests this hypothesis by comparing the average pricing error of stocks to the natural variability of the average pricing error of zero-intercept stocks. The average pricing error is a weighted-average statistic of all the stocks’ intercept values. The intercept value for each stock is estimated by regressing stock returns on the factor returns. If the average pricing error of stocks is too high, then the average F -test concludes that, on average, the factors in the specific model cannot fully explain excess return on individual stocks. More formally, the factors are the predictors in the multi-factor model, the average pricing error of stocks is the t-statistic in equation (10.8), and the natural variability of the average pricing error of zero-intercept stocks is fully characterized by a defined null distribution in equation (10.9). Since the average pricing error is obviously influenced by the number of stocks (N ), the number of observations (T ), and the number of factors (K), the null distribution is conditional on N, T , and K. The t-statistic is deemed “too high” when its value is found in the right 10%, 5%, 1%, or 0.1% tail of the null distribution. An advantage of the average F -test is that one cannot simply state that if all stocks have a large intercept, then the factors cannot explain the average pricing error of stocks (or reject the null hypothesis). In any linear regression, there are two estimates, the intercept (pricing error of a stock) and the residual variance (mean square error or MSE), that are unexplained by the factors. The average F -test calculation of the t-statistic is desirable because the squared magnitude of the pricing error of any single stock is weighted by one over the magnitude of the MSE. Thus, an intercept with a large residual variance has less strength in rejecting the null hypothesis than an identical intercept with a small residual variance. This is because a pricing error with a residual variance of smaller size implies that there is a high degree of belief in that pricing error, and the t-statistic gives credit for that. Also, squaring the intercepts gives equal credit to rejecting the null hypothesis regardless if the pricing error is positive or negative. 10.3.4 Simulation analysis Since the average F -test requires the fitting of an ordinary least squares (OLS) model for each stock, we also require the number of factors (K) be less than the number of time-series observations (T ). To calculate the average F -statistic, the residual variance-weighted squared-intercepts across N securities are first averaged. Then, in order to complete the calculation,
page 405
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
406
multiplicative adjustments are made for the sample size (T ), a constant (C) based on the K factors’ mean vector, and their variance-covariance matrix. This t-statistic in equation (10.8) is then compared to the probability distribution of the null hypothesis in equation (10.9) which assumes that stocks have zero intercept in the K-factor model. The probability distribution is actually an average of N F -distributions with degrees of freedom 1 and T − K − 1. The N F -statistics are independent of each other. However, since this probability distribution does not have a known closed-form or analytical solution (see Hwang and Satchell, 2014, p. 467), we simulate 100,000 values for each T, K, and N using the R programming language. Further details for the simulation can be found in Hwang and Satchell (2014). Typically, for each model l, 10,000 or 100,000 values are drawn at random from the probability distribution in equation (10.9). We illustrate the procedure for one draw, Shl , where hl = 1, 2, . . . , 100,000 from the distribution below. For each hl , a random matrix Ml of dimension (Tl − Kl ) × Nl is generated by sampling Tl Kl Nl independent and identically distributed standard normal random variables with mean o and standard deviation 1. Each entry mil ,jl in Ml is squared and the columns jl = 1, 2, . . . , Nl are 1 l −Kl −1 ΣiT= m2il ,jl . Then, the averaged except for the last row, m2jl = Tl −K 1 l −1 last entry in each column is divided by the previous average,
m2T
l −Kl ,jl m2j l
,
for each column jl . Finally, these fractions are averaged across all Nl , Shl =
2
mT −K ,j 1 Nl l l l Nl Σ j l = 1 m2j l
where Shl represents one sample from the distri-
l bution AFl ∼ N1l ΣN nl = 1 Fnl (1, Tl − Kl − 1). Shl represents a sample average F -statistic if the null hypothesis was true, that is, there is no pricing error. Once the above simulation is performed 100,000 times, the draws, S1l , S2l , . . . , S100,000l are rank-ordered from low to high. This rank ordering allows us to compare the t-statistic in the average F -test, AF l , to the quantiles of the null probability distribution. Correspondingly, a p-value emerges
Σ100,000 h =1 I(AF l < Sn )
l l for a one-sided test, for each model l which is equal to 100,000 where I(AF l < Snl ) is the indicator function and 1 if the value of the average F -statistic is less than the sample Snl , and 0 otherwise. If the p-value is lower than our type I error rate (e.g., usually 5% or 10% for a false positive rate preference), we reject the null hypothesis deviation and conclude that the factors found in model l cannot fully explain excess return on individual stocks during the sample time period. The average F -test has some other advantages over the GRS test. First, the average F -test is robust to small changes in the relationship between
page 406
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
407
assets, which is useful when the covariance matrix between stocks is unstable as is often found in the financial markets. Second, the average F -test accounts for the degree of belief in the pricing error of a stock by weighting it with the unexplained residual variance in the model. Finally, and perhaps most importantly, the average F -test avoids the selection bias in the GRS test when grouping assets together in order to force N to be less than T so that the covariance matrix is invertible (Knight and Satchell, 2005). 10.4 Empirical Analysis This section provides information on the summary statistics for monthly returns of the stocks in our sample, describes various measures calculated to compare and contrast relative performance and explanatory power of alternative models, and presents and analyzes our main empirical results. Although we adopted the average F -test of Hwang and Satchell (2014) in our empirical analysis to overcome shortcoming of the standard GRS test, focus of our empirical analyses is quite different from that of Hwang and Satchell (2014). Focus of Hwang and Satchell (2014) was solely on examining superiority of the average F -test over the GRS test for Fama and French (1993) 3-factor model, 4-factor model of Carhart (1997), and the principal components approach (PCA) model of Connor and Korajczyk (1988). Focus of our empirical analysis is on examining relative performances of several asset pricing models motivated by most up to date advances and development in asset pricing model research using several measures as well as the average F -test. We use the average F -test simply as a tool in conjunction with other measures to examine explanatory power of alternative models. 10.4.1 Summary statistics for individual security returns We calculate descriptive statistics for the stocks in our sample and present them in Table 10.1. Average monthly returns for our sample of 407 stocks for the period January 1990 to December 2014 range from 0.2324 to 3.5513% with an average of 1.4121%. Standard deviation of monthly returns varies from 2.8247 to 34.2172% with an average of 10.4165%. It appears that the sampled stocks are very widely distributed (i.e., diverse) in the risk-return spectrum. This diversity of stocks across the sample helps minimize the impact of survivorship bias in our sample. Stocks that disappeared because of merger or acquisition and bankruptcy or liquidation are excluded from our sample. These stocks are likely to have lower or higher returns, and stocks in
page 407
July 6, 2020
11:57
408
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider Table 10.1: Summary statistics for 407 stocks. January 1990–December 2014: 300 months. Average monthly return Mean Minimum Maximum Standard deviation of monthly return Mean Minimum Maximum
(Percent) 1.4121 0.2324 3.5513 10.4165 2.8247 34.2172
our sample in the lower and higher end of the return distribution spectrum will represent these missing stocks. We investigate the extent of pricing error and compare performance of five multi-factor asset pricing models using individual security returns. These models are Fama and French (2015) 5-factor model (FF5), Fama–French 5factor model without HML (FF4), Fama–French 5-factor model with the momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama–French 5-factor model with HML of Asness and Frazzini (2013) and the momentum factor ( HML DEV), and augmented Fama and French (1993) 3-factor model with the liquidity factor of Ibbotson et al. (2013) and the momentum factor (LIQUID). Table 10.2 lists the factors included in the alternative models. 10.4.2 Model performance measures We calculate a number of measures including those employed in Fama and French (2015) in order to compare and contrast relative performance and explanatory power of alternative multi-factor asset pricing models. Similar to Fama and French (2015), A|αi | is the average absolute value of the intercept from regression across all securities in the sample. It is a measure of the average pricing error. We calculate a ratio of absolute value of intercept to residual variance. The former is an estimate of the average excess return on a security over and above the excess return required to compensate investors for the systematic risk of the security. The latter is a proxy for the unsystematic or diversifiable risk of the security that is not explained by the model. This ratio is designed to examine the magnitude of the intercept relative to a security’s residual variance. It measures how well a model explains the cross-sectional variation in security returns. We call this the modified appraisal ratio. It is a variation of the appraisal ratio calculated for
page 408
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
409
Table 10.2:
Factors in alternative models. Models
Factors RM SMB HML OHML HML DEV RMW CMA MOM LMH
FF5
FF4
OHML
HML DEV
LIQUID
√ √ √
√ √
√ √
√ √
√ √ √
√ √
√ √
√ √ √ √
√ √ √ √
√ √
Note: The table lists factors in alternative models. These factors are the market factor (RM), the size factor (SMB), the value factor (HML), the profitability factor (RMW), the investment factor (CMA), the momentum factor (MOM), the liquidity factor (LMH), the alternative value factor (HML DEV), and the orthogonal value factor (OHML). The alternative models are Fama–French 5-factor model (FF5), Fama– French 5-factor model without the value factor (FF4), Fama–French 5factor model with the momentum and orthogonal value factor (OHML), Fama–French 5-factor model with the momentum and alternative value factor (HML DEV), and augmented Fama–French 3-factor model with the liquidity and momentum factor (LIQUID).
managed portfolios, such as mutual funds and hedge funds, which reveals the average value added by a manager in excess of the systematic risk-based excess return per unit of unsystematic risk.10 If a model fully captures all elements of systematic risk of a security, the intercept will be relatively small compared to the unsystematic risk or residual variance. Summary statistics for A |αi |, the modified appraisal ratio, the intercept, and R2 for competing models are presented in Table 10.3.
10.4.3 Results from the average F-test and pricing errors We calculate the t-statistic for the average F -test along with p-values (i.e., the probability of getting an average F -test statistic larger than the one 10 Unlike the appraisal ratio for managed portfolios, we use residual variance instead of residual standard deviation in the denominator in order to be consistent with the calculation of average F -statistic. This simply changes the scale and does not alter the ranking of stocks based on our calculated ratio.
page 409
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
410
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
Table 10.3: Relative performance of alternative asset pricing models. January 1990– December 2014: 300 months. Alternative models
R2 Average Minimum Maximum
FF5
FF4
0.2655 0.0099 0.5657
0.2515 0.0028 0.5566
OHML HML DEV LIQUID
0.2770 0.0104 0.5659
0.2755 0.0131 0.5637
0.2683 0.0151 0.5654
Average 0.2197 0.1391 0.2932 Minimum −1.2483 −1.6296 −1.1009 Maximum 3.8124 3.9803 3.5908 Significant negative @ 0.05 level 2 7 1 Significant negative @ 0.01 level 1 2 0 Significant negative @ 0.001 level 0 0 0 Significant positive @ 0.05 level 23 26 25 Significant positive @ 0.01 level 9 11 10 Significant positive @ 0.001 level 3 5 2 Average t-statistic 0.3260 0.1709 0.4365 A |αi | 0.4878 0.5380 0.4876
0.2331 −1.2786 3.6352 4 2 0 30 13 4 0.3095 0.5256
0.4321 −0.8809 2.9391 0 0 0 41 14 3 0.7354 0.5392
Intercept (Alpha)
Modified Appraisal Ratio Average Minimum Maximum
0.0063 0.0000 0.0337
0.0066 0.0000 0.0307
0.0060 0.0000 0.0326
0.0063 0.0000 0.0313
0.0069 0.0000 0.0353
1.1147
1.3308
1.0943
1.2714
1.4251
1.1266 1.1510 1.1994 1.2558 0.11065
1.1272 1.1511 1.1996 1.2594 0.00024
1.1252 1.1501 1.2003 1.2571 0.00000
Average F -test Test Statistic
Critical value thresholds for 100,000 simulations 0.10 level 0.05 level 0.01 level 0.001 level P -value
1.1267 1.1512 1.2017 1.2571 0.06846
1.1258 1.1513 1.1994 1.2546 0.00005
Note: The table presents summary statistics for Fama–French 5-factor model (FF5), Fama– French 5-factor model without HML (FF4), Fama–French 5-factor model with momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama–French 5factor model with HML of Asness and Frazzini (2013) and momentum factor (HML DEV), and augmented Fama–French 3-factor model with liquidity factor of Ibbotson et al. (2013) and momentum factor (LIQUID). Following Fama and French (2015), A |αi | is the average absolute value of the intercept from regression across all securities in the sample. Appraisal ratio is the absolute value of intercept over the residual variance.
observed if the null hypothesis of no pricing error is true) for each model. Following the methodology of Hwang and Satchell (2014), we run 100,000
page 410
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
411
Monte Carlo simulations for each model to generate the probability distribution to compare the computed average F -test statistic against. We then generate the critical value thresholds for rejecting the null hypothesis at the 0.001, 0.01, 0.05, and 0.10 significance levels for each model. We also run 10,000, and 1,000 Monte Carlo simulations for each model and the results, not reported in the chapter, are similar to those of 100,000 simulations. The average F -statistic as well as p-value and critical value thresholds from 100,000 simulations for all models are presented in Table 10.3. The average F -test fails to reject the null hypothesis of no pricing error for the 6-factor model with market, size, orthogonal value/growth, profitability, investment, and momentum factors (OHML model) and Fama and French’s (2015) fivefactor (FF5) model with a p-value (i.e., the t-statistic is greater than the largest sampled value from each null distribution of 100,000 samples) of 0.11065 and 0.06846, respectively, and rejects the null hypothesis of no pricing error for the 6-factor model with momentum and an alternative measure of the value/growth factor (HML DEV), the 5-factor model without the value/growth factor (FF4), and the augmented Fama and French (1993) 3factor model with the liquidity and momentum factors (LIQUID) with a p-value of 0.00024, 0.00005, and 0.00000, respectively, at any significance level. The results of our study show that the average F -test has superior power to discriminate among competing models and does not reject tested models altogether, unlike other multivariate tests with power such as the GRS test that considers asset pricing models as incomplete descriptions of or simplified propositions about expected returns and rejects them all (as in Fama and French, 2015, 2016). Additionally, and more importantly, our results are free from the information loss or the data-snooping biases associated with grouping securities into portfolios in order to reduce the errors-in-variables problem and to get around the restriction of the GRS test. We also performed the robustness check for simulation results across sub-periods within the entire sample period. In order to test out how sensitive the average F -test results from simulations are to the length of sample period, we divided most recent 24 years of monthly observations into three-year windows, and ran 10,000 simulations to calculate the average F -statistic and p-values for competing models in each three-year rolling window. These results are presented in Table 10.4. The results are fairly consistent across sub-periods and with the full sample period. The average F -test fails to reject the null hypothesis of no pricing error in the OHML model, the best performing model in the full sample period, in the largest
page 411
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
412
Table 10.4: Average F -test for 3-year rolling window. January 1991–December 2014: 288 months. Alternative models FF5
FF4
OHML
HML DEV
LIQUID
1.3532 thresholds 1.2044 1.2295 1.2861 1.3445 0.0005
1.3076
1.2336
1.4377
1.7066
1.2028 1.2284 1.2845 1.3490∗ 0.0030
1.2095 1.2354 1.2906∗ 1.3392∗ 0.0243
1.2086 1.2389 1.2900 1.3715 0.0000
1.2040 1.2288 1.2849 1.3511 0.0000
1.1731
1.1260
1.1404
1.2885
1.2041∗ 1.2271∗ 1.1208∗ 1.3843∗ 0.0983
1.2095∗ 1.2346∗ 1.2906∗ 1.3560∗ 0.2502
1.2094∗ 1.2390∗ 1.2985∗ 1.3683∗ 0.2020
1.2056 1.2327 1.2878 1.3519∗ 0.0049
1.0969 thresholds 1.2031∗ 1.2325∗ 1.2843∗ 1.3497∗ 0.3586
1.2721
0.9798
0.9677
0.8858
1.2047 1.2312 1.2864∗ 1.3511∗ 0.0086
1.2083∗ 1.2364∗ 1.2923∗ 1.3615∗ 0.8912
1.2089∗ 1.2345∗ 1.2891∗ 1.3408∗ 0.9095
1.2053∗ 1.2324∗ 1.2914∗ 1.3473∗ 0.9929
1.5371 thresholds 1.2046 1.2314 1.2916 1.3561 0.0000
1.4713
1.6067
1.5416
1.2365
1.2023 1.2328 1.2847 1.3547 0.0000
1.2080 1.2333 1.2923 1.3459 0.0000
1.2079 1.2338 1.2957 1.3751 0.0000
1.2035 1.2297 1.2872∗ 1.3442∗ 0.0203
1.0259
1.2364
1.2416
1.5867
1.2028∗ 1.2299∗ 1.2851∗ 1.3347∗ 0.6964
1.2123∗ 1.2405∗ 1.2953∗ 1.3617∗ 0.0280
1.2055 1.2329 1.2863∗ 1.3434∗ 0.0185
1.2045 1.2304 1.2908 1.3428 0.0000
1991–1993 Test statistic Critical value 0.10 level 0.05 level 0.01 level 0.001 level p-value 1994–1996 Test statistic Critical value 0.10 level 0.05 level 0.01 level 0.001 level p-value
1.1739 thresholds 1.2047∗ 1.2317∗ 1.2937∗ 1.3651∗ 0.1018
1997–1999 Test statistic Critical value 0.10 level 0.05 level 0.01 level 0.001 level p-value 2000–2002 Test statistic Critical value 0.10 level 0.05 level 0.01 level 0.001 level p-value 2003–2005 Test statistic
1.1525
Critical value thresholds 0.10 level 0.05 level 0.01 level 0.001 level p-value
1.2033∗ 1.2306∗ 1.2834∗ 1.3382∗ 0.1523
(Continued )
page 412
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
413
Table 10.4:
(Continued )
Alternative models FF5
FF4
OHML
HML DEV
LIQUID
0.9484
0.9368
0.8809
0.7885
1.0633
1.2058∗ 1.2308∗ 1.2807∗ 1.3523∗ 0.9942
1.2090∗ 1.2379∗ 1.2907∗ 1.3534∗ 0.9999
1.2032∗ 1.2283∗ 1.2914∗ 1.3699∗ 0.5254
1.5475
1.3385
1.2838
1.2088 1.2356 1.2842 1.3556 0.0000
1.2102 1.2383 1.3011 1.3823 0.0019
1.2052 1.2338 1.2965∗ 1.3508∗ 0.0063
1.2144
1.1574
1.9865
1.2086 1.2378∗ 1.2936∗ 1.3485∗ 0.0442
1.2066∗ 1.2322∗ 1.2835∗ 1.3602∗ 0.1418
1.2045 1.2321 1.2889 1.3650 0.0000
2006–2008 Test Statistic
Critical value thresholds 0.10 level 1.2039∗ 1.2029∗ 0.05 level 1.2301∗ 1.2327∗ 0.01 level 1.2799∗ 1.2857∗ 0.001 level 1.3531∗ 1.3521∗ p-value 0.9444 0.9588 2009–2011 Test statistic
1.2248
1.0665
Critical value thresholds 0.10 level 1.2075 1.2048∗ ∗ 1.2344∗ 0.05 level 1.2331 ∗ 1.2858∗ 0.01 level 1.2783 0.001 level 1.3386∗ 1.3412∗ p-value 0.0309 0.4999 2012–2014 Test statistic
1.2686
1.3540
Critical value thresholds 0.10 level 1.2062 1.2038 0.05 level 1.2341 1.2290 0.01 level 1.2823∗ 1.2819 0.001 level 1.3443∗ 1.3536 p-value 0.0075 0.0005
Notes: ∗ Fails to reject the null hypothesis of no pricing error at this level of significance. The table presents average F -statistic for 3-year rolling window for Fama–French 5-factor model (FF5), Fama–French 5-factor model without HML (FF4), Fama–French 5-factor model with momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama– French 5-factor model with HML of Asness and Frazzini (2013) and momentum factor (HML DEV), and augmented Fama–French threefactor model with liquidity factor of Ibbotson et al. (2013) and momentum factor (LIQUID).
number of sub-periods at all levels of significance but FF5 and FF4 models are a close second with HML DEV and LIQUID models in fourth and fifth place, respectively.
page 413
July 6, 2020
11:57
414
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
10.4.4 Relative performance of models We are also interested in the relative performance of competing models. Following Fama and French (2015) and Hou et al. (2015b), we now turn our attention to evaluate the relative performance of alternative asset pricing models based on various measures reported in Table 10.3 in an attempt to identify the model that best explains the cross-sectional variation in security returns. The average R2 of all models ranges from 0.2515 to 0.2770. The OHML model has the highest average R2 but the HML DEV model is a close second followed by the LIQUID model in third place. Overall, these average R2 values across models are hardly distinguishable. For identifying the model that fares best in explaining the cross-sectional variation in asset returns, we now consider the magnitude of the average intercept, the intercept absolute value, the number of significant intercepts, and especially the modified appraisal ratio. If an asset pricing model’s performance is adequate, the intercept should be economically small and statistically insignificant from zero for all assets (Hou et al., 2015b). We see that the model having an average intercept nearest to zero is the FF4 model, which is the Fama and French 5-factor model without the value factor (HML). In second place is the HML DEV model which is the Fama and French (2015) 5-factor model with the HML of Asness and Frazzini (2013) and the momentum factor. In third and fourth places are the FF5 and OHML, respectively. In last place is the LIQUID model. Average intercept value ranges from 0.1391 for the FF4 model to 0.4320 for the LIQUID model. The average absolute value of the intercept, A|αi |, is the lowest for the OHML model followed by the FF5 in a close second place. HML DEV, FF4, and LIQUID models are in third, fourth and fifth place, respectively. OHML and FF5 models share the smallest number of significant intercepts at various level of significance, followed by FF4, HML DEV, and LIQUID models in third, fourth, and fifth places, respectively. Looking at the modified appraisal ratio, OHML has the lowest average while the FF5 and HML DEV models tie for second place followed by the FF4 and LIQUID models in fourth and fifth place, respectively. It is apparent that the three measures of the average absolute value of the intercept, the number of significant intercepts, and the modified appraisal ratio are more indicative of the extent of pricing error in a model than the average intercept value. OHML performs best when judged by these three measures and it has the highest R2 , the lowest F -statistic, and the highest pvalue of all competing models. Additionally, the OHML model includes the
page 414
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
415
well-recognized momentum factor and resolves the redundancy of the value factor by replacing HML with orthogonal OHML. Fama and French (2015) noted that there is a large value premium in average returns that is often targeted by money managers even after the exposure to other factors is considered. It is worth noting that Fama and French (2016) use Fama and French’s (2015) 5-factor model of equation (10.4) and the 6-factor OHML model of equation (10.6) to explain several anomalies that plague asset pricing models. Despite overwhelming evidence of the importance of the momentum factor in asset pricing in extant literature, Fama and French’s (2015) 5-factor model excludes momentum factor. It is plausible that Fama and French (2015) considered the profitability factor a surrogate for the momentum factor. Hou et al. (2015b) find that the profitability factor — a factor much like Fama and French’s (2015) RMW factor — has a correlation of 0.50 with the momentum factor, implying that it plays a role similar to the momentum factor in explaining average return. However, the online data library of Professor French estimates and reports momentum factor along with five factors, thereby acknowledging the usefulness of the momentum factor in asset pricing. Moreover, Asness (2014) regressed the momentum factor on Fama and French’s (2015) five factors using data from Professor French’s online data library, and the intercept was large, positive, and statistically significant, unlike the intercept of the regression of HML on the other four factors in Fama and French (2015). This bears additional weight to the unique role of the momentum factor in asset pricing yet in the presence of the profitability and value factors. Asness (2014), however, notes that Fama and French (2015) see the negative correlation of momentum with value as justification, along with trying to limit dimensionality, for leaving momentum out. (See Asness (2016) for details). Professor Fama also indicates that it is more costly to implement a momentum rather than a lower-turnover strategy, and he believes that the higher turnover of momentum makes it implausible that risk can explain its high average returns (White, 2015). Asness et al. (2015a) also note that momentum is a higher turnover strategy than some other strategies (for example, value), and hence the question arises as to whether the premium for momentum covers trading costs. Korajczyk and Sadka (2004) examine the profitability of long positions in winner-based momentum strategies after accounting for the cost of trading. In particular, they estimate the size of a momentum-based fund that could be achieved before abnormal returns are either statistically insignificant or driven to zero. They investigate several trading cost models and momentum portfolio strategies and find that the
page 415
July 6, 2020
11:57
416
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
estimated excess returns of some momentum strategies disappear only after an initial investment of around $4.5 to $5.0 billion by a single fund manager. Berk and Binsbergen (2015) argue that because of high turnover, associated high transaction costs, and the time and effort required to implement a momentum strategy, momentum index funds did not exist until recently. AQR Capital Management launched a momentum index fund with a $1 million minimum investment in 2009 and charges 75 basis points which is close to the average fee of a sample of actively managed funds in Berk and Binsbergen (2015). Frazzini et al. (2015) estimate real-world trading costs for momentum, value, and size-based strategies using trades from a large institutional investor over a long period of time. Their conclusion is that per dollar trading costs for momentum are quite low, and thus, despite the higher turnover, momentum easily survives transactions costs. Today’s even lower trading costs would, of course, make momentum more viable and even cheaper to implement going forward (Asness, 2016). Asness (2014) and Asness et al. (2015a) note that both value and momentum work, but they work best together due to their negative correlation. A well-constructed value strategy diversifies momentum (and vice versa) so well that a combination strategy of the two is far better than either alone (Asness and Frazzini, 2013). Even if the OHML model adds more dimensionality to cross-sectional variation in asset returns, it is a parsimonious yet effective multifactor asset pricing model. Since the late 1990s, Carhart’s (1997) model has been the standard tool used to analyze and explain the performance of investment managers and investment strategies (Swedroe and Berkin, 2015). Carhart’s (1997) 4-factor model includes the recognized momentum factor in explaining the variation in asset returns that was left out of Fama and French’s (1993) 3-factor model. The importance of the momentum factor in explaining asset prices has been acknowledged by Fama and French (1996 and 2008), Schwert (2003), Asness (2014, 2016), and Asness et al. (2015a), among others. Although the momentum factor is a desirable addition to the market, size, and value/growth factors, it is not sufficient for an equilibrium asset pricing model. The OHML model is a simple extension of Carhart’s (1997) 4-factor model to include orthogonal HML and Fama and French’s (2015) profitability and investment factors. It is worth noting that not more than 10% of the intercepts for any of the competing models are significant at the 0.01, 0.001, and 0.0001 levels. The average t-statistic for the intercept ranges from 0.1709 for the FF4 model to 0.7354 for the LIQUID model. Many investors, analysts, and traders use
page 416
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
417
a positive intercept from the asset pricing model regression (alpha) as an initial screening tool for selecting securities for further security analysis. It is a measure of the abnormal return on a security in excess of what would be predicted by an equilibrium asset pricing model. In this regard, most of the competing models will be useful to investors, analysts, and traders for alpha-based preliminary security analysis. Our results with respect to Fama and French’s (2015) 5-factor model are not consistent with those of Jagadeesh et al. (2015) and Hou et al. (2015a). Jagadeesh et al. (2015) used a methodology that allows for using individual stocks without grouping them into portfolios in testing asset pricing models. To reduce the errors-in-variables problems associated with estimating betas, which is seemingly more severe with individual stocks than with portfolios, they used the instrumental variable (IV) technique to fix the EIV problem. Their results indicate that in Fama and French’s (2015) 5-factor model, market, SMB, and HML risks are priced, while RMW and CMA risks are not priced in the cross-section of individual stock returns, and the intercept is statistically significant at the 5% level. Hou et al. (2015b) developed an empirical model consisting of the market factor, a size factor, an investment factor, and a profitability factor to explain the cross-sectional variation in stock returns in general and to capture most (but not all) of the anomalies that plagued Fama and French’s (1993) 3-factor model in particular. Hou et al. (2015a) compared the Hou et al. (2015b) model and Fama and French’s (2015) 5-factor model both theoretically and empirically. In their empirical analysis, the 4-factor Hou et al. (2015b) model outperformed Fama and French’s (2015) 5-factor model in capturing price and earnings momentum and profitability anomalies. However, one weakness in the Hou et al. (2015a) is that securities are grouped into portfolios by attributes which leads to a loss of information. We did not test the explanatory power of the Hou et al. (2015b) model using individual stocks, but Jagadeesh et al. (2015) did. They show that in the Hou et al. (2015b) model, market and profitability risks are priced, while size and investment risks are not priced in the cross-section of individual stock returns, and the intercept is statistically significant at the 5% level. The pricing of market and profitability risks disappears when firm-level characteristics of size and B/M are included in the second stage cross-sectional regression. A few comments are in order about the methodology of Jagadeesh et al. (2015). First, the IV technique requires that an instrumental variable has certain properties in order to obtain consistent estimates of the parameters. Jagadeesh et al. (2015) used different sample observations of the same
page 417
July 6, 2020
11:57
418
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
variable as an instrument. If an explanatory variable is subject to measurement error or an errors-in-variables problem, it is likely correlated with the error term. Different sample observations of the same variable will suffer from the same problem as the original variable, and this weakness will prevent it from being a good instrument. Second, unlike the GRS test for portfolios (as in Fama and French 2015, 2016, Hou et al., 2011, 2015a, 2015b) or the average F -test for individual stocks (as in our study), there is no comparable multivariate test to judge the explanatory power and estimate the magnitude of pricing errors across all assets (i.e., average pricing errors) of the model in a two stage regression using instrumental variables. This is because the second stage regression with instrumental variables uses cross-sectional data. One has to rely on summary statistics of the time-series of the estimated intercept from cross-sectional regression in the second stage to draw conclusions about statistical significance. Our results are to some extent similar to those of Barillass and Shanken (2018) who find that the models of Hou, Xue and Zhang (2015a, 2015b) and Fama and French (2015, 2016) are both dominated by a variety of models that include a momentum factor, along with value and profitability factors, thus rejecting the hypothesis of redundancy of value and momentum factors. Nonetheless, it is worth noting that methodology in this study is quite different from that of Barillass and Shanken (2018) who examined the models with size and momentum or book-to-market and investment based portfolios leading up to unintended consequences of loss of information at the individual security level. 10.4.5 Variance inflation factor analysis The variance inflation factor (VIF) is a commonly used indicator of multicollinearity. It measures how much the variance of the estimated regression coefficients are inflated resulting in unreliable and unstable estimates when the predictor variables or regressors are highly correlated. The VIF is estimated for each predictor variable by regressing it on all the other predictors and then obtaining the R2 . The VIF is just the inverse of one minus R2 from that regression. The larger the value of VIF for a regressor, the more problematic or collinear it is. As a rule of thumb, if the VIF of a variable exceeds 10, that variable is considered to be highly collinear (Belsley et al., 1980; Gujrati, 2004). However, others consider VIF values exceeding 5 alarming (Allison, 2012). We calculated VIF for each regressor in all five models, and VIF values are presented in Table 10.5. Only the LMH factor in the LIQUID model has a VIF exceeding 5. All other VIFs were less than three except
page 418
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
419
Table 10.5: Variance Inflation Factor (VIF) of regressors of alternative asset pricing models. January 1990–December 2014: 300 months. Alternative models Factor Market (RM) Value (HML) Size (SMB) Profit (RMW) Inv (CMA) Momentum (MOM) Liquidity (LMH)
FF5
FF4
OHML
1.4123 2.3021 1.3312 1.8273 2.1979
1.3556
1.4464 1.0725 1.3399 1.4982 1.1765 1.1514
1.3311 1.4982 1.1744
HML DEV LIQUID 1.4871 3.6855 1.3460 1.6086 2.0230 2.9518
2.4902 2.1753 1.4762
1.2283 5.0376
FF5 PLUS 2.5228 2.9815 1.6098 2.1022 2.3749 1.2370 5.8177
Note: Alternative models are Fama–French 5-factor model (FF5), Fama–French 5-factor model without HML (FF4), Fama–French 5-factor model with momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama–French 5-factor model with HML of Asness and Frazzini (2013) and momentum factor (HML DEV), and augmented Fama–French 3-factor model with liquidity factor of Ibbotson et al. (2013) and momentum factor (LIQUID). FF5 Plus is FF5 with momentum and liquidity factors. VIF of a factor is the inverse of one minus R2 from regression of a predictor variable on all other predictor variables in the model.
the value factor in HML DEV model has a VIF of 3.6855. We added liquidity and momentum factors to the FF5 model (listed as FF5 PLUS model in the table) and calculated VIF for the regressors. VIF of the liquidity and the momentum factor were 5.81778 and 1.23701, respectively. We earlier noted a positive relationship between liquidity and momentum for US stocks (Sadka, 2006), as well as a negative relationship between value and liquidity and a positive relationship between liquidity and momentum globally across asset classes (Asness et al., 2013). These relationships of liquidity to value and momentum along with a high value of VIF for the LMH factor in the LIQUID model considerably weaken the effectiveness of the liquidity factor in explaining the cross-sectional variation in stock returns. On the other hand, a relatively low value of VIF for the momentum factor justifies its solid position in an asset pricing model. Hou et al. (2015b) found a high correlation between the profitability factor and the momentum factor. They also traced a tight economic link between investment and book-to-market and found a significant correlation between HML and the investment factor. Additionally, Novy-Marx (2013) found some relationship between book-to-market and profitability. Profitable firms tend to be growth firms, and unprofitable firms tend to be value firms. These findings seem to suggest interdependence among factor portfolios leading up to
page 419
July 6, 2020
11:57
420
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
multicollinearity and consequential distorted coefficient estimates and factor loadings. However, the predictor variables in the best performing OHML model, as well as in Fama and French’s (2015) 5-factor model, have low overall VIF values. This argues against the presence of multicollinearity or interdependence among the factors. Given that the extant literature has ample evidence on the importance of the liquidity factor in asset pricing, we made an attempt to resurrect the LIQUID model as a viable model to explain the cross-sectional variation in individual security returns. To resolve the issue associated with the liquidity factor (LMH) in the LIQUID model having an alarming VIF value, similar to Carhart (1997), we orthogonalized LMH by regressing it on the four factors of market, size, value/growth, and momentum, then replacing it with the sum of the intercept and residual from the regression, which is the orthogonal LMH (OLMH). This resulted in the VIF value dropping from 5.0376 for LMH to 1.0 for OLMH, but it did not move the needle. The results, not reported in the chapter, show that substitution of OLMH for LMH in the LIQUID model produces no improvement in model performance. The average F -test rejects the null hypothesis of no pricing error in the revised LIQUID model with F -statistic of 1.4251, p-value of 0.0000, and critical value thresholds of 1.1264, 1.1505, 1.1991, and 1.2588, for 0.10, 0.05, 0.01, and 0.001 significance level, respectively, from 100,000 simulations. These results complement those of Fama and French (2015, 2016) who tested the 5-factor model augmented with the liquidity factor of Pastor and Stambaugh (2003). But the attributesorted portfolios had trivial loadings on the liquidity factor and negligible improvement in the regression intercepts implying no noticeable difference in model performance. 10.4.6 Factor loadings and forces behind model performance To reflect upon the forces behind the relative performance of the alternative models, we examine the factor loadings. The descriptive statistics for the regression coefficients of the factors (i.e., factor loading or factor beta) for competing models are presented in Table 10.6. The regression coefficients for the market factor in the alternative models are mostly positive and highly significant and are not shown. The loadings for the size factor (SMB) are indistinguishable across five models. Approximately half of the SMB factor loadings are significant positive, and a small number of them are significant negative in all the models. The average loading for the value/growth factor (HML or its variations OHML and HML DEV) ranges from 0.1923 to 0.3044
page 420
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
421
Table 10.6: Regression coefficients of alternative asset pricing models. January 1990– December 2014: 300 months. Alternative models Factors
FF5
FF4
OHML
HML DEV
LIQUID
0.4616 −0.8111 2.4239 195 (170) 19 (12)
0.4638 −0.8148 2.4401 195 (169) 18 (12)
0.4811 −0.8088 2.6319 202 (178) 21 (10)
0.4745 −0.7881 2.5505 200 (176) 22 (13)
0.4206 −0.7552 1.7504 171 (149) 15 (11)
0.2049 −2.2133 2.0465 97 (65) 24 (15)
0.1923 −2.3706 2.1098 77 (55) 17 (9)
0.3044 −1.1855 2.5748 119 (89) 4 (2)
Size (SMB) Average Minimum Maximum Significant positive Significant negative
Value/Growth (HML/OHML/HML DEV) Average Minimum Maximum Significant positive Significant negative
0.2807 −2.0574 2.2996 113 (89) 22 (13)
Profitability (RMW) Average Minimum Maximum Significant positive Significant negative
0.2771 −2.8177 1.4252 114 (94) 9 (6)
0.4034 −3.4751 2.0888 187 (161) 15 (11)
0.4032 −3.4741 2.0861 190 (164) 16 (11)
0.3554 −3.1454 1.5971 165 (126) 10 (9)
0.1022 −1.8343 1.6031 24 (11) 5 (3)
0.3848 −2.0581 2.5408 147 (118) 11 (8)
0.3722 −2.0086 2.5258 147 (118) 12 (8)
0.2021 −1.8761 1.9548 44 (24) 4 (2)
−0.1262 −1.4011 0.5556 10 (4) 86 (63)
−0.0442 −1.5675 0.8655 23 (8) 29 (17)
Investment (CMA) Average Minimum Maximum Significant positive Significant negative Momentum (MOM) Average Minimum Maximum Significant positive Significant negative
−0.1202 −1.2355 0.7139 9 (6) 77 (58)
Liquidity (LMH) Average Minimum Maximum Significant positive Significant negative
0.0695 −2.2857 1.2723 69 (41) 31 (23)
Note: The table presents regression coefficients for Fama–French 5-factor model (FF5), Fama–French 5-factor model without HML (FF4), Fama–French 5-factor model with momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama–French 5-factor model with HML of Asness and Frazzini (2013) and momentum factor (HML DEV), and augmented Fama–French 3-factor model with liquidity factor of Ibbotson et al. (2013) and momentum factor (LIQUID). Test of significance is at the 0.01 level. The number of t-statistic larger than 3.0 which corresponds to a significance level of approximately 0.002 is in parentheses.
page 421
July 6, 2020
11:57
422
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
for the four models. There are more significant positive than significant negative HML factor loadings in all four models. It appears that the size factor is roughly twice as powerful as the value/growth factor in explaining the excess return on securities and is the most powerful of all factors, except for the market factor, in all the models. Average loading for the profitability factor (RMW) and the investment factor (CMA) varies from 0.2771 to 0.4034 and from 0.1022 to 0.3848, respectively, for the four models. There are more significant positive than significant negative factor loadings for both RMW and CMA for all four models. The average loading for the momentum factor is negative across the three applicable models and ranges from 0.1262 to –0.0442. There are more significant negative than significant positive momentum factor loadings across all three models. This implies that there are relatively less intermediate-term winners (i.e., stocks with high returns in the past 12-months) than losers (i.e., stocks with low returns) in our sample. The average factor loading for the liquidity factor (LMH) in the LIQUID model is 0.0695, which is very small relative to the other factor loadings in the LIQUID model, and there are more significant positive than significant negative liquidity factor loadings indicating relatively more liquid stocks in our sample. The liquidity factor is half as powerful as the momentum factor and substantially less powerful than the value/growth and size factors in explaining excess security returns in the LIQUID model. Overall, the OHML model has the highest number of significant factor loadings for the size, profitability, investment, and momentum factors. Only the FF5 model has a more significant factor loading for the value factor than the OHML model. This makes sense because in the OHML model, we replaced HML with orthogonal HML which captures the incremental impact of the value factor after what has been captured by the other factors. The OHML model is a refinement of Fama and French’s (2015) fivefactor model (FF5) and preserves most of the aspects of the FF5 model except for adding the momentum factor and replacing the observed value factor with the orthogonal value factor. The FF5 model is in second place in our empirical analysis for individual stocks and judged by various measures, it outperforms the other three competing models. The FF5 model has the second largest p-value and the second lowest F -statistic which fails to reject the null hypothesis of no pricing error in the model. We further investigated the relative performance of competing asset pricing models using the criterion recommended in Harvey et al. (2016). The authors note that hundreds of papers and hundreds of factors attempt to explain the cross-section of expected returns. Given this extensive data
page 422
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
423
mining, it does not make sense to use the usual statistical significance cutoffs (e.g., a t-statistic exceeding 2.0) in asset pricing tests regardless of whether the factor is derived from theory or pure empirical exercise. They argue that a newly discovered factor needs to clear a much higher hurdle with a t-statistic greater than at least 3.0, which corresponds to a p-value of 0.27%. We tested the competing models against this raised bar.11 As shown in Table 10.6, the OHML model has the largest number of statistically significant factor loadings (except for the value factor) when we use a t-statistic cutoff of 3.0. This, along with the smallest number of significant intercept at the .001 level of significance (which corresponds to a t-statistic of 3.291), as shown in Table 10.3, the lowest F -statistic, and the highest p-value makes a strong case for the OHML model. Our empirical results show that the OHML model is a viable candidate to replace Carhart’s (1997) 4-factor model as an equilibrium asset pricing model and a standard tool to evaluate the investment performance of managed portfolios. Our empirical results show that simply adding the momentum factor and replacing the traditional value factor with the orthogonal value factor in Fama and French’s (2015) 5-factor model of equation (10.4) produces a nontrivial improvement in model performance and statistically significant factor loadings for a large number of securities as shown in Table 10.6. 10.4.7 GRS test of intra-industry securities One assumption of the average F -test is that idiosyncratic returns across securities are uncorrelated resulting in block diagonal variance-covariance matrix of residuals, with the off-diagonal elements being zero. We now relax this assumption to make allowance for residuals of securities within an industry or sector to be cross-sectionally dependent. However, firms in different sectors or industries are still assumed to have uncorrelated idiosyncratic returns. To accomplish this, we employ GRS test that, unlike the average F -test, takes into account the variance-covariance matrix of residuals while calculating the t-statistic. We calculate GRS statistic for each sector with a sizeable sample of securities to detect average pricing error in competing 11
While their focus is on t-statistic in cross-sectional regressions, Harvey et al. (2016) pointed out that their message applies to many different areas of finance. They cited Welch and Goyal (2008), who examined equity premium prediction, and Novy-Marx (2014), who proposed unconventional variables to predict anomaly returns, both using timeseries regressions. We apply the same criterion of Harvey et al. (2016) to our time-series regressions.
page 423
July 6, 2020
11:57
424
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
models in that sector. GRS statistic is given by ˆ −1 α α ˆ Σ T −N −K ˆ , FGRS = ˆ K −1 μ N 1+μ ˆK Ω ˆK
(10.11)
ˆ is where α ˆ is a vector of OLS estimates of intercepts of the N assets, Σ the estimated variance-covariance matrix of residuals of securities in a secˆ K is the tor, μ ˆK is a vector of sample means of the K factor returns, and Ω estimated variance-covariance matrix of factor returns, T is the number of time-series observations, N is the number of securities in a sector, and K is the number of factors. FGRS follows a central F distribution unconditionally with N degrees of freedom in the numerator and T − N − K degrees of freedom in the denominator under the null. FGRS is a linear combination of the estimated second moment of the pricing errors, weighted by their variances and covariances (Knight and Satchell, 2005). Gibbons et al. (1989) note that the test gives a weighted measure of how much the whole set of assets deviated from their correct price. Comparing the average F -statistic of equation (10.8) with the GRS statistic of equation (10.11) shows that the latter replaces variances of residuals used in the former with full variancecovariance matrix of residuals while calculating the statistic for detecting average pricing error. However, the GRS test is inapplicable to our full sample where N > T but applicable to sectors or industries where N < T . Similar to Connor and Korajczyk (1988), we place securities into sectors using two-digit NAICS code12 which replaced SIC industry code in 1997, and run OLS regressions for individual securities to estimate intercepts. The GRS t-statistic for alternative models for various sectors are presented in Table 10.7. The results in the table indicate that the GRS test fails to reject the null hypothesis of no pricing error for all competing models in all sectors except for one where the GRS statistic are all significant at the 10%, 5%, and 1% level for FF4, HML DEV, and LIQUID models and at the 10% and 5% level for FF5 and OHML models resulting in the rejection of the null at those levels. The results further show that FF5 model has the lowest pricing error and rejection rate of the null hypothesis across all sectors using the GRS statistic, followed by OHML model. HML DEV and LIQUID models share third place and FF4 model is in fifth place. However, OHML model has lower GRS statistic than FF5 model in more sectors; the OHML has lower GRS statistic in six sectors while FF5 model has lower GRS statistic in other four sectors. These results do not significantly alter the position of alternative models in terms of their relative performance based on the average F -test 12
Source: https://www.census.gov/eos/www/naics/.
page 424
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
425
Table 10.7: GRS statistic for alternative models in sectors. January 1990–December 2014: 300 months. Alternative models NAICS sector code 21 22 31 32 33 42 44 48 51 52
FF5
FF4
OHML
HML DEV
1.0329 0.4982 0.4043 0.8623 0.8763 0.7108 0.9040 1.4375 1.8367∗∗ 0.6645
1.0097 0.4983 0.4554 0.9561 0.9138 0.8312 0.9962 1.1079 2.0591∗∗∗ 0.7247
0.9568 0.4479 0.4232 0.7917 0.8748 0.9241 1.0552 1.3139 1.9276∗∗ 0.5468
0.9191 0.4247 0.4364 1.0175 0.9722 0.9806 1.1720 0.9555 2.1461∗∗∗ 0.5590
LIQUID 0.9039 0.4349 0.5004 0.8392 0.9483 1.0743 1.3830 1.3236 2.0591∗∗∗ 0.4917
Notes: ∗∗∗ Significant @ 0.10, 0.05, and 0.01 Level; ∗∗ Significant @ 0.10 and 0.05 Level. The table shows GRS statistic for alternative models in various sectors. Alternative models are Fama–French 5-factor model (FF5), Fama–French 5-factor model without HML (FF4), Fama–French 5-factor model with momentum factor of Jagadeesh and Titman (1993) and orthogonal HML (OHML), Fama–French 5-factor model with HML of Asness and Frazzini (2013) and momentum factor (HML DEV), and augmented Fama– French 3-factor model with liquidity factor of Ibbotson et al. (2013) and momentum factor (LIQUID).
and other measures used in this study. However, these results indicate that the GRS test is less powerful than the average F -test in detecting pricing error in competing models at least at the individual security level and the former appears to be less discriminatory than the latter in distinguishing between relative explanatory power of these models. Our results of GRS test at the individual security level that captures cross-sectionally correlated idiosyncratic returns within industry or sector are not consistent with those of Fama and French (2015, 2016) and Hou et al. (2011, 2015a, 2015b) at the portfolio level. Unlike these studies, we do not group securities into portfolios and therefore no information regarding the cross-sectional behavior of individual securities is lost. 10.5 Conclusions A large number of studies attempt to explain the cross-sectional variation in asset returns using multi-factor asset pricing models. Harvey et al. (2016) noted that various combinations of at least 316 factors have been used to explain the cross-section of expected returns over the last 10 years. We attempt to answer the questions of which characteristics (factors) provide the best information about average returns and are most important. To do
page 425
July 6, 2020
11:57
426
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
this, we empirically explore the explanatory power of the 5-factor model of Fama and French (2015), an extended version of the 4-factor offspring of the 3-factor model of Fama and French (1993) including the momentum and liquidity factors, a 5-factor model without the value/growth factor, a 6-factor model with momentum and orthogonal value/growth factors, and a 6-factor model with momentum and an alternative measure of the value/growth factor. We found that the augmented version of Fama and French’s (2015) 5-factor model that includes the momentum and orthogonal value/growth factors outperformed all other models based on the average F -test and a number of other measures. To reduce the errors-in-variables problem associated with beta estimates, prior studies tested asset pricing models by grouping securities into portfolios which introduced a bias. The bias resulted in a reduction of efficiency caused by the loss of information in the cross-sectional behavior of individual stocks. Our new approach eliminates this bias by applying the alternative models directly to individual securities. We are able to do this by using an alternative multivariate F -test developed by Hwang and Satchel (2014) instead of the conventional multivariate F -test proposed by Gibbons et al. (1989). Importantly, our implementation allows the number of assets to exceed the number of time-series observations. These findings are supplementary to Fama and French (1993, 2015) and Hou et al. (2015a, 2015b) because these authors judge asset pricing models using only portfolios that are formed on attributes resulting in a loss of information about individual securities. Our study fills a gap in the literature and also tests the comparative performance of models directly using individual securities. Our results show that the OHML model, which is the simple Fama and French 5-factor model augmented with momentum as per Jegadeesh and Titman (1993) and Carhart (1997) and an orthogonal value/growth factor has the best explanatory power. Compared to the other models tested, the OHML model best explains the asset returns of individual securities using the average F -test, since it has the lowest average F -statistic of all five models. The OHML model also has an intercept across sample securities bordering zero as per the average F -test and the best modified appraisal ratio, which is an important measure of uniqueness. Furthermore, the OHML model has the smallest number of significant intercepts at the 0.01 and 0.001 significance levels. Therefore, we conclude that the variability of stock returns is best explained by the market, size, orthogonal value/growth, productivity, investment, and momentum factors. Our finding that the OHML model with momentum outperforms Fama and French’s (2015) 5-factor model without momentum raises questions about the usefulness of the latter. As per Asness
page 426
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
427
(2016), momentum adds considerable return to 5-factor model. Momentum is negatively correlated with value and largely independent of the other factors. For performance attribution purposes and for the building of real world portfolios, momentum should be a useful addition to any factor model of stock returns. Moreover, our finding that the OHML model outperforms Fama and French’s (2015) 5-factor model without HML is intriguing and raises questions about the justification for dropping HML to explain the cross-sectional variation in asset returns. Furthermore, a low value of VIF and a large number of significant factor loadings (97 positive and 24 negative out of 407) for the HML factor in the OHML model confirms the incremental and unique contribution of the orthogonal value factor in explaining individual stock returns. Our results based on returns of a large sample of individual stocks provide some clarity that is missing from the standard analysis of portfolios.
Bibliography Acharya, V. and Pederson, L. (2005). Asset Pricing with Liquidity Risk. Journal of Financial Economics, 77, 375–410. Aharoni, G., Grundy, B. and Zeng, Q. (2013). Stock Returns and the Miller Modigliani Valuation Formula: Revisiting the Fama French Analysis. Journal of Financial Economics, 110, 347–357. Allison, P. (2012). When Can You Safely Ignore Multicollinearity? Statistical Horizons, http://www.statisticalhorizons.com. Amihud, Y. (2002). Illiquidity and Stock Returns: Cross-Section and Time-Series Effects. Journal of Financial Markets, 5, 31–56. Amihud, Y. and Mendelson, H. (1986). Asset Pricing and the Bid-Ask Spread. Journal of Financial Economics, 1, 223–249. Ang, A., Hodrick, R., Xing, Y. and Zhang, X. (2009). High Idiosyncratic Volatility and Low Returns: International and Further US Evidence. Journal of Financial Economics, 91, 1–23. Ang, A., Liu, J. and Schwarz, K. (2010). Using Stocks or Portfolios in Tests of Factor Models. http://finance.wharton.upenn.edu/∼kschwarz/Portfolios.pdf. Antonacci, G. (2015). Dual Momentum Investing: An Innovative Strategy for Higher Returns with Lower Risk. McGraw-Hill Education, New York. Ashcraft, A., Garleanu, N. and Pederson, L. (2010). Two Monetary Tools: Interest Rates and Hair Cuts. NBER Macroeconomic Annual, 25, 143–180. Asness, C. (2014). Our Model Goes to Six and Saves Value From Redundancy Along the Way. Working Paper, AQR Capital Management. Asness, C. (2016). Fama on Momentum. Working Paper, AQR Capital Management. Asness, C. and Frazzini, A. (2013). The Devil in HML’s Detail. Journal of Portfolio Management, 39, 49–67. Asness, C., Frazzini, A., Israel, R. and Moskowitz, T. (2015a). Fact, Fiction, and Value Investing. Journal of Portfolio Management, 42, 34–52.
page 427
July 6, 2020
11:57
428
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
Asness, C., Ilmanen, A., Israel, R. and Moskowitz, T. (2015b). Investing with Style. Journal of Investment Management, 13, 27–63. Asness, C., Moskowitz, T. and Pederson, L. (2013). Value and Momentum Everywhere. Journal of Finance, 68, 929–985. Barras, L., Scaillet, O. and Wermers, R. (2010). False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas. Journal of Finance, 65, 179–216. Barillas, F. and Shanken, J. (2018). Comparing Asset Pricing Models. Journal of Finance, 73, 715–754. Belsley, D., Kuh, E. and Welsch, R. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley and Sons, New York. Ben-Raphael, A., Kadan, O. and Wohl, A. (2015). The Diminishing Liquidity Premium. Journal of Financial and Quantitative Analysis, 50, 197–229. Berk, J. (2000). Sorting out Sorts. Journal of Finance, 55, 407–427. Berk, J. and Binsbergen, J. (2015). Measuring Skill in the Mutual Fund Industry. Journal of Financial Economics, 118, 1–20. Brennan, M. and Subrahmanyam, A. (1996). Market Microstructure and Asset Pricing: On the Compensation for Illiquidity in Stock Returns. Journal of Financial Economics, 41, 441–464. Busse, J., Goyal, A. and Wahal, S. (2010). Performance and Persistence in Institutional Investment Management. Journal of Finance, 65, 765–790. Carhart, M. M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance, 52, 57–82. Chen, N. F. and Ingersoll, J. (1983). Exact Pricing in Linear Factor Models with Finitely Many Assets: A Note. Journal of Finance, 38, 985–988. Chen, N. F., Roll, R. and Ross, S. A. (1986). Economic Forces and the Stock Market. Journal of Business, 59, 383–403. Chordia, T., Subrahmanyam, A. and Anshuman, V. (2001). Trading Activity and Expected Stock Returns. Journal of Financial Economics, 59, 3–32. Chordia, T., Goyal, A. and Shanken, J. (2015). Cross-Sectional Asset Pricing with Individual Stocks: Betas versus Characteristics. https://papers.ssrn.com/sol3/papers.cfm? abstract id=2549578. Connor, G. (1984). A Unified Beta Pricing Theory. Journal of Economic Theory, 34, 13–31. Connor, G. and Korajczyk, R. (1988). Risk and Return in an Equilibrium APT: Application of a New Test Methodology. Journal of Financial Economics, 21, 255–289. DeMuth, P. (2014). What’s Up with Fama and French’s New 5-Factor Model? The Mysterious New Factor V. Forbes Online, (January 20). http://www. forbes.com/sites/phildemuth/2014/01/20/whats-up-with-fama-frenchs-new-5-factormodel-the-mysterious-new-factor-v. Fama, E. F. (1973). A Note on the Market Model and the Two Parameter Model. Journal of Finance, 28, 1181–1185. Fama, E. F. and French, K. R. (1992). The Cross-Section of Expected Stock Returns. Journal of Finance, 47, 427–465. Fama, E. F. and French, K. R. (1993). Common Risk Factors in the Returns on Stocks and Bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (1996). Multifactor Explanation of Asset Pricing Anomalies. Journal of Finance, 51, 55–84. Fama, E. F. and French, K. R. (2008). Dissecting Anomalies. Journal of Finance, 63, 1653–1678. Fama, E. F. and French, K. R. (2010). Luck versus Skill in the Cross-Section of Mutual Fund Returns. Journal of Finance, 65, 1915–1947.
page 428
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
Application of the Multivariate Average F-Test
429
Fama, E. F. and French, K. R. (2015). A Five-Factor Asset Pricing Model. Journal of Financial Economics, 116, 1–22. Fama, E. F. and French, K. R. (2016). Dissecting Anomalies with a Five-Factor Model. Review of Financial Studies, 29, 69–103. Ferson, W., Sarkissian, S. and Simin, T. (1999). The Alpha Factor Asset Pricing Model: A Parable. Journal of Financial Markets, 2, 49–68. Frazzini, A. and Pederson, L. (2014). Betting Against Beta. Journal of Financial Economics, 111, 1–25. Frazzini, A., Israel, R. and Moskowitz, T. (2015). Trading Costs of Asset Pricing Anomalies. Working Paper, University of Chicago. Gagliardini, P., Ossola, E. and Scaillet, O. (2016). A Time-Varying Risk Premium in Large Cross-Sectional Equity Data Sets. Econometrica, 84, 985–1046. Gibbons, M. (1982). Multivariate Tests of Financial Models: A New Approach. Journal of Financial Economics, 10, 3–27. Gibbons, M., Ross, S. and Shanken, J. (1989). A Test of the Efficiency of a Given Portfolio. Econometrica, 57, 1121–1152. Greene, W. (2012). Econometric analysis, 7th Edition, Pearson Education, Inc., New Jersey. Gujarati and Damodar (2004). Basic econometrics, 4th Edition, McGraw-Hill, New York. Harvey, C., Liu, Y. and Zhu, H. (2016). · · · and the Cross-Section of Expected Returns. Review of Financial Studies, 29, 5–68. Hou, K., Karolyi, A. and Kho, B. (2011). What Factors Drive Global Stock Returns? Review of Financial Studies, 24, 2528–2548. Hou, K., Xue, C. and Zhang, L. (2015a). A Comparison of New Factor Models. Fisher College of Business Working Paper, Columbus, Ohio State University. Hou, K., Xue, C. and Zhang, L. (2015b). Digesting Anomalies: An Investment Approach. Review of Financial Studies, 28, 650–705. Huberman, G. (1982). A Simple Approach to Arbitrage Pricing Theory. Journal of Economic Theory, 28, 183–191. Hwang, S. and Lu, C. (2007). Too Many Factors! Do We Need Them All? http://papers. ssrn.com/sol3/papers.cfm?abstract id=972022. Hwang, S. and Satchell, S. (2014). Testing Linear Factor Models on Individual Stocks Using the Average F-test. European Journal of Finance, 20, 463–498. Ibbotson, R., Chen, Z., Kim, D. and Hu, W. (2013). Liquidity as an Investment Style. Financial Analyst Journal, 69, 30–44. Ibbotson, R. and Idzorek, T. (2014). Dimensions of Popularity. Journal of Portfolio Management, 40, 68–74. Ibbotson, R. and Kim, D. (2015). Liquidity as an Investment Style: 2015 Update, Unpublished Paper. Zebra Capital Management. Idzorek, T., Xiong, J. and Ibbotson, R. (2012). The Liquidity Style of Mutual Funds. Financial Analysts Journal, 68, 38–53. Ingersoll, J. (1984). Some Results in the Theory of Arbitrage Pricing. Journal of Finance, 39, 1021–1039. Jegadeesh, N., Noh, J., Pukthuanthong, K., Roll, R. and Wang, J. (2015). Empirical Tests of Asset Pricing Models with Individual Assets: Resolving the Errors-in-Variables Bias in Risk Premium Estimation. http://papers.ssrn.com/sol3/papers.cfm?abstract id= 2664332. Jegadeesh, N. and Titman, S. (1993). Returns to Buying winners and Selling Losers: Implications for Stock Market Efficiency. Journal of Finance, 48, 93–130. Kim, D. (1995). The Errors in the Variables Problem in the Cross-Section of Expected Stock Returns. Journal of Finance, 50, 1605–1634.
page 429
July 6, 2020
11:57
430
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch10
S. Rahman & M. J. Schneider
Knight, J. and Satchell, S. (2005). Linear Factor Models in Finance. Elsevier Butterworth Heinemann, Amsterdam. Korajczyk, R. and Sadka, R. (2004). Are Momentum Profits Robust to Trading Costs? Journal of Finance, 59, 1039–1082. Korajczyk, R. and Sadka, R. (2008). Pricing the Commodity Across Measures of Liquidity. Journal of Financial Economics, 87, 45–72. Kosowski, R., Timmermann, A., Wermers, R. and White, H. (2006). Can Mutual Fund “Stars” Really Pick Stocks? New Evidence from a Bootstrap Analysis. Journal of Finance, 61, 2551–2595. Lintner, J. (1965). The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets. Review of Economics and Statistics, 47, 13–37. Litzenberger, R. and Ramaswamy, K. (1979). The Effect of Personal Taxes and Dividends on Capital Asset Prices: Theory and Empirical Evidence. Journal of Financial Economics, 7, 163–196. Lo, A. and MacKinlay, C. (1990). Data Snooping Biases in Tests of Financial Asset Pricing Models. Review of Financial Studies, 3, 431–468. Maddala, G. and Lahiri, K. (2010). Introduction to econometrics, 4th Edition, John Wiley and Sons, Inc., New Jersey. Mossin, J. (1966). Equilibrium in a Capital Asset Market. Econometrica, 34, 768–783. Novy-Marx, R. (2013). The Other Side of Value: The Gross Profitability Premium. Journal of Financial Economics, 108, 1–28. Novy-Marx, R. (2014). Predicting Anomaly Performance with Politics, the Weather, Global Warming, Sunspots, and the Stars. Journal of Financial Economics, 112, 137–146. Pastor, L. and Stambaugh, R. (2003). Liquidity Risk and Expected Stock Returns. Journal of Political Economy, 111, 642–685. Roll, R. (1977). A Critique of the Asset Pricing Theory’s Tests: Part 1: On Past and Present Potential Testability of the Theory. Journal of Financial Economics, 4, 129–176. Ross, S. A. (1976). The Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory, 13, 341–360. Ross, S. A. (1977). Risk, Return, and Arbitrage. In Risk and Return in Finance, edited by I. Friend and J. Bicksler, Vol. 1, Ballinger, Cambridge, MA. Sadka, R. (2006). Momentum and Post-Earnings-Announcement Drift Anomalies: The Role of Liquidity Risk. Journal of Financial Economics, 80, 309–349. Schwert, G. W. (2002). Anomalies and Market Efficiency. NBER Working Paper No. 9277, National Bureau of Economic Research, Cambridge, Massachusetts. Sharpe, W. (1964). Capital Asset Prices, A Theory of Market Equilibrium Under Conditions of Risk. Journal of Finance, 19, 425–442. Sharpe, W. (1988). Determining a Fund’s Effective Asset Mix. Investment Management Review, 2, 59–69. Sharpe, W. (1992). Asset Allocation: Management Style and Performance Measurement. Journal of Portfolio Management, 18, 7–19. Swedroe, L. and Berkin, A. (2015). Is Outperforming the Market Alpha or Beta? AAII Journal, 37, 11–15. Titman, S., Wei, K. and Xie, F. (2004). Capital Investments and Stock Returns. Journal of Financial and Quantitative Analysis, 39, 677–700. Welch, I. and Goyal, A. (2008). A Comprehensive Look at the Empirical Performance of Equity Premium Prediction. Review of Financial Studies, 21, 1455–1508. White, A. (2015). Investors from the Moon: Fama. http://www.top1000funds.com/ featured-homepage-posts/2015/12/11/investors-from-the-moon-fama.
page 430
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
Chapter 11
Hedge Ratio and Time Series Analysis∗ Sheng-Syan Chen, Cheng Few Lee and Keshab Shresth Contents 11.1 11.2
11.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Alternative Theories for Deriving the Optimal Hedge Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Static case . . . . . . . . . . . . . . . . . . . . . 11.2.2 Dynamic case . . . . . . . . . . . . . . . . . . . 11.2.3 Case with production and alternative investment opportunities . . . . . . . . . . . . . . . . . . . . Alternative Methods for Estimating the Optimal Hedge Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.1 Estimation of the minimum-variance (MV) hedge ratio . . . . . . . . . . . . . . . . . . . . .
. . . . 432 . . . . 435 . . . . 436 . . . . 442 . . . . 444 . . . . 445 . . . . 446
Sheng-Syan Chen National Chengchi University email: [email protected] Cheng Few Lee Rutgers University email: cfl[email protected] Keshab Shresth Monash University email: [email protected] ∗
This chapter is an update and expansion of the paper “Future hedge ratios: A review”, which was published in Quarterly Review of Economics and Finance, Vol. 43, pp. 433–465 (2003).
431
page 431
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
432
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
11.3.2 Estimation of the optimum mean-variance and Sharpe hedge ratios . . . . . . . . . . . . . . . . . . . . . . . . 11.3.3 Estimation of the maximum expected utility hedge ratio . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.4 Estimation of mean extended-gini (MEG) coefficient based hedge ratios . . . . . . . . . . . . . . . . . . . . . 11.3.5 Estimation of generalized semivariance (GSV)-based hedge ratios . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Hedging Horizon, Maturity of Futures Contract, Data Frequency, and Hedging Effectiveness . . . . . . . . . . . 11.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 11A: Theoretical Models . . . . . . . . . . . . . . . . . . . Appendix 11B: Empirical Models . . . . . . . . . . . . . . . . . . . . Appendix 11C: Monthly Data of S&P500 Index and its Futures . . .
452 452 452 454 454 457 459 463 466 479
Abstract This chapter discusses both static and dynamic hedge ratio in detail. In static analysis, we discuss minimum-variance hedge ratio, Sharpe hedge ratio, and optimum mean-variance hedge ratio. In addition, several time series analysis methods such as the multivariate skewnormal distribution method, the autoregressive conditional heteroskedasticity (ARCH) and generalized autoregressive conditional heteroskedasticity (GARCH) methods, the regimeswitching GARCH model, and the random coefficient method are used to show how hedge ratio can be estimated. Keywords Hedge ratio • Minimum variance hedge ratio • CARA utility function • Optimum mean variance hedge ratio • Sharpe hedge ratio • Maximum mean extended-Gini coefficient hedge ratio • Optimum mean MEG hedge ratio • Minimum generalized semi-variance hedge ratio • Minimum value-at-risk hedge ratio multivariable spew-normal distribution method • ARCH method • GARCH method • Regime-switching GARCH method • Random coefficient method • Co-integration and error assertion method effectiveness.
11.1 Introduction One of the best uses of derivative securities such as futures contracts is in hedging. In the past, both academicians and practitioners have shown great interest in the issue of hedging with futures. This is quite evident from the large number of papers written in this area. One of the main theoretical issues in hedging involves the determination of the optimal hedge ratio. However, the optimal hedge ratio depends on the particular objective function to be optimized. Many different objective
page 432
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Hedge Ratio and Time Series Analysis
b3568-v1-ch11
433
functions are currently being used. For example, one of the most widely-used hedging strategies is based on the minimization of the variance of the hedged portfolio (e.g., see Johnson, 1960; Ederington, 1979; Myers and Thompson, 1989). This so-called minimum-variance (MV) hedge ratio is simple to understand and estimate. However, the MV hedge ratio completely ignores the expected return of the hedged portfolio. Therefore, this strategy is in general inconsistent with the mean-variance framework unless the individuals are infinitely risk-averse or the futures price follows a pure martingale process (i.e., expected futures price change is zero). Other strategies that incorporate both the expected return and risk (variance) of the hedged portfolio have been recently proposed (e.g., see Howard and D’Antonio, 1984; Cecchetti 1988; Hsin et al., 1994). These strategies are consistent with the mean-variance framework. However, it can be shown that if the futures price follows a pure martingale process, then the optimal mean-variance hedge ratio will be the same as the MV hedge ratio. Another aspect of the mean-variance-based strategies is that even though they are an improvement over the MV strategy, for them to be consistent with the expected utility maximization principle, either the utility function needs to be quadratic or the returns should be jointly normal. If neither of these assumptions is valid, then the hedge ratio may not be optimal with respect to the expected utility maximization principle. Some researchers have solved this problem by deriving the optimal hedge ratio based on the maximization of the expected utility (e.g., see Cecchetti et al., 1988; Lence, 1995, 1996). However, this approach requires the use of specific utility function and specific return distribution. Attempts have been made to eliminate these specific assumptions regarding the utility function and return distributions. Some of them involve the minimization of the mean extended-Gini (MEG) coefficient, which is consistent with the concept of stochastic dominance (e.g., see Cheung 1990; Kolb and Okunev, 1992, 1993; Lien and Luo, 1993a; Shalit, 1995; Lien and Shaffer, 1999). Shalit (1995) shows that if the prices are normally distributed, then the MEG-based hedge ratio will be the same as the MV hedge ratio. Recently, hedge ratios based on the generalized semivariance (GSV) or lower partial moments have been proposed (e.g., see De Jong et al., 1997; Lien and Tse, 1998, 2000; Chen et al., 2001). These hedge ratios are also consistent with the concept of stochastic dominance. Furthermore, these GSV-based hedge ratios have another attractive feature whereby they measure portfolio risk by the GSV, which is consistent with the risk perceived by managers, because of its emphasis on the returns below the target return
page 433
July 6, 2020
11:57
434
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
(see Crum et al., 1981; Lien and Tse, 2000). Lien and Tse (1998) show that if the futures and spot returns are jointly normally distributed and if the futures price follows a pure martingale process, then the minimum-GSV hedge ratio will be equal to the MV hedge ratio. Finally, Hung et al. (2006) has proposed a related hedge ratio that minimizes the value-at-risk (VaR) associated with the hedged portfolio when choosing hedge ratio. This hedge ratio will also be equal to MV hedge ratio if the futures price follows a pure martingale process. Most of the studies mentioned above (except Lence (1995, 1996)) ignore transaction costs as well as investments in other securities. Lence (1995, 1996) derives the optimal hedge ratio where transaction costs and investments in other securities are incorporated in the model. Using a CARA utility function, Lence finds that under certain circumstances the optimal hedge ratio is zero; i.e., the optimal hedging strategy is not to hedge at all. In addition to the use of different objective functions in the derivation of the optimal hedge ratio, previous studies also differ in terms of the dynamic nature of the hedge ratio. For example, some studies assume that the hedge ratio is constant over time. Consequently, these static hedge ratios are estimated using unconditional probability distributions (e.g., see Ederington, 1979; Howard and D’Antonio, 1984; Benet, 1992; Kolb and Okunev, 1992, 1993; Ghosh, 1993). On the other hand, several studies allow the hedge ratio to change over time. In some cases, these dynamic hedge ratios are estimated using conditional distributions associated with models such as autoregressive conditional heteroskedasticity (ARCH) and generalized autoregressive conditional heteroskedasticity (GARCH) (e.g., see Cecchetti et al., 1988; Baillie and Myers, 1991; Kroner and Sultan, 1993; Sephton, 1993a). The GARCHbased method has recently been extended by Lee and Yoder (2007) where regime-switching model is used. Alternatively, the hedge ratios can be made dynamic by considering a multi-period model where the hedge ratios are allowed to vary for different periods. This is the method used by Lien and Luo (1993b). When it comes to estimating the hedge ratios, many different techniques are currently being employed, ranging from simple to complex ones. For example, some of them use such a simple method as the ordinary least squares (OLS) technique (e.g., see Ederington, 1979; Malliaris and Urrutia, 1991; Benet, 1992). However, others use more complex methods such as the conditional heteroscedastic (ARCH or GARCH) method (e.g., see Cecchetti et al., 1988; Baillie and Myers, 1991; Sephton, 1993a), the random coefficient method (e.g., see Grammatikos and Saunders, 1983), the cointegration
page 434
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
Hedge Ratio and Time Series Analysis
435
method (e.g., see Ghosh, 1993; Lien and Luo, 1993b; Chou et al., 1996), or the cointegration-heteroscedastic method (e.g., see Kroner and Sultan, 1993). Recently, Lien and Shrestha (2007) has suggested the use of wavelet analysis to match the data frequency with the hedging horizon. Finally, Lien and Shresths (2010) also suggests the use of multivariate skew-normal distribution in estimating the minimum variance hedge ratio. It is quite clear that there are several different ways of deriving and estimating hedge ratios. In this chapter, we review these different techniques and approaches and examine their relations. The chapter is divided into five sections. In Section 11.2, alternative theories for deriving the optimal hedge ratios are reviewed. Various estimation methods are discussed in Section 11.3. This section is divided into five subsections. Section 11.3.1 discusses estimation of MV hedge ratio, Section 11.3.2 discusses estimation of the optimum mean-variance and Sharpe hedge ratios, Section 11.3.3 discusses the estimation of the maximum expected unity hedge ratio, Section 11.3.4 discusses the estimation of mean extended-Gini coefficient-based hedge ratio and Section 11.3.5 discusses the estimation of GSV-based hedge ratio. Section 11.4 presents a discussion on the relationship among lengths of hedging horizon, maturity of futures contract, data frequency, and hedging effectiveness. Finally, in Section 11.5 we provide a summary and conclusion.
11.2 Alternative Theories for Deriving the Optimal Hedge Ratio The basic concept of hedging is to combine investments in the spot market and futures market to form a portfolio that will eliminate (or reduce) fluctuations in its value. Specifically, consider a portfolio consisting of Cs units of a long spot position and Cf units of a short futures position.1 Let St and Ft denote the spot and futures prices at time t, respectively. Since the futures contracts are used to reduce the fluctuations in spot positions, the resulting portfolio is known as the hedged portfolio. The return on the hedged portfolio, Rh , is given by Rh =
1
Cs St Rs − Cf Ft Rf = Rs − hRf , Cs S t
(11.1a)
Without loss of generality, we assume that the size of the futures contract is one.
page 435
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
436
where h=
Cf Ft Cs S t
is the so-called hedge ratio, and Rs =
St+1 − St St
and
Rf =
Ft+1 − Ft , Ft
are so-called one-period returns on the spot and futures positions, respectively. Sometimes, the hedge ratio is discussed in terms of price changes (profits) instead of returns. In this case the profit on the hedged portfolio, ΔVH , and the hedge ratio, H, are respectively given by ΔVH = Cs ΔSt − Cf ΔFt
and H =
Cf , Cs
(11.1b)
where ΔSt = St+1 − St and ΔFt = Ft+1 − Ft . The main objective of hedging is to choose the optimal hedge ratio (either h or H). As mentioned above, the optimal hedge ratio will depend on a particular objective function to be optimized. Furthermore, the hedge ratio can be static or dynamic. In Sections 11.2.1 and 11.2.2, we will discuss the static hedge ratio and then the dynamic hedge ratio. It is important to note that in the above setup, the cash position is assumed to be fixed and we only look for the optimum futures position. Most of the hedging literature assumes that the cash position is fixed, a setup that is suitable for financial futures. However, when we are dealing with commodity futures, the initial cash position becomes an important decision variable that is tied to the production decision. One such setup considered by Lence (1995, 1996) will be discussed in Section 11.2.3. 11.2.1 Static case We consider here that the hedge ratio is static if it remains the same over time. The static hedge ratios reviewed in this chapter can be divided into eight categories, as shown in Table 11.1. We will discuss each of them in the following subsections. 11.2.1.1 Minimum-variance hedge ratio The most widely-used static hedge ratio is the MV hedge ratio. Johnson (1960) derives this hedge ratio by minimizing the portfolio risk, where the
page 436
July 17, 2020
13:34
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
Hedge Ratio and Time Series Analysis Table 11.1:
437
A list of different static hedge ratios.
Hedge ratio
Objective function
• MV hedge ratio
Minimize variance of Rh
• Optimum mean-variance hedge ratio
Maximize E(Rh ) −
• Sharpe hedge ratio
√ h )−RF Maximize E(R
A Var(Rh ) 2
Var(Rh )
• Maximum expected utility hedge ratio
Maximize E[U (W1 )]
• Minimum MEG coefficient hedge ratio
Minimize Γv (Rh v)
• Optimum mean-MEG hedge ratio
Maximize E[Rh ] − Γv (Rh v)
• Minimum GSV hedge ratio
Minimize Vδ,α (Rh )
• Maximum Mean-GSV hedge ratio
Maximize E[Rh ] − Vδ,α (Rh ) √ Minimize Zα σh τ − E[Rh ]τ
• Minimum VaR hedge ratio over a given time period τ Notes:
1. Rh = return on the hedged portfolio, E(Rh ) = expected return on the hedged portfolio, Var(Rh ) = variance of return on the hedged portfolio, σh = standard deviation of return on the hedged portfolio, Zα = negative of left percentile at α for the standard normal distribution, A = risk aversion parameter, RF = return on the risk-free security, E (U (W1 )) = expected utility of end-of-period wealth, Γv (Rh v) = mean extended-Gini coefficient of Rh , Vδ,α (Rh ) = generalized semivariance of Rh . 2. With W1 given by equation (11.17), the maximum expected utility hedge ratio includes the hedge ratio considered by Lence (1995, 1996).
risk is given by the variance of changes in the value of the hedged portfolio as follows: Var(ΔVH ) = Cs2 Var(ΔS) + Cf2 Var(ΔF ) − 2Cs Cf Cov(ΔS, ΔF ). The MV hedge ratio, in this case, is given by HJ∗ =
Cf Cov(ΔS, ΔF ) . = Cs Var(ΔF )
(11.2a)
Alternatively, if we use definition (1a) and use Var(Rh ) to represent the portfolio risk, then the MV hedge ratio is obtained by minimizing Var(Rh ) which is given by: Var(Rh ) = Var(Rs ) + h2 Var(Rf ) − 2hCov(Rs , Rf ).
page 437
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
438
In this case, the MV hedge ratio is given by h∗J =
Cov(Rs , Rf ) σs =ρ , Var(Rf ) σf
(11.2b)
where ρ is the correlation coefficient between Rs and Rf , and σs and σf are standard deviations of Rs and Rf , respectively. The attractive features of the MV hedge ratio are that it is easy to understand and simple to compute. However, in general the MV hedge ratio is not consistent with the mean-variance framework since it ignores the expected return on the hedged portfolio. For the MV hedge ratio to be consistent with the mean-variance framework, either the investors need to be infinitely risk-averse or the expected return on the futures contract needs to be zero. 11.2.1.2 Optimum mean-variance hedge ratio Various studies have incorporated both risk and return in the derivation of the hedge ratio. For example, Hsin et al. (1994) derive the optimal hedge ratio that maximizes the following utility function: Max V (E(Rh ), σ; A) = E(Rh ) − 0.5Aσh2 , Cf
(11.3)
where A represents the risk aversion parameter. It is clear that this utility function incorporates both risk and return. Therefore, the hedge ratio based on this utility function would be consistent with the mean-variance framework. The optimal number of futures contract and the optimal hedge ratio are, respectively, given by Cf∗ F E(Rf ) σs =− −ρ . (11.4) h2 = − Cs S σf Aσf2 One problem associated with this type of hedge ratio is that in order to derive the optimum hedge ratio, we need to know the individual’s risk aversion parameter. Furthermore, different individuals will choose different optimal hedge ratios, depending on the values of their risk aversion parameter. Since the MV hedge ratio is easy to understand and simple to compute, it will be interesting and useful to know under what condition the above hedge ratio would be the same as the MV hedge ratio. It can be seen from equations (11.2b) and (11.4) that if A → ∞ or E(Rf ) = 0, then h2 would be equal to the MV hedge ratio h∗J . The first condition is simply a restatement of the infinitely risk-averse individuals. However, the second condition does not impose any condition on the risk-averseness, and this is important. It
page 438
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Hedge Ratio and Time Series Analysis
b3568-v1-ch11
439
implies that even if the individuals are not infinitely risk averse, then the MV hedge ratio would be the same as the optimal mean-variance hedge ratio if the expected return on the futures contract is zero (i.e., futures prices follow a simple martingale process). Therefore, if futures prices follow a simple martingale process, then we do not need to know the risk aversion parameter of the investor to find the optimal hedge ratio. 11.2.1.3 Sharpe hedge ratio Another way of incorporating the portfolio return in the hedging strategy is to use the risk-return tradeoff (Sharpe measure) criteria. Howard and D’Antonio (1984) consider the optimal level of futures contracts by maximizing the ratio of the portfolio’s excess return to its volatility: Max θ = Cf
E(Rh ) − RF , σh
(11.5)
where σh2 = Var(Rh ) and RF represents the risk-free interest rate. In this case the optimal number of futures positions, Cf∗ , is given by S σs σs E(Rf ) − ρ F σ σf E(Rs )−RF f . (11.6) Cf∗ = −Cs E(Rf )ρ σs 1 − σf E(Rs )−R F From the optimal futures position, we can obtain the following optimal hedge ratio:
E R ( f) σs σs −ρ σf σf E(Rs )−RF
. (11.7) h3 = − E (Rf )ρ 1 − σσfs E(Rs )−R F Again, if E(Rf ) = 0, then h3 reduces to σs ρ, h3 = σf
(11.8)
which is the same as the MV hedge ratio h∗J . As pointed out by Chen et al. (2001), the Sharpe ratio is a highly nonlinear function of the hedge ratio. Therefore, it is possible that equation (11.7), which is derived by equating the first derivative to zero, may lead to the hedge ratio that would minimize, instead of maximizing, the Sharpe ratio. This would be true if the second derivative of the Sharpe ratio with respect to the hedge ratio is positive instead of negative. Furthermore, it is possible
page 439
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
440
that the optimal hedge ratio may be undefined as in the case encountered by Chen et al. (2001), where the Sharpe ratio monotonically increases with the hedge ratio. 11.2.1.4 Maximum expected utility hedge ratio So far we have discussed the hedge ratios that incorporate only risk as well as the ones that incorporate both risk and return. The methods, which incorporate both the expected return and risk in the derivation of the optimal hedge ratio, are consistent with the mean-variance framework. However, these methods may not be consistent with the expected utility maximization principle unless either the utility function is quadratic or the returns are jointly normally distributed. Therefore, in order to make the hedge ratio consistent with the expected utility maximization principle, we need to derive the hedge ratio that maximizes the expected utility. However, in order to maximize the expected utility, we need to assume a specific utility function. For example, Cecchetti et al. (1988) derive the hedge ratio that maximizes the expected utility where the utility function is assumed to be the logarithm of terminal wealth. Specifically, they derive the optimal hedge ratio that maximizes the following expected utility function: log [1 + Rs − hRf ] f (Rs , Rf ) dRs dRf , Rs
Rf
where the density function f (Rs , Rf ) is assumed to be bivariate normal. A third-order linear bivariate ARCH model is used to get the conditional variance and covariance matrix, and a numerical procedure is used to maximize the objective function with respect to the hedge ratio.2 11.2.1.5 Minimum mean extended-Gini coefficient hedge ratio This approach of deriving the optimal hedge ratio is consistent with the concept of stochastic dominance and involves the use of the MEG coefficient. Cheung et al. (1990), Kolb and Okunev (1992), Lien and Luo (1993a), Shalit (1995), and Lien and Shaffer (1999) all consider this approach. It minimizes the MEG coefficient Γν (Rh ) defined as follows: Γν (Rh ) = −νCov(Rh , (1 − G(Rh ))ν−1 ), 2
(11.9)
Lence (1995) also derives the hedge ratio based on the expected utility. We will discuss it later in 11.2.3.
page 440
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Hedge Ratio and Time Series Analysis
b3568-v1-ch11
441
where G is the cumulative probability distribution and ν is the risk aversion parameter. Note that 0 ≤ ν < 1 implies risk seekers, ν = 1 implies riskneutral investors, and ν > 1 implies risk-averse investors. Shalit (1995) has shown that if the futures and spot returns are jointly normally distributed, then the minimum-MEG hedge ratio would be the same as the MV hedge ratio. 11.2.1.6 Optimum mean-MEG hedge ratio Instead of minimizing the MEG coefficient, Kolb and Okunev (1993) alternatively consider maximizing the utility function defined as follows: U (Rh ) = E(Rh ) − Γv (Rh ).
(11.10)
The hedge ratio based on the utility function defined by equation (11.10) is denoted as the M-MEG hedge ratio. The difference between the MEG and M-MEG hedge ratios is that the MEG hedge ratio ignores the expected return on the hedged portfolio. Again, if the futures price follows a martingale process (i.e., E(Rf ) = 0), then the MEG hedge ratio would be the same as the M-MEG hedge ratio. 11.2.1.7 Minimum generalized semivariance hedge ratio In recent years, a new approach for determining the hedge ratio has been suggested (see De Jong et al., 1997; Lien and Tse, 1998, 2000; Chen et al., 2001). This new approach is based on the relationship between the GSV and expected utility as discussed by Fishburn (1977) and Bawa (1978). In this case the optimal hedge ratio is obtained by minimizing the GSV given as follows: δ (δ − Rh )α dG(Rh ), α > 0, (11.11) Vδ,α (Rh ) = −∞
where G(Rh ) is the probability distribution function of the return on the hedged portfolio Rh . The parameters δ and α (which are both real numbers) represent the target return and risk aversion, respectively. The risk is defined in such a way that the investors consider only the returns below the target return (δ) to be risky. It can be shown (see Fishburn, 1977) that α < 1 represents a risk-seeking investor and α > 1 represents a risk-averse investor. The GSV, due to its emphasis on the returns below the target return, is consistent with the risk perceived by managers (see Crum et al., 1981; Lien and Tse, 2000). Furthermore, as shown by Fishburn (1977) and Bawa (1978),
page 441
July 6, 2020
11:57
442
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
the GSV is consistent with the concept of stochastic dominance. Lien and Tse (1998) show that the GSV hedge ratio, which is obtained by minimizing the GSV, would be the same as the MV hedge ratio if the futures and spot returns are jointly normally distributed and if the futures price follows a pure martingale process. 11.2.1.8 Optimum mean-generalized semivariance hedge ratio Chen et al. (2001) extend the GSV hedge ratio to a mean-GSV (M-GSV) hedge ratio by incorporating the mean return in the derivation of the optimal hedge ratio. The M-GSV hedge ratio is obtained by maximizing the following mean-risk utility function, which is similar to the conventional mean-variance-based utility function (see equation (11.3)): U (Rh ) = E[Rh ] − Vδ,α (Rh ).
(11.12)
This approach to the hedge ratio does not use the risk aversion parameter to multiply the GSV as done in conventional mean-risk models (see Hsin et al., 1994, and equation (11.3)). This is because the risk aversion parameter is already included in the definition of the GSV, Vδ,α (Rh ). As before, the M-GSV hedge ratio would be the same as the GSV hedge ratio if the futures price follows a pure martingale process. 11.2.1.9 Minimum value-at-risk hedge ratio Hung et al. (2006) suggests a new hedge ratio that minimizes the VaR of the hedged portfolio. Specifically, the hedge ratio h is derived by minimizing the following VaR of the hedged portfolio over a given time period τ : √ (11.13) VaR(Rh ) = Zα σh τ − E[Rh ]τ. The resulting optimal hedge ratio, which Hung et al. (2006) refer to as zero-VaR hedge ratio, is given by σs σs 1 − ρ2 VaR =ρ − E[Rf ] . (11.14) h 2 σf σf Zα2 σf − E[Rf ]2 It is clear that, if the futures price follows martingale process, the zero-VaR hedge ratio would be the same as the MV hedge ratio. 11.2.2 Dynamic case We have up to now examined the situations in which the hedge ratio is fixed at the optimum level and is not revised during the hedging period. However, it could be beneficial to change the hedge ratio over time. One way to allow
page 442
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Hedge Ratio and Time Series Analysis
b3568-v1-ch11
443
the hedge ratio to change is by recalculating the hedge ratio based on the current (or conditional) information on the covariance (σsf ) and variance (σf2 ). This involves calculating the hedge ratio based on conditional information (i.e., σsf |Ωt−1 and σf2 |Ωt−1 ) instead of unconditional information. In this case, the MV hedge ratio is given by h1 |Ωt−1 = −
σsf |Ωt−1 . σf2 |Ωt−1
The adjustment to the hedge ratio based on new information can be implemented using such conditional models as ARCH and GARCH (to be discussed later) or using the moving window estimation method. Another way of making the hedge ratio dynamic is by using the regimeswitching GARCH model (to be discussed later) as suggested by Lee and Yoder (2007). This model assumes two different regimes where each regime is associated with a different set of parameters and the probabilities of regime-switching must also be estimated when implementing such a method. Alternatively, we can allow the hedge ratio to change during the hedging period by considering multi-period models, which is the approach used by Lien and Luo (1993b). Lien and Luo (1993b) consider hedging with T periods’ planning horizon and minimize the variance of the wealth at the end of the planning horizon, WT . Consider the situation where Cs,t is the spot position at the beginning of period t and the corresponding futures position is given by Cf,t = −bt Cs,t . The wealth at the end of the planning horizon, WT , is then given by WT = W0 +
T −1
Cs,t [St+1 − St − bt (Ft+1 − Ft )]
t=0
= W0 +
T −1
Cs,t [ΔSt+1 − bt ΔFt+1 ].
(11.15)
t=0
The optimal bt ’s are given by the following recursive formula: bt =
Cov(ΔSt+1 , ΔFt+1 ) Var(ΔFt+1 ) T −1 Cs,i Cov(ΔFt+1 , ΔSi+1 + bi ΔFt+i ) . + Cs,t Var(ΔFt+1 )
(11.16)
i=t+1
It is clear from equation (11.16) that the optimal hedge ratio bt will change over time. The multi-period hedge ratio will differ from the single-period
page 443
July 6, 2020
11:57
444
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
hedge ratio due to the second term on the right-hand side of equation (11.16). However, it is interesting to note that the multi-period hedge ratio would be different from the single-period one if the changes in current futures prices are correlated with the changes in future futures prices or with the changes in future spot prices. 11.2.3 Case with production and alternative investment opportunities All the models considered in Sections 11.2.1 and 11.2.2 assume that the spot position is fixed or predetermined, and thus production is ignored. As mentioned earlier, such an assumption may be appropriate for financial futures. However, when we consider commodity futures, production should be considered in which case the spot position becomes one of the decision variables. In an important chapter, Lence (1995) extends the model with a fixed or predetermined spot position to a model where production is included. In his model, Lence (1995) also incorporates the possibility of investing in a risk-free asset and other risky assets, borrowing, as well as transaction costs. We will briefly discuss the model considered by Lence (1995) as follows. Lence (1995) considers a decision maker whose utility is a function of terminal wealth U (W1 ), such that U > 0 and U < 0. At the decision date (t = 0), the decision maker will engage in the production of Q commodity units for sale at terminal date (t = 1) at the random cash price P1 . At the decision date, the decision maker can lend L dollars at the risk-free lending rate (RL − 1) and borrow B dollars at the borrowing rate (RB − 1), invest I dollars in a different activity that yields a random rate of return (RI − 1) and sell X futures at futures price F0 . The transaction cost for the futures trade is f dollars per unit of the commodity traded to be paid at the terminal date. The terminal wealth (W1 ) is therefore given by W1 = W0 R = P1 Q + (F0 − F1 )X − f |X| − RB B + RL L + RI I,
(11.17)
where R is the return on the diversified portfolio. The decision maker will maximize the expected utility subject to the following restrictions: W0 + B ≥ v(Q)Q + L + I, 0 ≤ B ≤ kB v(Q)Q, kB ≥ 0, L ≥ kL F0 |X|, kL ≥ 0, I ≥ 0, where v(Q) is the average cost function, kB is the maximum amount (expressed as a proportion of his initial wealth) that the agent can borrow, and kL is the safety margin for the futures contract.
page 444
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Hedge Ratio and Time Series Analysis
b3568-v1-ch11
445
Using this framework, Lence (1995) introduces two opportunity costs: opportunity cost of alternative (sub-optimal) investment (calt ) and opportunity cost of estimation risk (eBayes ).3 Let Ropt be the return of the expected-utility maximizing strategy and let Ralt be the return on a particular alternative (sub-optimal) investment strategy. The opportunity cost of alternative investment strategy calt is then given by E[U (W0 Ropt )] = E[U (W0 Ralt + calt )].
(11.18)
In other words, calt is the minimum certain net return required by the agent to invest in the alternative (sub-optimal hedging) strategy rather than in the optimum strategy. Using the CARA utility function and some simulation results, Lence (1995) finds that the expected-utility maximizing hedge ratios are substantially different from the MV hedge ratios. He also shows that under certain conditions, the optimal hedge ratio is zero; i.e., the optimal strategy is not to hedge at all. Similarly, the opportunity cost of the estimation risk (eBayes ) is defined as follows: Bayes R = E E U W , (11.19) Eρ E U W0 Ropt (ρ) − eBayes ρ 0 opt ρ where Ropt (ρ) is the expected-utility maximizing return where the agent knows with certainty the value of the correlation between the futures and Bayes is the expected-utility maximizing return where the spot prices (ρ), Ropt agent only knows the distribution of the correlation ρ, and Eρ [.] is the expectation with respect to ρ. Using simulation results, Lence (1995) finds that the opportunity cost of the estimation risk is negligible and thus the value of the use of sophisticated estimation methods is negligible. 11.3 Alternative Methods for Estimating the Optimal Hedge Ratio In Section 11.2, we discussed different approaches to deriving the optimum hedge ratios. However, in order to apply these optimum hedge ratios in practice, we need to estimate these hedge ratios. There are various ways of estimating them. In this section, we briefly discuss these estimation methods. 3
Our discussion of the opportunity costs is very brief. We would like to refer interested readers to Lence (1995) for a detailed discussion. We would also like to point to the fact that production can be allowed to be random as is done in Lence (1996).
page 445
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
446
9.61in x 6.69in
b3568-v1-ch11
S.-S. Chen, C. F. Lee & K. Shresth
11.3.1 Estimation of the minimum-variance (MV) hedge ratio 11.3.1.1 OLS method The conventional approach to estimating the MV hedge ratio involves the regression of the changes in spot prices on the changes in futures price using the OLS technique (e.g., see Junkus and Lee, 1985). Specifically, the regression equation can be written as ΔSt = a0 + a1 ΔFt + et ,
(11.20)
where the estimate of the MV hedge ratio, Hj , is given by a1 . The OLS technique is quite robust and simple to use. However, for the OLS technique to be valid and efficient, assumptions associated with the OLS regression must be satisfied. One case where the assumptions are not completely satisfied is that the error term in the regression is heteroscedastic. This situation will be discussed later. Another problem with the OLS method, as pointed out by Myers and Thompson (1989), is the fact that it uses unconditional sample moments instead of conditional sample moments, which use currently available information. They suggest the use of the conditional covariance and conditional variance in equation (11.2a). In this case, the conditional version of the optimal hedge ratio (equation (11.2a)) will take the following form: HJ∗ =
Cf Cov(ΔS, ΔF )|Ωt−1 = . Cs Var(ΔF )|Ωt−1
(11.20a)
Suppose that the current information (Ωt−1 ) includes a vector of variables (Xt−1 ) and the spot and futures price changes are generated by the following equilibrium model: ΔSt = Xt−1 α + ut , ΔFt = Xt−1 β + vt . In this case the maximum likelihood estimator of the MV hedge ratio is given by (see Myers and Thompson, 1989) ˆuv ˆ t−1 = σ , h|X σ ˆv2
(11.21)
where σ ˆuv is the sample covariance between the residuals ut and vt , and σ ˆv2 is the sample variance of the residual vt . In general, the OLS estimator obtained from equation (11.20) would be different from the one given by
page 446
July 6, 2020
11:57
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch11
Hedge Ratio and Time Series Analysis
447
equation (11.22). For the two estimators to be the same, the spot and futures prices must be generated by the following model: ΔSt = α0 + ut ,
ΔFt = β0 + vt .
In other words, if the spot and futures prices follow a random walk, then with or without drift, the two estimators will be the same. Otherwise, the hedge ratio estimated from the OLS regression (11.18) will not be optimal. Now, we show how SAS can be used to estimate the hedge ratio in terms of OLS method. SAS Applications: Based upon monthly S&P 500 index and its futures as presented in Appendix 11C, we use OLS method in term of equation (11.20) to estimated hedge ratio in this section. By using SAS regression procedure We obtain the following program code and empirical results. proc reg data = sp500monthly; model C spot = C future; ods output ParameterEstimates = estimates ols; run;
Variable
DF
Estimate
StdErr
tValue
Probt
Intercept C future
1 1
0.10961 0.98795
0.28169 0.00535
0.39 184.51
0.6976 0.5
Thus, the absolute indices reveal that the proposed model is appropriate (Hair et al., 1998; McDonald and Ho, 2002). In addition, when referring to incremental and parsimonious fit indices, all indices are above a common threshold value except for AGFI and RFI. As a result, these findings indicate that the proposed model is valid.
16.4.2 Validity and reliability The measurement model examines the factor loadings that observed variables reflect latent variables, and t-tests for the factor loadings are used to test the convergent validity and assess if each observed variable (i.e., question item) effectively reflects its latent variable (i.e., factor construct) (Anderson and Gerbing, 1988; Raine-Eudy, 2000). As shown in Table 16.2, each factor loading of the constructs shows highly significant t-statistics, thereby suggesting that all question items provide good measures of their respective construct in the measurement model. Average variance extracted (AVE) and composite reliability (CR) are employed to assess the construct reliability of the measurement model (Bagozzi and Yi, 1998). AVE evaluates the amount of variance captured by the construct, and CR reflects the internal consistency of the question items measuring each construct. As a result, all AVE values are over the recommended value of 0.5, and all CR values are over the common threshold of 0.7. In addition, Table 16.3 shows the discriminant validity is satisfied. The root square value of AVE for each construct is
page 614
July 17, 2020
13:35
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
Application of Structural Equation Modeling in Behavioral Finance Table 16.2:
Validity and reliability.
Latent/Observed variable MA MA01: integration of two losses MA02: segregation of two gains MA04: integration of a big gain and a small loss RA RA01: regret for selling winning stocks too early RA02: regret for holding losing stocks too long RA04: more regret for holding losing stocks SC SC01: SC02: SC03: SC04:
a a a a
stop-loss order stop-gain order stop loss stop gain
DE DE02: experience of holding losing stocks and not realizing losses DE03: more experience of selling winning stocks than that of selling losing stocks DE04: experience of selling winning stocks and holding losing stocks PF PF01: realized gains PF02: paper gains PF03: paper and realized gains
615
Factor loading 0.87∗∗∗ 0.76∗∗∗
AVE
CR
0.54
0.78
0.60
0.81
0.71
0.91
0.54
0.78
0.70
0.87
0.85∗∗∗
0.81∗∗∗ 0.96∗∗∗ 0.65∗∗∗ 0.80∗∗∗ 0.80∗∗∗ 0.90∗∗∗ 0.87∗∗∗
0.79(–)a 0.76∗∗∗ 0.93∗∗∗ 0.69 (–)a 0.68∗∗∗ 0.96∗∗∗
Notes: MA: mental accounting; RA: regret avoidance; SC: self-control; DE: disposition effect; PF: investment performance. a As the first endogenous observed variable of each endogenous latent variable is used to standardize the other factor loadings in the same factor, its t-value does not exist. ∗∗∗ : significance at the 1% level.
greater than the squared correlation between it and other constructs, indicating that each construct is a distinct construct (Hair et al., 1998). As shown in Table 16.2, MA construct shows that respondents prefer “integration of two losses”, “segregation of two gains”, and “integration of a big gain and a small loss”. These preferences imply that investors use implicitly mental accounting to make decisions and maximize their value function (Kahneman and Tvesky, 1979; Thaler, 1985, 1999). RA construct
page 615
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
616 Table 16.3:
MA RA SC DE PF
Discriminant validity.
MA
RA
SC
0.74a 0.41 0.22 0.48 −0.07
0.77 0.38 0.37 −0.05
0.84 0.14 −0.02
DE
0.73 −0.15
PF
0.84
Notes: MA: mental accounting; RA: regret avoidance; SC: self-control; DE: disposition effect; PF: investment performance. a The diagonal values represent the root square of average variance extracted (AVE) for each construct. Non-diagonal values are the correlation square among constructs.
reveals that respondents feel “regret for selling winning stocks too early”, “regret for holding losing stocks too long”, and “more regret for holding losing stocks”. These feelings imply that investors are easily affected by the feel of regret and tend to avoid regret (Shefrin and Statman, 1985; Fogel and Berry, 2006). SC construct reports that respondents are more likely to set “a stop loss order” and “a stop gain order”, and perform “a stop loss” and “a stop gain”. These rules indicate that investors try to mitigate self-control bias through self-imposed rules. The question items used to measure the disposition effect consists of having “experience of holding losing stocks and not realizing losses”, “more experience of selling winning stocks than that of selling losing stocks” and “experience of selling winning stocks and holding losing stocks”. These experiences imply that investors have the tendency to sell winning stocks too early and retain losing stocks too long. In the performance construct, respondents display their levels of realized gain and paper gains. 16.4.3 Path relationship Path coefficients are used to assess whether the structure model is substantiated. As shown in Table 16.4 and Figure 16.2, MA and RA have significantly positive influence on DE, respectively. SC has a negative impact on DE but insignificant. Therefore, the findings of this study could not reject Hypothesis 16.1 and 16.2. MA has a direct influence on DE (path coefficient = 0.41, t-value = 4.71), which indicates that investors consider gains and losses as different mental accounts and realize their winning stocks at a faster rate than their losing stocks. Regret avoidance has a direct influence on DE (path coefficient = 0.19, t-value = 2.83), which implies that regret avoidance generates a
page 616
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Application of Structural Equation Modeling in Behavioral Finance Table 16.4:
b3568-v1-ch16
617
Path relationship.
Path relationship
Coefficient
t-value
Hypotheses
MA → DE RA → DE SC → DE DE → PF
0.41∗∗∗ 0.19∗∗∗ −0.03 −0.20∗∗∗
4.71 2.83 −0.59 −3.08
Acceptance Acceptance Rejection Acceptance
Notes: MA: mental accounting; RA: regret avoidance; SC: selfcontrol; DE: disposition effect; PF: investment performance. ∗∗∗ : significance at the 1% level.
Figure 16.2:
The results of conceptual model.
resistance to holding winning stocks and selling losing stocks. However, the part of emotional doer within the investors is stronger than the part of the rational planner, which results in the difficulty of executing self-imposed rules. The influence of DE on PF is significantly negative (path coefficient = −0.2, t-value = −3.08), which indicates that the disposition effect results in underperformance. This result could not reject Hypothesis 16.4, which implies that investors should try to mitigate the disposition effect. In sum, the results of this study validate Shefrin and Statman (1985)’s inference that mental accounting and regret avoidance result in the disposition effect. In addition, MA is the most significant bias that influences the disposition effect, which implies that prospect theory (an S-shaped value
page 617
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
618
function) can be an alternative to expected utility theory (expected utility function) in accounting for investor’s behavior.
16.4.4 The moderating result Table 16.5 shows that all values of RMSEA, χ2 /df , and GFI are less than the threshold values of 0.085, 5, and 0.9, respectively, which implies that the data-model fit is appropriate for each subgroup. The path coefficients in female (married) group are set to be equal to those in male (single) group to repeat the estimation procedure. A new χ2 statistics of the female (married) group is obtained and compared with the original χ2 of the female (married) group. Moderating effect exists if the Δχ2 is significant (Johnson, 1999). For female group, the Δχ2 (Δdf = 3) is 14.48 (new χ2 = 291.39 and original χ2 = 276.91, respectively), larger than the threshold value (Δχ20.05,3 = 7.815). Thus, gender might moderate the link between cognitive/emotional biases and the disposition effect. For married group, the Δχ2 (Δdf = 3) is 18.79 (new χ2 = 430.68 and original χ2 = 411.89, respectively), larger than the threshold value (Δχ20.05,3 = 7.815). Thus, marital status might moderate the link between cognitive/emotional biases and the disposition effect. This study also provides evidence that both female investors and married ones (path coefficient = 0.51 and 0.44, respectively) are more likely to use
Table 16.5:
Summary of moderating effects.
Male (N = 336)
Female (N = 425)
Single (N = 353)
Married (N = 408)
RMSEA χ2 /df GFI
0.019 108.63/97 = 1.12 0.93
0.066 276.91/97 = 2.85 0.92
0.053 194.44/97 = 2.01 0.90
0.083 411.89/97 = 4.20 0.91
MA → DE RA → DE SC → DE Total direct effect
0.21∗ (1.91) 0.34∗∗∗ (2.96) −0.06 (−0.75)
0.51∗∗∗ (4.82) 0.19∗∗ (2.10) −0.03 (−0.36)
0.37∗∗∗ (2.89) 0.27∗ (1.89) −0.07 (−0.73)
0.44∗∗∗ (4.23) 0.19∗∗ (2.13) −0.05 (−0.71)
χ2 /df (new) Δχ2 (Δdf )
0.49
0.67 291.39/100 = 2.91 14.48(3)
0.57
0.58 430.68/100 = 4.31 18.79(3)
Notes: MA: mental accounting; RA: regret avoidance; SC: self-control; DE: disposition effect; PF: investment performance. t-Values are shown in parentheses. ∗ , ∗∗ , ∗∗∗ : significance at the 10%, 5% and 1% level, respectively.
page 618
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Application of Structural Equation Modeling in Behavioral Finance
b3568-v1-ch16
619
mental accounting to make their decisions, whereas both male investors and single ones (path coefficient = 0.34 and 0.27, respectively) are more easily affected by regret emotion. Furthermore, when this study combines the impacts of cognitive/emotional biases on the disposition effect, total direct effect shows that female investors have greater disposition effect than male ones (0.67 vs. 0.49), which is consistent with Da Costa et al. (2008), Rau (2014), and Kohsaka et al. (2017). However, the disposition effect may be indifferent between single investors and married ones (0.58 vs. 0.57). 16.5 Conclusion and Strategic Implications 16.5.1 Concluding remarks Studies on behavioral finance use investor-level data or lab experiments to examine the disposition effect. These studies find that the disposition effect exists in financial markets and employ cognitive/emotional biases to account for the behavioral bias. Unlike previous studies using investor-level data or lab experiments, this study collects quantitative data through a questionnaire survey and employs the SEM approach to examine the relationship among cognitive/emotional biases, the disposition effect, and investment performance. The results of this study show that the proposed model has excellent goodness of fit, convergent validity, construct reliability, and discriminant validity, which implies that the proposed model is valid. The findings of this study provide evidence that mental accounting and regret avoidance have a significant and positive influence on the disposition effect, respectively. In addition, the disposition effect has a significant and negative impact on investment performance. These findings imply that cognitive/emotional biases result in the behavioral biases, which results consequently in underperformance. Because mental accounting plays the most important role influencing the disposition effect, prospect theory might be an alternative to expected utility theory to account for investor’s behavior. However, the results provide no evidence that self-impose rules do work efficiently. The results of moderating analysis indicate that female investors and married ones are more likely to use mental accounting to make decisions, whereas male investors and single ones are more easily affected by the feeling of regret. When combining the influences of mental accounting and regret avoidance on the disposition effect, this study find that female investors have greater tendency to sell winning stocks too early and hold losing stocks too long than male ones.
page 619
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
620
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
16.5.2 Strategic implications 16.5.2.1 Investors The results of this study can provide a few strategic implications with investors. Although self-imposed rules can be employed to mitigate the disposition effect, investors are more likely to encounter difficulties in executing these rules because an emotional doer is often stronger than a rational planner. Therefore, investors, especially female, can select products embedded with contractual obligations, such as contracts with options, to help them limit the doer’s discretion. In addition, investors, especially male, are advised to invest in products consisting of a variety of stocks because a product with diversified characteristics can reduce the return volatility and alleviate investors’ feeling of regret (Statman, 1995). For example, exchange traded funds (ETFs) may be better choices for investors to reduce their feeling of regret (Statman, 1995; Redhead, 2010). Based on the results of this study, investors are easily affected by mental accounting, which implies that investors should judge things from different perspectives when making decisions. 16.5.2.2 Financial institutions Financial institutions should know their customer’s context (i.e., know your customer’s context; KYCC) and provide appropriate products for their customers. To respond to the needs of investors for reducing cognitive/emotional biases, financial institutions can provide products that can help investors perform self-control and reduce regret emotion. In addition, financial institutions should provide potential customers with multi-function products because investors frame different functions as different mental accounts (gains) and prefer the segregation of multiple gains, which makes them happiest (Thaler, 1985, 1999). This may be the reason why insurance companies often provide a life insurance policy with additional functions, such as a saving function, and banks are more likely to issue a combo card rather than a single-function card. The results of this study provide evidence that investors, especially female, may prefer multi-function products, which implies financial institutions should provide their clients with a package of products and services. Investors’ regret emotion is influenced by the amount of their responsibility (Statman, 1995), which indicates that investors have less responsibility when investing in mutual funds than self-investing (Chang et al., 2016). Therefore, financial institutions can design financial products that can reduce investors’ responsibility to mitigate their regret emotion.
page 620
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Application of Structural Equation Modeling in Behavioral Finance
b3568-v1-ch16
621
16.5.2.3 The regulatory authorities The results of this study have important implications for the authorities as well. For example, Securities Exchange can launch some sustainability indices, such as corporate social responsibility (CSR) index, for investors and financial institutions to follow. Especially, financial institutions can use these indices to create ETFs products. As mentioned above, ETFs can help investors mitigate their regret emotion because of lower fluctuation risk. The regulatory authorities encourage financial institutions to design products that can help investors mitigate their self-control bias. Moreover, the increase in self-investing highlights the role of regulatory authorities in making investors aware of their trading biases. This study successfully detected the cognitive/emotional biases influencing the disposition effect. Because the questionnaire survey was launched in March 2009, most investors incurred considerable losses during the period of the 2008 financial crisis. Future extensions of this study could investigate investors’ behaviors in the bull market to identify the relationship among cognitive/emotional biases, the disposition effect, and investment performance.
Bibliography Ammann, M., Ising, A. and Kessler, S. (2012). Disposition Effect and Mutual Fund Performance. Applied Economic Finance, 22, 1–19. Anderson, J. C. and Gerbing, D. W. (1988). Structural Equation Modeling in Practice: A Review and Recommended Two-Stage Approach. Cognitive Bulletin, 103(2), 411–423. Bagozzi, R. P. and Yi, Y. (1988). On the Evaluation of Structural Equation Models. Academic of Marketing Science, 16(1), 76–94. Barber, B. M. and Odean, T. (1999). The Courage of Misguided Convictions. Financial Analysis Journal, 55(6), 41–55. Barber, B. M. and Odean, T. (2001). Boys Will Be Boys: Gender, Overconfidence, and Common Stock Investments. Quarterly Journal of Economics, 116(1), 261–289. Barber, B. M., Lee, Y. T., Liu, Y. J. and Odean, T. (2007). Is The Aggregate Investor Reluctant to Realize Losses? Evidence from Taiwan. European Financial Management, 13(3), 423–447. Barberis, N. and Huang, M. (2001). Mental Accounting, Loss Aversion, and Individual Stock Returns. Journal of Finance, 56(4), 1247–1292. Byrne, B. M. (1998). Structure Equation Modeling with LISREL, PRELIS, and SIMPLIS: Basic Concepts, Applications, and Programming. Lawrence Erlbaum Associates Publishers: Mahwah, NJ. Chang, C. J., Torkzadeh, G. and Dhillon, G. (2004). Re-examining the Measurement Models of Success for Internet Commerce. Information Management, 41(5), 577–584.
page 621
July 6, 2020
11:59
622
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
Chang, T. Y., Solomon, D. H. and Westerfield, M. M. (2016). Looking for Someone to Blame: Delegation, Cognitive Dissonance, and The Disposition Effect. The Journal of Finance, 71(1), 267–302. Chen, C. C., Lin, M. M. and Chen, C. M. (2012). Exploring the Mechanisms of the Relationship Between Website Characteristics and Organizational Attraction. The International Journal of Human Resource Management, 23(4), 867–885. Choe, H. and Eom, Y. (2009). The Disposition Effect and Investment Performance in the Futures Market. The Journal of Futures Markets, 29(6), 496–522. Chong, F. (2009). Deposition Effect and Flippers in the Bursa Malaysia. The Journal of Behavioral Finance, 10, 152–157. Cici, G. (2012). The Prevalence of the Disposition Effect in Mutual Funds’ Trades. Journal of Financial and Quantitative Analysis, 47(4), 795–820. Connors, J. J. and Elliot, J. (1994). Teacher Perceptions of Agriscience and Natural Resources Curriculum. Journal of Agricultural Education, 35(4), 15–19. Croonenbroeck, C. and Matkovskyy, R. (2014). Demand for Investment Advice Over Time: The Disposition Effect Revisited. Applied Financial Economics, 24(4), 235–240. Czarnitzki, D. and Stadtmann, G. (2005). The Disposition Effect-Empirical Evidence on Purchases of Investor Magazines. Applied Financial Economics, 1, 47–51. Da Costa Jr. Newton, Carlos Mineto, and Sergio Da Silva (2008). Disposition Effect and Gender. Applied Economics Letters, 15, 411–416. Da Costa Jr. Newton, Marco Goulart, Cesar Cupertino, Jurandir Macedo Jr., and Sergio Da Silva (2013). The Disposition Effect and Investor Experience. Journal of Banking & Finance, 37, 1669–1675. Dacey, R. and Zielonka, P. (2008). A Detailed Prospect Theory Explanation of the Disposition Effect. The Journal of Behavioral Finance, 9(1), 43–50. Dhar, R. and Zhu, N. (2006). Up Close and Personal: An Individual Level Analysis of the Disposition Effect. Management Science, 52(5), 726–740. Fernando, A. R. and Albena, P. (2015). An Empowerment Model of Youth Financial Behavior. Journal of Consumer Affairs, 49(3), 550–575. Fischbacher, U., Hoffmann, G. and Schudy, S. (2014). The Causal of Stop-Loss and TakeGain Orders on the Disposition Effect. Research Paper Series, No. 89. Thurgau Institute of Economics and Department of Economics at the University of Konstanz. 1–34. Fogel, S. O. and Berry, T. (2006). The Disposition Effect and Individual Investor Decisions: The Roles of Regret and Counterfactual Alternatives. The Journal of Behavioral Finance, 7(2), 107–116. Frino, A., Lepone, G. and Wright, D. (2015). Investor Characteristics and the Disposition Effect. Pacific-Basin Financial Journal, 31, 1–12. Garvey, R. and Murphy, A. (2004). Are Professional Traders Too Slow to Realize Their Losses? Financial Analysts Journal, 60(4), 35–43. Hair, J. F. Jr., Anderson, R. E., Tatham, R. L. and Black, W. C. (1998). Multivariate Data Analysis, 5th Edition. Prentice-Hall: Englewood Cliffs, NJ. Huang, W. H. and Zeelenberg, M. (2012). Investor Regret: The Role of Expectation in Comparing What Is To What Might Have Been. Judgment and Decision Making, 7(4), 441–451. Hung, M. W. and Yu, H. Y. (2006). A Heterogeneous Model of Disposition Effect. Applied Economics, 38(18), 2147–2157. Johnson, J. L. (1999). Strategic Integration in Industrial Distribution Channels: Managing the Inter-Firm Relationship as a Strategic Asset. Journal of the Academy of Marketing Science, 27(1), 4–18.
page 622
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Application of Structural Equation Modeling in Behavioral Finance
b3568-v1-ch16
623
Kahneman, D. and Tversky, A. (1979). Prospect Theory: An Analysis of Decision Under Risk. Econometrica, 47(2), 263–292. Kaustia, M. (2004). Market-Wide Impact of the Disposition Effect: Evidence from IPO Trading Volume. Journal of Financial Markets, 7(2), 207–235. Kleijnen, M., Wetzels, M. and Ruyter, K. D. (2004). Consumer Acceptance of Wireless Finance. Journal of Financial Services Marketing, 8(3), 206–217. Kohsaka, Y., Mardyla, G., Takenaka, S. and Tsutsui, Y. (2017). Disposition Effect and Diminishing Sensitivity: An Analysis Based on a Simulated Experimental Stock Market. Journal of Behavioral Finance, 18(2), 189–201. Lai, H. W., Chen, C. W. and Huang, C. S. (2010). Technical Analysis, Investment Psychology, and Liquidity Provision: Evidence from the Taiwan Stock Market. Emerging Markets Finance & Trade, 46(5), 18–38. Leal, C. C., Loureiro, G. and Armada, M. J. R. (2018). Selling Winners, Buying Losers: Mental Decision Rules of Individual Investors on their Holdings. European Financial Management, 24, 362–386. Lin, C. H., Huang, W. H. and Zeelenberg, M. (2006). Multiple Reference Points in Investor Regret. Journal of Economic Psychology, 27, 781–792. Locke, P. R. and Mann, S. C. (2005). Professional Trader Discipline and Trade Disposition. Journal of Financial Economics, 76, 401–444. Lu, Y., Ray, S. and Teo, M. (2016). Limited Attention, Marital Events and Hedge Funds. Journal of Financial Economics, 122, 607–624. MacCallum, R. C. and Austin, J. T. (2000). Applications of Structural Equation Modeling in Psychological Research. Annual Review Psychology, 51, 201–226. McDonald, R. P. and Ho, M. R. (2002). Principles and Practice in Reporting Structural Equation Analysis. Psychological Methods, 7, 64–82. Muhl, S. and Talpsepp, T. (2018). Faster Learning in Troubled Times: How Market Conditions Affect the Disposition Effect. The Quarterly Review of Economics and Finance, 68, 226–236. Odean, T. (1998). Are Investors Reluctant to Realize Their Losses? Journal of Finance, 53(5), 1775–1798. Olorunniwo, F., Hsu, M. K. and Udo, G. F. (2006). Service Quality, Customer Satisfaction, and Behavioral Intentions in the Service Factory. Journal of Service Marketing, 20(1), 59–72. Ploner, M. (2017). Hold on to it? An Experimental Analysis of the Disposition Effect. Judgment and Decision Making, 11(2), 117–128. Podsakoff, P. M., Mackenzie, S. B., Lee, J. Y. and Podsakoff, N. P. (2003). Common Method Biases in Behavioral Research: A Critical Review of the Literature and Recommended Remedies. Journal of Applies Psychology, 88(5), 879–903. Pompian, M. M. (2006). Behavioral Finance & Wealth Management. John Wiley & Sons. Rabin, M. and Thaler, R. H. (2001). A Anomalies Risk Aversion. Journal of Economic Perspectives, 15(1), 219–232. Raine-Eudy, R. (2000). Using Structural Equation Modeling to Test for Differential Reliability and Validity: An Empirical Demonstration. Structural Equation Modeling, 7(1), 124–141. Rau, H. A. (2014). The Disposition Effect and Loss Aversion: Do Gender Differences Matter? Economics Letters, 123, 33–36. Redhead, K. (2010). Behavioral Perspectives on Index Funds. Journal of Financial Service Professionals, 64(4), 54–61. Richards, D. W., Rutterford, J., Kodwani, D. G. and Fenton-O’Creevy, M. (2017). Stock Market Investors’ use of Stop Losses and the Disposition Effect. The European Journal of Finance, 23(2), 130–152.
page 623
July 6, 2020
11:59
624
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
Richards, D. W., Fenton-O’Creevy, M., Rutterford, J. and Kodwani, D. G. (2018). Is the Disposition Effect Related to Investors’ Reliance on System 1 and System 2 Processes or Their Strategy of Emotion Regulation? Journal of Economic Psychology, 66, 79–92. Shefrin, H. and Statman, M. (1985). The Disposition to Sell Winners Too Early and Ride Losers Too Long: Theory and Evidence. Journal of Finance, 40(3), 777–790. Shiller, R. J. (1999). Human Behavior and the Efficiency of the Financial System. In Taylor, J. and Woodford, M. (Eds.), The Handbook of Macroeconomics. North-Holland, Amsterdam, 1305–1340. Soureli, M., Lewis, B. R. and Karantinou, M. K. (2008). Factors that Affect Consumers’ Cross-Buying Intention: A Model for Financial Services. Journal of Financial Services Marketing, 13(1), 5–16. Statman, M. (1995). A Behavioral Framework for Dollar-Cost Averaging. The Journal of Portfolio Management, 22(1), 70–78. Summers, B. and Duxbury, D. (2012). Decision-Dependent Emotions and Behavioral Anomalies. Organizational Behavior and Human Decision Process, 118(2), 226–238. Thaler, R. H. (1985). Mental Accounting and Consumer Choice. Marketing Science, 4(3), 199–214. Thaler, R. H. (1999). Mental Accounting Matters. Journal of Behavioral Decision Making, 12, 183–206. Thaler, R. H. and Shefrin, H. M. (1981). An Economic Theory of Self-Control. Journal of Political Economy, 89(21), 392–405. Thaler, R. H. and Shefrin, H. M. (1988). The Behavioral Life-Cycle Hypothesis. Economic Inquiry, 26(4), 609–643. Thaler, R. H. and Benartzi, S. (2004). Save More Tomorrow: Using Behavioral Economics to Increase Employee Saving. Journal of Political Economy, 112(1), 5164–5187. Theron, E., Terblanche, N. S. and Boshoff, C. (2008). The Antecedents of Relationship Commitment in Management of Relationships in Business-to-Business (B2B) Financial Services. Journal of Marketing Management, 24(9–10), 997–1010. Tu, T. T., Chang, H. H. and Chiu, Y. H. (2011). Investigation of the Factors Influencing the Acceptance of Electronic Cash Stored-Value Cards. African Journal of Business Management, 5(1), 108–120. Weber, M. and Camerer, C. (1998). The Disposition Effect in Securities Trading. Journal of Economic Behavior and Organization, 33, 167–184. Wong, A. S., Carducci, B. J. and White, A. J. (2006). Asset Disposition Effect: The impact of Price Patterns and Elected Personal Characteristics. Journal of Asset Management, 7, 291–300. Wong, E. and Kwong, J. (2007). The Role of Anticipated Regret in Escalation of Commitment. Journal of Applied Psychology, 92(2), 545–554. Wulfmeyer, S. (2016). Irrational Mutual Fund Managers: Explaining Differences in their Behavior. Journal of Behavioral Finance, 7(2), 99–123.
Appendix 16A: Structural Equation Model A full SEM consists of two measurement models and a structural model. The measurement models are specified by equations (16A.1) and (16A.2), respectively (Byrne, 1998): x = Λx ξ + δ,
(16A.1)
page 624
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Application of Structural Equation Modeling in Behavioral Finance
b3568-v1-ch16
625
where x is a q × 1 vector of exogenous observed variables, and q is the number of x associated with ξ. ξ denotes a n × 1 vector of exogenous latent variables, and n denotes the number of ξ. Λx is a q × n matrix of coefficients of regression of x on ξ, and δ is a q × 1 vector of measurement errors in x. y = Λy η + ε,
(16A.2)
where y is a p × 1 vector of endogenous observed variables, and p is the number of y associated with η. η denotes a m × 1 vector of endogenous latent variables, and m denotes the number of η. Λy is a p × m matrix of coefficients of regression of y on η, and ε is a p × 1 vector of measurement errors in y. The structural model is used to examine the path relationship between latent variables, which is specified by equation (16A.3): η = Bη + Γξ + ζ,
(16A.3)
where B is an m×m matrix of coefficients of the η-variables in the structural relationship, with zeros in the diagonal, and I–B should be non-singular. Γ denotes an m × n matrix of coefficients of the ξ-variables, and ζ denotes an m × 1 vector of equation errors in the structural relationship between η and ξ. In addition, the random components in structural equation model are presumed to satisfy the following minimal assumptions: δ is uncorrelated with ξ, ε is uncorrelated with η, ζ is uncorrelated with ξ, and ζ, ε, and δ are mutually uncorrelated. Covariance matrices include cov(ξ) = Φ(n × n), cov(ζ) = Φ(m × m)cov(δ) = Θδ (q × q), and cov(ε) = Θε (p × p). The covariance matrix of the observations implied by SEM is represented as follows:
y Λy A(ΓΦΓ + Ψ)A Λy + Θε = Cov = Λx ΦΓ A Ay x
Λy AΓΦΛx , (16A.4) Λx ΦΛx + Θδ
where A = (I − B)−1 . In this study, the measurement model consists of 21 observed variables: 14 exogenous observed variables (x) and seven endogenous observed variables (y). The structural model consists of five latent variables: three exogenous latent variables (i.e., ξ = MA, RA, and SC) and two endogenous latent variables (i.e., η = DE and PF).
page 625
July 6, 2020
11:59
626
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch16
H.-H. Chang
Appendix 16B: Questionnaire MA01 MA02 MA03 MA04
RA01 RA02 RA03
RA04 SC01 SC02 SC03 SC04 SC05 SC06 DE01 DE02 DE03 DE04 PF01
PF02
PF03
If I have two stocks, one with a NT $15 loss and the other with a NT $5 loss, and I feel that I have a total loss of NT $20. If I have two stocks, each with a NT $10 gain, and I feel that it is better than one with a NT $20 gain. If I have two stocks, one with a NT $3 gain and the other with a NT $10 loss, I feel lucky that I still have a NT $3 gain. If I have two stocks, one with a NT $10 gain and the other with a NT $4 loss, and I feel I have a total NT $6 gain only (preferring the integration of a big gain and a small loss). I regret selling winning stocks too early. I regret holding losing stocks too long. I invest in A stock and obtain a great return; however, I find that B stock has a better return than A stock, and I feel the stock investment was the wrong decision. I feel regret for holding losing stocks much more than for selling winning stocks. I think stock investment has to be set at a stop-loss order. I think stock investment has to be set at a stop-gain order. I think I can execute at a stop loss with a proper price. I think I can execute at a stop gain with a proper price. I think I can perform a stop-loss strategy. I think I can perform a stop-gain strategy. When the prices of winning stocks increased for a period of time, I have the experience of selling winning stocks and realizing gains. When prices of losing stocks decreased for a period of time, I have the experience of holding losing stocks and not realizing losses. I have more experience of selling winning stocks than that of selling losing stocks. I have the experience of selling winning stocks and holding losing stocks. My unrealized investment performance in the past six months is A. below −20%, B. −20% to −10%, C. −10%–0%, D. even, E. 0–10%, F. 10%–20%, G. above 20%. My realized investment performance in the past six months is A. below −20%, B. −20% to −10%, C. −10%–0%, D. even, E. 0–10%, F. 10%–20%, G. above 20%. My unrealized and realized investment performance in the past six months is A. below −20%, B. −20% to −10%, C. −10%–0%, D. even, E. 0–10%, F. 10%–20%, G. above 20%.
page 626
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
Chapter 17
External Financing Needs and Early Adoption of Accounting Standards: Evidence from the Banking Industry∗ Sophia I-Ling Wang Contents 17.1 17.2 17.3 17.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Early Adoption Literature . . . . . . . . . . . . . . . . Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.1 The sample . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.2 Early-adoption decision models and predictions: Hypotheses 1–3 . . . . . . . . . . . . . . . . . . . . . . 17.4.3 Comparisons between early and late adopters in financing activities: Hypothesis 4 . . . . . . . . . . . . . . . .
628 634 635 640 640 641 646
Sophia I-Ling Wang California State University e-mail: [email protected] ∗
This paper is based upon my dissertation “Early Adoption of Accounting Standards in the Banking Industry” completed at the University of Illinois at Urbana-Champaign. I am indebted to Theodore Sougiannis (chair) and the members of my dissertation committee, George Deltas, James Gong, Ganapathi Narayanamoorthy, and Raghunathan Venugopalan for their much appreciated and helpful guidance and feedback. I also thank Dr. Cheng Few Lee, Rashad Abdel-khalik, Fatima Alali, Clara Chen, Wei Jiang, Richard Lu, Mark Peecher, Padmakumar Sivadasan, Myungsoo Son, Flora Zhou, Hui Zhou, and seminar participants at the University of Illinois at Urbana-Champaign and California State University, Fullerton for their useful comments. Special thanks to Andrew Cheng for his assistance in collecting the data set.
627
page 627
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
628
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
17.5
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.1 Descriptive statistics . . . . . . . . . . . . . . . . . . . 17.5.2 Findings related to likelihood of early adopting accounting standards . . . . . . . . . . . . . . . . . . . . . . . . 17.5.3 Comparisons of the financing activities between early and late adopters . . . . . . . . . . . . . . . . . . . . . 17.5.4 Additional analyses . . . . . . . . . . . . . . . . . . . . 17.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 17A: Accounting Standards from January 1995 to March 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 17B: Logit Model for Early Adoption Decisions . . . . . . Appendix 17C: Summary of Research Design . . . . . . . . . . . . .
647 647 650 656 661 663 664 669 671 673
Abstract Economic intuition and theories suggest that banks are motivated to voluntarily disclose information and signal their quality, for example, through early adoption of accounting standards, to better access capital markets. Examining accounting standards from January 1995 to March 2008, I find that US bank holding companies (BHCs) with lower profitability and higher risk profiles are more likely to choose early adoption. This evidence is consistent with a BHC’s incentive to better access external financing through information disclosure and signaling. Moreover, a counter-signaling effect of decisions not to early adopt is first identified because early-adopting BHCs are not necessarily the least risky and the most profitable. I also find the counter-signaling effect to be most evident when an accounting standard has no effect on the financial statement proper (i.e., only disclosure requirements). This finding complements prior research that managers treat recognition and disclosure differently and that financial statement users weigh more on recognized than disclosed values. Finally, the results show that early adopters generally experience higher fund growth in uninsured debts than matched late adopters in economic expansions, times when BHCs are most motivated to obtain funds. This finding is consistent with the bank capital structure literature that banks have shifted towards nondeposit debts to finance their balance sheet growth. Keywords Early adoption • Banks • Disclosure and counter-signaling • External financing.
17.1 Introduction Early adoption of accounting standards (hereafter, early adoption) serves as a form of voluntary disclosure and prior research has documented that some firms do select early adoption for such a purpose. Prior studies also document that managers choose early adoption to maximize accounting-based compensation bonuses, avoid violations of debt covenants, minimize political
page 628
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
629
costs, and increase bank regulatory capital (reviewed in Section 17.2). This study extends the early-adoption literature and examines another motivation whether US bank holding companies (hereafter, BHCs or banks) disclose information and signal their quality through early adoption to better access external financing.1 In addition, it examines whether early-adopting BHCs experience better external fund growth subsequent to early adoption. The association between a firm’s voluntary disclosure and its motivation for better access to capital markets is economically intuitive and has been theoretically established (e.g., Leland and Pyle, 1977; Myers and Majluf, 1984; Suijs, 2007). Extant information disclosure literature shows that a firm in need of external financing can better access capital markets through credible disclosure by reducing the adverse selection costs due to information asymmetry.2 As a form of voluntary disclosure, early adoption enables a firm to credibly disclose information sooner than required in external financial reporting associated with new accounting pronouncements. Hence, firms may elect early adoption of accounting standards to better access capital markets. The focus on the early-adoption behavior of the banking industry is threefold. First, enhancing information transparency of the banking industry has been emphasized by banking regulators to facilitate market discipline and valuation of the financial sector (e.g., Bushman and Williams, 2012; Estrella, 2004; Pasiouras et al., 2006). Extant research, however, has been limited in examining the determinants and capital market consequences of bank information disclosure.3 Recent studies on bank capital structure have shown that banks resemble considerably their nonbanking counterparts in their financing decisions.4 Gropp and Heider (2010) show that banks have shifted away from deposits towards nondeposit liabilities to finance their balance sheet growth. As banks are inherently opaque institutions and they are constantly in need
1
A bank holding company is defined as “A company that owns and/or controls one or more US banks or one that owns, or has controlling interest in, one or more banks. A bank holding company may also own another bank holding company, which in turn owns or controls a bank.” (source: http://www.ffiec.gov/) 2 Although existing disclosure research does not visualize banks as the subject matter, the insights from the literature still apply to banks. The fact that banks are scrutinized by regulators does not preclude banks from better accessing external financing with voluntary disclosure techniques. More discussion is provided in Section 17.3. 3 This limited investigation is partly attributable to the conventional wisdom that bank capital regulation overarches in banks’ financing decisions and that banks function as information-insensitive debt creators (e.g., Admati et al., 2011; Gorton, 2010). 4 See for example, Berlin (2009), Flannery and Rangan (2008), and Gropp and Heider (2010).
page 629
July 6, 2020
11:59
630
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
of external funds by nature, these findings prompt an immediate question on the role of bank information disclosure in facilitating bank external financing activities. Second, a majority of the accounting pronouncements allowing for early adoption during the past decade were related to financial instruments. Because financial instruments are extensively used in banking activities, early adoption of these pronouncements provides a natural setting to study bank information disclosure and the capital market consequences forthwith.5 Third, early adoption by the banking industry is intriguing given the longstanding debate on the increasing use of fair value measures in accounting standards.6 Despite bank opposition towards the increasing use of fair value measures in accounting standards, there are still banks which choose to early adopt accounting standards associated with fair value measures (Beatty, 1995). Hence, it merits further investigation of the types of banks that benefit from early adoption. Tapping capital markets for funds is essential to banks and information disclosure has been theoretically associated with a firm’s motivation for better access to capital markets. Existing theoretical and empirical voluntary disclosure research suggests firm profitability as a proxy for a firm’s motivation for better access to capital markets and posits it to be negatively associated with disclosure decisions for such a motivation (e.g., Myers and Majluf, 1984; Frank and Goyal, 2008; Francis et al., 2005; Suijs, 2007). These studies argue that high profit firms have few concerns about acquiring funds in capital markets because they are financially healthy. They can either generate funds through internal growth or raise funds externally with low cost of capital without additional disclosure. On the other hand, low profit firms have limited ability to generate funds internally and thus are relatively reliant on external financing. High profit firms, therefore, are not as motivated as low
5
I discuss banks’ early-adoption decisions based on the types of accounting standards (the ones related to financial instruments and the ones unrelated to financial instruments) in Section 17.5.4. 6 A majority of the banking sector has expressed concerns on the increasing application of fair value model to accounting items, particularly financial instruments, over the past two decades. For example, American Bankers Association (ABA) and bank regulators raise issues on the reliability, verifiability, and auditability of fair value estimates. They are also concerned with the unintended consequences of the use of fair value accounting on financial instruments in bank risk management practices. They believe a mismatch exists between the fair value model and banks’ relationship-based business model. See ABA letter to the Securities and Exchange Commission (SEC) dated November 13, 2008.
page 630
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
631
profit firms to choose disclosure to gain better access to external financing. Such a prediction is also consistent with the bank capital structure literature that bank profitability is negatively associated with bank leverage (e.g., Berlin, 2009; Gropp and Heider, 2010). In a similar note, counter-signaling theory suggests that high quality firms choose not to signal or counter-signal their quality through additional noisy information to differentiate themselves from medium quality firms (e.g., Feltovich et al., 2002). Medium quality firms, however, choose to signal their quality through additional noisy information to separate themselves from low quality firms. Since early-adoption decisions may not be the only information signal available to banks, high quality banks may choose to counter-signal (i.e., not to early adopt) to separate themselves from medium quality banks for better access to external financing. Taken together, the above discussion suggests that the likelihood of early adoption, a form of voluntary disclosure, is negatively associated with bank profitability. Besides profitability, voluntary disclosure studies suggest firm risk as another proxy for a firm’s incentive to better access external financing and posit it to be positively associated with disclosure decisions. Lang and Lundholm (1993) argue that if firm performance variability proxies for information asymmetry between investors and managers, riskier firms are more likely to choose disclosure to reduce adverse selection costs due to information asymmetry. Consequently, risky firms can better access external financing with lower cost of capital through voluntary disclosure than without such disclosure. Suijs (2007) theoretically suggests that the likelihood of disclosure increases with the uncertainty in firms’ future performance (i.e., firm risk profiles). The above discussion suggests that the likelihood of early adoption is positively associated with bank risk profiles. I further examine whether banks’ motivation for better access to external financing (proxied by bank profitability and risk profiles) through information disclosure and signaling/counter-signaling is conditional on the way information is revealed (i.e., recognition and disclosure). Prior studies suggest that managers and financial statement users treat recognition differently from disclosure due to contracts effects and enhanced audits (e.g., Amir and Ziv, 1997b; Mitra and Hossain, 2009; Choudhary, 2011; Cheng and Smith, 2013). In particular, accounting standards with recognition rules result in income effects and increase managers’ incentives to opportunistically report values used in contracts whereas rigorous audits for recognized values reduce such incentives. Since bank profitability measures could be affected by managerial opportunism through income effects of recognition
page 631
July 6, 2020
11:59
632
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
rules but not disclosure rules, it is expected that bank profitability better represents a bank’s need for better access to external financing in the case of disclosure rules. Hence, I expect a bank’s motivation for better access to external financing through early adoption to be most evident in the case of accounting standards with only disclosure requirements (i.e., both the negative (positive) relationship between the likelihood of early adoption and bank profitability (risk profiles) should hold). I use logit regressions to test the association of bank profitability and bank risk profiles with early-adoption decisions by examining the earlyadoption decisions of 462 US BHCs for 16 accounting standards issued between January 1995 and March 2008. Consistent with a bank’s motive to better access external financing through disclosure, I find a negative (positive) relationship between bank profitability (bank risk profiles) and the likelihood of early adoption. The findings are also consistent with the idea that high quality banks choose not to early adopt accounting standards to counter-signal their quality to external financing providers. I further find that the counter-signaling effect is best supported in the context of early-adoption decisions on accounting standards with only disclosure requirements. The results complement prior research that managers treat recognition and disclosure differently and that financial statement users weigh more on recognized than disclosed values. The additional analysis shows that the motive to better access capital markets is particularly evident in the cases of early adoption of accounting standards related to financial instruments, which are deemed more critical to the banking industry. To provide evidence on the capital market consequences following early adoption, I examine whether early bank adopters generally experience higher growth of funds than their matched late bank adopters between the standard issued date and its effective date (i.e., the test period), controlling for bank profitability and risk profiles. The results show that early adopters generally experience a higher fund growth than matched late adopters in test periods that take place during economic expansions, times in which banks are most motivated to access external financing. The differences in the growth of uninsured debts between early adopters and matched late adopters appear to account for the findings, consistent with the bank capital structure literature (e.g., Bolton and Freixas, 2006; Ashcraft, 2008; Gropp and Heider, 2010). Collectively, the results are consistent with the idea that banks’ motive to better access capital markets plays a role in their early-adoption decisions. This research extends and complements the bank literature and prior early-adoption studies in several ways. First, this research introduces
page 632
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
633
incentives to better access external financing as a new reason to early adopt and documents the role of bank disclosure in bank financing activities. In contrast to the motivation of increasing bank regulatory capital through an accrual effect (i.e., unrealized holding gains and losses) for early adoption (Beatty, 1995), I investigate whether banks early adopt accounting standards to disclose information and signal their quality to better access external financing in the form of either debt or equity capital. This study shows that early adoption of accounting standards is rare for banks, with the rate of early adoption ranging from zero to 20%. Despite this rarity, ex post evidence in this study on early-adopters’ immediate improvement in accessing capital markets reinforces the idea that banks’ incentive to better access external financing is important in explaining their early-adoption decisions. Second, this study investigates several accounting standards that allow for early adoption, whereas Beatty (1995) and prior studies examine only one standard at a time. To the best of my knowledge, this is the first research to utilize a large set of early-adoption data of BHCs and systematically examine BHCs’ early-adoption decisions. The multiple-standard setting enables me to provide evidence of variations in bank early-adoption decisions and subsequent bank financing activities, given different income effects of accounting standards, standard-specific characteristics, and macroeconomic conditions.7 Finally, the evidence of negatively associated bank profitability and positively associated bank risk profiles with early-adoption decisions suggests a counter-signaling effect (Feltovich et al., 2002) of early-adoption decisions. Prior accounting research on accounting choice has associated the signaling theory (Spence, 1973) with managers’ accounting method choice but yields mixed results (e.g., Eakin and Gramlich, 2000; Aboody et al., 2004). One potential explanation for the mixed results lies in how well an accounting method choice signals a firm’s quality as suggested by the counter-signaling theory. This study is the first to identify counter-signaling effects of banks’ early-adoption decisions because it documents that early-adopting banks are not necessarily the least risky and the most profitable. Section 17.2 reviews related early adoption literature and Section 17.3 develops the hypotheses. Section 17.4 describes the research design and
7
In this study, I use bank profitability and risk profiles to proxy for banks’ external financing incentive and to examine the effect of that incentive on early-adoption decisions. Therefore, the multiple-standard setting enables a powerful test and generalizability of the results.
page 633
July 6, 2020
11:59
634
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
empirical predictions. Section 17.5 presents findings and additional analyses. Section 17.6 concludes. 17.2 Related Early Adoption Literature Extant work on early adoption of accounting standards focuses on issues such as motivations for early adoption, and economic consequences or stock market reactions to early adoption. A number of studies apply the positive theory of accounting (Watts and Zimmerman, 1986) to explain managerial motives for early adoption of an accounting standard. For example, Scott (1991) and Langer and Lev (1993) examine and document the influence of political costs, management compensation contracts, and the magnitude of the income effect of the adoption on the adoption-timing choice of pension accounting. In the context of the standard on income taxes, Gujarathi and Hoskin (2003) find that the positive theory of accounting is significant in explaining management’s early-adoption decisions. Other than the contractual motivations for early adoption, some studies examine an informational effect of early adoption. For example, Amir and Livnat (1996) show that firms early adopt SFAS No. 106 (FASB, 1990) to correct the market’s perception of the magnitude of the postretirement benefit obligation (PRB). Amir and Ziv (1997a, 1997b) develop a theoretical model that shows how firms use adoption timing and recognition/disclosure choices to convey their private information to the market and empirically document the predictions in the context of early adopting SFAS No. 106. In addition, they find that early adopters generally experience positive market reactions to their adoption announcements. Beatty (1995) is the only study that examines the determinants of the early-adoption decisions by the banking industry. She finds that BHCs choose early adoption of SFAS No. 115 (FASB, 1993) to increase regulatory capital through an accrual effect (i.e., net unrealized gains on available-for-sale securities).8 According to her findings, early adopters tend to reduce holdings in investment securities and the maturity of the investment securities to minimize regulatory capital costs when potential volatility in recognized fair values and in regulatory capital ratios increases. In summary, there are few empirical investigations into the informational effect of early-adoption decisions. In addition, few studies examine 8
The effect of the standard on Tier 1 capital, however, was excluded by the Federal Reserve Board three quarters after the standard became mandatorily effective in November 1994 (Beatty, 1995).
page 634
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
635
early-adoption decisions on accounting standards that only require disclosure because the positive theory of accounting is unable to explain management’s early-adoption decisions on accounting standards without income implications or contract effects. The broader voluntary disclosure literature is also short of evidence in the banking industry. With an increasing number of accounting standards related to financial instruments and calls for more enhanced transparency of the financial sector by market participants, examining the determinants of and consequences to banks’ voluntary disclosure is of utmost importance. This research aims to complement the early-adoption literature and thus the broader voluntary disclosure literature by examining BHCs’ incentive to early adopt accounting standards to disclose information, signal their quality, for the ultimate goal of better accessing external financing (i.e., determinants and capital market consequences of early adoption).
17.3 Hypotheses I consider banks’ motivation for early adoption from an information disclosure perspective because by nature banks are inherently opaque institutions and constantly in need of funds. Recent studies on bank capital structure have shown that banks resemble considerably their nonbanking counterparts in their financing decisions (e.g., Flannery and Rangan, 2008; Gropp and Heider, 2010; Berlin, 2011). However, banks are subject to bank regulatory supervisions; they need to maintain adequate capital and show bank regulators their capability of effectively managing bank balance sheets and related risks (e.g., Berger et al., 1995; Estrella, 2004; Berger et al., 2008; Greenlaw et al., 2008).9 Under regulations, banks have every incentive to stay wellcapitalized for economic benefits such as less regulatory scrutiny and more
9
The supervisory rating system for bank holding companies is known as BOPEC (Bank subsidiaries, Other nonbank subsidiaries, Parent company, Earnings, and Capital adequacy) (Hirtle and Lopez, 1999). Although the Federal Reserve emphasized risk management in its supervisory processes, this component was not directly reflected in the name of BOPEC. Therefore, in December 2004, the Federal Reserve revised the rating system as RFI/C(D) (Risk Management (R); Financial Condition (F); potential Impact (I) of the parent company and nondepository subsidiaries (collectively nondepository entities) on the subsidiary depository institution(s); Composite rating (C) based on an evaluation and rating of its managerial and financial condition and an assessment of future potential risk to its subsidiary depository institution(s); Depository Institution (D)). The revised rating system became effective on January 1, 2005 (Federal Reserve Board, 2004).
page 635
July 6, 2020
11:59
636
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
operational flexibility. When a bank’s financial position deteriorates, even before being classified as undercapitalized, it faces increasingly severe regulatory sanctions and the threat of early closure under the Federal Deposit Insurance Corporation Improvement Act of 1991. In addition, it is often costly to raise equity capital quickly if the banks are not well capitalized (Furlong and Kwan, 2007; Gropp and Heider, 2010). As a result, maintaining capital and managing risk are essential to banks. To this end, banks are motivated to disclose information or signal their quality for the purpose of better accessing capital markets when necessary to ensure continuing growth of their operations.10 Banks vary in their needs of raising funds externally. Therefore, the net expected benefits/costs of voluntary disclosure to better access external financing differ according to a bank’s need for external funds. Prior voluntary disclosure research suggests that firm profitability is associated with a firm’s disclosure decisions. Specifically, firm profitability can proxy for a firm’s incentive to convey favorable information or to better access external financing. For a firm’s incentive to convey favorable information, prevalent disclosure theories and empirical studies suggest that a firm is more likely to choose disclosure to reduce information asymmetry as its private information becomes more favorable (e.g., Dye, 1986; Lang and Lundholm, 1993, 2000). A general explanation for the theoretical prediction and empirical findings is nonzero disclosure costs.11 For a firm’s incentive to better access external financing, prior studies suggest that firm profitability is negatively associated with a firm’s incentive to better access external financing and thus negatively with voluntary disclosure (e.g., Myers and Majluf, 1984; Francis et al., 2005). Intuitively, firms can either generate cash flows internally or acquire capital externally to fund their operations or investments. Low profit firms, however, have limited ability to generate cash flows internally, and thus making the cost of external financing particularly important to them. Therefore, low profit
10
Extant disclosure theories and empirical research suggest that firms are motivated to disclose information voluntarily to better access external financing (e.g., Leland and Pyle, 1977; Myers and Majluf, 1984; Frankel et al., 1995; Lang and Lundholm, 2000; Francis et al., 2005; Suijs, 2007). 11 On the other hand, some empirical studies argue and document that firms choose to disclose preemptive bad news to minimize potential litigation liability or management reputation costs (e.g., Cao and Narayanamoorthy, 2011; Skinner, 1994).
page 636
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
637
firms are relatively motivated to choose disclosure compared to high profit firms to lower the cost of external financing. Suijs (2007) models how a firm varies its voluntary disclosure policy to acquire more capital from an investor when the firm is uncertain how the investor will respond to the disclosure. The model suggests a partial disclosure equilibrium where a firm hides high profit and discloses low profit. The intuition is that a high profit firm can easily raise funds in capital markets without disclosure compared to a low profit firm.12 As a result, low profit firms are more motivated than high profit firms to choose voluntary disclosure. Counter-signaling theory suggests along similar lines that high (medium) quality firms choose not (choose) to signal their quality through additional noisy information to differentiate themselves from medium (low) quality firms (Feltovich et al., 2002). The underlying mechanism of the equilibrium is that high quality firms are confident enough that they will not be mistaken as low quality firms without the additional noisy information; medium quality firms find it costly to mimic high quality firms’ strategy and risk being mistaken as low quality firms. Since early-adoption decisions may not be the only information signal available to banks, high quality banks may choose to counter-signal (i.e., not to early adopt) to separate themselves from medium quality banks for better access to external financing. The aforementioned studies are based on industrial firms in needs of equity financing. The insights offered by these studies are readily applicable to banks which intend to better external financing. Banks can access either insured deposits or uninsured debts as a means of debt financing. However, since insured deposits are explicitly guaranteed by deposit insurance, these debt instruments are essentially free from banks’ default risk and hence are informationally insensitive (e.g., Admati et al., 2011). Uninsured debts, on the other hand, are not free from banks’ default risk and may be compromised by adverse selection problems due to information asymmetry (Lucas and McDonald, 1992). Prospective or current uninsured debt providers may hesitate to purchase or renew the uninsured debt securities sold by banks due to concerns that banks hold superior private information. Hence, the higher a bank’s profitability, the fewer concerns capital providers have regarding the bank’s ability to service debts. In an information environment where early
12
In addition, high profit firms avoid any chances of having negative reactions from information receivers by not disclosing their private information which may fall below information receivers’ expectations of their profitability.
page 637
July 6, 2020
11:59
638
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
adoption is not the only information signal, high profit banks are thus less motivated than low profit banks to choose early adoption to better external financing. In summary, the preceding discussion suggests a negative (positive) association between voluntary disclosure and firm profitability when firm profitability stands for an incentive to better access external financing (to convey favorable information). Although the results conflict from extant empirical studies on the association between firm disclosure and firm profitability, they still support the idea that voluntary disclosures arise when the expected benefits exceed costs. Consequently, it is an empirical issue whether a bank’s incentive to better access external financing as proxied by bank profitability drives its early-adoption decisions. I hypothesize bank profitability in an alternative form as follows: H1: The likelihood of early adoption is negatively related to bank profitability. Voluntary disclosure research also motivates the association between firm risk and a firm’s incentive to better access external financing. In an empirical study, Lang and Lundholm (1993) argue if firm performance variability (i.e., firm risk) proxies for information asymmetry between investors and managers, firms with high performance variability are more likely to choose disclosure to relieve information asymmetry. Moreover, studies find that firms can improve external financing with voluntary disclosure by lowering costs of debt and equity capital (e.g., Botosan, 1997; Sengupta, 1998; Botosan and Plumlee, 2002). Extant bank capital structure literature has established a negative relationship between bank risk and capital structure. In particular, bank risk is expected to be negatively associated with bank leverage because maintaining greater capital buffer and avoiding debt financing due to greater information asymmetry about risk is critical for risky banks facing greater adverse selection costs of debts (e.g., Estrella, 2004; Barth et al., 2008; Gropp and Heider, 2010; Halov and Heider, 2011). Therefore, to better access external financing, banks with higher risk profiles are more likely to disclose information to reduce information asymmetry about risk. The discussion therefore implies a positive association between firm risk profiles and voluntary disclosure. Because early adoption is a form of voluntary disclosure, banks with higher risk profiles are expected to be more likely to choose
page 638
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
639
early adoption to better access external financing. I hypothesize bank risk profiles in an alternative form as follows: H2: The likelihood of early adoption is positively related to bank risk profiles. Thus far, the discussion on the association between early-adoption decisions and banks’ need for better access to external financing (proxied by bank profitability and risk profiles) is unconditional on how information is revealed (recognition or disclosure). Prior studies, however, suggest that management treats recognized and disclosed values differently because recognized values are more rigorously audited than disclosed values and might affect current or future earnings (e.g., Amir and Ziv, 1997b; Mitra and Hossain, 2009; Choudhary, 2011; Cheng and Smith, 2013). Particularly, two opposing effects arise on management opportunism in reported earnings in response to new accounting standards that require recognition. One is that management incentives might be increased to opportunistically report (either downward or upward) recognized values upon which contracts are based whereas disclosed values are less subject to managerial opportunism because there are no associated contract effects. Alternatively, management opportunism in recognized values might be reduced because recognized values are audited rigorously relative to disclosed values. Since bank profitability measures could be affected by managerial opportunism through income effects of recognition rules, it is unclear the extent to which bank profitability represents the need for better access to external financing through information disclosure and signaling. Therefore, I expect a bank’s motivation for better access to external financing through early adoption to be most evident when an accounting standard has no effect on the financial statement proper. I hypothesize in an alternative form as follows: H3: A bank’s motivation to disclose information and signal its quality to better access external financing is most evident in early adoption of accounting standards with only disclosure requirements. Given that a bank in need of funds could choose early adoption for better access to external financing, an intriguing empirical issue is whether an earlyadopting bank actually has better access to external financing following early adoption (i.e., capital market consequences). Consequently, I hypothesize better access to external financing in an alternative form as follows:
page 639
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
640
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
H4: A bank that chooses early adoption can better access external financing than a similar bank that does not choose early adoption.
17.4 Research Design 17.4.1 The sample To hand collect the adoption status of each sample accounting standard disclosed in 10Q/10K filings for each BHC, all 10Qs/10Ks filed by banks between 1994 and 2008 were researched using the Morningstar Document Research data base.13 A total of 581 BHCs filed 10Qs/10Ks during the period and 5,595 BHC-standard observations are available. As a final step, observations without the required information from the Bank Regulatory data base are eliminated to calculate all independent variables at the bank level. This action yields a final sample of 3,640 BHC-standard observations (462 unique BHCs). An overview of the 16 sample accounting standards examined in this chapter is provided in Appendix 17A.14 In particular, there are seven sample accounting standards related to financial instruments. In terms of the potential income effect of an adoption, two accounting standards are expected to have income-increasing effects (SFAS Nos. 122 and 146), five with income-decreasing effects (SFAS Nos. 121, 123, 143, 144, and 123R), five accounting standards with ex ante undetermined income effects (SFAS Nos. 133, 142, 155, 156, and 159), and four accounting standards with no impact on income (SFAS Nos. 132, 132R, 157, and 161). Three accounting standards prompt changes in the determination of regulatory capital (SFAS Nos. 122, 133, and 155). 13 The 10Qs and 10Ks are only available in electronic files starting on January 1, 1994. Therefore, the earliest accounting standard allowing for early adoption that I could attend to with a complete adoption period (i.e., a period starting from a standard announcement date to a standard effective date) is SFAS No. 121, “Accounting for the impairment of long-lived assets and for long-lived assets to be disposed of”, which was issued in March 1995 and effective on and after December 15, 1995. The sample period ends in March 2008 because SFAS No. 161, “Disclosures about derivative instruments and hedging activities — an amendment of SFAS No. 133”, was the most recent accounting standard allowing for early adoption at the time when I researched 10Qs and 10Ks for banks’ early-adoption decisions. Only BHCs with their parent companies identified as from the banking industry (SIC: 6020) are included in the investigation. 14 The diversity in standard characteristics given in this set of sample accounting standards enables me to examine features representing costs and benefits that discourage or encourage early adoption across standards. A discussion is provided in Section 17.4.2. Inclusion of these costs and benefits in the analyses yields a more complete picture of banks’ early adoption behavior.
page 640
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
641
17.4.2 Early-adoption decision models and predictions: Hypotheses 1–3 To examine Hypotheses 1 and 2, I estimate the following logit model using maximum likelihood estimation: EARLY ij = β0 + β1 ROAij + β2 RP CERT AIN T Y ij + β3 ROA RP CERT AIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSU REDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEV ERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 N IEF F j + β16 DROE N IEF F ij + β17 M KT HERF ij + β18 DISCLOSU RE j + β19 REG P OS j + β20 REG U N j + β21 P AGE j + ΣN βN Y EARN + εij .
(17.1)
The dependent variable, EARLYij , is equal to 1 if BHC i early adopted the accounting standard j and 0 otherwise. YEARN is an indicator variable equal to 1 if the accounting standard was issued in year N and 0 otherwise. I include year fixed effects to control for the effects of macroeconomic conditions surrounding the pronouncements of accounting standards. All other variables are discussed below and defined in Appendix 17B. Given Hypothesis 1, the first set of test variables relates to bank profitability. I measure bank profitability using ROA, the ratio of the income before extraordinary items to total assets adjusted for the average ROA of peer banks at the first quarter-end after the adoption of a new standard.15 I expect a negative relationship between early adoption and ROA. 15
I adjust the return on assets for the average return on assets of US commercial banks with similar bank asset size to remove the effect of the macroeconomic conditions specific to the time period. To determine the group of US commercial banks with similar bank asset size, I sort US commercial banks into three groups based on total assets. Peer 1 includes commercial banks with total assets less than $1 billion. Peer 2 includes commercial banks with total assets between $1 and $10 billion. Peer 3 includes commercial banks with total assets greater than $10 billion. If a BHC’s total assets fall between $1 and $10 billion, I use the average return on assets of Peer 2 commercial banks to make the adjustment. This adjusted measure also proxies for the information not yet expected by the market. A drawback of this measure is that the time-series property of the profitability measure of a bank is not controlled for.
page 641
July 6, 2020
11:59
642
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
I also examine whether high-profit banks are more likely to choose early adoption when they are more certain of receiving a positive response to the disclosure as suggested by Suijs (2007). That is, to examine whether the negative relationship between early adoption and ROA is moderated as the level of response certainty increases, I include a proxy for response certainty and interact it with ROA. Response certainty, RPCERTAINTY, is measured by the percentage of times that a bank’s return on assets is greater than the average return on assets of peer banks during the past 12 quarters before the announcement date of an accounting standard.16 A higher level of response certainty means that a bank is more certain of receiving a positive response following the disclosure of its performance. Therefore, conditional on knowing its performance, a bank is more likely to disclose its profitability when it is more certain of receiving a positive response following the disclosure. Consequently, I predict a positive relationship between early adoption and RPCERTAINTY as well as ROA RPCERTAINTY (an interaction term between ROA and RPCERTAINTY). Given Hypothesis 2, the second set of test variables relates to bank risk profiles. I evaluate bank risk profiles in four dimensions: (1) interest rate risk (AFSOFINV), (2) credit risk on loan portfolios (LOANRISK), (3) exposures and credit risk on derivatives (EXPOSUREDER and CREDITRISKDER), and (4) operational risk on noninterest income (NONINTCHG). AFSOFINV is the ratio of available-for-sale securities (excluding equity securities) to total investment securities multiplied by (−1) for the ease of interpretation. Banks with higher proportion of available-for-sale securities out of total investment securities have greater financial flexibility in managing interest rate risk and thus have lower interest rate risk (e.g., Beatty, 1995; Hodder et al., 2002; Papiernik et al., 2004). LOANRISK is the ratio of the allowance for loan losses to average loans multiplied by (−1) for the ease of interpretation. It measures the amount of buffer the bank management creates for the whole loan portfolios in case of future write-offs. Banks with a higher ratio of loan loss allowance to average loans are associated with lower credit risk, higher loan quality, and less delay in expected loss recognition (e.g., Eccher et al.,
16
The results are similar regardless of the length of period the response certainty is measured (ranging from 4 quarters to 20 quarters). The average profitability of US commercial banks with similar size is chosen as a benchmark because it is a legitimate proxy for the investors’ prior expectations of a bank’s profitability. Analysts’ EPS forecasts can be another benchmark used to construct response certainty. However, using analysts’ forecasts would reduce more than half of the sample observations. Therefore, I choose to use the average return on assets of US commercial banks as a benchmark.
page 642
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
643
1996, 103; Docking et al., 2000; Kanagaretnam et al., 2003; Yang, 2009; Beatty and Liao, 2011). EXPOSUREDER is the ratio of the gross notional amount of derivatives other than purchased options to total assets.17 CREDITRISKDER is an indicator variable equal to 1 if the gross notional amount of over-the-counter (OTC) derivatives is greater than that of exchangetraded derivatives and 0 otherwise.18 NONINTCHG is the average quarterly growth in noninterest income over the past six quarters. Noninterest income growth is much more volatile than the net interest income growth, largely due to volatile trading revenues. In addition, it is shown that the higher the share of income derived from noninterest activities, the lower the risk-adjusted returns and the higher the insolvency risk (e.g., Stiroh, 2004; Demirg¨ u¸c-Kunt and Huizinga, 2010; Chunhachinda and Li, 2014). Therefore, banks with a higher share of income derived from noninterest activities are positively associated with higher operational risk (e.g., Zhao and He, 2014). All risk proxies are adjusted for the corresponding average risk proxies of US commercial banks with similar bank asset size except for CREDITRISKDER. This adjustment controls the effect of macroeconomic conditions on banks’ risk management or risk-taking behavior. I expect a positive relationship between early adoption and bank risk profiles. I control for several variables suggested by prior banking, early adoption, and voluntary disclosure literature. Specifically, I control for a bank’s incentive to disclose preemptive bad news (BADNEWS), whether a bank holds interest rate derivatives (CONTRACT),19 bank size (SIZE),20 a bank’s propensity to access capital markets (LEVERAGE),21 a bank’s regulatory
17
Purchased options are excluded from the calculation because the financial exposures of purchased options are limited to their book values accounted for in the financial statements (Nissim and Penman, 2007). 18 Exchange-traded derivative contracts have trivial credit risk because the exchanges act as the counterparty to each contract (Nissim and Penman, 2007). As a result, banks with greater proportion of OTC derivatives relative to exchange-traded derivatives are positively associated with credit risk on derivatives. 19 Banks can hold derivatives to manage interest rate risk or to speculate (e.g., Chen et al., 2008; Siregar et al., 2013; Chang et al., 2018). 20 The effect of size is most commonly attributed to political visibility (e.g., Moyer, 1990; Scott, 1991; Ali and Kumar, 1994). Differences in accounting choices between large and small firms could also be due to other reasons such as compliance costs, information production costs, and litigation risk (e.g., Sami and Welsh, 1992; Cao and Narayanamoorthy, 2011). 21 LEVERAGE is the ratio of total liabilities to total assets. The easier it is for banks to access capital markets, the lower the benefits associated with early adoption of an accounting standard (e.g., Aboody et al., 2004).
page 643
July 6, 2020
11:59
644
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
capital adequacy (T1LEV),22 bank management’s incentives to maximize current and future compensation (DROE and DROE NIEFF), the expected income effect of adoption (NIEFF),23 and market competition/proprietary costs (MKTHERF).24 I also control for three dimensions of standard characteristics in the early-adoption decision model: (1) disclosure requirements (DISCLOSURE), (2) impact on the calculation of regulatory capital (REG POS and REG UN), and (3) implementation costs (PAGE). Disclosure requirements speak to whether an accounting standard requires only footnote disclosure or recognition in financial statements proper. Prior studies have shown that financial statement users place less weight on disclosures in footnotes than those in financial statements because footnote disclosures are only subject to a standard audit and do not affect current or future earnings (e.g., Amir and Ziv, 1997a, 1997b; Espahbodi et al., 2002; Choudhary, 2011). In addition, banks do not benefit from managing their regulatory capital flexibly by early adopting standards requiring only footnote disclosures. Alternatively, for better access to capital markets, banks may benefit more from
22
Findings in Beatty (1995) and Hodder et al. (2002) suggest that the higher the regulatory capital ratio, the less likely banks early adopt accounting standards to facilitate regulatory capital management. In addition, the more significant equity buffer held by banks, the less likely bank debts are informationally sensitive (Admati et al., 2011). Banks with higher capital ratios are thus less motivated to disclose more information than those with lower ratios. 23 Following prior early-adoption studies (e.g., Ali and Kumar, 1994; Gujarathi and Hoskin, 2003), I use DROE NIEFF as a proxy for bank management’s incentives to delay early adoption of income-increasing accounting standards to maximize current and future compensation and NIEFF to capture the expected income effects of adoption. Managers of banks with extreme ROE are less likely to early adopt an income-increasing accounting standard, as it is less likely to make a difference in the calculation of bonus compensation. I expect a positive (negative) relationship between early adoption and NIEFF (DROE NIEFF). 24 MKTHERF is included in the model for two reasons. First, market competition may explain the negative relationship between bank profitability and early adoption. Specifically, banks which operate in highly concentrated markets (i.e., with monopoly power) tend to have high profitability. On the other hand, banks that operate in competitive markets tend to have lower profitability than those that operate with monopoly power (e.g., ¨ Ozyıldırım, 2010). This discussion suggests a positive relationship between profitability and market concentration and therefore a potential correlated omitted variable problem. To relieve the problem, I include a measure of market concentration (MKTHERF) in the model. Second, voluntary disclosure literature suggests that proprietary costs may keep firms from making disclosure (e.g., Graham et al., 2005; Beyer et al., 2010). One commonly used measure of proprietary costs is the level of market concentration (e.g., Botosan and Harris, 2000; Sengupta, 2004; Berger, 2011).
page 644
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
645
adopting early standards requiring only disclosures because disclosed values are less subject to opportunistic management than recognized values upon which contracts are based. However, recognized values are not necessarily less accurate due to a more rigorous audit than disclosed values (Amir and Ziv, 1997b; Choudhardy, 2011). Given the ambiguous benefits of early adopting standards requiring only disclosures, I do not expect a signed relationship between early adoption and DISCLOSURE. New accounting standards may lead to changes in the calculation of regulatory core capital under bank regulators’ discretion.25 Banks have every incentive to stay well capitalized and are expected to hold a buffer of capital to limit the chances of falling below the well-capitalized cutoff (Beatty et al., 1996; and Furlong and Kwan, 2007, 10). Hence, I expect that banks are more (less) likely to early adopt accounting standards that lead to positive (undetermined) changes in the calculation of regulatory capital under bank regulators’ discretion for the benefit of adjusting their capital to an optimal level. The coefficients on REG POS and REG UN are expected to be positive and negative, respectively. Implementation costs speak to the continuing move by the FASB toward a principles-based approach. The FASB intends to smoothly converge to the International Financial Reporting Standards (IFRSs) and to reduce complexity in accounting standards and firm costs of applying new accounting standards (Choi and McCarthy, 2003; Schipper, 2003). Therefore, I predict that it is more costly for banks to early adopt a more complex standard than a less complex one. I expect a negative relation between early adoption and PAGE, a proxy for standard complexity. Given Hypothesis 3, I estimate the following logit model separately for accounting standards with (1) income-decreasing effects, (2) incomeincreasing effects, (3) ex ante undetermined income effects, and (4) only disclosure requirements. Particularly, I expect that both Hypotheses 1 and 2 are (only Hypothesis 2 is) supported in the case of accounting standards with only disclosure requirements (with income effects). Estimating the logit
25 Since the Financial Institutions Reform, Recovery and Enforcement Act of 1989, banks have been required to adopt generally accepted accounting principles (GAAP) (Furlong and Kwan, 2007). In addition, the Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA) requires that regulatory accounting standards be at least as strict as GAAP (Beatty et al., 1996; Ramesh and Revsine, 2001). In situations where concerns about the calculation of regulatory capital exist with the implementation of new accounting pronouncements, bank regulators develop interim capital rules as they see fit in a timely fashion and publish them in the Federal Register.
page 645
July 6, 2020
11:59
646
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
model separately for the above four scenarios shows whether managerial opportunism induced by recognition rules affects bank profitability as a proxy for a bank’s need for better access to external financing. EARLY ij = β0 + β1 ROAij + β2 RP CERTAIN T Y ij + β3 ROA RP CERTAIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSU REDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEV ERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 M KT HERF ij + ΣN βN Y EARN + εij . (17.2) All variables are as defined earlier. Coefficient predictions are the same as discussed before across four income-effect scenarios except for T1LEV and DROE. Particularly, I do not have a directional prediction on T1LEV for accounting standards with only disclosure requirements because banks do not benefit from managing their regulatory capital flexibly by early adopting this type of standards. I expect a positive (negative) relationship between early adoption and DROE for accounting standards with income-decreasing effects (income-increasing effects) to reflect bank managers’ incentives to maximize bonus compensation. However, I do not have a directional prediction on DROE for accounting standards with ex ante undetermined income effects and only disclosure requirements. 17.4.3 Comparisons between early and late adopters in financing activities: Hypothesis 4 As bank profitability and risk profiles characterize banks’ motivation for information disclosure and thus better access to external financing, an issue that naturally follows is whether early bank adopters do ex post experience higher growth of funds. I examine my fourth hypothesis during the period between the issued date and the effective date of an accounting standard (i.e., the test period), in which adoption of accounting standards is considered voluntary disclosure. Hypotheses 1 and 2 suggest that banks with high profitability and low risk profiles are less likely to voluntarily disclose information because they have few concerns about raising funds. Consequently, it is likely that late bank
page 646
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
647
adopters with high profitability and low risk profiles experience comparable financing activity growth without early adoption to those of early bank adopters. To address this issue, I match each early adopter to a late adopter with comparable profitability and risk profiles measured at the most recent quarter-end before standard issued dates.26 In addition, banks’ financing strategies may change with the macroeconomic conditions in general (e.g., Bolton and Prexias, 2006). For instance, loan demand increases during economic expansions and hence banks’ demand for funds increases. On the other hand, loan demand decreases during economic contractions and hence banks’ demand for fund decreases. I therefore expect early adopters to experience higher growth of funds than matched late adopters during the test periods that are in economic expansions. A summary of the research methodology is presented in Appendix 17C and results are discussed in Section 17.5.3. 17.5 Results 17.5.1 Descriptive statistics Table 17.1 presents the sample distribution by bank size (small, mid-sized, and large) and early adoption status of each sample accounting standard. Small BHCs is the biggest group of the total sample (51%), followed by mid-sized BHCs (38%) and large BHCs (11%). On average, 3.5% of small BHCs, 3.8% of mid-sized BHCs, and 10.4% of large BHCs choose to early adopt sample accounting standards. Differences in early adoption rates exist between peers and between standards. For example, no large BHCs early adopted SFAS No. 133 (FASB, 1998) while 22% of small BHCs and 16% of mid-sized BHCs chose to early adopt the standard. SFAS No. 123R (FASB, 2004) presents another extreme case, as only 0.9% of small BHCs chose early adoption. As to SFAS No. 157 (FASB, 2006), 16.7% of large BHCs chose to early adopt the standard, while approximately 5% of small and mid-sized BHCs chose early adoption. Table 17.2, Panel A presents descriptive statistics for the variables used in the analyses. In general, approximately 4.4% of BHC-standard observations choose early adoption. ROA distributes with no skew and a mean 26
I use the return on assets and the risk measures (i.e., proxies for the interest rate risk, the credit risk on loan portfolios, exposures and credit risk of derivatives, and the operational risk on noninterest income activities) examined in this research as the matching parameters. All measures are raw and not adjusted for the average measures of peer US commercial banks.
page 647
July 6, 2020
Early adoption by standard and bank size.
Late adopters
Early adopters
Peer 2 Early adopters
Total
Late adopters
Early adopters
27 26 41 36 54 96 39 79 72 45 138 101 114 156 162 146
2 10
2 1 5 7 6 1
55 81 84 62 140 177 89 132 117 55 230 103 125 153 141 98
Total
Late adopters
Early adopters
Total
10 5 15 12 20 26 17 23 24 16 33 28 23 35 39 44
1 11
6 8 10 5
29 36 41 36 64 97 39 79 73 45 138 101 120 164 172 151
5 11 7 6 1
11 16 15 12 20 26 17 23 25 16 33 33 34 42 45 45
92 106 140 110 183 298 141 234 213 116 399 231 257 337 336 287
3 27 0 0 41 2 4 0 2 0 2 6 22 22 22 7
95 133 140 110 224 300 145 234 215 116 401 237 279 359 358 294
121 122 123 132 133 142 143 144 146 132R 123R 155 156 157 159 161
55 75 84 62 109 176 85 132 117 55 228 102 120 146 135 97
Total
1,778
64
1,842
1,332
53
1,385
370
43
413
3,480
160
3,640
355
53
356
217
40
217
49
18
49
462
105
462
# of Different BHCs
6
31 1 4
10 1
1
1
b3568-v1-ch17
Note: Peer 1 includes BHCs with total assets less than $1 billion. Peer 2 includes BHCs with total assets between $1 and $10 billion. Peer 3 includes BHCs with total assets greater than $10 billion.
9.61in x 6.69in
Late adopters
Total
S. I-L. Wang
Total
Peer 3
Handbook of Financial Econometrics,. . . (Vol. 1)
Peer 1 Standard
11:59
648
Table 17.1:
page 648
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards Table 17.2:
649
Descriptive statistics.
Panel A: Sample statistics (N = 3, 640)
Variables EARLY ROA RPCERTAINTY AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE MKTHERF
Mean
Std. dev.
0.044 −0.0003 0.412 −0.015 0.002 −0.010 0.396 −0.032 0.538 0.388 11.150 0.908 0.084 0.498 0.134
0.205 0.003 0.258 0.225 0.005 1.972 0.489 0.437 0.499 0.487 1.685 0.024 0.028 0.500 0.088
25th percentile 0 −0.001 0 −0.152 0.000 −0.040 0 −0.076 0 0 10.050 0.899 0.067 0 0.047
Median
75th percentile
0 −0.0001 0.375 −0.091 0.003 −0.003 0 −0.042 1 0 10.850 0.911 0.082 0 0.122
0 0.001 0.625 0.062 0.005 −0.001 1 −0.010 1 1 11.940 0.922 0.096 1 0.217
Panel B: Sample statistics (means and medians) by early vs. late adopters Early adopters (N = 160) Variables ROA RPCERTAINTY AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE MKTHERF
Late adopters (N = 3, 480)
Mean
Median
Mean
Median
T -test p-value
−0.0007 0.384 0.043 0.003 1.014 0.588 0.103 0.625 0.556 11.760 0.909 0.078 0.456 0.127
−0.0003 0.313 −0.036 0.003 −0.008 1 −0.038 1 1 10.930 0.913 0.076 0 0.074
−0.0003 0.413 −0.018 0.002 −0.057 0.388 −0.039 0.534 0.380 11.120 0.908 0.085 0.500 0.134
−0.0001 0.375 −0.094 0.0025 −0.003 0 −0.042 1 0 10.840 0.911 0.082 0 0.123
0.088 0.160 0.001 0.097 0.000 0.000 0.000 0.024 0.000 0.000 0.688 0.002 0.283 0.304
Wilcoxon test p-value 0.060 0.117 0.000 0.068 0.005 0.000 0.220 0.024 0.000 0.140 0.576 0.001 0.282 0.099
Note: The p-values for differences in means and medians between early and late adopters are based on two-tailed tests. Variable definitions are in Appendix 17B.
page 649
July 6, 2020
11:59
650
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
close to zero. RPCERTAINTY ranges from 0 to 1 and distributes with a mean of 0.412. This data shows that, on average, the sample banks are not highly certain of receiving a positive response following the disclosure of their performance (e.g., 10Q/K filings). The statistics of AFSOFINV show that more than half of the BHC-standard observations hold more of the total investment portfolios in available-for-sale securities than their peer banks. EXPOSUREDER ranges from −2.5 to 42.6, which shows diverse bank derivative activities even after adjusted for the derivative activities of peer banks. The results show that 40% of the banks hold more OTC than exchangetraded derivative contracts. Approximately 54% of total sample observations experience low performance relative to peer banks and 39% of the BHC-standard observations are involved with interest rate related derivative contracts. The mean of LEVERAGE is 0.908, which is expected for the banking industry whereas the median of T1LEV suggests that at least 75% of the banks’ Tier 1 leverage capital ratios are above 5% (i.e., one of the necessary conditions for banks to be classified as well capitalized). The mean of DROE is approximately 0.5 by construction. Finally, MKTHERF ranges from 0.023 to 0.363, which suggests widely-varied market competition in which banks operate (from unconcentrated to highly-concentrated). Table 17.2, Panel B compares the variables used in the analyses between early and late bank adopters by using t-tests and Wilcoxon tests for differences in means and medians. Most differences are significant and consistent with predictions. For example, consistent with predictions, ROA is significantly smaller for early bank adopters than for late bank adopters for both mean and median at the 5% level (one-tailed test). In addition, the means of AFSOFINV, EXPOSUREDER, CREDITRISKDER, and NONINTCHG are significantly greater for early adopters than for late adopters at the 1% level (one-tailed test). 17.5.2 Findings related to likelihood of early adopting accounting standards Table 17.3 shows the results from estimating equation (17.1). Relating to Hypothesis 1, ROA is significantly and negatively related to the likelihood of early adoption. This result indicates that more profitable banks with fewer concerns about attracting funds from capital providers are less likely to early adopt an accounting standard. A significant and positive coefficient on ROA RPCERTAINTY suggests that the level of response certainty moderates the negative relationship between early adoption and bank profitability. The standardized coefficient on ROA shows that a one standard-deviation
page 650
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards Table 17.3:
651
Early-adoption decision model — All standards (N = 3, 640). Dependent variable: EARLY z -stat.
Prediction
Coeff.
%StdX
Hypothesis 1: ROA RPCERTAINTY ROA RPCERTAINTY
− + +
−67.710 0.052 191.500
−17.60 1.40 29.70
Hypothesis 2: AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG
+ + + + +
0.900 74.670 0.085 0.370 0.257
22.50 42.40 18.30 19.80 11.90
2.34∗∗∗ 2.74∗∗∗ 4.16∗∗∗ 1.37∗ 4.02∗∗∗
Control variables: BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE NIEFF DROE NIEFF MKTHERF DISCLOSURE REG POS REG UN PAGE
± ± ± ± − ± + − ± ± + − −
0.301 0.496 0.136 −9.184 −7.464 0.069 −0.135 −0.643 1.611 0.242 4.412 −1.991 −0.007
16.20 27.30 25.80 −19.80 −19.10 3.50 −7.60 −23.90 15.10 10.90 128.90 −48.40 −29.90
1.18 1.98∗∗ 1.47 −1.44 −1.37∗ 0.34 −0.19 −1.75∗∗ 1.05 0.89 4.33∗∗∗ −2.90∗∗∗ −1.15
Year Fixed Effect
−1.95∗∗ 0.12 2.04∗∗
Yes
# of clusters Pseudo R2
462 0.24
Note: ∗∗∗ , ∗∗ , ∗ denote significance at 1%, 5%, and 10% levels, respectively (two-tailed tests unless a prediction is made). The table reports the results of estimating the following model using logit regression with standard errors clustered at the bank level. Variable definitions are in Appendix 17B. EARLYij = β0 + β1 ROAij + β2 RP CERT AIN T Y ij + β3 ROA RP CERT AIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSU REDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEV ERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 N IEF F j + β16 DROE N IEF F ij + β17 M KT HERF ij + β18 DISCLOSU RE j + β19 REG P OS j + β20 REG U N j + β21 PAGE j + ΣN βN Y EARN + εij .
page 651
July 6, 2020
11:59
652
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
increase in ROA while holding RPCERTAINTY at zero decreases the odds ratio (early adoption to late adoption) by 17.6%. In sum, Hypothesis 1 is supported by the results. Relating to Hypothesis 2, all measures for bank risk profiles except for CREDITRISKDER are significant with predicted signs at the 1% level whereas CREDITRISKDER is significant at the 10% level. The results indicate that banks with higher interest rate risk on investment portfolios, higher credit risk on loan portfolios, greater derivative exposures and credit risk, and higher operational risk are more likely to early adopt. The standardized coefficients on AFSOFINV, LOANRISK, EXPOSUREDER, and NONINTCHG suggest that a one standard-deviation increase in each variable increases the odds ratio by 22.5%, 42.4%, 18.3%, and 11.9%, respectively. Therefore, Hypothesis 2 is supported by the results. Relating to accounting standard characteristics that encourage/ discourage early adoption, as predicted, a positive impact on the calculation of regulatory capital (REG POS) and an undetermined impact on the calculation of regulatory capital (REG UN) are significant with predicted signs at the 1% level. Therefore, banks are more likely to early adopt a standard that provides an opportunity for them to increase regulatory capital to an optimal level than a standard that does not, but less likely to early adopt a standard that can impact banks’ capital either upwardly or downwardly. No support is found for PAGE. Relating to the control variables, T1LEV is negatively significant at the 10% level. This finding suggests that banks are less likely to choose early adoption to manage regulatory capital if their regulatory capital ratios are higher. DROE NIEFF is significantly and negatively related to the likelihood of early adoption at the 5% level. This finding suggests that banks with ROE in the extreme quartiles are less likely to early adopt an income-increasing standard because the adoption is unlikely to make a difference in the bonus compensation calculation. Both DROE and NIEFF are not significant. The insignificant results coupled with significant DROE NIEFF suggest that the effects of DROE and NIEFF on early-adoption decisions are jointly captured by their interaction term. Lastly, market concentration (MKTHERF) is not statistically significant. Table 17.4 shows the results from estimating equation (17.2) given different income effects of accounting standards. Relating to bank profitability, the likelihood of early adoption is significantly and negatively related to ROA under one scenario: when accounting standards only require disclosure. The negative relationship is moderated by RPCERTAINTY and significant at the
page 652
July 6, 2020
Early-adoption decision model by the income effects of accounting standards (Hypothesis 3).
Hypothesis 1 ROA RPCERTAINTY ROA RPCERTAINTY
− + +
−77.280 1.149 658.7
Hypothesis 2 AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG
+ + + + +
Control variables BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE MKTHERF
± ± ± ± − + ±
z -stat.
z -stat.
Prediction
Coeff.
−0.36 0.75 1.54∗
− + +
517.400 0.417 −270.868
0.664 292.0 −0.250 2.087 −10.790
0.79 2.18∗∗ −0.43 2.76∗∗∗ −4.17
+ + + + +
−0.527 74.640 0.062 1.814 0.589
−0.54 1.38∗ 0.80 3.28∗∗∗ 0.23
−0.523 −1.135 −0.696 −3.312 −24.10 −0.059 −8.703
−0.66 −1.44 −1.69∗ −0.10 −0.84 −0.06 −1.08
± ± ± ± − − ±
0.558 −0.085 0.454 −35.730 −16.480 −0.387 −1.670
0.81 −0.12 2.26∗∗ −0.90 −0.46 −0.62 −0.22
Yes 348 0.43 (Continued )
653
b3568-v1-ch17
Yes 1,015 0.26
0.77 0.34 −0.20
9.61in x 6.69in
Coeff.
Handbook of Financial Econometrics,. . . (Vol. 1)
Prediction
Accounting standards with income-increasing effects2
External Financing Needs and Early Adoption of Accounting Standards
Accounting standards with income-decreasing effects1
Year fixed effect N Pseudo R2
11:59
Table 17.4:
page 653
July 6, 2020
(Continued )
Coeff.
z -stat.
Accounting standards with only disclosure requirements4 Prediction
Coeff.
z -stat.
0.32 0.33 0.18
− + +
−65.580 −1.853 164.700
−1.67∗∗ −1.75 1.59∗
Hypothesis 2 AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG
+ + + + +
1.480 59.450 0.162 −0.265 −0.999
3.73∗∗∗ 1.53∗ 3.63∗∗∗ −0.88 −1.18
+ + + + +
1.100 89.950 0.072 −0.044 0.341
1.15 1.74∗∗ 2.08∗∗ −0.10 4.50∗∗∗
Control variables BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE MKTHERF
± ± ± ± − ± ±
0.617 0.486 0.045 −7.958 −4.047 0.054 −2.139
1.89∗ 1.57 0.42 −1.51 −0.78 0.23 −1.50
± ± ± ± ± ± ±
0.058 1.940 0.034 −14.650 −12.220 −0.397 5.938
0.12 3.69∗∗∗ 0.24 −1.87∗ −1.67∗ −0.95 2.08∗∗
b3568-v1-ch17
37.410 0.178 41.730
9.61in x 6.69in
− + +
S. I-L. Wang
Hypothesis 1 ROA RPCERTAINTY ROA RPCERTAINTY
Handbook of Financial Econometrics,. . . (Vol. 1)
Accounting standards with ex ante undetermined income effects3 Prediction
11:59
654
Table 17.4:
page 654
July 6, 2020 11:59
Yes 879 0.27
Note: ∗∗∗ , ∗∗ , ∗ denote significance at 1%, 5%, and 10% levels, respectively (two-tailed tests unless a prediction is made). This table reports the results of estimating the following model using logit regression with heteroskedasticity-corrected robust standard errors. Variable definitions are in Appendix 17B. EARLY ij = β0 + β1 ROAij + β2 RP CERT AIN T Y ij + β3 ROA RP CERT AIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSU REDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEV ERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 M KT HERF ij + ΣN βN Y EARN + εij . Accounting standards with income-decreasing effect include SFAS No. 121, SFAS No. 123, SFAS No. 143, SFAS No. 144, and SFAS No. 123R. 2 Accounting standards with income-increasing effect include SFAS No. 122 and SFAS No. 146. 3 Accounting standards with ex ante undetermined income effect include SFAS No. 133, SFAS No. 142, SFAS No. 155, SFAS No. 156, and SFAS No. 159. 4 Accounting standards with disclosure requirements include SFAS No. 132, SFAS No. 132R, SFAS No. 157, and SFAS No. 161.
9.61in x 6.69in
1
Handbook of Financial Econometrics,. . . (Vol. 1)
Yes 1,398 0.14
655
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
Year fixed effect N Pseudo R2
page 655
July 6, 2020
11:59
656
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
10% level. ROA and RPCERTAINTY are not significant in explaining earlyadoption decisions on accounting standards with the other three scenarios whereas ROA RPCERTAINTY is marginally significant under the scenario of accounting standards with income-decreasing effects. Relating to bank risk profiles, AFSOFINV is only significant with predicted sign when used to explain early-adoption decisions on accounting standards with ex ante undetermined income effects. LOANRISK is significant with predicted sign in all scenarios. EXPOSUREDER is significant with predicted sign when used to explain accounting standards with ex ante undetermined income effects and with only disclosure requirements. CREDITRISKDER is significant when used to explain accounting standards with income-decreasing and income-increasing effects. NONINTCHG is only significant when used to explain early-adoption decisions with only disclosure requirements. Relating to the control variables, BADNEWS is significantly different from zero under the scenario of accounting standards with ex ante undetermined income effects. The results indicate the fact that banks experience relatively bad performance weighs in banks’ early-adoption decisions in this scenario. SIZE is significantly related to the likelihood of early adoption when used to explain accounting standards with income-decreasing and incomeincreasing effects. LEVERAGE and MKTHERF are only significant in the scenario of accounting standards with only disclosure requirements. T1LEV is negatively related to the likelihood of early adoption but insignificant; it is significantly different from zero in the scenario of accounting standards with only disclosure requirements. DROE is not significant in any cases. Taken together, a bank’s motive to better access external financing and to signal its quality through early adoption is most evident in the case when accounting standards require only disclosure. The results are consistent with the notion that high quality banks choose to counter-signal their quality to external financing providers by not adopting disclosure rules early. In addition, bank risk profiles consistently explain banks’ early-adoption decisions regardless of the income effects of accounting standards. Hence, Hypothesis 3 is supported. The results also suggest that variables used in prior earlyadoption studies based on the positive theory of accounting do not appear to consistently explain banks’ early-adoption decisions. 17.5.3 Comparisons of the financing activities between early and late adopters Table 17.5 describes the percentage of early adoption cases given the income effects of accounting standards and the macroeconomic conditions. The table
page 656
July 6, 2020
Table 17.5:
Percentage of early adoption cases conditional on the income effects of accounting standards and macroeconomic conditions. 11:59
# of early adoption cases Total
Percentage of early adoption (%)
5 4 9
631 375 1,006
636 379 1,015
0.8% 1.1% 0.9%
27 2 29
106 213 319
133 215 348
20.3% 0.9% 8.3%
91 2 93
1,007 298 1,305
1,098 300 1,398
8.3% 0.7% 6.7%
22 7 29
563 287 850
585 294 879
3.8% 2.4% 3.3%
145 15 160
2,307 1,173 3,480
2,452 1,188 3,640
5.9% 1.3% 4.4%
b3568-v1-ch17
657
Notes: 1 Accounting standards with income-decreasing effect include SFAS No. 121, SFAS No. 123, SFAS No. 143, SFAS No. 144, and SFAS No. 123R. 2 Accounting standards with income-increasing effect include SFAS No. 122 and SFAS No. 146. 3 Accounting standards with ex ante undetermined income effect include SFAS No. 133, SFAS No. 142, SFAS No. 155, SFAS No. 156, and SFAS No. 159. 4 Accounting standards with disclosure requirements include SFAS No. 132, SFAS No. 132R, SFAS No. 157, and SFAS No. 161.
9.61in x 6.69in
Overall During economic expansion During economic contraction Total
Late
Handbook of Financial Econometrics,. . . (Vol. 1)
Income-decreasing effects1 During economic expansion During economic contraction Total Income-increasing effects2 During economic expansion During economic contraction Total Ex ante undetermined income effects3 During economic expansion During economic contraction Total Only disclosure requirements4 During economic expansion During economic contraction Total
Early
External Financing Needs and Early Adoption of Accounting Standards
Income effects of accounting standards
page 657
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
658
.5
reveals that on average banks are more likely to choose early adoption during economic expansions than during contractions (5.9% and 1.3%, respectively). In addition, the data reveals that except for accounting standards with income-decreasing effects, banks are more likely to choose early adoption during economic expansions than during contractions regardless of the income effects of accounting standards. Furthermore, the data shows that banks are most likely to adopt early accounting standards with incomeincreasing effects during economic expansions (20.3%) and accounting standards with only disclosure requirements during contractions (2.4%). In general, the table is consistent with the assertion that banks are most motivated during economic expansions to disclose information voluntarily to better access external financing. Figure 17.1 compares early adopters and matched late adopters in the growth of funds attributed to changes in insured deposits, changes in
0
0
.1
.05
.1
Growth of funds
.15
.4 .3 .2
Growth of funds
Early adopters Late adopters
.2
Early adopters Late adopters
FAS121 FAS122 FAS123R FAS133 FAS155 FAS156 FAS157 FAS159
FAS142
FAS143
FAS146
FAS161
Banks’ financing activities during economic expansion
Banks’ financing activities during economic contraction
(a)
(b)
Figure 17.1: Comparison between early and matched late bank adopters in the growth of funds. Notes: This figure displays the financing activities of early bank adopters and matched late adopters between the issued dates and the effective dates of accounting standards (i.e., the test periods). Banks’ financing activities include changes in insured deposits, liabilities other than insured deposits, and preferred and common stock. Growth of funds is defined as changes in the sum of insured deposits, liabilities other than insured deposits, and preferred and common stock, scaled by total assets during the test periods. For each early bank adopter of an accounting standard, a late bank adopter is matched correspondingly based on bank profitability (i.e., return on assets) and the risk factors examined in this chapter. Panel a (Panel b) exhibits the early and late bank adopters’ financing activities in the test periods when the economy was in expansion (contraction). Definitions of economic expansions and contractions are available on the NBER website (http://www.nber.org/ cycles/cyclesmain.html).
page 658
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
659
liabilities other than insured deposits, and changes in preferred and common stock and related surplus during test periods.27 For better illustration, I separately show the fund growth of banks during test periods that take place in economic expansions, Panel (a), and economic contractions, Panel (b). Panel (a) shows that early bank adopters experience higher growth of funds than their matched counterparts on average in most cases (five out of eight cases, all of which are accounting standards related to financial instruments). On the other hand, Panel (b) does not show any particular trend in the growth of funds between early adopters and matched late adopters as expected. Table 17.6 reports the means of the three major components of the total growth of funds under each standard (all scaled by total assets for the ease of comparisons): (1) changes in insured deposits (CHDEPOSIT), (2) changes in liabilities other than insured deposits (CHOTD), and (3) changes in preferred and common stock (CHEQUITY). It suggests that the mean differences in the growth of funds between early and matched late adopters are mainly accounted for by the mean differences in the growth of liabilities other than insured deposits during test periods that take place in economic expansions (i.e., 6 out of eight cases are positive mean differences).28 In terms of economic significance, the mean differences in the growth of liabilities other than insured deposits between early adopters and matched late adopters range from −2.9% to 6.3%. The differences are significant in the cases of early adoption of SFAS No. 122 and SFAS No. 155 at the 10% and 5% levels, respectively. As for equity capital (i.e., changes in preferred and common stock), there does not seem to be a particular trend in the growth of capital between early adopters and matched late adopters during test periods in either economic expansions or
27 Liabilities other than insured deposits include large deposits more than $100,000, federal funds purchased and securities sold under agreements to repurchase, trading liabilities, other borrowed money, subordinated notes and debentures, and other liabilities. Note that four accounting standards have zero early-adoption cases: SFAS No. 123, SFAS No. 132, SFAS No. 144, and SFAS No. 132R. Therefore, these four accounting standards are excluded from comparisons. 28 I also limit liabilities other than insured deposits to include only large deposits more than $100,000, other borrowed money, and subordinated notes and debentures. The results indicate that the mean differences in the growth of liabilities other than insured deposits during test periods are positive in four out of eight cases in economic expansions (i.e., SFAS No. 122, SFAS No. 123R, SFAS No. 155, and SFAS No. 159) and in two out of four cases in economic contractions (i.e., SFAS No. 142 and SFAS No. 161).
page 659
July 6, 2020 11:59
660 Table 17.6: Comparison between early and matched late bank adopters in the growth of funds and the components of growth (Hypothesis 4).
Variable
n
Early
Late
Diff.
CHDEPOSIT Early
economic expansion −0.108 0.042 0.020 0.049 −0.054 0.017 0.009 0.265 0.061∗ 0.033 0.009 0.053 −0.046 0.070 0.008 0.051
Accounting standards issued FAS 142 2 0.052 FAS 143 4 0.128 FAS 146 2 −0.017 FAS 161 7 0.199
during economic contraction 0.202 −0.151 0.011 0.108 0.020 0.151 0.161 −0.178 0.030 0.060 0.139 0.089
Late
Diff.
Early
Late
Diff.
Early
Late
Diff.
0.111 0.052 0.080 0.258 0.036 0.056 0.090 0.045
−0.069 −0.003 −0.063 0.007 −0.003 −0.003 −0.020 0.006
0.056 0.042 0.023 0.196 0.060 0.036 0.020 0.012
0.084 0.020 0.013 0.195 −0.003 0.029 0.043 0.006
−0.029 0.022∗ 0.010 0.001 0.063∗∗ 0.007 −0.023 0.006
0.000 0.004 0.000 0.016 0.003 0.010 0.005 0.003
0.010 0.003 0.001 0.017 0.004 0.006 0.008 0.008
−0.010 0.001 −0.001 0.000 −0.001 0.004 −0.003 −0.005
0.123 0.084 0.079 0.127
−0.112 0.067 −0.049 −0.038
0.041 −0.021 −0.047 0.088
0.056 0.018 0.070 −0.072
−0.015 −0.039 −0.117 0.159∗
0.000 −0.002 0.000 0.023
0.024 0.006 0.012 0.005
−0.024 −0.008 −0.012 0.018∗∗
b3568-v1-ch17
Notes: CHTOTAL is measured as a bank’s changes in the sum of insured deposits, liabilities other than insured deposits, and preferred and common stock, scaled by total assets during the test periods. CHDEPOSIT (CHOTD, CHEQUITY) is measured as a bank’s changes in insured deposits (changes in liabilities other than insured deposits, changes in preferred and common stock) scaled by total assets during the test period. ∗∗ Difference significant at 5% level in a one-tailed test (i.e., Diff. = (Early – Late) is predicted to be positive). ∗ Difference significant at 10% level in a one-tailed test (i.e., Diff. = (Early – Late) is predicted to be positive).
9.61in x 6.69in
during 0.205 0.074 0.094 0.469 0.036 0.090 0.141 0.059
CHEQUITY
S. I-L. Wang
Accounting standards issued FAS 121 3 0.097 FAS 122 27 0.094 FAS 123R 2 0.040 FAS 133 41 0.478 FAS 155 6 0.096 FAS 156 22 0.099 FAS 157 22 0.095 FAS 159 22 0.067
CHOTD
Handbook of Financial Econometrics,. . . (Vol. 1)
CHTOTAL
page 660
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
661
economic contractions.29 This result is consistent with extant banking literature that raising equity capital is relatively costly for banks and therefore does not serve as the main driver of banks’ total fund growth (e.g., Zimmer and McCauley, 1991; Bolton and Freixas, 2006; Admati et al., 2011; Miles et al., 2013). In addition, the result is consistent with Gropp and Heider (2010) that banks have shifted towards uninsured debts to fund their balance sheet growth. In summary, early-adopting banks appear to experience higher growth of funds than matched late-adopting banks during test periods that take place in economic expansions. Hence, Hypothesis 4 is supported by the results. 17.5.4 Additional analyses One concern that may arise from the previous analyses is that banks may not serve as the most ideal group to examine early-adoption decisions on accounting standards unrelated to financial instruments. Hence, I estimate the early-adoption decision model separately for (1) accounting standards related to financial instruments (i.e., SFAS No. 122, SFAS No. 133, SFAS No. 155, SFAS No. 156, SFAS No. 157, SFAS No. 159, and SFAS No. 161) and (2) all other accounting standards examined in this study. Table 17.7 reports the findings. The results show that a bank’s incentive to better access external financing predominantly explains a bank’s early-adoption decisions on accounting standards related to financial instruments because neither Hypothesis 1 nor Hypothesis 2 is supported in the sub-sample of accounting standards unrelated to financial instruments. The findings are consistent with the observation in Section 17.5.3 about early adopters’ financing activities: early adopters experience higher fund growth than matched late adopters particularly in the cases of accounting standards related to financial instruments (i.e., SFAS No. 122, SFAS No. 133, SFAS No. 155, SFAS No. 159, and SFAS No. 161). Since the motive to better access external financing is not evident in the cases of all other accounting standards unrelated to financial instruments, it follows naturally that early adopters do not necessarily experience higher growth of funds than matched late adopters. This inference is observed in Figure 17.1. I also consider the potential interaction effects between bank profitability and risk profiles on the likelihood of early adoption. Ex ante, high profitability is expected of banks with high risk profiles. However, banks with high 29
The mean differences between early adopters and their matched counterparts are not statistically significant in any case of early adoption except for SFAS No. 161.
page 661
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
662 Table 17.7:
Early-adoption decision model by types of accounting standards. Financial-related standards1
Hypothesis 1 ROA RPCERTAINTY ROA RPCERTAINTY Hypothesis 2 AFSOFINV LOANRISK EXPOSUREDER CREDITRISKDER NONINTCHG Control variables BADNEWS CONTRACT SIZE LEVERAGE T1LEV DROE NIEFF DROE NIEFF MKTHERF DISCLOSURE3 N # of clusters Pseudo R2
z -stat.
Nonfinancial-related standards2 Coeff.
z -stat.
−74.670 0.059 519.200
−0.90 0.05 2.10∗∗
Pred.
Coeff.
− + +
−42.610 0.264 152.300
+ + + + +
1.270 81.960 0.110 0.444 0.246
3.17∗∗∗ 2.63∗∗∗ 3.61∗∗∗ 1.57∗ 3.79∗∗∗
0.181 91.720 0.005 0.383 −4.430
0.26 0.99 0.04 0.46 −1.90
± ± ± ± − ± + − ± ±
0.497 0.241 0.052 −11.500 −8.086 −0.022 1.294 −0.309 −0.784 −0.696
1.83∗ 0.85 0.47 −1.83∗ −1.41∗ −0.10 3.40∗∗∗ −0.68 −0.51 −3.92∗∗∗
−0.260 0.690 −0.354 33.240 19.030 −0.436 0.030 −1.332 −6.927
−0.44 0.91 −1.31 1.32 1.12 −0.50 0.04 −1.65∗∗ −1.05
1, 884 398 0.11
−1.32∗ 0.58 1.75∗∗
1, 756 435 0.14
Notes: ∗∗∗ , ∗∗ , ∗ denote significance at 1%, 5%, and 10% levels, respectively (two-tailed tests unless a prediction is made). This table reports the results of estimating the following model using logit regression with heteroskedasticity-corrected robust standard errors. Variable definitions are in Appendix 17B. EARLY ij = β0 + β1 ROAij + β2 RP CERT AIN T Y ij + β3 ROA RP CERT AIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSU REDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEV ERAGE ij + β13 T1 LEV ij + β14 DROE ij + β15 N IEF F j + β16 DROE N IEF F ij + β17 M KT HERF ij + β18 DISCLOSU RE j + εij . 1 Accounting
standards related to financial instruments include SFAS No. 122, SFAS No. 133, SFAS No. 155, SFAS No. 156, SFAS No. 157, SFAS No. 159, and SFAS No. 161. 2 Accounting standards unrelated to financial instruments include SFAS No. 121, SFAS No. 123, SFAS No. 132, SFAS No. 142, SFAS No. 143, SFAS No. 144, SFAS No. 146, SFAS No. 132R, and SFAS No. 123R. 3 DISCLOSURE is excluded from the model in the case of nonfinancial-related accounting standards because it perfectly predicts late adoption.
page 662
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
663
risk profiles do not necessarily enjoy high profitability ex post. To examine whether and the extent to which bank profitability moderates the positive effects of bank risk profiles on banks’ early-adoption decisions, I estimate equation (17.1) with additional interaction terms. Specifically, I add to equation (17.1) interaction terms between bank profitability (ROA) and risk profiles (AFSOFINV, LOANRISK, EXPOSUREDER, CREDITRISKDER, and NONINTCHG). In the untabulated results, I find that the marginal effect of each risk factor on the likelihood of early adoption decreases with bank profitability except for AFSOFINV. The results generally indicate that the marginal effects of bank risk profiles on the likelihood of early adoption are positive and higher given lower bank profitability.
17.6 Conclusion I investigate whether US BHCs disclose information and signal their quality through early adoption to better access capital markets. I find that the likelihood of early adoption is negatively related to bank profitability and positively related to bank risk profiles. Conditional on the income effects of accounting standards, I find that the motive to better access external financing is most evident in explaining early-adoption decision on accounting standards with only disclosure requirements. Finally, the results indicate that early adopters experience higher fund growth than matched late adopters during test periods, particularly in economic expansions. The findings reinforce the idea that banks voluntarily disclose information to better access external financing in the context of early adoption. This study appeals to both academic and practical audiences. From an academic perspective, this research complements the existing early-adoption literature by providing an additional motivation for early adoption and the line of research in managers’ voluntary disclosure decisions for capital market reasons. My research may also interest the banking industry and bank regulators. With the counter-signaling role in the early-adoption decisions, banks may think twice when evaluating early adoption of accounting standards. As banks are closely scrutinized by bank regulators, maintaining growth in profitability and practicing sound risk management are of utmost importance to banks’ operations. Hence, future research may gather implications about a bank’s future financial performance by analyzing a bank’s early-adoption decisions or other types of voluntary disclosure over time. Future research may also examine whether the motive for early adoption to provide new information and thus better access capital markets applies in other industries.
page 663
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
664
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
Bibliography Aboody, D., Barth, M. and Kasznik, R. (2004). Firms’ Voluntary Recognition of Stockbased Compensation Expense. Journal of Accounting Research, 42(2), 123–150. Admati, A., DeMarzo, P., Hellwig, M. and Pfleiderer, P. (2011). Fallacies, Irrelevant Facts, and Myths in the Discussion of Capital Regulation: Why Bank Equity is Not Expensive. Rock Center for Corporate Governance at Stanford University Working Paper No. 86, Stanford Graduate School of Business Research Paper No. 2065. Ali, A. and Kumar, K. (1994). The Magnitudes of Financial Statement Effects and Accounting Choice: The Case of the Adoption of SFAS 87. Journal of Accounting and Economics, 18, 89–114. Amir, E. and Livnat, J. (1996). Multiperiod Analysis of Adoption Motives: The Case of SFAS No. 106. The Accounting Review, 71(4), 539–553. Amir, E. and Ziv, A. (1997a). Recognition, Disclosure or Delay: Timing the Adoption of SFAS 106. Journal of Accounting Research, 35(1), 61–81. Amir, E. and Ziv, A. (1997b). Economic Consequences of Alternative Adoption Rules for New Accounting Standards. Contemporary Accounting Research, 14(3), 543–568. Ashcraft, A. (2008). Does the Market Discipline Banks? New Evidence from Regulatory Capital Mix. Journal of Financial Intermediation, 17, 543–561. Bagntasarian, A. and Mamatzakis, E. (2018). Testing for the Underlying Dynamics of Bank Capital Buffer and Performance Nexus. Review of Quantitative Finance and Accounting, Forthcoming (https://doi.org/10.1007/s11156-018-0712-y). Bartolucci, F. and Farcomeni, A. (2009). A Multivariate Extension of the Dynamic Logit Model for Longitudinal Data Based on a Latent Markov Heterogeneity Structure. Journal of the American Statistical Association, 104, 816–831. Bartolucci, F. and Nigro, V. (2012). Pseudo Conditional Maximum Likelihood Estimation of the Dynamic Logit Model for Binary Panel Data. Journal of Econometrics, 170(1), 102–116. Barth, J., Bertus, M., Hai, J. and Phumiwasana, T. (2008). A Cross-country Assessment of Bank Risk-shifting Behavior. Review of Pacific Basin Financial Markets and Policies, 11(1), 1–34. Beatty, A. (1995). The Effects of Fair Value Accounting on Investment Portfolio Management: How Fair is it? Review Federal Reserve Bank of St. Louis, 77 (January/February), 25–38. Beatty, A., Chamberlain, S. and Magliolo, J. (1996). An Empirical Analysis of the Economic Implications of Fair Value Accounting for Investment Securities. Journal of Accounting and Economics, 22, 43–77. Beatty, A. and Liao, S. (2011). Do Delays in Expected Loss Recognition Affect Banks’ Willingness to Lend? Journal of Accounting and Economics, 52, 1–20. ¨ ¨ (2008). How Do Large Berger, A., DeYoung, R., Flannery, M., Lee, D. and Oztekin, O. Banking Organizations Manage Their Capital Ratios? Journal of Financial Services Research, 34, 123–149. Berger, A., Herring, R. and Szego, G. (1995). The Role of Capital in Financial Institutions. Journal of Banking & Finance, 19, 393–430. Berger, P. (2011). Challenges and Opportunities in Disclosure Research — A Discussion of ‘the Financial Reporting Environment: Review of the Recent Literature’. Journal of Accounting and Economics, 51, 204–218. Berlin, M. (2011). Can We Explain Banks’ Capital Structures? Federal Reserve Bank of Philadelphia Business Review, Q2 2011, 1–11. Beyer, A., Cohen, D., Lys, T. and Walther, B. (2010). The Financial Reporting Environment: Review of the Recent Literature. Journal of Accounting and Economics, 50, 296–343.
page 664
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
665
Bolton, P. and Freixas, X. (2006). Corporate Finance and the Monetary Transmission Mechanism. Review of Financial Studies, 19(3), 829–870. Botosan, C. (1997). Disclosure Level and the Cost of Equity Capital. The Accounting Review, 20, 323–349. Botosan, C. and Harris, M. (2000). Motivations for a Change in Disclosure Frequency and its Consequences: An Examination of Voluntary Quarterly Segment Disclosures. Journal of Accounting Research, 38(2), 329–353. Botosan, C. and Plumlee, M. (2002). A Re-examination of Disclosure Level and the Expected Cost of Equity Capital. Journal of Accounting Research, 40, 21–40. Bushman, R. and Williams, C. (2012). Accounting Discretion, Loan Loss Provisioning, and Discipline of Banks’ Risk-taking. Journal of Accounting and Economics, 54, 1–18. Cao, Z. and Narayanamoorthy, G. (2011). The Effect of Litigation Risk on Management Earnings Forecasts. Contemporary Accounting Research, 28(1), 125–173. Chang, C., Ho, H. and Hsiao, Y. (2018). Derivatives Usage for Banking Industry: Evidence from the European Markets. Review of Quantitative Finance and Accounting, Forthcoming (https://doi.org/10.1007/s11156-017-0692-3). Chen, W., Liu, C. and Ryan, S. (2008). Characteristics of Securitizations that Determine Issuers’ Retention of the Risks of the Securitized Assets. The Accounting Review, 83(5), 1181–1215. Cheng, X. and Smith, D. (2013). Disclosure Versus Recognition: The Case of Expensing Stock Options. Review of Quantitative Finance and Accounting, 40, 591–621. Choi, Y. and McCarthy, I. (2003). FASB Proposes Principles-based Approach to US Standards Setting. Bank Accounting & Finance, 16(2), 5–11. Choudhary, P. (2011). Evidence on Differences between Recognition and Disclosure: A Comparison of Inputs to Estimate Fair Values of Employee Stock Options. Journal of Accounting and Economics, 51, 77–94. Chunhachinda, P. and Li, L. (2014). Income Structure, Competitiveness, Profitability, and Risk: Evidence from Asian Banks. Review of Pacific Basin Financial Markets and Policies, 17(3), 1450015 (23 pages) (http://dx.doi.org/10.1142/S0219091514500155). Demirg¨ uc¸-Kunt, A. and Huizinga, H. (2010). Bank Activity and Funding Strategies: The Impact on Risk and Returns. Journal of Financial Economics, 98, 626–650. Docking, D., Hirschey, M. and Jones, E. (2000). Reaction of Bank Stock Prices to Loanloss Reserve Announcements. Review of Quantitative Finance and Accounting, 15(3), 277–297. Dye, R. (1986). Proprietary and Nonproprietary Disclosures. The Journal of Business, 59(2), 331–366. Eakin, C. and Gramlich, J. (2000). Insider Trading and the Early Adoption of SFAS 96: A Test of the Signaling Hypothesis. Advances in Accounting, 17, 111–133. Eccher, E., Ramesh, K. and Thiagarajan, S. (1996). Fair Value Disclosures by Bank Holding Companies. Journal of Accounting and Economics, 22, 79–117. Espahbodi, H., Espahbodi, P., Rezaee, Z. and Tehranian, H. (2002). Stock Price Reaction and Value Relevance of Recognition versus Disclosure: The Case of Stock-based Compensation. Journal of Accounting and Economics, 33, 343–373. Estrella, A. (2004). Bank Capital and Risk: Is Voluntary Disclosure Enough? Journal of Financial Services Research, 26(2), 145–160. Federal Reserve Board (FRB) (2004). Bank Holding Company Rating System. Supervisory Letter SR 04–18. Feltovich, N., Harbaugh, R. and To, T. (2002). Too Cool for School? Signalling and Countersignalling. The RAND Journal of Economics, 33(4), 630–649. Financial Accounting Standards Board (FASB) (1990). Employers’ Accounting for Postretirement Benefits Other than Pensions. SFAS No. 106. Norwalk, CT: FASB.
page 665
July 6, 2020
11:59
666
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
FASB (1993). Accounting for Certain Investments in Debt and Equity Securities. SFAS No. 115. Norwalk, CT: FASB. FASB (1995). Accounting for the Impairment of Long-Lived Assets and for Long-Lived Assets to Be Disposed Of. SFAS No. 121. Norwalk, CT: FASB. FASB (1995). Accounting for Mortgage Servicing Rights — An Amendment of FASB Statement No. 65. SFAS No. 122. Norwalk, CT: FASB. FASB (1995). Accounting for Stock-Based Compensation. SFAS No. 123. Norwalk, CT: FASB. FASB (1998). Employers’ Disclosures about Pensions and Other Postretirement Benefits — An amendment of FASB Statements No. 87, 88, and 106. SFAS No. 132. Norwalk, CT: FASB. FASB (1998). Accounting for Derivative Instruments and Hedging Activities. SFAS No. 133. Norwalk, CT: FASB. FASB (2001). Goodwill and Other Intangible Assets. SFAS No. 142. Norwalk, CT: FASB. FASB (2001). Accounting for Asset Retirement Obligations. SFAS No. 143. Norwalk, CT: FASB. FASB (2001). Accounting for the Impairment or Disposal of Long-Lived Assets. SFAS No. 144. Norwalk, CT: FASB. FASB (2002). Accounting for Costs Associated with Exit or Disposal Activities. SFAS No. 146. Norwalk, CT: FASB. FASB (2003). Employers’ Disclosures about Pensions and Other Postretirement Benefits — An Amendment of FASB Statements No. 87, 88, and 106. SFAS No. 132R. Norwalk, CT: FASB. FASB (2004). Share-Based Payment. SFAS No. 123R. Norwalk, CT: FASB. FASB (2006). Accounting for Certain Hybrid Financial Instruments — An Amendment of FASB Statements No. 133 and 140. SFAS No. 155. Norwalk, CT: FASB. FASB (2006). Accounting for Servicing of Financial Assets — An Amendment of FASB Statement No. 140. SFAS No. 156. Norwalk, CT: FASB. FASB (2006). Fair Value Measurements. SFAS No. 157. Norwalk, CT: FASB. FASB (2007). The Fair Value Option for Financial Assets and Financial Liabilities — Including an amendment of FASB Statement No. 115. SFAS No. 159. Norwalk, CT: FASB. FASB (2008). Disclosures about Derivative Instruments and Hedging Activities — An amendment of FASB Statement No. 133. SFAS No. 161. Norwalk, CT: FASB. Flannery, M. and Rangan, K. (2008). What Caused the Bank Capital Buildup of the 1990s? Review of Finance, 12, 391–429. Francis, J., Khurana, I. and Pereira, R. (2005). Disclosure Incentives and Effects of Capital around the World. The Accounting Review, 80(4), 1125–1162. Frank, M. and Goyal, V. (2008). Trade-off and Pecking Order Theories of Debt. In Handbook of Corporate Finance: Empirical Corporate Finance, Vol. 2, Edited by E. Eckbo. Amsterdam: Elsevier, pp. 135–202. Frankel, R., McNichols, M. and Wilson, G. (1995). Discretionary Disclosure and External Financing. The Accounting Review, 70(1), 135–150. Furlong, F. and Kwan, S. (2007). Safe and Sound Banking Twenty Years Later: What was Proposed and What has been Adopted. Federal Reserve Bank of Atlanta Economic Review, 92 (1&2), 1–23. Gorton, G. (2010). Slapped by the Invisible Hand: The Panic of 2007. Oxford University Press. Graham, J., Harvey, C. and Rajgopal, S. (2005). The Economic Implications of Corporate Financial Reporting. Journal of Accounting and Economics, 40, 3–73.
page 666
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
667
Greenlaw, D., Hatzius, J., Kashyap, A. and Shin, H. (2008). Leveraged Losses: Lessons from the Mortgage Market Meltdown. US Monetary Policy Forum Report No. 2. Gropp, R. and Heider, F. (2010). The Determinants of Bank Capital Structure. Review of Finance, 14, 587–622. Gujarathi, M. and Hoskin, R. (2003). Modeling the Early Adoption Decision: The Case of SFAS 96. Review of Accounting & Finance, 2(4), 63–86. Halov, N. and Heider, F. (2011). Capital Structure, Risk, and Asymmetric Information. Quarterly Journal of Finance, 1(4), 767–809. Heckman, J. (1981a). Statistical Models for Discrete Panel Data. In Structural Analysis of Discrete Data with Econometric Applications, Edited by C. F. Manski and D. L. McFadden. Cambridge, MA: MIT Press, pp. 114–178. Heckman, J. (1981b). Heterogeneity and State Dependence. In Studies in Labor Markets, Edited by S. Rosen. Chicago: University of Chicago Press, pp. 91–140. Heckman, J. (1981c). The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time-discrete Data Stochastic Process. In Structural Analysis of Discrete Data with Econometric Applications, Edited by C. F. Manski and D. L. McFadden. Cambridge, MA: MIT Press, pp. 179–195. Hirtle, B. and Lopez, J. (1999). Frequency of Bank Examinations. FRBNY Economic Policy Review, April, 1–19. Hodder, L., Kohlbeck, M. and McAnally, M. (2002). Accounting Choices and Risk Management: SFAS No. 115 and US Bank Holding Companies. Contemporary Accounting Research, 19(2), 225–270. Kanagaretnam, K., Lobo, G. and Mathieu, R. (2003). Managerial Incentives for Income Smoothing through Bank Loan Loss Provisions. Review of Quantitative Finance and Accounting, 20, 63–80. Lang, M. and Lundholm, R. (1993). Cross-sectional Determinants of Analyst Ratings of Corporate Disclosures. Journal of Accounting Research, 31(2), 246–271. Lang, M. and Lundholm, R. (2000). Voluntary Disclosure and Equity Offerings: Reducing Information Asymmetry or Hyping the Stock? Contemporary Accounting Research, 17(4), 623–662. Langer, R. and Lev, B. (1993). The FASB’s Policy of Extended Adoption for New Standards: An Examination of FAS No. 87. The Accounting Review, 68(3), 515–533. Lee, I. and Stiner, Jr., F. (1993). Stock Market Reactions to SFAS No. 96: Evidence from Early Bank Adopters. The Financial Review, 28(4), 469–491. Leland, H. and Pyle, D. (1977). Informational Asymmetries, Financial Structure, and Financial Intermediation. Journal of Finance, 32, 371–387. Lucas, D. and McDonald, R. (1992). Bank Financing and Investment Decisions with Asymmetric Information about Loan Quality. The RAND Journal of Economics, 23(1), 86–105. Miles, D., Yang, J. and Marcheggiano, G. (2013). Optimal Bank Capital. The Economic Journal, 123(567), 1–37. Mitra, S. and Hossain, M. (2009). Value-relevance of Pension Transition Adjustments and Other Comprehensive Income Components in the Adoption Year of SFAS No. 158. Review of Quantitative Finance and Accounting, 33, 279–301. Moyer, S. (1990). Capital Adequacy Ratio Regulations and Accounting Choices in Commercial Banks. Journal of Accounting and Economics, 13, 123–154. Myers, S. and Majluf, N. (1984). Corporate Financing and Investment Decisions When Firms Have Information That Investors Do Not Have. Journal of Financial Economics, 32, 187–221.
page 667
July 6, 2020
11:59
668
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
Nissim, D. and Penman, S. (2007). Fair Value Accounting in the Banking Industry. Columbia Business School, Center for Excellence in Accounting and Security Analysis, Occasional Paper Series. ¨ Ozyıldırım, S. (2010). Intermediation Spread, Bank Supervision, and Financial Stability. Review of Pacific Basin Financial Markets and Policies, 13(4), 517–537. Papiernik, J., Meier, H. and Rozen, E. (2004). Effects of Fair-value Accounting on Securities Portfolio Restructuring. Bank Accounting & Finance, 18, 19–25. Pasiouras, F., Gaganis, C. and Zopounidis, C. (2006). The Impact of Bank Regulations, Supervision, Market Structure, and Bank Characteristics on Individual Bank Ratings: A Cross-country Analysis. Review of Quantitative Finance and Accounting, 27, 403–438. Ramesh, K. and Revsine, L. (2001). The Effects of Regulatory and Contracting Costs on Banks’ Choice of Accounting Method for Other Postretirement Employee Benefits. Journal of Accounting and Economics, 30, 159–186. Sami, H. and Welsh, M. (1992). Characteristics of Early and Late Adopters of Pension Accounting Standard SFAS No. 87. Contemporary Accounting Research, 9(1), 212–236. Schipper, K. (2003). Principles-based Accounting Standards. Accounting Horizons, 17(1), 61–72. Scott, T. (1991). Pension Disclosures under SFAS No. 87: Theory and Evidence. Contemporary Accounting Research, 8(1), 62–81. Sengupta, P. (1998). Corporate Disclosure Quality and the Cost of Debt. The Accounting Review, 73, 459–474. Sengupta, P. (2004). Disclosure Timing: Determinants of Quarterly Earnings Release Dates. Journal of Accounting and Public Policy, 23, 457–482. Siregar, D., Anandarajan, A. and Hasan, I. (2013). Commercial Banks and Value Relevance of Derivative Disclosures after SFAS 133: Evidence from the USA. Review of Pacific Basin Financial Markets and Policies, 16(1), 1350004 (28 pages). (DOI: http://dx. doi.org/10.1142/S0219091513500045) Skinner, D. (1994). Why Firms Voluntarily Disclose Bad News. Journal of Accounting Research, 32(1), 38–60. Spence, M. (1973). Job Market Signaling. Quarterly Journal of Economics, 87(3), 355–374. Stiroh, K. (2004). Diversification in Banking: Is Noninterest Income the Answer? Journal of Money, Credit, and Banking, 36(5), 853–882. Suijs, J. (2007). Voluntary Disclosure of Information When Firms Are Uncertain of Investor Response. Journal of Accounting and Economics, 43, 391–410. Watts, R. and Zimmerman, J. (1986). Positive Accounting Theory, Englewood Cliffs, NJ/ Prentice-Hall. Wooldridge, J. (2000). A Framework for Estimating Dynamic, Unobserved Effects Panel Data Models with Possible Feedback to Future Explanatory Variables. Economics Letters, 68, 245–250. Yang, D. (2009). Signaling Through Accounting Accruals vs. Financial Policy: Evidence from Bank Loan Loss Provisions and Dividend Changes. Review of Pacific Basin Financial Markets and Policies, 12(3), 377–402. Zhao, R. and He, Y. (2014). The Accounting Implication of Banking Deregulation: An Event Study of Gramm-Leach-Bliley Act (1999). Review of Quantitative Finance and Accounting, 42, 449–468. Zimmer, S. and McCauley, R. (1991). Bank Cost of Capital and International Competition. Review Federal Reserve Bank of New York Quarterly Review, 15 (Winter), 33–59.
page 668
July 6, 2020
Appendix 17A: Accounting Standards from January 1995 to March 2008 11:59
December 15, 1995
−
May 1995
December 15, 1995
+
No. 123 Accounting for Stock-Based Compensation
October 1995
December 15, 1995
−
No. 132 Employers’ Disclosures about Pensions and Other Postretirement Benefits — an amendment of FASB Statements No. 87, 88, and 106
February 1998
December 15, 1998
N/A
June 1998
June 15, 2000
?
March 15, 2001
December 15, 2001
?
75
No. 143 Accounting for Asset Retirement Obligations
June 2001
June 15, 2002
−
49
No. 144 Accounting for the Impairment or Disposal of Long-Lived Assets
August 2001
December 15, 2001
−
65
No. 121 Accounting for the Impairment of Long-Lived Assets and for Long-Lived Assets to Be Disposed Of No. 122 Accounting for Mortgage Servicing Rights — an amendment of FASB Statement No. 65
No. 133 Accounting for Derivative Instruments and Hedging Activities No. 142 Goodwill and Other Intangible Assets
Disclosures only?
Number of pages 47
Yes
35
89 Yes
29
Yes
176
669
(Continued )
b3568-v1-ch17
March 1995
Accounting standards
Changes in calculation of regulatory core capital?
9.61in x 6.69in
Potential income effects of the adoption
Handbook of Financial Econometrics,. . . (Vol. 1)
Effective date (for fiscal years after the date)
External Financing Needs and Early Adoption of Accounting Standards
Issued date (or the earliest adoption date)
page 669
July 6, 2020
Effective date (for fiscal years after the date)
Potential income effects of the adoption
June 2002
December 31, 2002
+
No. 132R December 2003 December 15, Employers’ Disclosures about Pensions and 2003 Other Postretirement Benefits — an amendment of FASB Statements No. 87, 88, and 106
N/A
−
No. 155 Accounting for Certain Hybrid Financial Instruments — an amendment of FASB Statements No. 133 and 140
February 2006
September 15, 2006
?
March 2006
September 15, 2006
?
No. 156 Accounting for Servicing of Financial Assets — an amendment of FASB Statement No. 140 No. 157 Fair Value Measurements No. 159 The Fair Value Option for Financial Assets and Financial Liabilities — Including an amendment of FASB Statement No. 115
N/A
February 2007
November 15, 2007
?
March 2008
November 15, 2008
N/A
Yes
40
171 Yes
18
114
Yes
86 36
Yes
32
b3568-v1-ch17
No. 161 Disclosures about Derivative Instruments and Hedging Activities — an amendment of FASB Statement No. 133
September 2006 November 15, 2007
25
9.61in x 6.69in
December 2004 June 15, 2005
Number of pages
S. I-L. Wang
No. 123R Share-Based Payment
Disclosures only?
Changes in calculation of regulatory core capital?
Handbook of Financial Econometrics,. . . (Vol. 1)
No. 146 Accounting for Costs Associated with Exit or Disposal Activities
Issued date (or the earliest adoption date)
11:59
Accounting standards
670
Appendix 17A: (Continued )
page 670
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
671
Appendix 17B: Logit Model for Early Adoption Decisions Variable name Dependent variable EARLY
Variable definition
An indicator variable that equals 1 if BHCs early adopted accounting standards and 0 otherwise. Early adoption information is obtained from 10Q/Ks.
Test variables Hypothesis 1 — Profitability: ROA
ROA is measured by the income (loss) before extraordinary items and other adjustments (BHCK4300) divided by total assets (BHCK2170) (return on assets) at the first quarter-end after the adoption of a new accounting standard and adjusted for the average return on assets of US commercial banks with similar bank asset size (measured by RIAD4300 ÷ RCFD2170). RPCERTAINTY The percentage of times that a bank’s return on asset is greater than the average return on asset of peer commercial banks during the past 12 quarters before the announcement date of an accounting standard. ROA RPCERTAINTY An interaction term between ROA and RPCERTAINTY. Hypothesis 2 — Risk profiles: (a) Interest rate risk AFSOFINV
(b) Credit risk LOANRISK
The proportion of available-for-sale securities (excluding equity securities) (BHCK1773 – BHCKA511) out of total investment securities (BHCK1773 + BHCK1754), adjusted for the average AFSOFINV of peer commercial banks (measured by [RCFD1773 – RCFDA511] ÷ [RCFD1773 + RCFD1754]). The measure is then multiplied by (−1) to show that the higher the proportion, the higher the interest rate risk. The ratio of the allowance for loan and lease losses at the quarter-end (BHCK3123) to average quarterly loans and leases (BHCK3516), adjusted for the average LOANRISK of peer commercial banks (measured by RCFD3123 ÷ RCFD3516). The measure is then multiplied by (−1) to show that the higher the ratio, the higher the credit risk of the loan portfolio.
(c) Exposures and credit risk of derivatives EXPOSUREDER The level of exposure to derivative contracts (EXPOSUREDER) is measured by the ratio of the gross notional amount of derivative contracts other than purchased options (i.e., futures contracts, forward contracts, written options, and swaps) to total assets (BHCK2170), adjusted for the average EXPOSUREDER of peer commercial banks.
page 671
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
672
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang Appendix 17B: (Continued )
Variable name CREDITRISKDER
(d) Operational risk NONINTCHG
Variable definition An indicator variable that equals 1 if the gross notional amount of over-the-counter derivative contracts (i.e., forwards, over-the-counter options, and swaps) is greater than that of exchange-traded derivative contracts (i.e., futures contracts and exchange-traded options) and 0 otherwise. Average quarterly growth in noninterest income (BHCK4079) over the past six quarters, adjusted for the average NONINTCHG of peer commercial banks (RIAD4079) during the same period.
Control variables BADNEWS CONTRACT SIZE
LEVERAGE T1LEV
DROE
NIEFF
DROE NIEFF MKTHERF
DISCLOSURE
An indicator variable equal to 1 if ROA is less than zero and 0 otherwise. An indicator variable equal to 1 if a BHC holds interest rate derivative contracts and 0 otherwise. The logarithm of total revenues. Total revenues are defined as the sum of total interest income (BHCK4107) and total noninterest income (BHCK4079). Total liabilities (BHCK2948) divided by total assets (BHCK2170). Tier 1-leverage ratio is calculated as Tier-1 capital (BHCK8274) divided by average total assets for leverage capital purposes (BHCKA224). DROE is an indicator variable that equals 1 if a BHC’s ROE falls in the highest or lowest quartiles of the ROE distribution for the sample BHCs during a benchmark quarter and 0 otherwise. ROE is calculated as income (loss) before extraordinary items and other adjustments (BHCK4300) divided by the last quarter-end total equity capital (BHCK3210). The variable is coded as 1 if adoption of a sample accounting standard is expected to have a positive effect on income, −1 if a negative effect on income is expected, and 0 if the income effect cannot be determined ex ante or an accounting standard has only disclosure requirements. An interaction term between DROE and NIEFF. MKTHERF is measured by the sum of the squared market shares (based on total interest income, RIAD4107, and total noninterest income, RIAD4079) of commercial banks operating in the same geographic region. Following Moyer (1990), I consider five geographic regions are identified: Eastern, Southeast, Midwest, Southwest, and West. An indicator variable that equals 1 if an accounting standard only requires disclosures and 0 otherwise. (Continued )
page 672
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
673
Appendix 17B: (Continued ) Variable name REG POS
REG UN
PAGE
Variable definition An indicator variable that equals 1 if the pronouncement of an accounting standard could lead to changes in the rule of Tier-1 capital calculation and a positive effect on Tier-1 capital (i.e., SFAS No. 122); 0 otherwise. An indicator variable that equals 1 if the pronouncement of an accounting standard could lead to changes in the rule of Tier-1 capital calculation and an undetermined effect on Tier-1 capital (i.e., SFAS No. 133 and SFAS No. 155); 0 otherwise. The number of pages of a sample accounting standard as issued.
Note: All variables are measured at the latest quarter-end or for the latest quarter (the benchmark quarter) before standard announcement dates, unless otherwise mentioned. For example, all variables are measured for the fourth quarter in 2005 or at the end of the fourth quarter in 2005 for FAS 156, which was announced in March 2006.
Appendix 17C: Summary of Research Design This study examines 462 BHCs’ decisions whether to early adopt 16 accounting standards announced between January 1, 1995 and March 31, 2008 (3,640 BHC-standard observations). In particular, it investigates whether BHCs choose to early adopt accounting standards to disclose information/signal their quality to better access external financing. To proxy a BHC’s motivation for information disclosure/signaling to better access external financing, I use bank profitability and risk profiles as suggested in prior literature. I estimate the following logit model (unbalanced static binary panel data analysis) using maximum likelihood estimation with standard errors clustered at the bank level to test Hypotheses 1 and 2: EARLY ij = β0 + β1 ROAij + β2 RP CERTAIN T Y ij + β3 ROA RP CERTAIN T Y ij + β4 AF SOF IN V ij + β5 LOAN RISK ij + β6 EXP OSUREDERij + β7 CREDIT RISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEVERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 N IEF Fj + β16 DROE N IEF F ij + β17 M KT HERF ij + β18 DISCLOSU RE j + β19 REG P OS j + β20 REG U Nj + β21 P AGE j + ΣN βN Y EARN + εij .
(17C.1)
page 673
July 6, 2020
11:59
674
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
S. I-L. Wang
The dependent variable, EARLYij , is equal to 1 if BHC i early adopted the accounting standard j and 0 otherwise. YEARN is an indicator variable equal to 1 if the accounting standard was issued in year N and 0 otherwise. All other variables are defined in Appendix 17B. As an examination of Hypothesis 1, the test variable of interest is ROA; β1 is expected to be negative. To test hypothesis 2, the test variables are AFSOFINV, LOANRISK, EXPOSUREDER, CREDITRISKDER, and NONINTCHG; the respective coefficients are expected to be positive. The static logit model assumes independence between the response variables (i.e., early adoption decisions) given the covariates and does not include the lagged response variable in the regressors. To test whether the motivation for information disclosure/signaling to better access external financing varies with the potential income effect upon adoption of a standard (i.e., Hypothesis 3), the following logit model (unbalanced static binary panel data analysis) is estimated with heteroskedasticitycorrected standard errors and separately for accounting standards with (1) income-decreasing effects, (2) income-increasing effects, (3) ex ante undetermined income effects, and (4) only disclosure requirements: EARLY ij = β0 + β1 ROAij + β2 RP CERT AIN T Y ij + β3 ROA RP CERTAIN T Y ij + β4 AF SOF IN V ij + β5 LOANRISK ij + β6 EXP OSUREDERij + β7 CREDITRISKDERij + β8 N ON IN T CHGij + β9 BADN EW S ij + β10 CON T RACT ij + β11 SIZE ij + β12 LEVERAGE ij + β13 T 1LEV ij + β14 DROE ij + β15 M KT HERF ij + ΣN βN Y EARN + εij .
(17C.2)
All variables are defined in Appendix 17B. To examine Hypothesis 3, it is expected that ROA, AFSOFINV, LOANRISK, EXPOSUREDER, CREDITRISKDER, and NONINTCHG are most consistent with the predictions in the case of accounting standards with only disclosure requirements. In contrast to a static logit model, a dynamic logit model includes a lagged response variable and fixed/random effects specific to each BHC among the covariates and assumes the current response variable is conditionally independent of other prior lagged response variables. Additionally, the initial observation of the response variable must be known for a dynamic logit model with random effects (e.g., Heckman, 1981a, 1981b, 1981c; Wooldrige, 2000;
page 674
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch17
External Financing Needs and Early Adoption of Accounting Standards
675
Bartolucci and Farcomeni, 2009; Bartolucci and Nigro, 2012). The advantage of a dynamic logit model is to account for the dynamics in the response variables and to address for the potential endogeneity of the regressors (e.g., Bagntasarian and Mamatzakis, 2018). In the case of this study, because the accounting standards examined are of different nature, it is less likely that the current early adoption decision is correlated with the previous early adoption decision. Future research may explore the level of consistency in the coefficient estimates using a dynamic logit model as opposed to a static logit model in the setting of early adoption decisions.
page 675
This page intentionally left blank
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Chapter 18
Improving the Stock Market Prediction with Social Media via Broad Learning Xi Zhang and Philip S. Yu
Contents 18.1 18.2
18.3
18.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Exploiting Investors Social Network for Stock Prediction 18.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . 18.2.2 Data description and characteristics . . . . . . . 18.2.3 Prediction of stock price movement . . . . . . . 18.2.4 Experiments . . . . . . . . . . . . . . . . . . . . Tensor-Based Information Fusion Methods for Stock Volatility Prediction . . . . . . . . . . . . . . . . . . . . 18.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . 18.3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . 18.3.3 The framework . . . . . . . . . . . . . . . . . . . 18.3.4 Coupled matrix and tensor factorization . . . . 18.3.5 Experiments . . . . . . . . . . . . . . . . . . . . Multi-Source Multiple Instance Learning for Market Composite Index Movements . . . . . . . . . . . . . . . 18.4.1 Overview . . . . . . . . . . . . . . . . . . . . . .
Xi Zhang Beijing University of Posts and Telecommunications e-mail: [email protected] Philip S. Yu University of Illinois at Chicago e-mail: [email protected] 677
. . . . . .
. . . . . .
. . . . . .
. . . . . .
678 680 680 682 687 692
. . . . . .
. . . . . .
. . . . . .
. . . . . .
696 696 698 700 706 709
. . . . 717 . . . . 717
page 677
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
678
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
18.4.2 Preliminaries . . . . . . . . . . . . . . 18.4.3 Multi-source multiple instance model 18.4.4 Feature extraction . . . . . . . . . . . 18.4.5 Experiments . . . . . . . . . . . . . . 18.5 Summary . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
718 722 726 728 733 733
Abstract This chapter discusses how to exploit various Web information to improve the stock market prediction. We first discuss the impacts of investors’ social network on the stock market, and then propose several information fusion methods, that is, the tensor-based model and the multiple-instance learning model, to integrate the Web information and the quantitative information to improve the prediction capability. Keywords Stock prediction • Event extraction • Information fusion • Social network • Sentiment analysis • Quantitative analysis • Stock correlation.
18.1 Introduction Traditional stock market prediction approaches commonly utilize the historical price-related data of the stocks to forecast their future trends. With the prosperity of Web 2.0, more and more investors are engaged in the Web activities to obtain and share the stock-related information in real time. Meanwhile, the posted opinions by the experts and influential people on the stocks can influence other people’s decisions due to the rapid propagation of the influence through the Internet. The effects are twofold. On the one hand, the event information and the users’ sentiments on the Web can largely influence the stock price. For example, the false rumor of explosion at White House causes stocks to briefly plunge (Johnson, 2013). On the other hand, drastic fluctuations in stock price can lead to the generation and spreading of the relevant information (e.g., viewpoints from authorities), which can in turn affect the public opinions for the future investment strategies. Therefore, it provides researchers unprecedented opportunity to utilize the Web information to facilitate stock analysis. Motivated by the rich data from the Web, recently some works try to explore the news articles and the social media to improve the prediction. Effective indicators, e.g., the events related to the stocks and the people’s sentiments towards the market and stocks, have been proved to play important roles in the stocks’ volatility, and are extracted to feed into the
page 678
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
679
prediction models for improving the prediction accuracy. However, a major limitation of previous methods is that the indicators are obtained from only a single source whose reliability might be low, or from several data sources but their interactions and correlations among the multi-sourced data are largely ignored. In this chapter, we propose to use the idea of broad learning to make stock market predictions. Broad learning means fusing information from different sources. To this end, we explore various Web information sources and extract effective indicators. Furthermore, instead of treating each data source separately, we investigate their joint impacts on the stock price movements. However, this problem is challenging due to the implicit relationships among data sources and the sparsity of the Web data, e.g., the news events and user discussions. To address these issues, we propose a line of machine learning models to effectively fuse heterogeneous data and capture their intrinsic relations. Experiments on real-world stock market data demonstrate that our models can significantly improve the prediction capability compared to methods with single-source data. This chapter is organized as follows. Section 18.2 investigates whether the investor perceptions extracted from the investors social network are useful for the stock market prediction. We use two data sources, the social media (i.e., Xueqiu1 ) and the quantitative trading data, to make predictions for each stock in the market, with the off-the-shelf machine learning models. Section 18.3 introduces a novel tensor-based computational framework that can effectively predict stock price movements by fusing three sources of information, that is, the social media (i.e., Guba2 ), the news titles and the quantitative trading data. Due to the sparsity of the news events, in contrast to Section 18.2 that works with all the stocks, we only choose stocks from Chinese Stock Index (CSI) 100. Section 18.4 develops an extension of the Multiple Instance Learning (MIL) model that can effectively integrate the features from multiple sources to make more accurate predictions. We also use three data sources as Section 18.3, but instead of extracting the news events and sentiments with regard to each specific stock, we analyze the macro news events and sentiments for the overall market, and make predictions on the market composite index movements. Table 18.1 summarizes the differences of the three sections.
1 2
http://xueqiu.com/. http://guba.eastmoney.com/.
page 679
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
680 Table 18.1:
Comparisons of the prediction methods in three sections. Data source
Section 18.2 Section 18.3
Section 18.4
Xueqiu, quantitative indices Guba, news, quantitative indices Xueqiu, news, quantitative indices
Target
Learning method
Every stock in the market Every stock in CSI 100
SVM, MLP
Market composite index
Extension of multiple instance learning
Extension of tensor factorization
18.2 Exploiting Investors Social Network for Stock Prediction3 18.2.1 Overview Social networks such as Twitter, Weibo, Facebook, and LinkedIn have attracted millions of users to post and acquire information, which have been well studied by various works (Kwak et al., 2010; Su et al., 2016; Anderson et al., 2015; Viswanath et al., 2009). In addition to these general social networks, there is another breed of smaller, more focused sites that cater to niche audiences. Here we look at a social site designed for traders and investors, that is, Xueqiu. Xueqiu is a specialized social network for Chinese investors of the stock market, and has attracted millions of users. Xueqiu enables investors to share their opinions on a Twitter-like platform, or post their portfolios, demonstrating their trading operations and returns. Different from those general social networks, almost all the information on Xueqiu is related to stocks, making it a natural data source to collect investors’ perceptions, which may be useful for stock market prediction in China. The early literature on stock market prediction was based on the Efficient Market Hypothesis (EMH) and random walk theory (Fama, 1965). However, investors’ reactions may not support a random walk model in reality. Behavioral economics has provided plenty of proofs that financial decisions are significantly driven by sentiment. The collective level of optimism or pessimism in society can affect investor decisions (Prechter, 1999; Nofsinger, 2005). Besides, investor perceptions on the relatedness of stocks can also be 3
Reprinted from Journal of Computational Science, Vol 28, Xi Zhang, Jiawei Shi, Di Wang, Binxing Fang, Exploiting investors social network for stock prediction in China’s market, 294–303, Copyright (2018), with permission from Elsevier.
page 680
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
681
(a)
(b)
Figure 18.1:
Tweets number of Xueqiu vs. trading volume and turnover.
a potential predictor. Firms may be economically related with one another (King, 1966; Pindyck and Rotemberg, 1993). Therefore, there is a probability that one stock’s price movement can influence its peers due to the investment reactions driven by investors’ perceptions on such relatedness. Sentiment and perception are psychological constructs and thus difficult to measure in archive analyses. News articles have been used as a major source for textual content analysis. For example, news articles are employed to analyze public mood (Li et al., 2014), by which stock price movements can be predicted. However, this type of content has an obvious drawback that news articles directly reflect the sentiments of the authors instead of the investors. Online social platforms have provided us with more direct data and enabled opportunities for exploring users’ sentiment and perception. In recent studies, it is found (Bollen et al., 2011) that collective mood derived from Twitter feeds can be used to improve the prediction accuracy of Dow Jones Industrial Average (DJIA). Facebook’s Gross National Happiness (GNH) index is shown to have the ability to predict changes in both daily returns and trading volume in the US stock market (Karabulut, 2013). The predictability of StockTwits (Twitter-like platform specialized on exchanging trading-related opinions) data with respect to stock price behavior is reported in Al Nasseri et al. (2015). Most of the existing studies have focused on the US stock market and lacked attention to certain emerging countries such as China, where the stock market is inefficient exhibiting a considerable non-random walk pattern (Darrat and Zhong, 2000). The China’s stock market (denoted as the A-share
page 681
July 6, 2020
11:59
682
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
market) differs remarkably from other major markets in the structure of investors. Specifically, unlike other major stock markets, which are dominated by institutional investors, retail investors account for a greater percentage in China’s market. Importantly, retail investors are more likely to buy rather than sell stocks that catch their attention and thus tend to be influenced by news or other social medias (Barber and Odean, 2008). Therefore, we study the China’s stock market based on a unique dataset from a popular Chinese Twitter-like social platform specialized for investors, namely Xueqiu (which means “snowball” in Chinese), aiming to fill this gap in the literature. To demonstrate how closely Xueqiu is related to the stock market of China, Figure 18.1(a) shows the daily published tweets volume of all stocks on Xueqiu and the daily trading volume of the A-share market from November 2014 to May 2015. It can be observed that the fluctuation trends of these two curves show great synchronicity, especially when high-trading volume volatility occurs. When we look at the individual stocks, the synchronicity between the movement of daily tweets volume and the movement of daily turnover rate still holds, as displayed in Figure 18.1(b), where one of the most popular stocks in Xueqiu, that is, the CITIC Securities, is taken as an example. On the basis of the tweets from Xueqiu, we construct features with regard to collective sentiment and perception by extracting two types of networks from Xueqiu: one is the user network, and the other is the stock network perceived by users. We evaluate our proposal on all the active stocks (more than 2,000) in the A-share market, indicating it’s a feasible approach. 18.2.2 Data description and characteristics Xueqiu was established in 2010, and mainly focused on US stock market at first. Since 2014, more and more attention has been paid to the A-share market. By the end of 2015, there had been millions of registered users. Xueqiu enables investors to share their opinions on a Twitter-like platform, that is, a user can post, reply or repost others’ contagions. In addition, each user can follow or be followed by other users, and the number of followers demonstrates his/her authority in some degree. The administrators and official agency of quoted company usually publish authoritative announcements on Xueqiu. In addition to the announcements and opinions, a number of investors post their portfolios, demonstrating their trading operations and returns. Different from general social networks such as Twitter or Weibo, almost all the information on Xueqiu is related to stocks, making it a natural data source to collect investors’ perceptions.
page 682
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
683
18.2.2.1 Dataset We obtain a complete dataset of all users and tweets from December 2010 to May 2015, which consists of 18.39 million tweets from 2,780 stocks (total 2,780 stocks till July 2015) and 2.77 million users. Then we restrict our analysis to the interval from November 2014 to May 2015 for two reasons. First, as some features of the data (e.g., follower graph) keep evolving, we have to choose a relatively short interval with the assumption that such features are stable within this period. Second, the A-share market was very active in this period, resulting in large fluctuations in the market indicators and a lot of discussion tweets on Xueqiu. The dataset we analyze involves 6.48 million tweets from 284,000 active users, and the crawled information is listed as follows: • Users. We crawled user ID, the number of followers, the list of the followers, and the number of the published tweets. • Tweets. For tweets, we record not only the content but also the associated attributes, such as the tweet’s ID, publishing time, replying, and retweeting time. We also record the retweeting behaviors, including the ID of the new tweet and the ID of the user who retweets it. 18.2.2.2 Characteristics of the data We begin with the structural analysis of the dataset, and the observed characteristics can help us better understand Xueqiu, and facilitate our prediction task. Distribution of Followers Counts: We first look at the distribution of the follower counts. As shown in Figure 18.2, the x-axis represents the number of followers of each user, and the y-axis shows the Complementary Cumulative Distribution Function (CCDF). The plus-symbol (+) line shows the results of our dataset, while the solid line shows a power law distribution with the exponent of −0.624 and R2 = 0.982. It can be observed that the distribution curve fits well with the power law distribution when x ≤ 104 . The turning point appears at x = 104 , and then the curve with symbol “+” drops quickly. The reason is that only 0.44% of the total 284,000 users have more than 104 followers, making it difficult to keep consistent with the other users all through the curve. Followers vs. Retweeted Counts: When a tweet is retweeted, its influence gets spreading. The retweet count indicates its influence. Generally, a tweet posted by a celebrity can get retweeted easily. We attempt to demonstrate
page 683
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
684
Figure 18.2:
Distribution of followers counts.
the relations between the number of the followers and the number of the retweets. The scatter diagram is shown in Figure 18.3, where the x-axis represents the number of followers and the y-axis stands for its retweet number. It can be observed that when the number of followers x exceeds 1.2 × 104 , the number of retweets is larger than 1 × 103 , indicating that tweets published by a celebrity whose follower number is large enough (larger than 104 ) can get retweeted much more easily. Moreover, as the number of followers increase, the number of retweets grows linearly, especially when x > 104 . 18.2.2.3 Sentiments vs. A-share indicators In order to investigate the correlations between the sentiments of the tweets and the stock prices, we first extract the sentiments from tweets, and use Naive Bayes Algorithm to infer the sentiments. Tweets are classified into three categories: negative, positive, and neutral. Negative and positive tweets are applied to construct the sentiment index at someday i, which is defined as pi N i=1 pi , Si = 0.5 − ni pi + N N i=1 pi i=1 ni
page 684
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
Figure 18.3:
Figure 18.4:
685
Followers counts vs. retweeted counts.
Change rate of A-Share index vs. sentiment index.
where N is the number of dates, and Pi and ni are positive and negative tweets numbers at day i, respectively. In Figure 18.4, the solid curve represents the sentiment index from December 2014 to May 2015, and the dashed curve stands for the change rate of A-share Index, i.e., the Shanghai Stock Exchange Composite Index, in the same time interval. If these two curves co-evolve, it indicates the sentiments presented by tweets on Xueqiu are correlated with A-share Index.
page 685
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
686
It is reasonable that the positive emotion usually goes with the rise of the stock price, and vice versa. It can be observed that in a lot of time intervals, these two curves show similar fluctuation trends, especially at the peak or bottom of the A-share index curve. For example, on January 19, 2015, the A-share index dropped by 7.7%, and the sentiment index also dropped a lot, indicating the strong negative emotions of the investors. 18.2.2.4 User perceived stock relatedness The common method to obtain the stock correlations is to use the standard industry classification scheme or historical price series. However, in this work, we extract the user perceived relatedness through Xueqiu. The advantage is that, in addition to the explicit and static relatedness, we can also obtain the latent or instant correlations, e.g., the correlated stocks which are driven by the same event but not affiliated with the same industry. To obtain such correlations, we collect all the pairwise stocks mentioned by the same tweet. For preprocessing, we removed tweets mentioning more than five consecutive stock tickers as such tweets usually do not convey much meaning for our task. Table 18.2 shows the top five most frequent stocks jointly mentioned with China Merchants Bank, LeTV, ChinaNetCenter and Kweichow Moutai respectively. It can be observed that the top five stocks related to China Merchants Bank are all financial companies, the top three are banks that have similar sizes as China Merchants Bank, and the fourth is Ping An, a comprehensive financial company involving the Ping An Bank. CITIC Securities, the largest securities company in China, takes the fifth place. For the stock LeTV, the correlated stocks are diverse. LeTV is a company whose major products are smart TVs and video services, while EastMoney is a website providing financial news and data. Though they are not in the same industry, they are treated as representative companies in China Growth Enterprise Board by investors, and thus co-occur frequently. Table 18.2: China Merchants Bank 1 2 3 4 5
Industrial Bank SPD Bank Minsheng Bank Ping An CITIC Securities
Top 5 co-occurrence statistics.
LeTV
ChinaNetCenter
Kweichow Moutai
EastMoney Hithink RoyalFlush CITIC Securitie Siasun Robot Huayi Brothers
LeTV Ourpalm Kweichow Moutai Sinnet Siasun Robot
China Merchants Bank Luzhou Laojiao Ping An Gree SPD Bank
page 686
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
687
For ChinaNetCenter and Kweichow Moutai, their most correlated stocks are also not restricted to the same industry. Thus, it can be summarized that the user-perceived relatedness from Xueqiu can capture implicit correlations which are difficult to observe by previous methods. Correlations may result in co-evolving in stock prices, and our work is to investigate whether such correlations extracted from Xueqiu is effective for our prediction task. 18.2.3 Prediction of stock price movement We model the prediction of stock price movement as a binary classification problem. Then we discuss how to extract features from three different types of information sources. After that, we evaluate the classification model to verify the effectiveness of the information from Xueqiu. 18.2.3.1 Problem formulation The movement of stock price only happens during trading days, so we define one single trading day as the time granularity of our prediction model. A trading day is defined from the close time (i.e., 3:00pm) of the last day to the close time of today. We would predict whether the close price of today is increased or decreased compared to the close price of the previous day. Given a target stock si , a series of its consecutive valid trading days constitute the trading day vector Ti = (ti1 , ti2 , ti3 , . . . , tin ), where n is the number of the trading days in Ti , determined by the range of the dataset. Note that different stocks would have different trading day vectors. For some trading day tij , we define feature vector xij , consisting of features extracted for stock Si at trading day tij . The feature vector is also the input of the prediction model. Formally, given the stock Si and its feature vector xij , the stock price movement prediction problem is modeled as: 1, if price of si increases on tij+1 i i (18.1) yj = f (xj ) = 0, otherwise, where yji is the result of the prediction function f (xij ), denoting the price movement direction of stock Si at the next trading day tij+1 . 18.2.3.2 Feature extraction Motivated by the data analysis in Section 18.2.2, we explore the rich knowledge from Xueqiu as well as the stock market to constitute the input feature
page 687
July 6, 2020
11:59
688
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
vector xij , and categorize the features into three types, the stock specific features, the sentiment feature, and the stock relatedness feature. Stock Specific Features: The common information used for stock prediction is the firm-specific factors, as well as the historical and time series prices used for technical analysis (Taylor and Xu, 1997; Taylor, 2007). We select some key characteristics of a stock that exhibit the predictive ability in previous literatures (Li et al., 2014; Fama and French, 1992): stock price, trading volume, turnover and price-to-earnings (P/E) ratio. Note that the absolute value of the stock price and trading volume would have huge differences between different stocks, so we use the change rate instead. Besides, not only the daily change but also the change of 5-days-moving-average is involved. Sentiment Features: In this study, we derive the sentiment index for each stock at each trading day. Firstly, for all the tweets in the dataset, we classify them into three categories: positive, neutral and negative, and only the positive and negative tweets are used to derive the sentiment feature. Counting the number of tweets in the positive and negative categories is an intuitive way to measure how strong each sentiment is. However, this counting method implies that each tweet is treated equally and thus has the same weight. In fact, it’s clear that different tweets might have different influences due to different authorities of the users. Thus, it is reasonable to take the user’s authority as the weight for each tweet. Given a user network extracted from Xueqiu, PageRank is a natural method to weigh each user. To derive the PageRank score, we first construct the user network from the dataset. Note that different from the static friendship links in the social network, the user network constructed here is a dynamic forwarding network. Specifically, as the users publish tweets or forward others’ tweets on Xueqiu, a user forwarding network can be constructed. Figure 18.5 shows a sample of the user network on May 29, 2015. In this network, each node stands for a user marked with its user ID, and the edge stands for the forwarding behavior between the two users. There are totally 141 trading days of the A-share market in our dataset, so 141 user networks are constructed. For each network, we calculate PageRank value of each vertex. As shown in Figure 18.5, the bigger a node is, the larger the user’s PageRank is. For a user ut in the directed user network, given a user set Ut with Kt users (denoted as nodes u1 , u2 , . . . , uKt ) that have forwarded ut ’s tweet, the
page 688
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
Figure 18.5:
689
A sample of user network.
PageRank value of ut can be calculated as P R(ui ) , P R(ut ) = L(ui )
(18.2)
ui ∈Ut
where L(ui ) is the number of outbound links from ui . After that, the weight of each tweet x is the weight of the user u(x) who has published it, and then the weighted count is P R(u(x)), P ositiveCount = x
N egativeCount =
(18.3)
P R(u(y)),
y
where x and y denote the positive tweet and negative tweet, respectively. For a given stock Si and some trading day tij , we first calculate its positive count, and negative count, and then combine them into one sentiment score, denoted as SCji , that is SCji =
PositiveCount ij
PositiveCount ij + NegativeCount ij
.
(18.4)
Obviously, SCji ∈ [0, 1], and the larger SCji is, the more positive the overall emotion is. SCji is used as the sentiment feature for our prediction model. Stock Relatedness Features: User-perceived relatedness among stocks is another knowledge that could be obtained from Xueqiu (Arai et al., 2015). The intuition is that stocks with strong correlations may demonstrate comovements on prices. In our work, stocks are regarded as correlated stocks
page 689
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
690
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
if they are jointly mentioned by a tweet. Formally, we define the stock network as an undirected graph G = {V, E}. The node set V comprises of stocks, and eu,v ∈ E stands for the edge between stock nodes u and v and the edge weight is the number of co-occurrences in the last 3 days. As this correlation is time-sensitive (Wichard et al., 2004), we construct 141 stock networks for 141 trading days. Specifically, for a given stock Si and the trading day j, let rji,k denote the weight of the edge between stock Si and Sk at day j. To make the correlation more specific and meaningful, we filter the non-informative edge with rji,k < 2 (except rji,i , note that rji,i = 1). For any two stocks (namely Sm and Sn ) which are not connected in the stock network, rjm,n = 0. Then given rji,k as the weight, we can combine it with a stock specific feature fk of the stock Sk to obtain the relatedness feature at day j, that is N i,k k=1 rj fk i (18.5) corr(f )j = N i,k , k=1 rj
where N is the number of stocks in the dataset and fk is a stock specific feature of the stock Sk . Take the turnover rate and the stock price change rate as examples, we can obtain N i,k k=1 rj turnoverk , corr(turnover)ij = N i,k k=1 rj (18.6) N i,k r price change k k=1 j . corr(price change)ij = N i,k k=1 rj 18.2.3.3 Prediction methods Given the feature vector, we then apply statistical learning methods to obtain the prediction results. Specifically, given a training set of n points with the form (x1 , y1 ), . . . , (xn , yn ), where yi is either +1 or −1. The Class +1 denotes that the stock price will increase, while the Class −1 means the stock price is a vector for a specific stock on a certain day containing will decrease. xi the features applied to train the model. To obtain the prediction results, we consider both the Support Vector Machine (SVM) (Boser et al., 1992) and the Multilayer Perceptron (MLP) (Hinton, 1987; Rumelhart et al., 1988) algorithms. Most previous works use linear models to predict the stock market (Xie et al., 2013; Luss and DAspremont, 2015; Kogan et al., 2009). However, the relationship between the features and the stock price movements may be more complex than linear. Thus, we use RBF-kernel instead of the linear kernel in SVM, and the
page 690
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
691
results also show that using RBF kernel is better than using linear kernel. In addition, we also exploit the MLP model to learn the hidden and complex relationships. MLP is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. The structure of the model in our work is using one hidden layer and using sigmoid as the activation function. The standard backpropagation algorithm is used for supervised training of the neural network. The process of feature extraction and prediction is shown in Algorithm 1.
Algorithm 1 Process of Feature Extraction and Prediction Input: Users U and tweets X from Xueqiu, firm-specific factors F of stock Si at trading day tij Output: Stock movement yji at next trading day tij+1 function SPECIFICFEATURE(F , sij ) Extracting firm-specific features: fji ← F for stock sij at tij ; return fji end function function SENTIMENTSCORE(X, U , si ) Counting the number of tweets in the positive category (i.e., x) and negative category (i.e., y) respectively; Constructing the user forwarding network; P R(ui ) ; PageRank value for user ut : P R(vt ) ← L(ui ) P R(u(x)) Positive weighted count: P ositiveCount ← x Negative weighted count: N egativeCount ← P R(u(y)) y
Sentiment Score:
SCji
P ositiveCountij ← ; P ositiveCountij + N egativeCountij
return SCji end function function RELATEDNESSFEATURE(fji , X, si ) Constructing the stock network: G = {V, E}; N
Correlation between stock si and sk : rji,k ← k=1 N corr(fk )ij ,
rji,k fk
; rji,k fk is a specific feature of stock k; k=1
Relatedness feature: return corr(fk )ij end function function PREDICTION(fji ,SCji ,corr(f k)ij ) Combining features into a vector: xij ← fji , SCji , corr(f )ij ; Predicting stock movement: yji ← SVM (xij )(or yji ← M LP (xij )); return yji end function
page 691
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
692
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
18.2.4 Experiments We conduct experiments to evaluate the effectiveness of the features obtained from Xueqiu to predict stock price movements. 18.2.4.1 Experimental setup We use the Xueqiu data from November 2014 to May 2015. The target stocks are selected from all the stocks in the A-share market satisfying two requirements: (1) there are more than 10 trading days for that stock during this time; (2) the number of tweets about that stock is more than 10 per day. The spam contagions in Xueqiu may lead to large noises in our analysis and prediction task. To detect the spams, we have determined 10 features (including the percentage of digits in the contagion, the number of followers of the user, etc.), and use logistic regression to identify them. Then we extract Xueqiu related features (sentiment features and the stock relatedness features) for each stock. The stock-specific features are extracted from the historical information obtained through TuShare.4 The sentiment of each tweet from Xueqiu is classified by SnowNLP,5 an open-source Chinese text processing toolkit. Finally, we get about 35.7K valid test samples from our dataset. We use the SVM (with RBF-kernel) and MLP as the prediction models. The samples in the last month would be used as the training set to predict the stock price movements for each trading day in the following month. For example, when the samples in November 2014 are used as the training set, the trading days in December 2014 are the corresponding testing set. The prediction is evaluated through two commonly used metrics: classification accuracy (ACC) and Area Under ROC Curve (AUC). ACC is very sensitive to the data skew. When a class has an overwhelmingly high frequency, the accuracy can be high using a classifier that makes the prediction on the majority class. Thus, we also use AUC to avoid the bias due to data skew. Though our data is not severely skewed, we also use AUC for comparison. After conducting predictions on all these testing sets, we aggregate all results of AUC and ACC into an overall output. 18.2.4.2 Prediction results According to previous studies (Li et al., 2014; Fama and French, 1992), the stock-specific features are vital for stock prediction, so using prediction 4 5
http://tushare.org/. https://github.com/isnowfy/snownlp.
page 692
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
693
Figure 18.6: Prediction results using SVM and MLP with only stock specific features (SSF) vs. with both SSF and Xueqiu features.
methods (i.e., SVM and MLP) with only stock-specific features is adopted as our baselines. To verify whether the knowledge extracted from Xueqiu is effective for stock prediction, the prediction methods with stock-specific features as well as the Xueqiu features (i.e., sentiment features and stock relatedness features) are evaluated against the baselines. The results are shown in Figure 18.6. It can be observed that given the same prediction model (SVM or MLP), the method involving Xueqiu features achieves consistently better performance than only involving the stock-specific features over both ACC and AUC metrics. This confirms that the investors’ perceptions extracted from Xueqiu can assist in stock prediction. This also demonstrates that the Chinese social media can have reflected the investors’ opinions and behaviors in China’s stock market. In addition, the MLP model achieves better performance than the SVM model, partly by effective learning hidden relationships between the features and the price movements. Based on the above analysis, we can observe that both features and algorithms can have impacts on performance. 18.2.4.3 Feature importance analysis Feature important analysis studies how important the various features are in the prediction task. From a macroscopic view, we first study the importance of the features derived from different types of knowledge.
page 693
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
694
Figure 18.7: SVM.
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
Prediction results trained on different combinations of features with
Figure 18.7 shows the prediction results with different groups of features. Not surprisingly, the stock-specific features are very useful for stock prediction. Using stock-specific features alone can achieve 0.57 in ACC and 0.54 in AUC. Both the sentiment features and the stock relatedness features are helpful, and the sentiment features play more important roles than the stock relatedness features. When we put all features together, the prediction result in terms of ACC remain the same as that with both sentiment and stock specific feature, but the results in terms of AUC can be further improved. The reason is that the imbalance in the dataset is not considered in ACC. The improvement in AUC indicates that the addition of stock relatedness feature can improve the discriminative power of our model. We then study the importance of features from a microcosmic point of view. To evaluate how the features contribute to the prediction results, we use the random forests model to obtain the rank of the importance of the features (Genuer et al., 2010), which is shown in Table 18.3. It is clear that stock-specific features are the most influential feature type in the model, as the top 4 features all belong to it. While additional features are taken into consideration, sentiment features are more important than the stock relatedness features, which is coherent with the feature analysis presented above in Figure 18.7. Among sentiment features, Sentiment Score is the most critical one containing not only the contents of the tweets but also the
page 694
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning Table 18.3:
695
Top 10 important features.
Features Stock Price Change Rate Stock MA5 Value Stock Turnover Change Rate Stock Trading Volume Change Rate Sentiment Score Stock 5-Day Moving Average of Trading Volume Sentiment Tweets Count (Positive) Correlation Stocks Weighted Average (MA5) Sentiment Tweets Count (Negative) Sentiment Tweets Count (Neutral)
structure of the user network. Correlation Stocks Weighted Average (MA5) is in the top 10 features, indicating it is also useful for our task.
18.2.4.4 Summary We summarize the experimental results by the following observations: (1) Knowledge extracted from Xueqiu is useful for stock prediction. The results show that by exploiting the knowledge from Xueqiu, the prediction results can be consistently improved in terms of both ACC and AUC. Previous studies show the effectiveness of sentiments for stock prediction in the US stock market. Our research confirms that the sentiments extracted from Xueqiu are also effective for China’s stock market. In addition, we also observe the effectiveness of user-perceived stock relatedness in stock prediction. (2) The stock-specific features are crucial to the prediction task. Both the prediction results with only stock-specific features and the feature importance analysis show that the stock-specific features are the most important ones for our task. (3) The sentiment feature is more useful than the stock relatedness features. One possible reason is that the user-perceived stock relatedness is more sparse than the sentiments. Only a few stocks are mentioned jointly with other stocks in the tweets, especially considering our stock network is time-sensitive, i.e., only the co-occurrences in the last 3 days are taken into account. Despite the sparseness, the prediction results are pretty good for almost the full set of stocks in the market.
page 695
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
696
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
18.3 Tensor-Based Information Fusion Methods for Stock Volatility Prediction6 18.3.1 Overview Previous studies showed that news events, such as corporation acquisition and earning announcement, could have significant impacts on the stock prices (Cutler et al., 1989; Tetlock et al., 2008; Luss and DAspremont, 2015; Xie et al., 2013; Wang and Hua, 2014; Peng and Jiang, 2015), and eventdriven stock prediction techniques are proposed to extract events from Web news to guide stock investment (Ding et al., 2014, 2015, 2016). However, their prediction capabilities are limited by the following challenges. First, the stock-related event information collected from the Web is very sparse. Although Web news is increasingly available, the amount of events that can be extracted from the Web news is still limited. In addition, the events usually reside in unstructured texts which are difficult to extract. Besides, the same event can be described in different ways by different websites and thus is prone to be identified as different events, leading to increased sparsity. Second, there lacks an effective method to analyze the events and quantitively measure their influence on the stock prices. For example, for the event acquisition, Microsoft acquiring LinkedIn leads to the fall of Microsoft’s stock price. In contrast, Intel acquiring Altera results in the rise of Intel’s stock price. Thus, solely relying on the events to make predictions is not enough. In addition to the events, emotions also play a significant role in decision making. Previous studies in behavioral economics show that financial decisions are significantly driven by emotion and mood. For example, the collective level of optimism or pessimism in society can affect investor decisions (Prechter, 1999; Nofsinger, 2005). Due to the recent advances in Natural Language Processing (NLP) techniques, sentiment-driven stock prediction techniques are also proposed by extracting indicators of public mood from social media (Bollen et al., 2011; Nguyen and Shirai, 2015; Feldman et al., 2011), where the positive mood for a stock will probably indicate a rising trend in the price and negative mood will more likely mean a decreasing
6
Reprinted from Knowledge-Based Systems, Vol 143, Xi Zhang, Yunjia Zhang, Senzhang Wang, Yuntao Yao, Binxing Fang, Philip S. Yu, Improving stock market prediction via heterogeneous information fusion, 236–247, Copyright (2018), with permission from Elsevier.
page 696
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
697
trend. However, relying on the sentiments alone is not sufficient for prediction either. For example, in holidays, people’s mood tends to be positive yet it may not really reflect their investment opinions. To address the above mentioned challenges, we propose to combine both the stock-related events extracted from the Web news and the users’ sentiments on the social media. In addition to the sentiments considered in Section 18.2, we involve the news events in this section, and thus the data sources are more heterogeneous. Then effective information integration techniques are required to jointly model their impacts. However, it is extremely difficult to integrate the information from multiple sources which is heterogeneous and interrelated. Specifically, the information may be with different time scales (e.g., hours, days, months) and different structures (e.g., events in news, sentiments on social media). A common integration strategy is to concatenate the features from multiple sources into one compound feature vector. However, such a linear predictive model assumes these features from different data sources are independent of each other. In reality, in addition to the linear effects, there are also coupling effects coming from the interactions among multiple sources. For example, a specific event (e.g., breaking the contract) usually results in a specific sentiment (e.g., negative emotion). In addition, even within a single data source, there can be interactions among different features. For instance, in the quantitative data, the price movements of two stocks and their corresponding industries can be highly correlated. It is obvious that the stocks affiliated to the same industry tend to co-evolve more frequently than those from unrelated industries. In this section, we present a novel tensor-based computational framework that can effectively predict stock price movements by fusing various sources of information. To this end, we extensively collect the stock-related information that can roughly be categorized into three types, the quantitative information (e.g., historical stock prices) from the financial data providers, the event-specific information from the Web media, and the sentiment information from the social media. We propose a coupled matrix and tensor factorization scheme to integrate the quantitative stock price data, the sentiment-specific data as well as the event-specific data. With the collaboratively factorized low rank matrices, we can effectively predict the stock movements by completing the missing values in the sparse tensor. This scheme considers the correlations among the stocks and provides a powerful tool to co-learn all tasks simultaneously through the implicit shared knowledge and the explicit side information, and thus achieves better predictive performance.
page 697
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
698
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
18.3.2 Preliminaries 18.3.2.1 Tensor decomposition and reconstruction In this part, we briefly introduce the mathematical notations and the tensor operations. Tensors are high-order arrays that generalize the notions of vectors and matrices. In this work, we use a 3rd-order tensor, representing a 3-dimensional array. Scalars are 0th-order tensors and denoted by lowercase letters, e.g., a. Vectors are 1st-order tensor and denoted by boldface lowercase letters, e.g., a. Matrices are 2nd-order tensor and denoted by boldface capital letters, e.g., X, and 3rd-order tensors are denoted by calligraphic letters, e.g., A. The ith entry of a vector a is denoted by ai, element (i, j) of a matrix X is denoted by xij , and element (i, j, k) of a 3rd-order tensor A is denoted by aijk . The ith row and the jth column of a matrix X are denoted by xi :: and x : j, respectively. Alternatively, the ith row of a matrix, ai :, can also be denoted as ai . The norm of a tensor A ∈ P N ×M ×L is defined as N M L a2i,j,k . A = i=1 j=1 k=1
This is analogous to the matrix Frobenius norm, which is denoted as X for a matrix X. The n-mode product of a tensor X ∈ P I1 ×I2 ×I3 with a matrix U ∈ RIn ×J , denoted by X ×n U , is a tensor of size I1 ×· · ·×In−1 ×J ×In+1 × · · · × IN with the elements (X ×n U )i1 ...in−1 jin+1 ...iN = Iinn=1 aii ,i2 ,iN uin j . The Tucker factorization of a tensor A ∈ P N ×M ×L is defined as: A = X ×1 U ×2 V ×3 W. Here, U ∈ P N ×R , V ∈ P M ×S and W ∈ P L×T are the factor matrices and can be thought of as the principal components in each mode. The tensor X ∈ P R×S×T is the core tensor and its entries show the level of interaction between the different components. The reconstructed tensor is derived by multiplying the core tensor and the three factor matrices. It can be observed that tensor decomposition and reconstruction has updated the value for each existing entries indicating its importance and fill some new entries showing the latent relationships. Generally speaking, tensor factorization can be regarded as an extension of the matrix decomposition. During the process of decomposition, data can be projected in the subspaces, which includes latent significance.
page 698
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
699
18.3.2.2 Coupled attribute value similarity To exploit the stock correlations to facilitate the prediction, similarity measures are required to obtain such correlations. The traditional similarity measures usually assume the object’s attributes are independent with each other and the interactions among the attributes are not considered to calculate the similarity. However, the coupling effects among different attributes exist in a wide range of applications. We take two movie-related attributes, the actor and genre, as an example. The actor Jackie Chan appears more frequently in action movies than in other genres, while actor Jim Carrey is more likely to play roles in comedy movies. Thus, in addition to considering the intracoupled similarity within an attribute, the inter-coupled similarity among different attributes should also be involved. Formally, a large number of data objects with the same features can be organized by such information table S = U, A, V, f , where U = {u1 , u2 , . . . , um } is a set of instances, A = {a1 , a2 , . . . , an } is a n-attribute set for every instance. Vj is the set of all the values of feature aj , and fj : U → Vj is a map function that returns the attribute aj ’s value of the instance. We next introduce an approach to calculate the intra-coupled and inter-coupled similarities. Intra-coupled attribute value similarity (IaAVS). According to Wang et al. (2011), the frequency distribution of the attribute value can reveal the value similarity. The intra-coupled similarity δjIa (x, y) between values x and y of attribute aj is defined as δjIa =
|gj (x)| · |gj (x)| , |gj (x)| + |gj (y)| + |gj (x)| · |gj (y)|
where Vj → 2U is a map function that returns a set of instances whose values of attribute aj are x. Thus, gj (x) is defined as gj (x) = { ui | fi (ui ) = x, 1 ≤ j ≤ n, 1 ≤ i ≤ m} . Inter-coupled attribute value similarity (IeAVS). The inter-coupled attribute value similarity δjIe (x, y) between values x and y of attribute aj is the aggregation of the relative similarities δj|k (x, y) (which will be given later) for all the other attributes excluding itself. δjIe (x, y) =
n k=1,k=j
αk δj|k (x, y),
where αk is the weight parameter for attribute aj , nk=1 αk = 1, αk ∈ [0, 1]. The relative similarity δj|k (x, y) represents the similarity between values x
page 699
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
700
and y of attribute aj , based on the other attribute ak . Thus, we have min Pk|j ({w} |x), Pk|j ({w} |y)}, δj|k (x, y) = w∈∩
where w ∈ W , and W is the kth attribute value subset and thus W ⊆ Vk . w ∈ ∩ denotes w ∈ ϕj→k (x) ∩ ϕj→k (y), and ϕj→k : Vj → 2Vk is a map function that returns the attribute ak ’s values subset of the instances, whose attribute aj ’s values are x, and that is, ϕj→k (x) = fk∗ (gj (x)), where f ∗ (·) differs from f (·) in that the input of the function f ∗ (·) is a set of instances instead of an individual instance. Pk|j ({w} |x) is the information condition probability of {w} with respect to x. Pk|j (W |x) can be obtained through Pk|j (W |x) =
|gk∗ (W ) ∩ gi (x)| . |gi (x)|
Here, gk∗ (W ) is the variation of gk (x) with the set W as input. Specially, gk∗ (W ) maps a set of attribute ak values W to a set of instances, that is gk∗ (W ) = {ui |fi (ui ) ∈ W, 1 ≤ k ≤ n, 1 ≤ i ≤ m} . For more detailed introduction to the coupled attribute value similarity, one can refer to Wang et al. (2011) and Li et al. (2015a). 18.3.3 The framework We employ the historical stock quantitative data, Web news articles and social media data to construct a 3rd-order tensor as well as two auxiliary matrices to model their joint effects on stock price movements. The overall system framework is shown in Figure 18.8, which is comprised of four major parts: (1) The stock quantitative feature matrix construction based on the stock quantitative data; (2) The stock correlation matrix construction with the multi-sourced data; (3) Events and sentiments extraction from news articles and social media to build the stock movement tensor; (4) Coupled matrix and tensor factorization for stock price movement prediction. The reconstructed tensor would be the output of the system and used to make stock predictions. We will describe the first three parts in this section, and the last part will be introduced in the following subsection.
page 700
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
Figure 18.8:
701
The system framework of our stock prediction model.
18.3.3.1 Building the stock quantitative feature matrix The first step is to build the stock quantitative feature matrix whose two dimensions are the stock IDs and the quantitative features, respectively. For every stock i, the quantitative features are denoted as such a vector xi = (xi1 , xi2 , . . . , xik , . . . , xiK ), where K is the number of features, xik is the value of the kth feature. Then we normalize and gather all the features to form the stock quantitative matrix X ∈ P N ×K , where N is the number of stocks. Based on the previous studies (Fama and French, 1993; Li et al., 2015b), we choose four common quantitative features, that is, the share turnover, the price-to-earning (P/E) ratio, the price-to-book (P/B) ratio and the price-tocash-flow (PCF) ratio. Each feature can reflect the status and valuation of a company to some extent from an aspect, and the combination of them could depict a company’s valuation from a comprehensive perspective. That’s why we construct the matrix with various quantitative features. After factorizing this matrix, each stock will be represented by a vector (embedding), and it can be expected that similar stocks will have close quantitative feature vectors. 18.3.3.2 Building the stock correlation matrix Stocks can be correlated from various perspectives, e.g., they may belong to the same industry, or they are involved in the same topic (e.g., benefit from reducing interest rate), or just co-evolves in price historically without explicit relationships. The correlations between two stocks can be depicted by their similarities. Traditional similarity metrics usually require that the
page 701
July 6, 2020
11:59
702
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
objects are described by numerical features, and measure their similarity by the geometric analogies which reflect the relationship of data values. However, the underlying assumption of these metrics is that the features follow the independent and identically distribution (iid), indicating they only consider the intra-coupled similarity within a feature, but ignore the dependency relationships among features (Wang et al., 2011). Here, we extract the coupled correlations between stocks by considering the coupled effects among features, which is denoted as the coupled stock correlation. For the purpose of comparison, we also develop three other methods to calculate the correlations, which are described in detail as follows. Coupled stock correlation. In complex applications such as stock analysis, there usually exists coupling effects between different features. To capture such effects, inspired by the previous works (Wang et al., 2011; Li et al., 2015a), we apply a Coupled Stock Similarity (CSS) measure by taking the intra interaction between values within an attribute and the inter interaction between attributes into account to calculate the correlation between stocks. Note that this is a different way of getting stock correlations as compared with the previous section that uses user perceptions from social media. The idea of the Coupled Stock Correlation is depicted in Figure 18.9. Given the stock attribute space Sa = S, A, V, f , where S = {s1 , s2 , . . . , sn } is a set of stocks, A = {a1 , a2 , . . . , am } is the attribute set of the stock, Vk is the set of all the values of the feature ak , Vik is the value of feature ak for stock si , and fk : S → Vk is a map function that returns the attribute ak ’s
Figure 18.9:
Coupled stock similarity.
page 702
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
703
value of the stock. Then CSS between two stocks si and sj can be defined as δkIa (Vik , Vjk )∗ δkIe (Vik , Vjk ), (18.7) CSS (i, j) = k
where Vik and Vjk are the values of feature k on stock si and sj , respectively, δkIa (Vik , Vjk ) is the intra-coupled attribute values similarity of attribute ak , and δkIe (Vik , Vjk ) is the inter-coupled attribute values similarity which can be calculated based on other coupled attributes. The details of the theoretical analysis and calculation method of equation (18.7) can be referred to Section 18.3.2.2 and the literature (Wang et al., 2011). To derive CSS (i, j), the attribute set used can be represented as a tuple sit = (pit , cit ), where sit is the status of stock i on day t. sit has two attributes: (1) pit the price movement direction of the stock i on day t, and 1 means going up while −1 means going down; (2) cit represents the industry (that stock i belongs to) index changing direction on day t. Analogously, the value of cit is 1 for going up and −1 for going down. Then we calculate CSS (i, j) for each pair of stocks si , sj on each day, and average all the CSS (i, j) through the training period (9 months in our case). Co-evolving direction correlation. We also attempt to learn the correlation between two stocks by simply counting the times of co-evolvements in their price movements. In particular, for each pair of stocks, if their closing prices both go up (or go down) compared to their closing prices in the previous trading day, they are referred as co-evolving at that day. Given a time period (e.g., a year), the more days the two stocks co-evolves, the more closely they are correlated. Formally, supposing the number of days on which the two stocks si and sj co-evolve is N , and the total number of trading days is M , the correlation coefficient between them is N/M , which is also the value of entry zij in the stock correlation matrix Z. Co-evolving p-change correlation. The p-change value of a stock on the trading day i is defined as the change rate between its closing prices on the day i and on the previous day (i−1). By combining each stock’s p-change for all the trading days in a time period, we can obtain a p-change curve for each stock. Then Pearson Correlation Coefficient is applied to measure the co-evolving p-change correlation of each pair of stocks based on their p-change curves. This measurement considers both the fluctuation direction and the fluctuation range to reflect the co-evolving movements. User-perceived correlation. In addition to obtaining the correlations with the stock quantitative data, we can also extract the user-perceived
page 703
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
704
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
relatedness through Xueqiu, a Twitter-like investor social network.7 To obtain such correlations, we collect all the pairwise stocks mentioned by the same tweet and then remove the tweets mentioning more than five consecutive stock tickers as such tweets usually do not convey useful meaning for our task. With the above methods, the correlation coefficients for each pair of stocks are obtained. Next they are normalized and filled into the stock correlation matrix Z ∈ P N ×N , whose entry zij is the normalized correlation value between stock i and stock j. A larger zi,j means a higher correlation between the stocks i and j. Please note that each individual method described in this section can provide a correlation matrix as the input of our coupled matrix and tensor factorization model, and we will empirically choose the best one according to the evaluating results. 18.3.3.3 Building the stock movement tensor We use a tensor to represent the stock ID, events and sentiments collected from multiple data sources. The reason is that the consequence of an event is usually complex and thus only relying on events is not sufficient to make good predictions. For example, the announcement of acquisition can either be positive or negative news under different circumstances. To address this issue, the sentiments extracted from social media, which represent the people’s judgments on the events, can be an effective complement. Specifically, the positive sentiment usually indicates the event is good to the stock while the negative sentiments mean the opposite. Formally, we build a 3rd-order tensor A ∈ P N ×M ×L , whose three dimensions are stock ID, event category and social sentiment polarity. Note that no events happening is also regarded as a type of event. We next show how to extract the events and sentiments from Web news and social media, respectively. Event extraction. The previous work (Ding et al., 2014) shows that news titles are more useful to extract events compared to news contents. Thus, we extract events from the news titles. Within a title, we use the verb or gerund to represent an event since they are quite informative. For example, in the news title “Microsoft to acquire LinkedIn”, the verb “acquire” can denote the event quite well. Note that we do not consider the subject or object in the title as in our Web news data source, the news articles have 7
https://xueqiu.com/.
page 704
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
705
been assigned to the related stocks, which will be the subject or object of the event. The data source will be described in Section 18.3.5.1. To extract the events, we first segment the news titles into words with Jieba8 , an open-source python component for text segmentation. For each part of the speech tagging, we next extract the verbs and gerunds in news titles. If we directly use the extracted verbs and gerunds (more than 6,000 in our case) to construct the tensor, the tensor will be extremely sparse, preventing it from getting a good prediction accuracy. We observed that a lot of titles actually refer to the same type of event but fall into different event categories. For example, two titles “Microsoft to acquire LinkedIn” and “Microsoft to buy LinkedIn” can be treated as the same event, but are presented by different verbs. To address this issue, we examine the corpus of the extracted verbs and gerunds, and cluster the synonyms with a linguistic knowledge base HowNet (Dong, 2011). To further reduce the dimensionality of the event categories, we next cluster the event categories based on their word embeddings. Specifically, we train the domain-specific word embeddings using word2vector (Mikolov et al., 2013b) with the Chinese finance news corpus (Zhang et al., 2016), and the number of dimensions is set as 100. We then apply the k-means method to cluster the embeddings and obtain 500 clusters, which is set empirically. Too many clusters may lead to over sparsity of the tensor, while too few clusters may not be sufficient to separate different types of events into different categories. Please note that the word embedding can’t distinguish antonyms, i.e., the antonyms may fall into the same cluster. For example, in the context of stock market, the embedding vectors of “rise” and “fall” may be close to each other (and thus in the same cluster), but they indicate totally different meanings for stock movements. Thus, manual correction is required to separate the antonym words within the same cluster into different clusters. As there are 97 antonym words in the same clusters, we create 97 new clusters and eventually the news titles are classified into 597 event categories. Sentiment extraction. For every stock, we analyze its public sentiment polarity (i.e., positive or negative) on each day by extracting users’ postings from the investor social media. We apply the method proposed in Li et al. (2015b) to calculate the public sentiment, and this method mainly utilizes the following information: each posting’s publication time, the title, the number of clicks and the number of comments. Different from Li et al. (2015b), we develop a specialized sentiment dictionary focusing on financial 8
https://github.com/fxsjy/jieba.
page 705
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
706
social media based on NTUSD (Ku and Chen, 2007). The new sentiment dictionary contains plenty of words with the sentiment polarity in the domain of finance, e.g., rise, fall, up, and down. To obtain the sentiment polarity of a posting, we first use Jieba to segment the postings, and extract the sentiment words by using the sentiment dictionary. We then calculate the positive and negative sentiment value for each stock on each day. The sentiment value is calculated by Sit+ =
K Pjt j=0
Ljt
× Wjt ,
Sit+
is the positive sentiment value of stock i on day t, Pjt is the where number of positive sentiment words in posting j published on day t for stock i, Ljt is the total number of sentiment words in posting j on day t, and Wjt is the weight of posting j, which indicates the degree of impact on social media and can be calculated by the number of clicks and comments. The detailed calculation method can be referred to Li et al. (2015b). Finally, the sentiment polarity of a stock on a day can be obtained by comparing the difference between its positive and negative values with a predefined threshold. After extracting the events and sentiments for each stock on each day, we can construct a stock movement tensor for all the stocks on each trading day. A positive (negative) value of the entry (anml = 1 or − 1) of the tensor denotes the price of stock n goes up (down) when event m happens and simultaneously the public sentiment is l. However, due to the sparsity of the events for one stock, the tensor is overly sparse. Thus, we aggregate the tensors in a long past period to form a denser historical tensor in a period into a historical tensor. Specifically, the corresponding entry values in each day’s tensor will be aggregated to form an upward probability, indicating the probability that a stock price will go up when the stock meets a specific event category and a specific sentiment polarity. For example, given 10 tensors in the past 10 trading days, if the entry anml has six “+1” and four “−1” in all the 10 tensors, we say its upward probability is 0.6. After aggregation, as the stock movement tensor is still sparse for an accurate decomposition, we then apply the stock correlation matrix and stock quantitative feature matrix to assist its decomposition, which will be described in the following section. 18.3.4 Coupled matrix and tensor factorization In Section 18.3.3.3, we have shown how to build the stock movement tensor. Although several techniques have been applied to reduce the dimension of the events category, the tensor is still very sparse, as the number of events related
page 706
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
707
to each stock is quite limited. Consequently, solely decomposing the tensor does not work very well on making very accurate predictions. To address this issue, auxiliary information from other data sources can be incorporated to assist. In this work, the additional information includes the stock correlations and the stock quantitative features, which reside in two matrices, the stock quantitative feature matrix X and the stock correlation matrix Z. The main idea of this coupled model is to propagate the information among X,Y, and Z by requiring them to share the low-rank matrices in a collective matrix and tensor factorization model. We can also illustrate this model from a multi-task learning perspective, that is, instead of conducting each stock prediction task independently, we can co-learn multiple tasks simultaneously through their commonalities and shared knowledge. In our work, the multiple tasks are connected by the stock correlations and their quantitative features. The intuition behind is that if two stocks are highly correlated, the events occurred on one stock are likely to have similar effects on the other stock. We next describe how to collaboratively factorize matrices and tensors. Specifically, given a very sparse stock movement tensorA, we try to complete A by decomposing it collaboratively with the stock quantitative feature matrix X and the stock correlation matrix Z. As shown in Figure 18.10,
Figure 18.10:
Coupled matrix and tensor factorization.
page 707
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
708
X ∈ P N ×K is the stock quantitative feature matrix and Z ∈ P N ×N is the stock correlation matrix, where N is the number of stocks, M is the number of event types (categories), L is the number of sentiment polarities, i.e., positive and negative, and K is the number of quantitive features. The tensor A can be decomposed as X ×1 U ×2 V ×3 W , where the core tensor is X ∈ P R1 ×R2 ×R3 and three factorized low-rank matrices are U ∈ P N ×R1 , V ∈ P M ×R2 , W ∈ P L×R3 , denoting the low rank latent factors for stocks, events and sentiments, respectively. X can be factorized as X = U × F, where F ∈ P R1 ×K is the low rank factor matrix for quantitative features. As our model applies to the coupled matrix and tensor factorization to replenish the tensor, the entries obtained after reconstruction are required to be close to their real values. To achieve this goal, we define the following objective function to minimize the factorization errors: Λ(U, V, W, X, F ) =
λ1 1 A − X ×1 U ×2 V ×3 W 2 + X − U F 2 2 2 λ2 λ3 + tr(U T LZ U ) + (U 2 + V 2 + W 2 2 2 + X2 + F 2 ),
(18.8)
where A − X ×1 U ×2 V ×3 W 2 is to control the decomposition error of the tensor A, X − U F 2 is to control the factorization error of X, tr(·) denotes the matrix traces, U 2 + V 2 + W 2 + X2 + F 2 is the regularization penalty to avoid overfitting. LZ = D − Z is the Laplacian matrix of the stock correlation graph in which D is a diagonal matrix with T diagonal entries dii = i zij . And tr(U LZ U) can be obtained through equation (18.9), in which two stocks si and sj with a higher correlation (i.e., zij is large) should also have a closer distance between the vector ui and uj in the matrix U. 1 ui − uj 22 zij = ui zij uTi − ui zij uTj 2 i,j i,j i,j = ui dii uTi − ui zij uTj i
i,j T
= tr(U (D − Z)U) = tr(UT LZ U).
(18.9)
The object function is not jointly convex to all the variables U, V, W, X, F. Thus, we use an element-wise optimization algorithm to iteratively update each entry in the matrices and tensor independently by gradient descent (Karatzoglou et al., 2010; Wang et al., 2015). The gradient for each
page 708
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
709
variable is derived as follows: T T T T ×3 wk: − aijk )X ×2 vj: ×3 wk: ∇ui: Λ = (X ×1 uTi: ×2 vj:
+ λ1 (ui: F − xi: )F T + λ2 (LZ U )i: + λ3 ui: , T T T ×3 wk: − aijk )X ×1 uTi: ×3 wk: + λ3 vj: , ∇vj: Λ = (X ×1 uTi: ×2 vj: T T T ×3 wk: − aijk )X ×1 uTi: ×2 vj: + λ3 wk , ∇wk: Λ = (X ×1 uTi: ×2 vj: T T ×3 wk: − aijk )ui: ◦ vj: ◦ wk: + λ3 C, ∇C Λ = (X ×1 uTi: ×2 vj:
∇F Λ = λ1 uTi: (ui: F − xi: ) + λ3 F. The detailed algorithm of the learning process is shown in Algorithm 2. Algorithm 2 Coupled Matrix and Tensor Factorization Input: Tensor A, matrices X, Z, an error threshold ε Output: Low rank matrices U , V , W , F , core tensor X 1: Set η as step size of gradient descent, iteration time 2: Initialize U ∈ P N×R1 , V ∈ P M ×R2 , W ∈ P L×R3 , F ∈ P R1 ×K , Z ∈ P N×N and core tensor X ∈ P R1 ×R2 ×R3 with small random values, t = 0 3: dii = i zij 4: LZ = D − Z 5: for each aijk = 0 do 6: Get ∇ui: Λ, ∇vj: Λ, ∇wk: Λ, ∇X Λ, ∇Φ Λ 7: ut+1 = uti: − η∇uti: Λ i: t+1 t 8: vj: = vj: − η∇vj: t Λ t+1 t 9: wk: = wk: − η∇wt Λ k: 10: Xt+1 = Xt − η∇Xt Λ 11: Φt+1 = Φt − η∇Φt Λ 12: end for 13: while(Losst − Losst−1 > ε) do 14: Do steps 5 − 12 iteratively 15: t = t + 1 16: end while 17: return U ,V ,W and F
18.3.5 Experiments 18.3.5.1 Data collection and description We evaluate our proposed method on two datasets, the China A-share stock market data and HK stock market data, during January 1, 2015 and December 31, 2015. For the A-share market, we choose 78 stocks from the Chinese Stock Index (CSI) 100, and collect the corresponding event information and sentiment polarity from Web media and social media.
page 709
July 6, 2020
11:59
710
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
The remaining 22 stocks are not included in the experiment due to the very limited information on the Web during this period. For HK market, since there are much less retail investors in HK market than in A-share market, the number of tweets in social network related to HK stocks is not as many as those for A-share stocks. Thus, we only choose 13 hot stocks with relatively large number of tweets for experiments.9 Next we introduce how to collect the data in detail as follows: Quantitative data: The quantitative data of stocks of the two datasets are both collected from Wind, a widely-used financial information service provider in China. The indices we select are the stock turnovers, P/E ratios, P/B ratios and P CF ratios that are commonly used indices for stock trading and valuation, which comprise the stock quantitative feature matrix. In addition, we also collect the closing price and industry index to calculate the coupled stock correlation. Web news data: We collect 76,445 and 7284 news articles for A-share stocks and HK stocks respectively from Wind, including the titles and the publication time in 2015. Each article is assigned to the corresponding stock. These Web news are originally aggregated by Wind from major financial news websites in China, such as http://finance.sina.com.cn and http://www.hexun.com. These titles are then processed to extract events as described in Section 18.3.3.3. The data is publicly available on Zhang and Yunjia (2017). Social media data: The sentiments for A-share stocks are extracted from Guba. Guba is an active financial social media where investors can post their opinions and each stock has its own discussion site. In total, we collect 6,163,056 postings from January 1, 2015 to December 31, 2015 for the 78 stocks. For each posting, the content, user ID, the title, the number of comments and clicks, and the publication time are extracted. We also have made this dataset publicly available (Zhang and Yao, 2011). In addition, we crawled 3191 tweets on the 13 HK stocks in 2015 from Xueqiu. As there are more discussions on HK stocks in Xueqiu than in Guba, we collect the sentiment information related to HK stocks from Xueqiu. In the experiments, we use the data in the first 9 months as the training set and the last 3 months as the testing set. For the A-share dataset, 47.1% of all the samples present upward trend, while 52.3% present downward trend 9
The codes of the 13 HK stocks are 0175, 0388, 0390, 0400, 0656, 0700, 1030, 1766, 1918, 2318, 2333, 3333 and 3968.
page 710
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
711
and 0.6% keep still. For the HK stock market data, 54.9% of all the samples present upward trend, while 42.3% present downward trend and 2.8% keep still. To remove the obstacles and obtain deterministic price moving trends, we set the movement scope threshold as 2%. In particular, when the price change ratio of a stock is larger than 2% (or smaller than −2%) on a trading day, its price movement direction is considered as up (or down). Otherwise, this sample is considered as little fluctuation (or still) and excluded from our experiments. Therefore, our prediction task can be considered as a binary classification task, and can be addressed by a binary classifier. 18.3.5.2 Comparison methods and metrics The following baselines and variations of our proposed model are implemented for comparison. • SVM: We directly concatenate the stock quantitative features, event mode features and sentiment mode features as a linear vector, and use them as the input of SVM for prediction. • PCA+SVM: The Principle Component Analysis (PCA) technique is applied to reduce the dimensions of the original concatenated vector, and then the new vector is used as the input of SVM. • TeSIA: The tensor-based learning approach proposed in Li et al. (2015b) is the state-of-art baseline which utilizes multi-source information. Specifically, it uses a 3rd-order tensor to model the firm-mode, event-mode and sentiment-mode data. Note that they construct an independent tensor for every stock on each trading day, without considering the correlations between stocks. • CMT: CMT denotes our proposed model, i.e., two auxiliary matrices and a tensor are factorized together. Note that the stock correlation matrices can be obtained by four methods. The default CMT is using the couple stock correlation method. We will also evaluate CMT approaches with other stock correlation matrices as well, including coevolving direction correlation, co-evolving p-change correlation and user perceived correlation. • CMT-Z: To study the effectiveness of the stock correlation matrix Z, we use the CMT without Z as a baseline. • CMT-Z-X: To study the effectiveness of the two auxiliary matrices X and Z, we use only the tensor decomposition as a baseline. Following the previous studies (Ding et al., 2015; Xie et al., 2013) the standard measures of accuracy (ACC) and Matthews Correlation Coefficient
page 711
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
712
(MCC) are used as evaluation metrics. Larger values of the two metrics mean better performance. ACC is one of the most useful metrics for stock price movement prediction, however, it may fall short when two classes are of very different sizes. MCC are more useful than ACC when the class distribution is uneven. The MCC is in essence a correlation coefficient value between −1 and +1. A coefficient of +1 represents a perfect prediction, 0 no better than random prediction and −1 indicates total disagreement between prediction and observation. MCC is defined as M CC =
TP × TN − FP × FN , (T P + F P )(T P + F N )(T N + F P )(T N + F N )
(18.10)
where T P is the number of true positives, T N the number of true negatives, F P the number of false positives and F N the number of false negatives in the confusion matrix. 18.3.5.3 Prediction results Table 18.4 shows the average stock movement prediction accuracy for all the 78 stocks and 13 HK market stocks during the testing period. For the China stock market, it can be observed that our proposed CMT achieves the best performance, with 62.5% in ACC and 0.409 in MCC. SVM method shows the worst performance, indicating that only using linear combination of the features cannot capture the coupling effects, which are crucial for improving the performance. PCA+SVM performs better than SVM, which is mainly because some noise can be removed by PCA. CMT outperforms CMT-Z, indicating the stock correlation information plays an important role. Analogously, CMT-Z achieves better performance than CMT-Z-X, validating the effectiveness of the stock quantitative feature matrix. Therefore, both of the Table 18.4:
Prediction results of movement direction. China A-share
HK
Method
ACC
MCC
ACC
MCC
SVM PCA+SVM TeSIA CMT-Z-X CMT-Z CMT
55.37% 57.50% 60.63% 59.03% 60.25% 62.50%
0.014 0.104 0.190 0.162 0.306 0.409
55.13% 56.07% 60.38% 59.36% 60.29% 61.73%
0.08 0.092 0.205 0.137 0.252 0.331
page 712
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
713
two auxiliary matrices provide effective knowledge to assist stock prediction. It can also be observed that compared to TeSIA, CMT improves accuracy by 3% and significantly improves MCC by 115%. We also evaluate our proposal on the HK stock market dataset, and the prediction results are similar as the A-share dataset. It demonstrates that the multi-task learning idea can be successfully applied to the stock prediction problem, resulting in a significant improvement in the prediction accuracy. In addition to the superior prediction accuracy over TeSIA, our method also requires much fewer parameters to tune than TeSIA. Note that TeSIA ignores the relations among the stocks and takes each stock’s prediction as a single-task regression learning. Due to the different characteristics of different stocks, learning a common regression function for all the stocks may not achieve the best performance. Thus, it requires TeSIA to learn different regression functions for different stocks respectively. Consequently, the number of parameters required to tune in TeSIA is linearly related to the amount of stocks, which is prohibitive due to the large volume of stocks in market. 18.3.5.4 Results with different stock correlation methods As the stock correlation matrix can be obtained with different methods, we compare their performance in our framework and the results are shown in Table 18.5. It can be observed that for both of the two datasets, the coupled stock correlation method outperforms all the other methods in terms of both ACC and MCC. It implies that considering the coupled effects between attributes does help in getting more realistic correlations between stocks, and eventually results in better prediction capability. As a comparison, for the China A-share stocks, the performance with user perceived correlation takes the second place in terms of MCC and third place in terms of ACC, indicating that such information obtained from the social network can be used to capture the effective correlations as well. It makes sense as the cooccurrence relationships between stocks in users’ postings are actually the Table 18.5:
Results with different stock correlation methods. China A-share
HK
Method
ACC
MCC
ACC
MCC
Without stock correlation matrix Co-evolving direction correlation Co-evolving p-change correlation User perceived correlation Coupled stock correlation (CMT)
60.25% 62.06% 60.91% 61.64% 62.50%
0.306 0.382 0.300 0.402 0.409
60.29% 61.39% 61.22% 60.77% 61.73%
0.252 0.290 0.205 0.263 0.331
page 713
July 6, 2020
11:59
714
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
results of the fundamental analysis to some extent, and represents the collective intelligence of the retail investors in the stock market. Moreover, this correlation with contexts in social media is more interpretable than the other quantitative methods. However, using user-perceived correlations with HK stock data is not as good as that with China A-share data. The possible reason is that the number of HK stocks we choose and the amount of social media data related to HK stocks are less than those on China A-share stocks, and thus the obtained correlations for HK stocks may not be as accurate as the A-share stocks. The performance with co-evolving direction correlation performs better than with co-evolving p-change correlation for both datasets. The difference between them is that p-change correlation takes the fluctuation scope of price into account as well. Thus, it indicates that movement direction is more important than movement scope for effectively mining the correlations. When compared to the approach without stock correlation matrix as an input, almost all the methods with stock correlations shows better performance, validating the effectiveness of this knowledge in stock prediction. 18.3.5.5 Parameter sensitivity There are three main parameters λ1 , λ2 and λ3 in the proposed model equation (2). Next we conduct an experiment to study the effects of the parameters on the model performance. To find the best setting of the parameters, we first fix the value of two parameters and then tune the value of the third parameter. With the discovered best setting of one parameter, we next tune another parameter in the same way. After several iterations, we can find the best parameter settings. Figures 18.11(a)–18.11(c) show the effects of the three parameters on the model performance, respectively. For each figure, we fix the best settings of two parameters, and tune the value of the third one. It can be observed that the accuracy is relatively stable for both of the two datasets when λ1 is within [0.21, 0.71], λ2 is within [0.21, 0.71] and λ3 is within [0.01, 0.4]. 18.3.5.6 Case study on the stock correlation matrix To demonstrate the effectiveness and interpretability of the stock correlation results, we select two stocks, Sany Heavy Industry (stock code 600031) and China Railway Construction (stock code 601186), from A-share stocks as two examples to show their correlations with other stocks. Specifically, using the coupled stock similarity method, the top 10 most correlated stocks with Sany Heavy Industry are listed in Table 18.6. It can be observed
page 714
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
(a)
715
(b)
(c)
Figure 18.11: Sensitivity analysis on λ1 , λ2 and λ3 . x-axis represents the various values for the parameter and y-axis denotes the test accuracy. Table 18.6: Rank
The stocks with top 10 highest correlation values of Sany heavy industry.
Code
Name
Industry
Correlation
1 2
600150.SH 601989.SH
Machinery Machinery
0.806 0.800
3 4
601766.SH 601668.SH
Machinery Construction
0.773 0.659
5
601618.SH
Construction
0.616
6
601600.SH
Material
0.615
7
600010.SH
Material
0.613
8 9 10
601088.SH 601699.SH 601186.SH
China CSSC Co., Ltd. China Shipbuilding Industry Co., Ltd. CRRC Co., Ltd. China State Construction Engineering Co., Ltd. Metallurgical Corporation of China Ltd. Aluminum Corporation of China Limited Inner Mongolia Baotou Steel Union Co., Ltd China Shenhua Energy Co., Ltd. Power Construction Co., Ltd. China Railway Construction Co., Ltd.
Energy Construction Construction
0.612 0.608 0.607
page 715
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
b3568-v1-ch18
X. Zhang & P. S. Yu
716 Table 18.7: Rank
9.61in x 6.69in
The stocks with top five lowest correlation values of Sany heavy industry.
Code
Name
Industry
Correlation
1
600518.SH
Medical
0.183
2
300104.SZ
Internet
0.200
3
600637.SH
Media
0.202
4
601727.SH
Appliance
0.227
5
600050.SH
Kangmei Pharmaceutical Co., Ltd. Leshi Internet Information & Technology Corp., BeiJing Shanghai Oriental Pearl Media Co., Ltd. Shanghai Electric Group Co., Ltd. China United Network Communications Co., Ltd.
Telecommunications
0.233
Table 18.8: The stocks with top 10 highest correlation values of China Railway Construction. Rank
Code
1 2
601390.SH 601800.SH
3
601668.SH
4 5
601699.SH 601618.SH
6
601818.SH
7 8 9 10
000001.SZ 601766.SH 600015.SH 600031.SH
Name China Railway Group Limited China Communications Construction Co., Ltd. China State Construction Engineering Co., Ltd. Power Construction Co., Ltd. Metallurgical Corporation of China Ltd. China Everbright Bank Co., Ltd. Ping An Bank Co., Ltd. CRRC Co., Ltd. Hua Xia Bank Co., Ltd. Sany Heavy Industry Co., Ltd.
Industry
Correlation
Construction Construction
0.894 0.869
Construction
0.862
Construction Construction
0.852 0.831
Bank
0.602
Bank Machinery Bank Machinery
0.583 0.569 0.563 0.554
that the most relevant stocks (top 3) belong to the same industry category (i.e., Machinery) as Sany, with correlation values larger than 0.7%. The other correlated stocks do not belong to the same industry, but they are correlated to the upstream or downstream of the Machinery Industry, with the correlation values ranging from 0.6 to 0.7. Table 18.7 shows the top five lowest correlated stocks with Sany, which are all unrelated stocks with correlation value lower than 0.2. Similar observations can also be obtained for the correlation results with China Railway Construction shown in Table 18.8 and Table 18.9. Hence, the correlation values exhibit good interpretability, and do play significant roles in our framework.
page 716
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
717
Table 18.9: The stocks with top five lowest correlation values of China Railway Construction. Rank
Code
Name
Industry
Correlation
1 2
600518.SH 600893.SH
Medical Aerospace
0.112 0.144
3
600637.SH
Media
0.146
4 5
002024.SZ 601727.SH
Kangmei Pharmaceutical Co., Ltd. AVIC Aviation Engine Corporation PLC. Shanghai Oriental Pearl Media Co., Ltd. Suning Commerce Group Co., Ltd. Shanghai Electric Group Co., Ltd.
Elec retail Appliance
0.160 0.189
18.4 Multi-Source Multiple Instance Learning for Market Composite Index Movements10 18.4.1 Overview In this section, we aim to learn a predictive model for describing the fluctuations of the stock market index, e.g., Dow Jones Index or SSE (Shanghai Stock Exchange) Composite Index. Please note that this goal is different from the studies in Sections 18.2 and 18.3 that aim to predict the movements for each individual stock. We still perform broad learning by utilizing various sources of data, that is, the historical quantitative data, the social media and Web news. The essential features we extract include the event representations from news and the sentiments from social media, and these features are for the overall market rather than for individual stocks. To conduct predictions, firstly, we propose a novel method to capture the events from short texts. Different from the event extraction method adopted in Section 18.3 that only considers verb or gerund in the sentence, here we model events by considering the structure information of the whole sentence, aiming to capture more effective event information. Specifically, structured events are extracted and then used as the inputs for Restricted Boltzmann Machines (RBM) to do the pre-training. After that, the output vectors from RBMs are used as the inputs to a recently proposed sentence2vec framework (Le and Mikolov, 2014), in order to achieve the event embeddings as features. Secondly, we propose an extension of the Multiple Instance Learning (MIL) model which can effectively integrate the event embeddings, sentiments and the historical quantitative data for accurate predictions. Compared to the 10
The essential part of this work has been published in IEEE Access, Vol 6, Xi Zhang, Siyu c 2018 IEEE. Qu, Jieyun Huang, Binxing Fang and Philip S. Yu, 50720–50728, 2018.
page 717
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
718
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
Figure 18.12: An example of the news events identified that are responsible for the Shanghai Composite Index change on January 26, 2016. The x-axis is the timeline. The left y-axis is the probability of each event leading to the target change. The right y-axis is the composite index in Shanghai Stock Exchange.
tensor-based model proposed in Section 18.3, we don’t consider the stock correlations here as we perform predictions for the market index. Rich data can therefore be provided from various data sources for the overall market, without the sparsity problem that exists for individual stocks. Furthermore, one benefit of the MIL model is that it can identify the specific factors (i.e., precursors) that incur the changes in the composite index, making the prediction more interpretable. Figure 18.12 shows an example of the precursors identified by our model, and the dots with numbers denote the probabilistic estimates for the events leading to the index change on January 26, 2016. We evaluate our framework on two year datasets, and the results show that our proposal can outperform the state-of-art baselines. 18.4.2 Preliminaries We introduce the multiple instance learning (MIL), the restricted Boltzmann machine (RBM) [Smolensky (1986)] and sentence2vec (Le and Mikolov, 2014). 18.4.2.1 Multiple instance learning The multiple instance learning (MIL) paradigm is a form of weakly supervised learning. Training instances arranged in sets generally called bags or groups. A label is provided for entire groups instead of individual instances.
page 718
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
719
Negative bags don’t contain any positive instances, but positive bags contain at least one positive instance (Dietterich et al., 1997). The standard MIL assumptions can be relaxed to be fit to a specific application, e.g., positive and negative bags differ by their instance distributions. Various applications and the comparisons of different methods in MIL were given in Amores (2013). The common MIL approach is used to predict the grouplevel label, Liu et al. (2012), however, proposed an approach to identify the instance-level labels, especially the labels of key instances in groups based on K nearest neighbors (K-NN). Kotzias et al. (2015) predicted the labels for sentences given labels for reviews, which can be used to detect sentiments. The most related work to ours is Ning et al. (2016), which proposed an event forecasting framework via the nested multiple instance learning. However, it only uses one data source and simple event features, which may not be sufficient for stock market prediction. 18.4.2.2 RBM The restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network, and has been applied in various applications such as dimensionality reduction (Hinton and Salakhutdinov, 2006) and classification (Larochelle and Bengio, 2008). Given a set of input data, an RBM can estimate the probability distribution over this set of inputs. Figure 18.13 shows the structure of an RBM, which contains two-layer neural nets. The first layer is called the visible layer or input layer, and the second layer is the hidden layer. This RBM network has m visual nodes in the visible layer and n hidden nodes in the hidden layer. The m visual nodes are independent to each other and each of them is only related to the n hidden nodes, and similarly, each of the hidden nodes is only related to the visual nodes.
h1 c1
h3 c3
h2 c2
…
b1
b2
b3
v1
v2
v3
Figure 18.13:
hn cn
…
bm
vm
An graphical depiction of an RBM.
page 719
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
720
We denote wn×m as the weight matrix of edges between the visible nodes and the hidden nodes, b and c are the bias vectors of the visible nodes and hidden nodes, respectively. In our model, each event can be represented as an m-dimensional vector with one-hot encoding, which is the visible layer in the RBM. Our target is to estimate the n-dimensional hidden layer to approximate the input layer as much as possible. Then the hidden layer will be set as the initial vector in sentence2vec. For each visible layer, the energy model is applied to learn the distribution of the corresponding hidden layer. Specifically, the connection structure between each visible node and each hidden node has energy, and the RBM energy function is defined as E(v, h) = −
m n
wij vj hi −
i=1 j=1
m
bj vj −
j=1
n
ci hi .
(18.11)
i=1
Please note that {w, b, c} are the parameters of RBM. Then we can get the joint probability of v and h as P (v, h) =
e−E(v,h) . −E(v,h) v,h e
(18.12)
With the joint probability, we can estimate P (v) and P (h|v) by calculating the marginal distribution of P (v, h). −E(v,h) e P (v) = h −E(v,h) , v,h e e−E(v,h) P (h|v) = −E(v,h) . he
(18.13)
(18.14)
In order to fit the input data as much as possible, we will estimate the parameters of the RBM by maximizing log P (v). Let θ denote the parameters {w, b, c}, then we can obtain the derivative of log P (v) by
∂E(v, h) ∂E(v, h) ∂ log P (v) = EP (h|v ) − − EP (h,v) − . ∂θ ∂θ ∂θ
(18.15)
As a probability graph model, RBM can be solved by sampling algorithms like Gibbs sampling or CD (contrastive divergence). Finally, we can obtain the distribution of hidden nodes, which will be set as the initial vector for sentence2vec.
page 720
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
721
18.4.2.3 Sentence2vec The word2vec model (Mikolov et al. (2013a) is used for learning vector representations of words, called “word embeddings”. One of the word2vec models is the continuous bag-of-words (CBOW) model, which predicts target words (e.g., “mat”) from source context words (“the cat sits on the”). Inspired by this idea, sentence2vec is proposed to learn the representation for a sentence. Compared to the word2vec CBOW model, the differences lie in (1) The sentence id will be added during the training process so that each sentence in the training corpus can have a unique id. The sentence id will also be mapped into a vector, called sentence vector, which would be the final vector that we want. In each sentence, the dimension of the sentence vector is the same as the word vector. (2) In the training process, the sentence vector and the word vectors of context will be concatenated as the input to softmax. We show the structure of sentence2vec in Figure 18.14. The sentence vector represents the missing information from the current context and can act as a memory of the topic of the sentence. Therefore, every time we predict the probability of a word (i.e., w(t) in Figure 18.14), we will use the semantic information of the entire sentence. (3) After training, the sentence vector will be obtained and used as the features for the proposed model.
Figure 18.14:
The framework of sentence2vec.
page 721
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
Table 18.10: Notation = {Γ} S = {G} {Γ} = {Ci }, i ∈ {1, . . . , t} G = {Ξi }, i ∈ {1, . . . , t} Ci = {Xi , di , si } Ξi = {xij }, j ∈ {1, . . . , ni } xij ∈ RV ×1 pij ∈ [0, 1] pm−i ∈ [0, 1] di ∈ R3×1 si ∈ R2×1 pd−i ∈ [0, 1] ps−i ∈ [0, 1] Pi ∈ [0, 1] Π ∈ [0, 1] Y ∈ {−1, +1}
b3568-v1-ch18
X. Zhang & P. S. Yu
722
9.61in x 6.69in
Notations in our model. Definition
A set of n Γ A set of n G Multi-source super group News super group: a set of t “groups” An element of Γ An element of G, ni is the number of instances in group Ξi A V -dimensional vector of news, the jth instance in Ξi The probability of xij in group Ξi in news super group to be positive The probability of news group i in news super group to be positive A 3-dimensional vector of stock market data on day i, (average price, market index change and turnover rate) A 2-dimensional vector of sentiment on day i (positive and negative) The probability of stock market data on day i to the multi-source super group label that is positive The probability of sentiment on day i to the multi-source super group label to be positive The probability of multi-source information (i.e., news, stock quantitative data and the sentiment data) on day i to be positive The estimated probability for a multi-source super group Label of multi-source super group
18.4.3 Multi-source multiple instance model 11 We first state and formulate the problem, and then propose the multi-source multiple instance (M-MI) framework and the corresponding learning process. Before going into details of our framework, we define some important notations as shown in Table 18.10. 18.4.3.1 Problem statement Stock markets are impacted by various factors, such as the trading volume, news events and the investors’ emotions. Thus, replying on a single data source (e.g., historical trading data) is not sufficient to make accurate predictions. The object of our study is to develop a multi-source data integration approach to predict the stock market index movement. Specifically, given a collection of economic news, social posts and historical trading data, 11
The essential part of this work has been published in IEEE Access, Vol 6, Xi Zhang, Siyu Qu, Jieyun Huang, Binxing Fang and Philip S. Yu, 50720–50728, 2018.
page 722
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
723
Stock quantave data
Social media
Sentiment Analyzer
Web news
Event Extracon
Figure 18.15:
Multi-source Multiple Instance M-MI Model
The system framework of our proposed model.
we focus on the forecasting of Shanghai securities composite index movements in China. The framework is shown in Figure 18.15. Moreover, we aim to identify the key factors that have decisive influences on the final index movement, which may be influential news, collective sentiments or an important quantitative index in the historical trading data. These key factors are supporting evidence for further analysis and are referred to as precursors for the index movement. Formally, in order to predict the stock market price movement on day t + k, we assume that there are a group of news articles for each day i (i < t), which is denoted as Ξi . The groups of news articles in consecutive t days are organized into a super-group G = {Ξ1 , . . . , Ξt }. The change in the stock market movement on day t + k can be denoted as yt+k ∈ {−1, +1}, where +1 denotes the index rise and −1 denotes the index decline. In addition to the news articles, the sentiment and quantitative indices on day i (denoted as si and di , respectively) are also taken into consideration. Then the forecasting problem can be modeled as a mathematical function f (G, s1:t , d1:t ) → yt+k , indicating that we map the multi-source information to an indicator (i.e., label) k days in the future from the day t, where k is number of the lead days that we aim to forecast. In the learning process, a weight is obtained for each piece of input information, which reveals the probability of that information signifying the rise or decline of the market index on our target day. In this way, we are able to identify the precursors set as a set of news articles, sentiments or quantitative signals whose probability values are above a given threshold τ . 18.4.3.2 The proposed approach Our predictive approach is proposed based on the multiple instance learning algorithm, that is, a group of instances are given group labels, which are assumed to be an association function (e.g., OR, average) of the instance-level labels. Our work further distinguishes the instance-level labels,
page 723
July 6, 2020
11:59
724
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
group-level labels, news super group labels and multi-source super group labels. One aim of our work is to predict the multi-source super group label (i.e., the target label in our work) that indicates the rise or decline of stock market index, and meanwhile we can also estimate the instance-level probabilities indicating how related the specific instance is to the index movement (i.e., target label), as well as the source-specific probability that reveals how related a specific source is to the index movement. For a given day, we first model the instance-level probability pij for a news article j on day i to the target label with a logistic function, that is 1 T xij ) = , (18.16) pij = σ(wm T −w 1 + e m xij where wm denotes the weight vector of the news articles. The higher the probability pij , the more related the article j is to the target label. If the probability is larger than a predetermined threshold τ , article j is identified as a precursor. The probability of all the news articles for a given day i can be computed as the average of probabilities of each news article, that is pm−i
ni 1 = pij . ni
(18.17)
j=1
In addition to news articles, we also model the probability pd−i for stock quantitative data and the ps−i for sentiments on day i, respectively, 1 , (18.18) pd−i = σ(wdT di ) = T 1 + e−wd di 1 , (18.19) ps−i = σ(wsT si ) = 1 + e−wsT si where wd denotes the weight vector of di and ws denotes the weight vector of si . We then model the probability Pi for multi-source information on day i as Pi = θ0 pm−i + θ1 pd−i + θ2 ps−i = θ(pm−i , pd−i , ps−i )T ,
(18.20)
where θ0 , θ1 and θ2 denote the weights of pm−i , pd−i and ps−i , respectively, and we use θ to indicate the source weight vector. We then model the probability of the multi-source super group as the average of the probabilities in t days, that is, t
Π=
1 Pi . t
(18.21)
i=1
As the influences of the news articles usually last for a number of days, we assume the probabilities on two consecutive days are essentially similar,
page 724
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
725
which can be represented by minimizing the following cost: g(Ci , Ci−1 ) = (Pi − Pi−1 )2 .
(18.22)
Given a set of true labels Y, we then define the following objective function to aggregate all the costs, that is, β f (Γ, Y, wm , wd , ws , θ) J(wm , wd , ws , θ) = n Γ∈Σ
+
+
1 n 1 n
t
Ci ,Ci−1 ∈Γ;Γ∈Σ
1 g(Ci , Ci−1 , wm , wd , ws , θ) t i=1
xij ∈Ξi ;Ξi ∈G;G∈S
ni t 1 1 h(xij , wm ) t ni i=1
j=1
+ λm R(wm ) + λd R(wd ) + λs R(ws ) + λθ R(θ), (18.23) where f (Γ, Y, wm , wd , ws , θ) = −I(Y = 1) log Π − I(Y = −1) log(1 − Π) is the log-likelihood function representing the difference between the predicted label and the true label. Here I(·) is the indicator function. T x ) represents the instanceh(xij , wm ) = max(0, m0 − sgn(pij − p0 )wm ij level cost. Here, sgn is the sign function; m0 is a crucial margin parameter used to separate the positive and negative instances from the hyper line in the feature space; p0 is a threshold parameter to determine the positiveness of an instance. R(wm ), R(wd ), R(ws ) and R(θ) are regularization terms, and β and λ are constants that control the function. 18.4.3.3 Model learning The model learning goal is to estimate the parameters wm , wd , ws and θ to minimize the cost shown in equation (18.23). We randomly choose a set (Γ, Y ) from Σ, and use the online stochastic gradient decent optimization. The gradient of J(wm ) with respect to wm can be calculated as ni t ∂J(wm ) Y − Π β θ0 =− pij (1 − pij )xij ∇J(wm ) = ∂wm Π(1 − Π) t ni i=1
+
j=1
ni t 1 θ0 2(Pi − Pi−1 ) pij (1 − pij )xij t ni i=1
j=1
page 725
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
726
−
ni−1 t 1 θ0 2(Pi − Pi−1 ) p(i−1)j (1 − p(i−1)j )x(i−1)j t ni−1 i=1
−
j=1
ni t 1 1 sgn(pij − p0 )xij (oij ) + λm wm . t ni i=1
(18.24)
j=1
The gradient of J(wd ) with respect to wd is ∇J(wd ) =
t ∂J(wd ) Y − Π βθ1 =− pd−i (1 − pd−i )di ∂wd Π(1 − Π) t i=1
t θ1 + 2(Pi − Pi−1 )pd−i (1 − pd−i )di t i=1
t θ1 + 2(Pi − Pi−1 )pd−(i−1) (1 − pd−(i−1) )d(i−1) + λd wd . t i=1
(18.25) We omit the result of ∇J(ws ) as it can be calculated in the similar way as in equation (18.25). The gradient of J(θ) with respect to θ is ∇J(θ) =
t Y −Π β ∂J(θ) =− (pm−i , pd−i , ps−i )T ∂θ Π(1 − Π) t i=1
+
1 t
t
2(Pi − Pi−1 )(pm−i , pd−i , ps−i )T
i=1 t
1 − 2(Pi − Pi−1 )(pm−(i−1) , pd−(i−1) , ps−(i−1) )T + λθ θ. t i=1
(18.26) 18.4.4 Feature extraction Here, we introduce how to extract the structured features from the raw texts, more specifically, extracting event representations from news articles and extracting the sentiments from posts in social media, which are used as the inputs to our M-MI framework. 18.4.4.1 Event feature extraction Recent advances in NLP techniques enable more accurate models of events with structures. We first use the syntactic analysis method to extract the
page 726
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
727
main structure information of the sentences, and then use it as the input to an RBM. The output of an RBM would be a pre-trained vector used as the input to sentence2vec, and then the event representations are obtained. The process is described in Figure 18.16. (1) Structured event extraction. With the text parser HanLP 10,12 we can capture the syntactic structure of a Chinese sentence, which is depicted as a three level tree in Figure 18.16. The root node denotes the core verb, and the nodes of the second layer are the subject of the verb and the object of the verb, respectively. The child of the subject is the modifier who is the nearest to the subject in the sentence, and so is the child of the object. Then we connect these core words together as the structure information to represent the event information. (2) Training with RBM. We then map the structured event into a vector. To make the vectors better reconstruct the original events, we use RBM as a pre-training module. The reason is that directly training the event representations using sentence2vec with the structured event may fall into local minimum. (3) Training with sentence2vec. Finally, we use sentence2vec, the neural probabilistic language model to obtain the event representations.
Figure 18.16:
12
Structured event extraction from a piece of macro news.
https://github.com/hankcs/HanLP.
page 727
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
728
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
18.4.4.2 Sentiment extraction To extract the sentiments from the posts in social network, we use the LDA-S method, an extension of Latent Dirichlet Allocation (LDA) model that we previously proposed to obtain the topic-specific sentiments for short texts. The intuition behind is that extracting sentiments discarding topics may be not sufficient as sentiment polarities usually depend on topics or domains (Zhao et al., 2010). In other words, the exact same word may express different sentiment polarities for different topics. For example, the opinion word “low” in the phrase “low speed” may have negative orientation in a trafficrelated topic. However, if it is in the phrase “low fat” in a food-related topic, the word usually belongs to the positive sentiment polarity. Therefore, extracting the sentiments corresponding to different topics can potentially improve the sentiment classification accuracy. The proposed model LDA-S can infer sentiment distribution and topic distribution simultaneously for short texts. LDA-S model consists of two steps. The first step aims to obtain the topic distribution of each post, and then set the topic as the one that has the largest probability. The second step gets the sentiment distribution of each post. The details of LDA-S can be referred to Zhang et al. (2017b). In this work, we adopt a sentiment word list called NTUSD (Ku and Chen, 2007), which contains 4,370 negative words and 4,566 positive words. If a word is an adjective but not in the sentiment word list, the sentiment label of this word is set as neutral. If a word is a noun, it is considered as a topic word. Otherwise, it is considered as a background word. For each topic, we distinguish opinion word distributions from two sentiment polarities, that is, positive or negative.
18.4.5 Experiments We extensively evaluate the performance of the proposed model against a number of baselines.
18.4.5.1 Data collection and description We collect stock market-related information from January 1, 2015 to December 31, 2016, and separate the information into two data sets, one for the year 2015 and the other for 2016. The data sets consist of three sources, the
page 728
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
729
historical quantitative data, the news and the posts in the social network, which are introduced in detail as follows: • Quantitative data: The source of quantitative data is Wind,13 a widely used financial information service provider in China. The data we collect are the average prices, market index change and turnover rate of the Shanghai Composite Index in each trading day. • News data: We collect the news articles including titles and release time through Wind, and get 38,727 and 39,465 news articles in 2015 and 2016, respectively. The news articles are aggregated by Wind from major financial news websites in China, such as http://finance.sina.com.cn and http://www.hexun.com. We process the news titles rather than the whole articles to extract the events, as the main topic of a news article is often summed up in the title. • Social media data: The sentiments are extracted from the posts crawled from Xueqiu. We collect totally 6,163,056 postings from January 1, 2015 to December 31, 2016. For each post, we get the posting time stamp and the content. For each trading day, if the stock market index rises, it is a positive instance, otherwise it is a negative instance. We randomly use 90% of the instances as the training set, and the remaining 10% as the testing set. We evaluate the performance of our model with varying lead days and varying historical days. Lead days refers to the number of days in advance the model makes predictions and the historical days indicates the number of days over which the multi-source information is utilized. The evaluation metrics we use are F1-score and accuracy (ACC). 18.4.5.2 Comparison methods The following baselines and variations of our proposed model are implemented for comparisons. The full implementation of our framework is named as multi-source multiple instance (M-MI) model. • SVM: The standard support vector machine is used as a basic prediction method. During the training process, the label assigned to each instance and each group is the same as its multi-source super group label. During 13
http://www.wind.com.cn/.
page 729
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
730
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
the prediction phase, we obtain the predicted label for each of the macro news, and then average the labels as the final label of the super-group. • nMIL: Nested Multi-Instance Learning (nMIL) model (Ning et al., 2016) is the state-of-art baseline. In this model, only one data source, that is, the news articles are used for prediction, and it ignores the impacts of the sentiments and the historical quantitative indices. • O-MI: Open IE Multiple Instance (O-MI) Learning model differs from MMI in the event extraction module. It adopts a previously proposed event extraction method (Ding et al., 2014), and uses Open IE (e.g., Banko et al., 2007; Etzioni et al., 2011; Fader et al., 2011) to extract event tuples from sentences. An event tuple (O1 , P , O2 , T ) can be represented by the combination of elements (except T ) (O1 , P , O2 , O1 + P , P + O2 , O1 + P + O2 ). P denotes the predicate verb of a sentence, O1 (i.e., subject) is the nearest noun phrase to the left of P and O2 (i.e., object) is the nearest noun phrase to the right of P . T is the time stamp. The structured event tuples are then processed by sentence2vec to obtain event representations. Pease note that the sentiment data and quantitative data are also used in this model. • WoR-MI: Without RBM Multiple Instance (WoR-MI) Learning model is also a part of the M-MI framework. It differs M-MI in that it works without the RBM module, and therefore the sentence2vec module is fed with original structured events instead of pre-trained vectors. To make a fair comparison, we use the same set of instances and the same setting of parameters to evaluate different methods. In our proposal and the baselines, we set the predicted label to −1 if the estimated probability for a multi-source super group is less than 0.5; otherwise, we set the predicted label to 1. All experiments are performed on a dual-core Xeon E5-2690 v2 processor. 18.4.5.3 Prediction results Table 18.11 shows the performance of our proposal and the baselines. We set both the history day and the lead day to 1. We can observe that M-MI outperforms all the baselines while SVM method shows the worst performance, indicating that simply tagging each news article with the label of its super-group is not effective. It can also be observed that M-MI, O-MI and WoR-MI outperform nMIL, indicating that only relying on a single news data source is not sufficient for prediction. M-MI and WoR-MI perform better than O-MI, indicating that both the structured event extraction module
page 730
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning Table 18.11:
731
Prediction results (history day = 1, lead day = 1). 2015
2016
Method
F-1
ACC
F-1
ACC
SVM nMIL O-MI WoR-MI M-MI
0.568 0.578 0.589 0.595 0.618
0.547 0.559 0.567 0.581 0.594
0.550 0.554 0.563 0.592 0.598
0.534 0.539 0.555 0.568 0.583
number of history days Figure 18.17:
F-1 scores with varying history days in 2015.
and the RBM pre-training module used in our framework are effective. The results also show that compared to nMIL, M-MI improves F-1 by 6.9% in 2015 and 7.9% in 2016, while it improves accuracy by 4.9% and 5.2% in 2015 and 2016, respectively. Figures 18.17 and 18.18 show the F-1 scores of all the comparative models with varying history days in training (lead day remains 1). The history day is varied from 1 to 5 and the results show that M-MI consistently perform better than the others. We can also observe that as the number of history days keep increasing, the F-1 first goes up and then goes down. The possible reason is that the impacts of the news, sentiments and quantitative indices released on some day will quickly decay after a period of time (2 or 3 days). Thus, outof-date information should be assigned with small weights or even discarded. Fortunately, our learning process can automatically assign small weights for information with weak impacts, alleviating the impact decaying problem.
page 731
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
732
number of history days Figure 18.18: Table 18.12:
History History History History History
day day day day day
= = = = =
1 2 3 4 5
Table 18.13:
History History History History History
day day day day day
= = = = =
1 2 3 4 5
F-1 scores with varying history days in 2016.
F-1 scores for M-MI in 2015 with varying lead days. Lead day = 1
Lead day = 2
Lead day = 3
0.618 0.629 0.620 0.614 0.600
0.586 0.589 0.577 0.557 0.552
0.568 0.573 0.558 0.548 0.548
F-1 scores for M-MI in 2016 with varying lead days. Lead day = 1
Lead day = 2
Lead day = 3
0.598 0.601 0.603 0.592 0.588
0.577 0.580 0.563 0.551 0.545
0.553 0.560 0.548 0.544 0.539
In order to know how early we can predict the index movement, we show the F-1 scores of M-MI model with varied lead days from 1 to 3 and history days from 1 to 5 in Tables 18.12 and 18.13 in 2015 and 2016, respectively. We observe that as the number of lead days increases, the predictive performance of our model decreases. This can be explained by the fact that the
page 732
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
733
stock market commonly reflects the available information in a timely manner. In other words, the up-to-date information will immediately be reflected in the index change and can make the prediction more accurate. 18.5 Summary In this chapter, the usefulness of the social media in stock market prediction is conceptually and analytically evaluated. Statistical analysis are conducted on various sources of information and a set of broad learning models are proposed to fuse the heterogeneous data, that is, the social media data, news articles data and quantitative trading data to make accurate predictions. Effective indicators are extracted from various data sources and proved to play important roles in the stocks’ volatility. Evaluations on real-world data demonstrate the effectiveness of these methods. This chapter gives a new paradigm and learning system that can utilize the Web information for stock market analysis, which are required for the financial investment in the Big Data era. Bibliography Al Nasseri, A., Tucker, A. and de Cesare, S. (2015). Big Data Analysis of Stocktwits to Predict Sentiments in the Stock Market. In Discovery Science (Springer), 13–24. Amores, J. (2013). Multiple Instance Classification: Review, Taxonomy and Comparative Study, Artificial Intelligence, 201, 81–105. Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J. and Tiwari, M. (2015). Global Diffusion via Cascading Invitations: Structure, Growth, and Homophily. In Proceedings of the 24th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, 66–76. Arai, Y., Yoshikawa, T. and Iyetomi, H. (2015). Dynamic Stock Correlation Network. Procedia Computer Science, 60, 1826–1835. Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M. and Etzioni, O. (2007). Open Information Extraction from the Web, in IJCAI, 7, 2670–2676. Barber, B. M. and Odean, T. (2008). All that Glitters: The Effect of Attention and News on the Buying Behavior of Individual and Institutional Investors. Review of Financial Studies, 21(2), 785–818. Bollen, J., Mao, H. and Zeng, X. (2011). Twitter Mood Predicts the Stock Market. Journal of Computational Science, 2(1), 1–8. Boser, B. E., Guyon, I. M. and Vapnik, V. N. (1992). A Training Algorithm for Optimal Margin Classifiers, In Proceedings of the fifth annual workshop on Computational learning theory (ACM), 144–152. Cutler, D. M., Poterba, J. M. and Summers, L. H. (1989). What Moves Stock Prices, J. Portf. Manag. 15, 4–12. Darrat, A. F. and Zhong, M. (2000). On Testing the Random Walk Hypothesis: A Model Comparison Approach, Financial Review, 35(3), 105–124.
page 733
July 6, 2020
11:59
734
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
Dietterich, T. G., Lathrop, R. H. and Lozano-P´erez, T. (1997). Solving the Multiple Instance Problem with Axis-parallel Rectangles. Artificial Intelligence 89(1), 31–71. Ding, X., Zhang, Y., Liu, T. and Duan, J. (2014). Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. In The Conference on Empirical Methods on Natural Language Processing (EMNLP-14), 1415–1425. Ding, X., Zhang, Y., Liu, T. and Duan, J. (2015). Deep Learning for Event-driven Stock Prediction. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (ICJAI-15), 2327–2333. Ding, X., Zhang, Y., Liu, T. and Duan, J. (2016). Knowledge-driven Event Embedding for Stock Prediction. In Proceedings of the 26th International Conference on Computational Linguistics (COLING-16), 2133–2142. Dong, Z. (2011). Hownet Knowledge Database, http://www.keenage.com/, (accessed 17.01.15). Etzioni, O., Fader, A., Christensen, J., Soderland, S. and Mausam, M. (2011). Open Information Extraction: The Second Generation, IJCAI, 11, 3–10. Fader, A., Soderland, S. and Etzioni, O. (2011). Identifying Relations for Open Information Extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics), 1535–1545. Fama, E. F. (1965). The Behavior of Stock-market Prices. The Journal of Business 38(1), 34–105. Fama, E. F. and French, K. R. (1992). The Cross-section of Expected Stock Returns. The Journal of Finance, 47(2), 427–465. Fama, E. F. and French, K. R. (1993). Common Risk Factors in the Returns on Stocks and Bonds. Journal of Financial Economics, 33(1), 3–56. Feldman, R., Rosenfeld, B., Bar-Haim, R. and Fresko, M. (2011). The Stock Sonarsentiment Analysis of Stocks based on a Hybrid Approach. In Twenty-Third IAAI Conference. Genuer, R., Poggi, J.-M. and Tuleau-Malot, C. (2010). Variable Selection Using Random Forests. Pattern Recognition Letters, 31(14), 2225–2236. Hinton, G. E. (1987). Learning Translation Invariant Recognition in a Massively Parallel Networks. In International Conference on Parallel Architectures and Languages Europe, Springer, 1–13. Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507. Johnson, S. C. (2013). Analysis: False White House Tweet Exposes Instant Trading Dangers, http://www.reuters.com/article/us-usa-markets-tweet-idUSBRE93M1FD20 130423, (accessed 17.01.14). Karabulut, Y. (2013). Can Facebook Predict Stock Market Activity? In AFA 2013 San Diego Meetings Paper. Karatzoglou, A., Amatriain, X., Baltrunas, L. and Oliver, N. (2010). Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems, ACM, 79–86. King, B. F. (1966). Market and Industry Factors in Stock Price Behavior. The Journal of Business, 39(1), 139–190. Kogan, S., Levin, D., Routledge, B. R., Sagi, J. S. and Smith, N. A. (2009). Predicting Risk from Financial Reports with Regression. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, 272–280.
page 734
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
Improving the Stock Market Prediction with Social Media via Broad Learning
735
Kotzias, D., Denil, M., De Freitas, N. and Smyth, P. (2015). From Group to Individual Labels Using Deep Features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 597–606. Ku, L.-W. and Chen, H.-H. (2007). Mining Opinions from The Web: Beyond Relevance Retrieval. J. American Soc. Inf. Science Tech. 58(12), 1838–1850. Kwak, H., Lee, C., Park, H. and Moon, S. (2010). What is Twitter, a Social Network or a News Media? In Proceedings of the 19th international conference on World wide web ACM, 591–600. Larochelle, H. and Bengio, Y. (2008). Classification Using Discriminative Restricted Boltzmann Machines. In Proceedings of the 25th international conference on Machine learning ACM, 536–543. Le, Q. V. and Mikolov, T. (2014). Distributed Representations of Sentences and Documents. In ICML, 14, 1188–1196. Li, F., Xu, G. and Cao, L. (2015a). Coupled Matrix Factorization within Non-iid Context. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 707–719. Li, Q., Jiang, L., Li, P. and Chen, H. (2015b). Tensor-based Learning for Predicting Stock Movements. AAAI, 1784–1790. Li, Q., Wang, T., Li, P., Liu, L., Gong, Q. and Chen, Y. (2014). The Effect of News and Public Mood on Stock Movements. Information Sciences, 278, 826–840. Liu, G., Wu, J. and Zhou, Z.-H. (2012). Key Instance Detection in Multi-instance Learning. In Asian Conference on Machine Learning, 253–268. Luss, R. and DAspremont, A. (2015). Predicting Abnormal Returns from News Using Text Classification. Quantitative Finance, 15(6), 999–1012. Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013a). Efficient Estimation of Word Representations in Vector Space, arXiv preprint arXiv:1301.3781. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J. (2013b). Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, 3111–3119. Nguyen, T. H. and Shirai, K. (2015). Topic Modeling based Sentiment Annual Analysis on Social Media for Stock Market Prediction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Ning, Y., Muthiah, S., Rangwala, H. and Ramakrishnan, N. (2016). Modeling Precursors for Event Forecasting via Nested Multi-instance Learning. In ACM SIGKDD Conferences on Knowledge Discovery and Data Mining, 1095–1104. Nofsinger, J. R. (2005). Social Mood and Financial Economics. The Journal of Behavioral Finance, 6(3), 144–160. Peng, Y. and Jiang, H. (2015). Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks, arXiv preprint arXiv:1506.07220 Pindyck, R. S. and Rotemberg, J. J. (1993). The Comovement of Stock Prices. The Quarterly Journal of Economics, 1073–1104. Prechter, R. R. (1999). The Wave Principle of Human Social Behavior and the New Science of Socionomics, Vol. 1 (New Classics Library). Rumelhart, D. E., McClelland, J. L., Group, P. R. et al. (1988). Parallel Distributed Processing, Vol. 1 (IEEE). Smolensky, P. (1986). Parallel Distributed Processing: Vol. 1: Foundations, edited by D. E. Rumelhart, J. L. McClelland, MIT Press, Cambridge, 194–281. Su, Y., Zhang, X., Philip, S. Y., Hua, W., Zhou, X. and Fang, B. (2016). Understanding Information Diffusion Under Interactions. IJCAI, 3875–3881.
page 735
July 6, 2020
11:59
736
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch18
X. Zhang & P. S. Yu
Taylor, S. J. (2007). Modelling Financial Time Series (Stephen J. Taylor, Modelling Financial Time Series (Second Edition), World Scientific Publishing). Taylor, S. J. and Xu, X. (1997). The Incremental Volatility Information in One Million Foreign Exchange Quotations. Journal of Empirical Finance, 4(4), 317–340. Tetlock, P. C., SAAR-TSECHANSKY, M. and Macskassy, S. (2008). More than Words: Quantifying Language to Measure Firms’ Fundamentals. J. Finance 63, 3, pp. 1437–1467. Viswanath, B., Mislove, A., Cha, M. and Gummadi, K. P. (2009). On the Evolution of User Interaction in Facebook. In Proceedings of the 2nd ACM workshop on Online social networks (ACM), 37–42. Wang, C., Cao, L., Wang, M., Li, J., Wei, W. and Ou, Y. (2011). Coupled Nominal Similarity in Unsupervised Learning. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM-11), ACM, 973–978. Wang, S., He, L., Stenneth, L., Yu, P. S. and Li, Z. (2015). Citywide Traffic Congestion Estimation with Social Media. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 34. Wang, W. Y. and Hua, Z. (2014). A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls. In The 51st Annual Meeting of the Association for Computational Linguistics (ACL-14), 1155–1165. Wichard, J. D., Merkwirth, C. and Ogorza lek, M. (2004). Detecting Correlation in Stock Market. Physica A: Statistical Mechanics and its Applications, 344(1), 308–311. Xie, B., Passonneau, R. J., Wu, L. and Creamer, G. G. (2013). Semantic Frames to Predict Stock Price Movement. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 873–883. Zhang, X., Shi, J., Wang, D. and Fang, B. (2017a). Exploiting Investors Social Network for Stock Prediction in China’s Market. Zhang, X., Su, Y., Qu, S., Xie, S., Fang, B. and Yu, P. S. (2017b). IAD: Interaction-Aware Diffusion Framework in Social Networks, Tech. rep., Beijing University of Posts and Telecommunications, Key Laboratory of Trustworthy Computing and Service. Zhang, X. and Yao, Y. (2011). Guba Dataset, https://pan.baidu.com/s/1i5zAWh3, (accessed 17.01.15). Zhang, X., Yao, Y., Ji, Y. and Fang, B. (2016). Effective and Fast Near Duplicate Detection via Signature-based Compression Metrics. Mathematical Problems in Engineering 2016. Zhang, X. and Yunjia, Z. (2017). Financial Web News Dataset, https://pan.baidu.com/s/ 1mhCLJJi, (accessed 17.01.15). Zhang, X., Zhang, Y., Wang, S., Yao, Y., Fang, B. and Yu, P. S. (2018). Improving Stock Market Prediction via Heterogeneous Information Fusion. KnowledgeBased Systems, 143, 236–247, doi:https://doi.org/10.1016/j.knosys.2017.12.025, http://www.sciencedirect.com/science/article/pii/S0950705117306032. Zhao, W. X., Jiang, J., Yan, H. and Li, X. (2010). Jointly Modeling Aspects and Opinions with a Maxent-lda Hybrid. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 56–65.
page 736
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
Chapter 19
Sourcing Alpha in Global Equity Markets: Market Factor Decomposition and Market Characteristics Subhransu S. Mohanty Contents 19.1 19.2 19.3
19.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.1 Market risk, factor risk and sources of alpha . . . . . . Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Working with the Capital Asset Pricing Model (CAPM) and the Market Factor Models . . . . . . . . . . . . . . . . . . 19.3.1 The capital asset pricing model (CAPM) . . . . . . . . 19.3.2 The Fama–French 3-factor model . . . . . . . . . . . . 19.3.3 Carhart’s 4-factor model . . . . . . . . . . . . . . . . . 19.3.4 Fama and French 5-factor model . . . . . . . . . . . . . 19.3.5 Asness, Frazzini and Pederson’s 5-factor and 6-factor models . . . . . . . . . . . . . . . . . . . . . . . . . . . Working with Market and Market Factor Models: Where They Work, Where Not (Data Analysis and Interpretation) . . . . . 19.4.1 Market characteristics: Market risk and factor risks . . . . . . . . . . . . . . . . . . . . . .
Subhransu S. Mohanty President Emeritus SMART International Holdings, Inc. (www.smartinternationalholdings.org) University of Mumbai e-mail: [email protected], [email protected]
737
738 739 741 742 742 743 744 745 746 747 748
page 737
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
738
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
19.4.2 Factor loading and alpha enhancement 19.5 Conclusion and Relevance of Factor Risk . . . . 19.5.1 Developed markets . . . . . . . . . . . 19.5.2 Emerging markets . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
777 781 781 787 789
Abstract The sources of risk in a market place are systematic, cross-sectional and time varying in nature. Though the CAPM provides an excellent risk-return framework and the market beta may reflect the risk associated with risky assets, there are opportunities for investors to take advantage of dimensional and time varying return anomalies in order to improve their investment returns. In this paper, we restrict our analysis to return variations linked to market factor anomalies or factor/dimensional beta using the Fama–French 3 factor, Carhart 4 factor, and Asness, Frazzini and Pederson (AFP)’s 5 and 6 factor models. We find significant variations in explaining sources of risk across 22 developed and 21 emerging markets with data over a long period from 1991 to 2016. Each market is unique in terms of factor risk characteristics and market risk as explained by the CAPM is not the true risk measure. Hence, contrary to the risk-return efficiency framework, we find that lower market risk results in higher excess return in 19 out of the 22 developed markets, which is a major anomaly. Although in majority of the markets, the AFP models result in reducing market risk (15 countries) and enhancing Alpha (11 countries), it is very interesting to note that, the CAPM is second only in generating excess returns in the developed markets. We are also conscious of the fact that each market is unique in its composition and trend even over a long time horizon and hence a generalized approach in asset allocation cannot be adopted across all the markets. Keywords Capital asset pricing model • Small-minus-big • High-minus-low • Momentum • Robustminus-weak • Conservative-minus-aggressive • Quality-minus-junk • Betting against beta.
19.1 Introduction In a market, investment behavior of investors is mainly guided by expected return and variance of return on their investments. Both these factors being future oriented and uncertain, investors assume a certain amount of risk. According to Hicks, expected returns from investments include an allowance for risk. This risk varies from security to security or from market to market. If market imperfection exists, an investor would like to maximize future returns by selecting the most robust portfolio of securities or markets which
page 738
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
739
are diversifiable based on the variation in risks, rather than holding a market where risk is non-diversifiable. Hence, we assume that each and every investor would like to create a portfolio with an optimal combination of risky assets which maximizes return. An investor would keep cash in the composition of investments only if there is a fear of loss on risky assets. In Keynesian terms of liquidity preference theory and elasticity of demand for cash, this explains the speculative motives of investors. 19.1.1 Market risk, factor risk and sources of alpha Though the origin of asset price behavior is the random walk theory, there are criticism to its independent incremental assumption (Cootner) and stationarities (Osborne). Similarly, Sharpe–Lintner’s capital asset pricing model (CAPM), being the most significant development in modern capital market theory is also subject to criticism of assuming homogeneity of investors’ return expectations within the Markowitz’s mean–variance optimization framework. While the model predicts that the expected excess return from holding an asset is proportional to the covariance of its return with the market portfolio (its “beta”), empirical work of Black, Jensen, and Scholes demonstrated that “low beta” assets earn a higher return on average and “high beta” assets earn a lower return on average. Further as Tobin observes, “for a given amount of risk, an investor always prefers a greater to a smaller expectation of return”. The utility theorists, behavioral finance researchers and psychologists have a number of explanations to investors’ varied risk and return preferences. Secondly, the model also assumes a single period, i.e., the investment opportunity sets do not change over time, though in reality, they do. Intertemporal studies show that risk and return associated with assets and markets as well as their correlations change over time. Despite the above, the CAPM provides a strong basis of the relationship among asset returns and explains a significant fraction of the variation in asset returns. Continued academic and non-academic empirical research shows that there are many anomalies to counter the risk-return efficiency of assets and their markets. Merton’s (1973) says up to four unspecified state variables lead to risk premiums that are not captured by the market risk. Since these unspecified state variables haven’t been identified and measured, the later empirical studies mostly deal with excess return (Alpha) generation
page 739
July 6, 2020
11:59
740
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
through factor portfolios providing different combinations of exposures to the unknown state variables within the relevant multifactor efficient set along with the market portfolio and the risk-free asset. Notable among them are the Fama–French 3-factor model, the Carhart 4-factor model, the Fama–French 5-factor model and the Asness and Frazini’s 6-factor model. All these models are highly intuitive and provide additional crosssectional risk-return dimensions to the market risk. These factors are size, value, momentum, profitability, investment, quality and low beta. These factors may have some interrelationship with each other, as there may be some common characteristics between them too. In fact, many of these factors have been used by practicing investment managers in order to achieve excess return and subsequently, index developers have started creating factor-specific (style) indices so that investors and investment managers can have an alternative way of enhancing Alpha by harvesting the risk-premia tilt associated with such factors. The decomposition of factor Alphas also helps in measuring and monitoring risk exposure and performance attribution of investment portfolios in a more granular and accurate fashion. Summarizing the above, in a market context and within the risk-return efficiency framework, the size (Small Minus Big — SMB) factor typically divides the market into two segments, small and big. Hence it covers the entire market and in our case, the country market representative — the index. Similarly, the valuation (High minus Low — HML) factor divides an entire market into two segments, i.e., based on their book-to-market price ratio. When these two factors are taken into account, we are evaluating the impact of these factor betas on overall market risk and these factors have interrelationship among themselves, as there could be high or low valuation situations both in the small as well as big-sized companies. In general, small companies are considered more risky than the big ones as they are less liquid, though some of them may be highly profitable, stable and growing at a faster rate. Similarly, high book-to-market, though technically pointing to the possibility of delivering better returns in the future, could instead be due to higher retention policy, not having enough opportunities to grow further and so on. Hence, the impact of these two factor (SMB and HML) betas could be positive or negative on market risk. The two factors, such as RMW (profitability premium) and CMA (investment premium) provide another dimension to stock characteristics, i.e., more profitable companies are expected to have a higher valuation compared to the less profitable ones and high book-equity growth means a lower valuation
page 740
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
741
growth. Both these factors, RMW and CMA could be considered as subfactors of the value premium in HML. In AFP’s model, the QMJ (quality minus junk) factor or quality premium adds a few more parameters to profitability, such as growth (higher price for stocks with growing profits), safety (both return-based measure of safety, i.e., volatility risk relative to market risk and fundamental-based measures such as stocks with low leverage, low volatility of profitability, and low credit risk) and payout ratio (higher payout means less of management agency problems). The momentum factor for a market is supposed to encompass the entire market, i.e., basically to improve the predictability whether a market is overheated (overvalued) or undervalued or in line with the growth in the financial health of companies listed therein. Hence, when momentum is added as a factor beta, positive momentum always increases market risk and vice versa. Similarly, the betting against beta (BAB) factor explains liquidity preference, liquidity funding risk and portfolio constraints in a generic market setting. Accordingly, low beta stocks have high expected returns and netting of low beta stocks against high beta stocks takes away some amount of market risk. All the above factors are cross-sectional but interdependent in nature, change over time and maybe useful in defining country/market-specific systematic (market) risk and factor (dimensional idiosyncratic) risk. In this paper, we attempt to analyze country-specific idiosyncratic risk with the help of factor alphas and the risks associated with them, so that in a global equity investment setting, investors will be in a position to differentiate between equity markets of the various countries and position them with a strategy to maximize their portfolio performance. 19.2 Dataset We have used MSCI Global Equity Markets Standard Price Monthly Index Data (ACWI and its country components) from the date of availability till December 2016. Our analysis and results will therefore reflect the construction methodology adopted by MSCI. All these indices are USD denominated and we have used the US 3-month T-bills as the risk-free rate. We have used pricing and accounting data from the union of the CRSP tape and the Compustat/XpressFeed Global database for 22 developed markets and 21 emerging markets. All portfolio returns are in USD without currency hedging. Excess returns are above the US Treasury bill rate. One set
page 741
July 6, 2020
11:59
742
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
of portfolios is formed in each country and aggregates are computed by weighting each country’s portfolio by the country’s total lagged (t − 1) market capitalization. 19.3 Working with the Capital Asset Pricing Model (CAPM) and the Market Factor Models 19.3.1 The capital asset pricing model (CAPM) The capital asset pricing model (CAPM) is built on the Markowitz mean– variance-efficiency proposition within a single period time frame and a riskfree asset. It propounds that all risk-averse investors choose only efficient portfolios with minimum variance at a given expected return or maximum expected return with a given variance. The Sharpe–Lintner version assumes a risk-free rate, whereas the Fisher Black version of the CAPM allows unlimited short selling. Both imply that beta, the covariance of asset returns with the market relative to variance of the market, is sufficient to explain differences in asset or portfolio expected returns and that the relationship between beta and expected returns is positive. The risk-free rate is the intercept in the Sharpe–Lintner version, but the Black version requires only that the expected market return be greater than the expected return on assets that are uncorrelated with the market. We test market risk and excess return in global equity markets by using the CAPM as reproduced as follows: CAPM: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + α,
(19.1)
in which Ra = Risk Asset return, Rrf = Risk-free Asset return, Bmkt = Market loading factor (exposure to market risk), Rmkt = Market return and Alpha = Excess return over the benchmark. Following the mean–variance efficiency proposition, many cross-sectional and time varying anomalies were discovered which show that, market anomalies in their various forms exist in different markets around the globe. Evidence of major return anomalies in any form, whether based on time period such as over specific days, weeks and months; or over size, such as large, medium or small or over different classifications such style (Growth/Value/Momentum), or across various sectors or triggered by material announcements such as earnings, dividend etc. are all contradictory to any of the three forms of Efficient Market Hypothesis (EMH). Apart from the controversies raised for testing the CAPM, most of the anomalies are cited and discussed in details in Fama’s early work which could be grouped
page 742
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
743
together under (a) cross-sectional/dimensional return variations, (b) time varying return variations and (c) a combination of both. There are other researchers who have also looked at the supply-side factors of the capital markets, mainly the macro-economic factors influencing the market returns. Many of these controversies and anomalies are also studied and found to be in existence by the behavioral scientists and utility theorist who look at the demand-side factors influencing returns in the markets. The idiosyncratic risk associated with various market factors is a natural outcome of such anomalies and controversial findings. In Fama–French’s thought-provoking words “with the deck of existing anomalies in hand, we should not be surprised when new studies show that yet other variables contradict the central prediction of the Sharpe–Lintner–Black model that Bmkt suffices to describe the cross-section of expected returns.” Contradicting the above, in a latter study, Fama also notes that “market efficiency survives the challenge from the literature on long-term return anomalies. Consistent with the market efficiency hypothesis that the anomalies are chance results, apparent overreaction to information is about as common as underreaction, and post-event continuation of pre-event abnormal returns is about as frequent as post-event reversal. Most important, consistent with the market efficiency prediction that apparent anomalies can be due to methodology, most long-term return anomalies tend to disappear with reasonable changes in technique”. From the above works of research it is interesting to note that though CAPM may hold in a single period continuous model and the market beta may reflect the risk associated with risky assets, there are opportunities for investors to take advantage of dimensional and time varying return anomalies in order to improve their investment returns. 19.3.2 The Fama–French 3-factor model Most prominent among the various empirical anomalies that contradict the Sharpe–Lintner–Black model, is the size effect theory of Banz (1981). He finds that market equity ME (a stock’s price times shares outstanding) adds to the explanation of the cross-section of average returns provided by market risk, i.e., average returns on small (low ME) stocks are too high and average returns on large stocks are too low, given their market risk estimates. Stattman (1980) and Rosenberg et al. (1985) find that average returns on US stocks are positively related to the ratio of a firm’s book value of common equity (BE), to its market value (ME). Reinforcing their findings in a similar study, Chan et al. (1991) find that book-to-market equity, BE/ME, also has
page 743
July 6, 2020
11:59
744
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
a strong role in explaining the cross-section of average returns on Japanese stocks. Basu (1983) shows that earnings-price ratios (E/P) help explain the cross-section of average returns on US stocks in tests that also include size and market risk. Ball (1978) argues that E/P is a catch-all proxy for unnamed factors in expected returns; E/P is likely to be higher for stocks with higher risks and expected returns, whatever may be the unnamed sources of risk. According to Fama, while the univariate relations between average return and size, leverage, E/P, and book-to-market equity are strong, in multivariate tests, the negative relation between size and average return and the positive relation between book-to-market equity and average return are robust which explain that (a) market risk does not seem to help explain the crosssection of average stock returns, and (b) the combination of size and bookto-market equity seems to absorb the roles of leverage and E/P in average stock returns. The Fama–French 3-factor model emphasizes through empirical tests that small-minus-big (Size effect) and high-minus-low (Value effect) dimensional factor effect can enhance portfolio returns. We test market and factor risk and excess return in global equity markets by using the Fama–French 3-factor model as reproduced as follows: 3-Factor Model: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + Bsmb × SMB + Bhml × HML + α,
(19.2)
in which Ra = Asset return, Rrf = Risk free return, Bmkt = Market loading factor (exposure to market risk, different from CAPM beta), Rmkt = Market return, Bsmb = Size loading factor (the level of exposure to size risk), SMB = Small Minus Big (The size premium), Bhml = Value loading factor (the level of exposure to value risk), HML = High Minus Low (The value premium) and α = Excess return over the benchmark. 19.3.3 Carhart’s 4-factor model In 1993, Jagadeesh and Titman finds that buying stocks that have performed well in the past and selling stocks that have performed poorly in the past generate significant positive returns over 3- to 12-month holding periods and the profitability of the relative strength strategies are not due to their systematic risk. Capturing the above findings, Carhart’s (1997) work on the persistence in stock returns of mutual funds in the US equity markets from January 1962 to December 1993 includes an additional factor to the Fama and French’s 3-factor model and forms a 4-factor model under the premise that stock returns tend to exhibit some form of positive autocorrelation in the short to medium term. Hence, investment strategies following a rule of
page 744
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
Sourcing Alpha in Global Equity Markets
745
buying past winners and selling past losers can generate abnormal returns in the short term. Termed as momentum anomaly, the difference between the winner and the loser portfolio provides significant loading of the 1-year momentum factor and sizeable time series variations. We test the above in global equity markets by using the Carhart’s 4-factor model as reproduced as follows: 4-factor model: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + Bsmb × SMB + Bhml × HML + Bmom × UMD + α,
(19.3)
in which Ra = Asset return, Rrf = Risk-free return, Bmkt = Market loading factor (exposure to market risk, different from CAPM beta), Rmkt = Market return, Bsmb = Size loading factor (the level of exposure to size risk), SMB = Small Minus Big (The size premium), Bhml = Value loading factor (the level of exposure to value risk), HML = High Minus Low (The value premium), Bmom = Momentum loading factor (the level of exposure to momentum), UMD = Up Minus Down (The momentum premium) and α = Excess return over the benchmark. 19.3.4 Fama and French 5-factor model The Fama and French 5-factor model is inspired by their earlier work on the 3-factor model and the predictability that profitability and investment could have on stock returns, taking a cue from Gordon’s dividend discount model and a re-engineered rationalization of the relationship between expected return and internal rate of return derived from future dividend flows. Their rationale combines previous findings that stocks with high profitability outperform (Novy and Marx, 2013), stocks that repurchase tend to do well (Baker and Wurgler, 2002; Pontiff and Woodgate, 2008; McLean et al., 2009), growing firms outperform firms with poor growth (Mohanram, 2005), and firms with high accruals are more likely to suffer subsequent earnings disappointments and their stocks tend to underperform peers with low accruals (Sloan, 1996; Richardson et al. 2005). We test the above in global equity markets by using Fama–French 5-factor model as reproduced as follows: 5-factor model: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + Bsmb × SMB + Bhml × HML + Brmw × RM W + Bcma × CM A + α,
(19.4)
page 745
July 6, 2020
11:59
746
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
in which Ra = Asset return, Rrf = Risk-free return, Bmkt = Market loading factor (exposure to market risk, different from CAPM beta), Rmkt = Market return, Bsmb = Size loading factor (the level of exposure to size risk), SMB = Small Minus Big (The size premium), Bhml = Value loading factor (the level of exposure to value risk), HML = High Minus Low (The value premium), Brmw = Profitability loading factor, RMW = Robust Minus Weak (The profitability premium), Bcma = Investment loading factor, CMA = Conservative Minus Aggressive (The conservative investment premium) and α = Excess return over the benchmark. 19.3.5 Asness, Frazzini and Pederson’s 5-factor and 6-factor models Asness, Frazzini and Pederson through a number of studies discover that leverage adds to risk and hence low risk investments have higher expected return (BAB), quality stocks have higher return than junk stock (Quality minus Junk). Their work on embedded leverage shows that asset classes with embedded leverage offer low risk-adjusted returns and, in the cross-section, higher embedded leverage is associated with lower returns. A portfolio which is long low-embedded-leverage securities and short high-embedded-leverage securities earns large abnormal returns. Additionally, taking a cue from well-known accounting variables’ relationship with return anomalies, such as, low beta is associated with high Alpha for stocks, bonds, credit, and futures (Black et al., 1972; Frazzini and Pedersen, 2013), dividend growth variables as market’s quality variable (Campbell and Shiller, 1988; Vuolteenaho, 2002; Fama and French, 2008) and cash flow betas’ impact on price levels (Cohen et al., 2009), firms with low leverage have high Alpha (George and Hwang, 2010; Penman et al., 2007), firms with high credit risk tend to under-perform (Altman, 1968; Ohlson, 1980; Campbell et al., 2008), they find that the cross-sectional variation of price multiples can be explained by quality and it can resurrect the size effect. Their findings show that the price of quality is positive though quite low, and a quality minus junk factor has a negative market beta and can produce high returns. Both these factors, BAB and QMJ provide a further dimension to risk-based explanations and were tested in a 5-factor model by expanding Fama–French’s 3-factor model and thereafter, in a 6-factor model by adding these two factors to Carhart’s 4-factor model with momentum as a factor.
page 746
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
Sourcing Alpha in Global Equity Markets
747
We test the above in global equity markets with their models as reproduced as follows: AFP 5-factor model: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + Bsmb × SMB + Bhmlwlab × HML − WLAB 1 + Bqmj × QMJ + Bbab × BAB + α,
(19.5)
AFP 6-factor model: Ra = Rrf + Bmkt × (Rmkt − Rrf ) + Bsmb × SMB + Bhmlwlab × HML − WLAB + Bmom × MOM 2 + Bqmj × QMJ + Bbab × BAB + α,
(19.6)
in which Ra = Asset return, Rrf = Risk-free return, Bmkt = Market loading factor (exposure to market risk, different from CAPM beta), Rmkt = Market return, Bsmb = Size loading factor (the level of exposure to size risk), SMB = Small Minus Big (The size premium), Bhmlwlab = Value loading factor (the level of exposure to value risk without the look-ahead bias), HML-WLAB = High Minus Low (The value premium without the lookahead bias), Bmom = Momentum loading factor (the level of exposure to momentum), MOM = Up Minus Down (The momentum premium), Bqmj = Quality loading factor, QMJ = Quality Minus Junk factor (Quality Premium), Bbab = Low Beta loading factor, BAB = Betting Against Beta factor (Low Risk Anomaly) and α = Excess return over the benchmark. 19.4 Working with Market and Market Factor Models: Where They Work, Where Not (Data Analysis and Interpretation) An analysis of regression results of the CAPM and all the market factor models under study are carried out in the following sections in order to analyze 1
We would like to mention that Asness, Frazzini and Pederson in their model have removed the look-ahead bias of value premium (HML) by using market price at the time of release of book value information, and monthly rebalancing the ratio of book-to-price which they call “timely value”. 2 Adding momentum as an additional factor they find that both these factors have very strong negative correlation and intuitively cheap stocks have more “value” and potentially stronger momentum and thus serve better in finding factor dimensional risks.
page 747
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
748
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
the impact of various market factor risks, market characteristics, evaluate the efficacy of a market model in a particular market and their contribution in enhancing Alpha in various markets. These results are presented in Tables 19.1, 19.2 and Exhibits 19I and 19II. 19.4.1 Market characteristics: Market risk and factor risks 19.4.1.1 Developed markets The regression results for 22 developed markets are presented in Table 19.1. Australia: In Australia, the overall market is more risk-return efficient as the CAPM model generates the highest Alpha. Sourcing additional Alpha is not possible with factor betas, though the impact of SMB, HML and RMW factor betas are statistically significant with t-Stats greater than 2. MOM factor has a positive beta and CMA factor has a negative beta, as observed from the FF 3-factor, CH 4-factor and FF 5-factor models. An analysis of the AFP 5-factor model shows that both HML-WLAB and BAB factors have statistically significant betas, though in the AFP 6-factor model, BAB factor beta goes down to a less significant level with the addition of positive beta of MOM factor and an increase in the QMJ factor’s negative beta. Interestingly, the market risk goes down in the AFP models as compared to all the other models, though they could not generate the highest Alpha. Hence, the market may have potential for sourcing strong value premia (beta of 0.4, t-stats of 3.403), if high quality stocks are chosen that are driven by profitability-led momentum, and market risk can be brought down by reducing exposure to aggressive and junk stocks. However, overall, the interactive relationship of these factors is mostly priced in the market risk and index investing could prove to be a better option. Austria: Like Australia, the Austrian equity markets are also risk-return efficient with the CAPM generating the highest Alpha, followed by the AFP 5-factor and AFP 6-factor models, respectively, though the market risk is the lowest in the AFP 5-factor model and AFP 6-factor model, in that order. The FF 3-factor, CH 4-factor and FF 5-factor models show increase in market risk as compared to the above 3 models and statistically significant impact of positive SMB, HML, RMW and negative CMA betas. In the AFP 5-factor model, the HML-WLAB and BAB factor betas are positive and statistically significant, while the SMB factor beta becomes insignificant and QMJ factor beta remains negative. Hence, it can be concluded that a low risk strategy that includes the value premium and profitability-led growth stocks and avoids aggressive growth stocks might deliver better results in Austria and
page 748
July 6, 2020 11:59
Country
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
CAPM Beta
Alpha
1.11 23.114 1.19 17.518 1.02 19.915 1.07 25.152 1 19.8 1.46 16.5 1.16 29.836 1.26 27.475 1.13 30.588 1.07 14.157 1.09 19.038 0.89 11.861 1.18 17.231 0.92 17.372
−0.14% −0.688 −0.56% −1.927 −0.22% −0.997 −0.10% −0.536 0.16% 0.733 0.08% 0.222 −0.24% −1.429 −0.24% −1.208 −0.16% −1.002 0.09% 0.287 −0.46% −1.897 −0.22% −0.694 −0.51% −1.743 −0.50% −2.206
FF-3 FACTOR R2 63.30% 49.70% 56.10% 67.10% 55.80% 46.80% 74.20% 70.90% 75.10% 39.30% 53.90% 33.00% 48.90% 49.30%
R2
Rm-Rf
SMB
HML
Alpha
1.13 23.71 1.27 20.588 1.05 20.579 1.07 25.157 1.01 20.026 1.41 16.075 1.18 30.351 1.27 27.268 1.15 31.599 1.07 13.999 1.13 19.876 0.83 11.787 1.21 17.878 0.93 17.505
0.33 3.308 0.53 4.064 −0.12 −1.101 0.26 2.949 0.19 1.81 −0.29 −1.596 −0.17 −2.147 −0.07 −0.753 −0.09 −1.221 0.24 1.519 0.13 1.109 0.36 2.432 0.14 1.02 0.26 2.324
0.23 2.569 0.97 8.405 0.32 3.36 0.02 0.198 0.2 2.079 −0.53 −3.22 0.17 2.396 0.07 0.846 0.26 3.827 0 −0.003 0.43 4.069 −0.81 −6.282 0.46 3.594 0.14 1.41
−0.25% −1.231 −0.98% −3.699 −0.34% −1.559 −0.12% −0.647 0.07% 0.312 0.31% 0.830 −0.30% −1.791 −0.26% −1.315 −0.26% −1.650 0.08% 0.242 −0.64% −2.653 0.11% 0.359 −0.70% −2.403 −0.57% −2.495
65.00% 59.90% 58.10% 68.00% 56.80% 48.70% 75.20% 71.00% 76.50% 39.70% 56.30% 43.50%
9.61in x 6.69in
End date
Sourcing Alpha in Global Equity Markets
AUSTRALIA t-Stats AUSTRIA t-Stats BELGIUM t-Stats CANADA t-Stats DENMARK t-Stats FINLAND t-Stats FRANCE t-Stats GERMANY t-Stats HOLLAND t-Stats HONGKONG t-Stats IRELAND t-Stats ISRAEL t-Stats ITALY t-Stats JAPAN t-Stats
Start date
Developed markets.
Handbook of Financial Econometrics,. . . (Vol. 1)
Table 19.1:
51.00% 50.40%
(Continued)
b3568-v1-ch19
749
page 749
July 6, 2020
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
CAPM Beta
Alpha
0.98 15.341 1.36 21.29 1.03 15.579 1.27 21.629 1.37 24.476 0.83 19.967 0.94
−0.20% −0.722 −0.32% −1.194 −0.55% −1.933 −0.25% −1.003 0.00% 0.013 0.16% 0.888 −0.34%
30.13 0.88 37.317 0.98 143.314
−2.543 0.06% 0.560 −0.18% −6.315
FF-3 FACTOR R2 43.20% 59.40% 43.90% 60.10% 65.90% 56.30% 74.50%
81.80% 98.50%
Rm-Rf
SMB
HML
Alpha
1 15.765 1.4 22.589 1.05 15.748 1.3 22.137 1.35 24.338 0.86 21.269 0.96
0.19 1.453 0.44 3.364 0.21 1.53 0 0.015 0.25 2.194 −0.24 −2.807 −0.23
0.34 2.845 0.53 4.564 0.23 1.84 0.35 3.146 −0.35 −3.36 0.32 4.219 0.2
−0.34% −1.256 −0.56% −2.114 −0.65% −2.274 −0.39% −1.551 0.13% 0.543 0.04% 0.248 −0.41%
31.724 0.87 40.88 0.98 220.233
−3.651 −0.39 −8.803 −0.19 −20.671
3.603 −0.15 −3.829 0 0.178
−3.151 0.14% 1.529 −0.17% −9.122
R2 44.80% 62.70% 44.80% 61.40% 67.90% 60.20% 76.90%
85.70% 99.40%
9.61in x 6.69in
End date
S. S. Mohanty
NEW ZEALAND t-Stats NORWAY t-Stats PORTUGAL t-Stats SPAIN t-Stats SWEDEN t-Stats SWITZERLAND t-Stats UNITED KINGDOM t-Stats UNITED STATES t-Stats WORLD t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
(Continued )
11:59
750
Table 19.1:
b3568-v1-ch19 page 750
July 6, 2020 11:59
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
CH-4 FACTOR
FF-5 FACTOR
Rm-Rf
SMB
HML
MOM
Alpha
1.14 22.993 1.29 20.067 1.06 19.919 1.08 24.192 1.02 19.308 1.43 15.607 1.19 29.416 1.28 26.196 1.16 30.389 1.07 13.406 1.12 18.894 0.83 11.432 1.22 17.178 0.94 16.993
0.32 3.173 0.51 3.904 −0.13 −1.177 0.26 2.881 0.19 1.74 −0.31 −1.67 −0.18 −2.261 −0.08 −0.782 −0.1 −1.265 0.24 1.505 0.14 1.16 0.35 2.356 0.14 0.985 0.25 2.214
0.25 2.725 1 8.386 0.34 3.423 0.02 0.29 0.21 2.121 −0.49 −2.889 0.2 2.601 0.08 0.896 0.27 3.796 0 0 0.41 3.762 −0.8 −5.916 0.46 3.506 0.16 1.567
0.05 0.927 0.08 1.117 0.04 0.711 0.02 0.359 0.03 0.454 0.07 0.722 0.05 1.073 0.02 0.303 0.02 0.451 0 0.011 −0.03 −0.503 0.03 0.42 0.02 0.215 0.05 0.77
−0.30% −1.411 −1.05% −3.858 −0.38% −1.680 −0.13% −0.712 0.04% 0.199 0.25% 0.641 −0.34% −1.990 −0.28% −1.347 −0.27% −1.707 0.08% 0.232 −0.61% −2.462 0.08% 0.255 −0.71% −2.384 −0.61% −2.604
R2 65.10% 60.10% 58.10% 68.00% 56.80% 48.80% 75.30% 71.00% 76.50% 39.70% 56.30% 43.60% 51.00% 50.50%
Rm-Rf
SMB
HML
RMW
CMA
Alpha
1.2 20.091 1.29 17.2 1.17 18.7 0.94 18.458 1.04 16.301 1.28 11.584 1.19 24.374 1.25 21.176 1.21 26.365 0.98 10.169 1.14 15.905 0.73 8.145 1.14 13.225 0.97 14.645
0.44 4.078 0.66 4.844 0.06 0.56 0.19 2.02 0.26 2.232 −0.42 −2.11 −0.15 −1.701 −0.08 −0.792 −0.03 −0.415 0.19 1.104 0.16 1.243 0.29 1.827 0.05 0.33 0.24 1.974
0.19 1.457 1.21 7.346 0.25 1.835 0.49 4.372 0.25 1.822 −0.12 −0.507 0.23 2.135 0.22 1.692 0.16 1.577 0.36 1.702 0.46 2.925 −0.61 −3.164 0.62 3.287 −0.2 −1.41
0.47 2.922 0.67 3.265 0.77 4.523 −0.1 −0.701 0.31 1.808 −0.29 −0.966 0.2 1.468 0.07 0.435 0.24 1.931 −0.02 −0.075 0.19 0.967 −0.26 −1.07 −0.26 −1.105 −0.3 −1.699
−0.11 −0.627 −0.69 −3.268 0.03 0.181 −0.9 −6.301 −0.21 −1.171 −0.59 −1.882 −0.08 −0.568 −0.25 −1.488 0.17 1.318 −0.71 −2.617 −0.12 −0.603 −0.42 −1.691 −0.29 −1.207 0.58 3.143
−0.44% −206.60% −1.18% −437.30% −0.65% −289.00% 0.03% 15.40% −0.04% −17.30% 0.52% 130.90% −0.36% −202.70% −0.25% −119.90% −0.37% −225.50% 0.17% 49.30% −0.71% −274.50% 0.26% 80.00% −0.56% −181.20% −0.53% −224.60%
R2 66.20% 63.20% 60.70% 71.70% 57.60% 49.40% 75.40% 71.30% 76.90% 41.00% 56.50% 44.20% 51.30%
9.61in x 6.69in
End date
Sourcing Alpha in Global Equity Markets
AUSTRALIA t-Stats AUSTRIA t-Stats BELGIUM t-Stats CANADA t-Stats DENMARK t-Stats FINLAND t-Stats FRANCE t-Stats GERMANY t-Stats HOLLAND t-Stats HONGKONG t-Stats IRELAND t-Stats ISRAEL t-Stats ITALY t-Stats JAPAN t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
52.60%
(Continued)
b3568-v1-ch19
751
page 751
July 6, 2020
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
CH-4 FACTOR
FF-5 FACTOR
Rm-Rf
SMB
HML
MOM
Alpha
0.99 14.963 1.43 22.206 1.06 15.175 1.3 21.175 1.33 23.045 0.87 20.673 0.96
0.2 1.498 0.41 3.149 0.21 1.477 0 0.023 0.27 2.304 −0.25 −2.904 −0.23
0.32 2.596 0.58 4.868 0.24 1.859 0.34 2.998 −0.38 −3.519 0.34 4.325 0.21
−0.04 −0.478 0.12 1.691 0.03 0.337 0 −0.066 −0.07 −1.048 0.05 0.982 0.01
−0.31% −1.111 −0.67% −2.453 −0.67% −2.287 −0.39% −1.492 0.19% 0.770 0.00% 0.014 −0.42%
30.501 0.86 38.841 0.98 210.796
−3.671 −0.38 −8.567 −0.19 −20.433
3.573 −0.17 −4.153 0 0.009
0.423 −0.04 −1.667 0 −0.58
−3.159 0.18% 1.876 −0.17% −8.732
R2
Rm-Rf
SMB
HML
RMW
CMA
Alpha
1.08 13.555 1.4 18.271 1.11 13.093 1.25 16.765 1.28 18.312 0.91 17.958 1.03
0.33 2.323 0.51 3.668 0.28 1.819 −0.06 −0.479 0.18 1.442 −0.16 −1.743 −0.12
0.36 2.08 0.74 4.42 0.15 0.786 0.47 2.902 −0.24 −1.548 0.27 2.464 0.26
0.62 2.865 0.48 2.304 0.3 1.312 −0.16 −0.805 −0.24 −1.257 0.3 2.153 0.57
−0.2 −0.892 −0.59 −2.731 0.04 0.157 −0.2 −0.966 −0.23 −1.178 0.1 0.691 −0.12
−0.58% −201.40% −0.70% −251.70% −0.78% −257.90% −0.30% −110.60% 0.25% 97.80% −0.08% −44.60% −0.61%
28.357 0.84 31.501 99.40% 0.99 173.591
−1.818 −0.4 −8.249 −0.19 −18.44
3.234 0.03 0.506 0.03 2.757
5.791 0.05 0.695 0.04 2.644
−1.156 −0.22 −2.856 −0.01 −0.463
−469.60% 0.16% 169.00% −0.18% −890.50%
44.80% 63.10% 44.80% 61.40% 68.00% 60.30% 76.90%
85.80%
R2 46.60% 64.50% 45.10% 61.60% 68.10% 60.70% 79.40%
85.90% 99.40%
9.61in x 6.69in
End Date
S. S. Mohanty
NEW ZEALAND t-Stats NORWAY t-Stats PORTUGAL t-Stats SPAIN t-Stats SWEDEN t-Stats SWITZERLAND t-Stats UNITED KINGDOM t-Stats UNITED STATES t-Stats WORLD t-Stats
Start Date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
(Continued )
11:59
752
Table 19.1:
b3568-v1-ch19 page 752
July 6, 2020 11:59
Country
Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16
Jan-91 Dec-16 Jan-93 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16
SMB
0.96 12.256 0.99 10.301 1.12 16.785 0.86 16.18 0.84 12.139 1.15 10.692 1.18 19.43 1.1 15.281 1.02 20.819 1.03 10.674 1.01 12.116 0.65 5.657 0.99 11.082 0.97 13.795
0.1 0.726 0.13 0.772 −0.18 −1.555 0.07 0.79 0.12 0.953 −0.2 −1.059 −0.19 −1.82 −0.23 −1.812 −0.27 −3.108 0.05 0.273 −0.07 −0.496 0.05 0.228 −0.21 −1.34 −0.15 −1.247
0.24 2.927 0.48 4.759 0.26 3.739 0.04 0.645 0.14 1.948 −0.16 −1.401 0.09 1.354 0.18 2.334 0.2 3.793 0.19 1.823 0.25 3.012 −0.62 −6.29 0.18 1.861 −0.13 −1.818
AFP-6 FACTOR
QMJ
BAB
Alpha
−0.18 −0.975 −0.14 −0.648 0.33 2.144 −0.36 −2.963 −0.15 −0.915 −0.21 −0.855 0.18 1.259 −0.1 −0.605 −0.04 −0.332 0.02 0.07 −0.18 −0.912 −0.66 −2.476 −0.35 −1.676 −0.15 −0.953
0.27 2.905 0.6 5.362 0.28 3.566 0.07 1.141 0.18 2.269 −0.38 −2.962 −0.04 −0.556 −0.11 −1.262 0.08 1.383 −0.04 −0.326 0.32 3.469 −0.19 −1.637 0.18 1.753 0.2 2.451
−0.28% −1.032 −0.62% −1.835 −0.49% −2.085 0.00% −0.024 0.19% 0.781 0.71% 1.868 −0.11% −0.512 0.08% 0.308 −0.15% −0.859 0.11% 0.310 −0.59% −2.251 0.59% 1.718 −0.26% −0.840 −0.31% −1.250
R2 48.60% 42.90% 55.60% 65.10% 47.60% 42.50% 64.20% 57.10% 69.90% 38.20% 54.70% 45.10% 44.50% 52.00%
Rm-Rf
SMB
HMLDEV
0.98 12.441 1.01 10.481 1.12 16.676 0.86 16.058 0.84 12.041 1.15 10.519 1.19 19.42 1.13 15.494 1.03 20.732 1.05 10.795 1.02 11.944 0.61 5.316 1 11.075 0.96 13.498
0.13 0.958 0.17 0.993 −0.17 −1.486 0.08 0.824 0.12 0.969 −0.21 −1.111 −0.18 −1.662 −0.2 −1.545 −0.26 −2.981 0.08 0.457 −0.07 −0.428 −0.03 −0.155 −0.2 −1.236 −0.18 −1.415
0.4 3.403 0.67 4.616 0.29 2.92 0.06 0.685 0.16 1.494 −0.22 −1.326 0.16 1.781 0.34 3.142 0.24 3.224 0.34 2.344 0.27 2.095 −0.84 −5.344 0.25 1.828 −0.24 −2.283
MOM
QMJ
BAB
Alpha
0.2 1.894 0.23 1.805 0.04 0.449 0.02 0.329 0.02 0.197 −0.07 −0.491 0.1 1.167 0.21 2.103 0.05 0.818 0.2 1.495 0.02 0.232 −0.24 −1.827 0.09 0.744 −0.13 −1.416
−0.21 −1.165 −0.18 −0.829 0.32 2.085 −0.37 −2.977 −0.15 −0.929 −0.2 −0.8 0.16 1.136 −0.14 −0.817 −0.05 −0.413 −0.02 −0.081 −0.18 −0.906 −0.67 −2.534 −0.36 −1.742 −0.13 −0.806
0.19 1.847 0.51 4.118 0.26 3.042 0.06 0.895 0.18 1.972 −0.35 −2.478 −0.08 −0.996 −0.19 −2.036 0.06 0.909 −0.12 −0.926 0.31 2.932 −0.07 −0.514 0.15 1.276 0.26 2.822
−0.40% −1.444 −0.76% −2.207 −0.51% −2.131 −0.02% −0.099 0.18% 0.714 0.75% 1.929 −0.17% −0.768 −0.05% −0.183 −0.18% −1.024 −0.01% −0.042 −0.61% −2.219 0.75% 2.126 −0.32% −0.988 −0.23% −0.892
R2 49.00% 43.30% 55.60% 65.10% 47.60% 42.50% 64.40% 57.60% 69.90% 38.60% 54.70% 45.70%
9.61in x 6.69in
Jan-91 Dec-16
Rm-Rf
HMLDEV
Sourcing Alpha in Global Equity Markets
AUSTRALIA t-Stats AUSTRIA t-Stats BELGIUM t-Stats CANADA t-Stats DENMARK t-Stats FINLAND t-Stats FRANCE t-Stats GERMANY t-Stats HOLLAND t-Stats HONGKONG t-Stats IRELAND t-Stats ISRAEL t-Stats ITALY t-Stats JAPAN t-Stats
End date
Handbook of Financial Econometrics,. . . (Vol. 1)
AFP-5 FACTOR Start date
44.60% 52.30%
(Continued)
b3568-v1-ch19
753
page 753
July 6, 2020
AFP-5 FACTOR Country
End date
SMB
0.93 9.96 1.07 12.804 0.86 8.558 1.2 15.565 1.13 15.613 0.87 15.543 1.05
0.08 0.517 −0.01 −0.082 −0.18 −1.032 −0.2 −1.469 0.14 1.1 −0.32 −3.273 −0.2
0.29 2.935 0.21 2.382 0.01 0.098 0.27 3.269 −0.05 −0.674 0.14 2.436 0.22
22.015 −2.394 0.91 −0.15 26.518 −2.447 Jan-91 Dec-16 1 −0.19 110.075 −12.079
4.509 0.11 3.066 0.07 6.919
Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16 Jan-91 Dec-16
Jan-91 Dec-16
QMJ
AFP-6 FACTOR Rm-Rf
SMB
HMLDEV
MOM
QMJ
BAB
Alpha
0.92 9.847 1.1 13.046 0.84 8.256 1.21 15.45 1.13 15.363 0.87 15.359 1.06
0.08 0.503 0.03 0.191 −0.22 −1.211 −0.19 −1.416 0.13 0.998 −0.32 −3.266 −0.18
0.28 1.986 0.41 3.239 −0.1 −0.624 0.29 2.503 −0.11 −1.003 0.13 1.564 0.32
−0.01 −0.072 0.25 2.19 −0.12 −0.891 0.03 0.323 −0.07 −0.743 −0.01 −0.178 0.12
0.01 0.044 −0.55 −2.832 −0.44 −1.848 −0.05 −0.297 −0.27 −1.595 0.08 0.588 0.24
0.44 3.682 0.22 2.072 0.43 3.423 0.01 0.099 −0.18 −1.94 0.22 3.044 0.04
−0.67% −1.989 −0.34% −1.132 −0.59% −1.789 −0.17% −0.604 0.57% 2.194 −0.05% −0.223 −0.56%
−2.943 22.14 −2.151 0.02% 77.00% 0.91 −0.14 0.193 26.382 −2.322 −0.17% 98.40% 1 −0.19 −5.479 108.968 −11.945
4.491 0.14 2.738 0.07 4.948
1.872 0.04 0.845 0 0.199
2.152 0.19 2.45 0.07 3.208
0.665 −0.2 −4.455 −0.02 −1.866
−3.305 0.00% 77.00% −0.007 −0.18% 98.40% −5.371
BAB
Alpha
0.01 0.44 0.037 4.031 −0.51 0.32 −2.611 3.285 0.38 −0.44 −1.832 3.412 −0.05 0.02 −0.266 0.26 −0.28 −0.21 −1.68 −2.486 0.07 0.22 0.573 3.278 0.26 0.09
−0.67% −2.064 −0.19% −0.642 −0.67% −2.139 −0.15% −0.545 0.53% 2.080 −0.05% −0.272 −0.49%
2.345 0.2 2.549 0.07 3.249
1.597 −0.18 −4.521 −0.02 −1.968
R2 36.60% 54.50% 41.10% 57.00% 62.00% 54.20% 68.60%
R2 36.60% 55.00% 41.20% 57.10% 62.00% 54.30% 68.90%
9.61in x 6.69in
Rm-Rf
HMLDEV
S. S. Mohanty
NEW ZEALAND t-Stats NORWAY t-Stats PORTUGAL t-Stats SPAIN t-Stats SWEDEN t-Stats SWITZERLAND t-Stats UNITED KINGDOM t-Stats UNITED STATES t-Stats WORLD t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
(Continued )
11:59
754
Table 19.1:
b3568-v1-ch19 page 754
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
755
some momentum risk can also be captured. The overall market however, captures these risks in a better way as explained in the CAPM market beta and hence index investing could be a preferable option in Austria too. Belgium: Similar to Australia and Austria, the Belgian equity markets are also risk-return efficient and the CAPM generates the highest Alpha. Moreover, contrary to the previous two markets, it also has the lowest market beta. Interestingly, this market has highly statistically significant and positive HML, RMW, QMJ and BAB factor betas and a negative but not so significant SMB factor beta according to all the models, except the FF 5-factor model wherein it turns positive to some extent. All these observations show that, though the market has positive momentum and investing in profitable, quality, value and low risk stocks may provide better results, without these dimensional betas the overall market risk is lower and provides better results. Hence, in this market also, index investing is more desirable. Canada: In Canada, the FF 5-factor model generates the highest Alpha with the CMA factor having the highest and statistically significant negative beta, followed by statistically significant and positive HML and SMB betas. Interestingly, the RMW factor beta is also negative. All the above, shows that there is a valuation overdrive in this market, particularly in the investment factor-led growth companies. This finding is further supported by the AFP model results which generate small but positive betas in the SMB and HML-WLAB factors, but a negative and statistically significant QMJ beta. Additionally, the market has a positive momentum and the BAB factor is positive but not statistically significant. Hence, though the market is momentum-positive, there is certainly a valuation overdrive and definitely so in highly investment-led junk stocks. Denmark: In the case of Denmark, based on the FF 3-factor model, only HML has a statistically significant beta, although in the FF 5-factor model, the impact of SMB is statistically significant, while HML, and RMW factor betas are high and positive and CMA has a negative factor beta. The market is momentum-positive; however, when we apply AFP’s 5-factor and 6-factor models, the BAB factor is statistically significant without the momentum factor, and HML-WLAB factor shows significant impact on Alpha with the negative factor beta of QMJ. Hence, investing in low risk and profitability-led quality stocks in the high value segment and avoiding overvalued investmentled junk stocks can enhance Alpha. Finland: In Finland, the AFP 6-factor model generates the highest Alpha, followed by the AFP 5-factor model, the FF 5-factor model, the FF 3-factor
page 755
July 6, 2020
11:59
756
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
model and CH 4-factor model, in that order. The CAPM generates the lowest Alpha. Interestingly, in Finland, all the factor dimensional betas are negative, i.e., SMB, HML, RMW, CMA, BAB and QMJ. The MOM factor beta which is slightly positive in the CH 4-factor model turns negative in the AFP 6-factor model. The factor betas also reduce market risk except the MOM factor beta. However, only HML (t-stats −3.22 in FF 3-factor, t-stats −2.89 in CH 4-factor), SMB (t-stats −2.11 in FF 5-factor) and BAB (t-stats of −2.962 and t-stats of −2.5 in AFP 5-factor and 6-factor models, respectively) betas are statistically significant. Hence, for Finland, we believe a very low risk strategy will only enhance Alpha. France: In France, the AFP 5-factor model provides the best results, followed by the AFP 6-factor model, though CAPM reduces market risk to its lowest. France has a positive as well as statistically significant HML factor beta, a positive MOM factor beta and a statistically significant negative SMB factor beta as per the results of FF 3-factor, CH 4-factor model and FF 5-factor models. The observations also show that the negative SMB factor beta improves a bit, due to a negative CMA or the investment factor beta in FF 5-factor model. This shows that some large stocks may be overvalued due to higher expectation of investment factor-led valuation. As per the AFP models, none of the factor betas are statistically significant, though they produce positive HML-WLAB, RMW, QMJ and MOM betas and negative SMB and BAB betas. This shows that in France, in the value segment, profitability and quality factor-led stocks will provide the best results and a low risk strategy may not be appropriate. Germany: In Germany too, the AFP 5-factor model produces the highest Alpha, followed by the AFP 6-factor model and they reduce market risk to the lowest, in that order. A review of factor betas obtained from all the models, shows that the FF 3-factor, CH 4-factor and the FF 5-factor models don’t have any statistically significant factor beta and SMB has a negative beta while HML and MOM have positive factor betas. When the profitability (RMW) and investment (CMA) factors are added in the FF 5-factor model, the HML factor beta gets slightly more pronounced with the negative CMA beta and positive RMW beta and market risk goes down slightly. However, the AFP 5-factor model shows that the HML-WLAB factor beta is statistically significant with the negative SMB, QMJ and BAB factor betas and it reduced the market risk to its lowest among all the factor models. Interestingly, when MOM factor is added in the AFP 6-factor model, the HML-WLAB factor beta goes up with the negative but statistically significant BAB beta and an equally statistically significant positive MOM beta.
page 756
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
757
The QMJ factor’s negative beta also increases slightly in the AFP 6-factor model. It means that the HML factor without the look-ahead bias, or a value strategy may work best in Germany. Holland: Holland’s equity markets are more or less similar to Germany, though the AFP 5-factor model generates the highest Alpha and reduces market risk to its lowest. The CAPM generates the second highest Alpha, though market risk is higher compared to the AFP 6-factor model. Interestingly, in AFP factor models, the SMB factor has a statistically significant negative factor beta, while it is negative but not statistically significant and gradually tapers down in the CH 4-factor, FF 3-factor and FF 5-factor model. On the other hand, the HML factor beta is very high, positive and statistically significant in the FF 3-factor and the CH 4-factor models, but tapers to a non-significant level with the RMW and CMA factor beta being positive. This shows that the high risk associated with value premium gets distributed over the premium on profitability and investor factor risks. The AFP models show that the HML-WLAB factor beta is positive and statistically significant with positive BAB factor beta, a negative QMJ beta and a very small MOM factor positive beta. These observations show that there are mispricing opportunities in the value segment, typically in stocks which are of good quality, less volatile, driven by both high profitability and aggressive investment policies, though some large companies may be overvalued. Since the market is momentum positive, it adds to market risk. Hence, a value strategy with ranking of stocks on the basis of quality, profitability and conservative investment factors may provide the best results. Hong Kong: In Hong Kong, the FF 5-factor model delivers the highest Alpha with the lowest market risk, followed by the AFP 5-factor model and the CAPM, in that order. The market is momentum-positive though not statistically significant and both SMB and HML factor betas are positive as well. However, once the RMW and CMA factors are added, the HML factor beta improves to a high and positive factor beta in the FF 5-factor model, from a small but negative beta in the FF 3-factor model, mainly due to the statistically significant negative beta of the CMA or investment factor and a negative though not statistically significant beta of the RMW or profitability factor. Interestingly, in the AFP 6-factor model, the HML-WLAB factor beta turns statistically significant from a high but not statistically significant beta in the AFP 5-factor model, while MOM factor produces the second highest positive beta along with QMJ factor beta turning negative and the negative BAB factor beta accentuating to a higher level. These observations show that
page 757
July 6, 2020
11:59
758
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
in Hong Kong, there is a valuation overdrive in investment-led highly risky and low quality growth stocks which are also momentum positive, though there are some mispricing opportunities in the value and large stocks. Ireland: In Ireland, the AFP 5-factor and 6-factor model betas are the lowest, in that order and also lower than the CAPM beta; the latter model has provided the highest Alpha though. Technically, the AFP models show overvaluation in larger companies, as the SMB beta turns negative in the AFP 5-factor and 6 factor model, as compared to their positive values in the FF 3-factor, FF 4-factor and FF 5-factor models. Moreover, book-to-price or value premium (both HML and HML-WLAB) has statistically significant positive betas in all the models, while the RMW (profitability factor) beta is positive and the CMA (investment factor) beta is negative in the FF 5-factor model. As per the AFP models, the QMJ (quality factor) model has a negative beta and the BAB (low risk) factor has a positive and statistically significant beta. All these indicate that there could be an overvaluation situation in some large companies, led by investment rather than profitability growth. Hence, though the market risk is more risk-return efficient as per the CAPM, a low risk, profitability-led and high book-to-market value stock portfolio may be a better strategy in the Irish equity markets. Israel: In Israel, the market risk is lowest in the AFP 6-factor model which maximizes the Alpha to 0.75% with very high and statistically significant negative betas in HML-WLAB followed by QMJ. Interestingly, all the other factor betas including the BAB factor are negative. It shows that all these factor strategies would lead to reducing market risk, though in the AFP 5-factor model, the SMB factor is positive and increases market risk by 4 bps. According to the FF 5-factor model, the SMB factor has a much higher positive beta and statistically significant negative beta of HML factor, a less pronounced negative CMA factor beta and a smaller negative RMW factor beta, though the market risk remains higher compared to the AFP models. This is also evident in the statistically significant positive SMB factor beta and negative HML factor beta in the FF 3-factor and CH 4-factor models, which further enhances market risk. Hence, this market could have some larger companies that are more risk-prone as they could be missing out on quality and their valuations may be driven by the investment and momentum factors. Italy: In Italian equity markets, the market risk goes down to its lowest in the AFP 5-factor model as compared to other models with negative betas in the SMB and QMJ factors, and positive betas in HML-WLAB and BAB
page 758
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
759
factors. The AFP 6-factor model shows that market risk goes up slightly, once the positive beta of MOM factor is added. However, the market risk is the highest in the CH 4-factor model with its very high and statistically significant positive HML beta and some positive MOM beta. The FF 5-factor model shows that market risk can be reduced to some extent by excluding companies which are not profitable (negative RMW beta) and investmentreturn efficient (negative CMA beta). Hence, a strategy of concentrating on high book-to-market stocks which are of quality but have lower risk, could lead to better returns. Japan: In Japan, the AFP 6-factor model produces the highest Alpha at lowest market risk, as it has a statistically significant positive BAB factor beta, a statistically significant negative HML-WLAB factor beta and to some extent due to contribution from the negative betas of MOM, SMB and QMJ factors. Ironically, the Japanese market has a very high and statistically significant positive CMA factor beta and a high SMB factor positive beta and a negative HML and RMW factor betas, as per the FF 5-factor model. In the FF 3-factor and CH 4-factor models, SMB, HML and MOM factors, all have positive betas and they increase market risk. Hence in Japan, Alpha can be enhanced only by focusing on low risk and conservatively investing in companies, as mispricing opportunity is found only in the low risk category (BAB factor) and CMA factors, while the market has already factored in the risk associated with HML, QMJ, SMB, RMW and MOM. New Zealand: New Zealand’s equity markets are quite risk-return efficient, as the CAPM produces the highest Alpha with the lowest risk, followed by the CH 4-factor model, as MOM has a negative factor beta. In all the factor models, the HML factor with and without the look-ahead bias is statistically significant and has a positive beta. In addition to that, in the FF 5-factor model, the SMB as well as the RMW factor betas are statistically significant and positive as well. Similarly, in the AFP 5-factor and 6-factor models, the BAB factor beta is high, positive and statistically significant. Ironically, though in the AFP models, market risk goes down significantly with the strong BAB factor, they don’t generate excess Alpha. It shows that high book-to-price led by strong profitability growth and low risk stocks would probably provide better Alpha, though index investing could be a better option than stock-picking in this market. Norway: In Norway, the AFP 5-factor model reduces the market risk to the lowest and generates the highest Alpha with its high, positive and statistically significant HML-WLAB and BAB factor betas and high, negative
page 759
July 6, 2020
11:59
760
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
and statistically significant QMJ factor beta. Though the CAPM generates the second highest Alpha, the AFP 6-factor model interestingly provides the second lowest market beta, due to its high, positive and statistically significant MOM beta. The CH 4-factor model shows that the market is momentum positive, while the FF 5-factor model shows that the HML, RMW and SMB factor betas are very high, positive and statistically significant betas, whereas, the CMA factor beta is very high, negative and statistically significant. In the AFP 5 factor model, the SMB factor has a negative beta, though it turns positive when we add the MOM factor in the AFP 6-factor model. This shows that some large and aggressively growing companies are probably overvalued led by momentum, while there may be mispricing opportunities in some positive momentum stocks in the high book-to-price segment which are of good quality and have low volatility too. Portugal: Portugal’s equity markets are more or less similar to Norway’s equity markets. Though the CAPM model generates the highest Alpha, its market beta of 1.03 is higher than that of the market betas of the AFP models. The market risk gets reduced to a great extent in the AFP 6-factor model due to the positive and statistically significant BAB beta and negative betas of QMJ, SMB, HML-WLAB and MOM factors. We also observe that in the AFP 5-factor model, the small positive beta of HML-WLAB becomes negative when MOM factor is added. These observations are also supported by the FF 3-factor, CH 4-factor and FF 5-factor models’ results, which shows that with the addition of SMB, HML, MOM, RMW and CMA factor betas in these three models, the market risk increases as all these factors positively contribute to market risk. Hence, index investing could be a better strategy in Portugal, though there may be overvaluation situations in large, low bookto-market and junk stocks. Spain: In Spain, the CAPM model produces a market beta of 1.27 which goes up to 1.3 in the FF 3-factor and CH 4-factor models. Both the latter models highlight the statistical significance of the HML factor beta, very small positive SMB beta and a small negative MOM beta. The market beta goes down a notch to 1.25 in FF 5-factor model with negative betas of CMA, RMW and SMB factors, though the HML factor beta remains highly statistically significant and positive. The AFP 5-factor model generates the highest Alpha with a statistically significant and positive HML-WLAB factor beta and a very small but positive BAB factor beta, as some amount of market risk goes away with the negative betas of SMB and QMJ factors. In the AFP 6-factor model, the market risk goes up a bit due to small but positive MOM factor beta. Hence, in Spain the high book-to-price stocks
page 760
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
761
can generate excess returns if overvalued junk stocks are avoided in the large companies segment. Sweden: In Sweden, the AFP 6-factor model reduces market risk and enhances Alpha to its highest. Interestingly, it has negative beta for BAB, QMJ, HML-WLAB, and MOM factors and only positive beta in the SMB factor. In the AFP 5-factor model, without the MOM factor, the BAB factor beta is negative and statistically significant as well. This is also supported by negative betas of HML, RMW and CMA factors and only positive beta of SMB factor in the FF 5-factor model, though none of them are statistically significant. In the FF 3-factor and CH 4-factor model, market beta gets gradually reduced by 2 bps respectively, as compared to the CAPM model market beta, but both the models show statistically significant positive beta of the SMB factor and statistically significant negative beta of the HML factor. Hence, in Sweden, overvaluation situations may be found in low book-to-price stocks that are also low on profitability and follow an aggressive investment policy. Some large stocks may have mispricing opportunities but may not necessarily be of low risk. Switzerland: In Switzerland, the CAPM model generates the highest Alpha with the lowest market risk; while the FF 3-factor model generates the second highest Alpha with its very high statistically significant positive HML factor beta, and statistically significant negative SMB factor beta. The SMB and HML factor betas increase further in the CH 4-factor model with some positive beta of the MOM factor. The market risk gets further accentuated in the FF 5-factor model with positive and statistically significant betas of the HML and RMW factors and not so significant but positive beta of the CMA factor, though the SMB factor beta remains negative. In the AFP models, the BAB and HML-WLAB factor betas are positive and statistically significant, while the SMB factor beta turns statistically significant and negative and the QMJ and MOM factors have less significant positive and negative betas, respectively. This shows that there are mispricing opportunities in the high book-to-price companies with strong profitability growth combined with some investment growth. These companies may also be better in quality and may have low risk as well. United Kingdom (UK): The UK market is also similar to the Swiss market and the CAPM beta is the lowest among all and it generates the highest Alpha as well. All the factor models show statistically significant and negative SMB factor beta, and statistically significant and positive HML/ HML-WLAB factor betas. In addition to the above, the FF 5-factor model
page 761
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
762
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
shows very high, positive and statistically significant RMW factor beta and some negative CMA factor beta, while the AFP models show statistically significant and positive QMJ factor beta and some positive MOM and BAB factor betas. This shows that there is an investment-led overvaluation situation in some of the large companies, but the overall market is more risk-return efficient and index investing could be a better option in this market, though betting on high value stocks with selective stock screening for profitability, quality and low risk parameters, may also work well. United States of America (US): The US market is slightly different than the UK and Swiss markets, as it generates the highest Alpha in the CH 4-factor model with its negative MOM factor beta, followed by the FF 5-factor model, though the market risk factor is the lowest in the latter. Interestingly, the SMB and HML factor betas are negative and statistically significant in both these models, with the former being the highest among all the models. Additionally, in the FF 5-factor model, the investment factor (CMA) has a high negative and statistically significant beta and the RMW factor has a positive beta. This is also supported by the fact that QMJ has a positive and statistically significant factor beta, a high, positive and statistically significant HML-WLAB factor beta, some positive MOM factor beta and a negative and statistically significant BAB factor beta in the AFP 6-factor model. Hence, in the US market, high book-to-market quality stocks led by profitability growth but not investment growth, would provide the best results and these stocks may have some positive momentum as well. 19.4.1.2 Emerging markets The regression results for emerging markets are presented in Table 19.2. Brazil: In Brazil, the CAPM beta is 1.63, whereas when factor betas are added in the FF-3 and CH-4 models, the market risk goes up to 1.65 and 1.66 and the Alpha value goes down, respectively, in that order. The market risk which generates an Alpha of 0.65% with the CAPM, when added up with the size factor (SMB) and value factor (HML) positive betas, increases by 2 bps, as they are positively correlated to market risk. Hence, Alpha gets reduced by 8 bps, which shows that Brazil’s market risk gets accentuated by almost 4 times when we take into consideration the return spread between small-minus-big stocks (the size effect) and the return spread between highminus low stocks (the value effect). When we add a momentum factor in the
page 762
July 6, 2020 11:59
End date
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
CAPM Beta
Alpha
1.63 12.177 0.85 10.739 1.13 9.628 0.8 7.015 0.86 6.858 1.46 12.943 1.6 14.288 1.04 10.451 1.26 8.753 0.8 8.312 1.18 13.357 0.97 8.913
0.65% 1.133 0.22% 0.651 −0.52% −1.022 0.38% 0.760 0.49% 0.883 −1.21% −2.523 0.25% 0.511 0.13% 0.294 −0.05% −0.085 −0.10% −0.248 0.29% 0.755 0.57% 1.208
FF-3 FACTOR R2 32.40% 27.10% 24.50% 14.70% 15.20% 35.10% 43.80% 27.60% 19.80% 18.20% 36.50% 21.70%
Rm-Rf
SMB
HML
Alpha
1.65 12.11 0.86 10.801 1.14 9.672 0.85 7.537 0.87 6.898 1.52 13.757 1.65 14.688 1.02 10.589 1.31 9.1 0.82 8.503 1.17 13.113 0.99 9.107
0.17 0.605 0.4 2.41 0.46 1.874 0.73 3.074 0.71 2.698 0.56 2.403 0.48 2.06 1.04 5.126 0.86 2.857 0.58 2.882 0.29 1.544 0.76 3.347
0.17 0.660 0.12 0.809 0.22 0.998 0.68 3.258 0.17 0.746 0.84 4.043 0.54 2.631 −0.21 −1.184 0.58 2.163 0.24 1.316 −0.13 −0.790 0.27 1.367
0.57% 0.979 0.15% 0.437 −0.63% −1.229 0.05% 0.105 0.40% 0.734 −1.58% −3.324 0.03% 0.063 0.17% 0.411 −0.33% −0.542 −0.23% −0.556 0.32% 0.841 0.42% 0.886
R2 32.50% 28.50% 25.50% 19.50% 17.50% 38.90% 45.80% 34.80% 22.60% 20.60% 37.20%
9.61in x 6.69in
BRAZIL t-Stats CHILE t-Stats CHINA t-Stats COLUMBIA t-Stats EGYPT t-Stats GREECE t-Stats HUNGARY t-Stats INDIA t-Stats INDONESIA t-Stats MALAYSIA t-Stats MEXICO t-Stats PERU t-Stats
Start date
Sourcing Alpha in Global Equity Markets
Country
Emerging markets.
Handbook of Financial Econometrics,. . . (Vol. 1)
Table 19.2:
24.90%
b3568-v1-ch19
763
page 763
July 6, 2020
Jan-91
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
CAPM Beta
Alpha
0.89 8.610 1.64 10.659 1.81 10.290 1.19 14.944 1.36 11.595 1.04 10.537 1.19 9.643 1.5 8.308 1 10.258
0.16% 0.360 0.28% 0.418 0.69% 0.898 0.00% 0.000 −0.05% −0.091 −0.21% −0.490 −0.07% −0.140 0.28% 0.369 −0.03% −0.063
FF-3 FACTOR R2 19.30% 28.40% 28.80% 43.80% 30.30% 26.40% 23.10% 18.20% 28.70%
Rm-Rf
SMB
HML
Alpha
0.91 8.723 1.67 10.966 1.84 10.275 1.2 15.474 1.35 11.498 1.04 10.6 1.18 9.515 1.44 7.914 1.04 10.977
0.28 1.273 1.13 3.531 0.39 1.036 0.71 4.366 0.7 2.857 0.63 3.074 0.34 1.323 −0.2 −0.533 0.92 4.655
0.26 1.331 0.51 1.829 0.31 0.934 0.29 2.016 −0.14 −0.618 0.08 0.438 −0.02 −0.081 −0.76 −2.244 0.52 2.985
0.04% 0.087 0.01% 0.010 0.57% 0.723 −0.16% −0.461 −0.03% −0.060 −0.27% −0.647 −0.08% −0.159 0.60% 0.773 −0.25% −0.594
R2 20.10% 31.80% 29.20% 47.60% 32.30% 28.60% 23.50% 19.50% 35.20%
9.61in x 6.69in
End date
S. S. Mohanty
PHILIPPINES t-Stats POLAND t-Stats RUSSIA t-Stats S. AFRICA t-Stats S. KOREA t-Stats TAIWAN t-Stats THAILAND t-Stats TURKEY t-Stats CZECH REP. t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
(Continued )
11:59
764
Table 19.2:
b3568-v1-ch19 page 764
July 6, 2020 11:59
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
CH-4 FACTOR
FF-5 FACTOR
Rm-Rf
SMB
HML
MOM
Alpha
1.66 11.676 0.81 9.861 1.07 8.749 0.78 6.661 0.87 6.59 1.49 12.928 1.7 14.438 1.01 10.081 1.32 8.803 0.82 8.074 1.19 12.772 1.04 9.212
0.16 0.566 0.44 2.63 0.53 2.155 0.8 3.369 0.7 2.656 0.58 2.495 0.44 1.862 1.04 5.116 0.85 2.795 0.59 2.885 0.27 1.445 0.72 3.13
0.19 0.713 0.04 0.249 0.08 0.351 0.54 2.515 0.18 0.74 0.78 3.625 0.63 2.923 −0.23 −1.221 0.6 2.163 0.23 1.196 −0.1 −0.554 0.36 1.751
0.04 0.286 −0.17 −1.904 −0.3 −2.239 −0.3 −2.339 0.01 0.095 −0.12 −0.917 0.17 1.375 −0.03 −0.304 0.05 0.318 −0.03 −0.237 0.08 0.732 0.19 1.577
0.53% 0.886 0.30% 0.866 −0.37% −0.703 0.31% 0.629 0.39% 0.692 −1.47% −3.022 −0.13% −0.250 0.20% 0.469 −0.38% −0.600 −0.21% −0.485 0.26% 0.649 0.25% 0.510
R2 32.50% 29.30% 26.80% 21.00% 17.50% 39.10% 46.20% 34.80% 22.60% 20.60% 37.40% 25.50%
Rm-Rf
SMB
HML
RMW
CMA
Alpha
1.53 8.959 0.82 8.151 0.96 6.441 0.85 5.919 0.83 5.183 1.39 10.017 1.57 10.8 1.03 8.455 1.41 7.891 0.86 7.037 1.06 9.431 1.01 7.272
0.15 0.481 0.41 2.256 0.44 1.654 0.84 3.312 0.89 3.217 0.5 1.982 0.54 2.161 1.15 5.327 1.15 3.563 0.68 3.096 0.22 1.084 0.84 3.387
0.81 2.157 0.32 1.444 0.98 3.06 0.92 2.984 0.69 2.043 1.33 4.358 0.93 3.054 −0.07 −0.251 0.74 1.888 0.26 0.972 0.26 1.044 0.24 0.786
0.28 0.605 0.12 0.425 0.14 0.36 0.59 1.536 1.08 2.584 0 −0.013 0.41 1.091 0.56 1.735 1.21 2.496 0.42 1.254 −0.09 −0.288 0.27 0.714
−1.23 −2.568 −0.49 −1.734 −1.51 −3.654 −0.75 −1.861 −1.38 −3.079 −1.04 −2.659 −0.95 −2.342 −0.66 −1.928 −0.74 −1.465 −0.28 −0.817 −0.77 −2.415 −0.2 −0.526
0.61% 0.999 0.15% 0.412 −0.48% −0.907 −0.10% −0.191 0.11% 0.190 −1.46% −2.925 −0.03% −0.063 0.01% 0.016 −0.77% −1.196 −0.39% −0.882 0.45% 1.103 0.31% 0.625
R2 34.20% 29.30% 29.20% 21.60% 23.30% 40.30% 47.40% 36.80% 25.40% 21.40% 38.40% 25.20%
9.61in x 6.69in
End date
Sourcing Alpha in Global Equity Markets
BRAZIL t-Stats CHILE t-Stats CHINA t-Stats COLUMBIA t-Stats EGYPT t-Stats GREECE t-Stats HUNGARY t-Stats INDIA t-Stats INDONESIA t-Stats MALAYSIA t-Stats MEXICO t-Stats PERU t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
b3568-v1-ch19
765
page 765
July 6, 2020
Jan-91
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
CH-4 FACTOR Rm-Rf 0.86 7.932 1.68 10.578 1.86 9.889 1.23 15.237 1.24 10.3 0.96 9.467 1.12 8.646 1.3 6.914 1.05 10.525
FF-5 FACTOR
SMB
HML
MOM
Alpha
0.32 1.458 1.12 3.473 0.37 0.975 0.69 4.177 0.79 3.245 0.7 3.432 0.4 1.533 −0.08 −0.22 0.91 4.569
0.17 0.835 0.53 1.808 0.34 1.003 0.34 2.284 −0.33 −1.465 −0.07 −0.379 −0.14 −0.581 −1.02 −2.911 0.53 2.929
−0.19 −1.602 0.03 0.191 0.08 0.377 0.11 1.252 −0.41 −3.098 −0.32 −2.883 −0.26 −1.804 −0.55 −2.637 0.03 0.275
0.21% 0.454 −0.02% −0.034 0.50% 0.618 −0.25% −0.732 0.33% 0.656 0.01% 0.028 0.14% 0.261 1.09% 1.368 −0.27% −0.640
R2 20.70% 31.80% 29.30% 47.90% 34.40% 30.40% 24.30% 21.30% 35.20%
Rm-Rf
SMB
HML
RMW
CMA
Alpha
0.97 7.315 1.68 8.545 1.43 6.375 1.28 12.781 1.37 9.259 0.92 7.429 1.27 8.114 1.14 4.993 0.91 7.444
0.4 1.679 1.14 3.263 0.34 0.874 0.82 4.614 0.79 2.935 0.55 2.451 0.51 1.811 −0.41 −0.989 0.89 4.208
0.27 0.922 0.37 0.865 1.83 3.871 0.14 0.642 −0.13 −0.396 0.43 1.578 −0.01 −0.039 0.39 0.788 0.85 3.288
0.47 1.296 0 −0.007 0.42 0.709 0.38 1.42 0.33 0.809 −0.17 −0.516 0.73 1.718 −0.24 −0.387 0.02 0.071
−0.17 −0.46 −0.08 −0.137 −2.99 −4.755 −0.01 −0.018 −0.27 −0.653 −0.78 −2.236 −0.23 −0.52 −1.96 −3.056 −0.9 −2.624
−0.14% −0.298 −0.02% −0.033 0.80% 0.998 −0.33% −0.927 −0.16% −0.295 −0.13% −0.283 −0.37% −0.649 0.97% 1.180 −0.17% −0.381
R2 20.70% 31.70% 35.40% 48.10% 32.70% 29.60% 24.50% 21.90% 36.90%
(Continued)
9.61in x 6.69in
End date
S. S. Mohanty
PHILIPPINES t-Stats POLAND t-Stats RUSSIA t-Stats S. AFRICA t-Stats S. KOREA t-Stats TAIWAN t-Stats THAILAND t-Stats TURKEY t-Stats CZECH REP. t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
(Continued )
11:59
766
Table 19.2:
b3568-v1-ch19 page 766
July 6, 2020 11:59
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-93
Dec-16
Rm-Rf
SMB
1.24 5.694 0.7 6.829 1.18 7.383 0.55 3.627 0.35 2.07 0.92 5.407 1.24 7.944 0.72 5.257 1.1 5.167 0.78 6.407 1.1 9.473 0.65 4.285
0.1 0.331 0.23 1.567 0.23 1.02 0.41 1.911 −0.34 −1.418 −0.02 −0.068 −0.07 −0.301 0.48 2.531 0.45 1.498 0.25 1.466 0.16 0.962 0.21 1.004
0.08 0.42 0.16 1.704 0.16 1.207 0.54 4.243 0.11 0.821 0.24 1.588 0.1 0.775 −0.07 −0.622 0.26 1.34 0.12 1.089 −0.06 −0.584 0 −0.022
AFP-6 FACTOR
QMJ
BAB
Alpha
−0.49 −1.25 −0.08 −0.456 0.14 0.525 −0.18 −0.688 −1.16 −3.883 −0.65 −2.122 −0.78 −2.845 −0.47 −1.985 0.05 0.119 0.06 0.288 −0.27 −1.301 −0.56 −2.134
0.16 0.797 0.04 0.374 0.12 0.826 0.23 1.727 0.42 2.75 0.34 2.168 0.51 3.645 0.23 1.965 0.25 1.254 0.18 1.544 0.06 0.595 0.37 2.834
0.84% 1.085 0.25% 0.681 −0.90% −1.620 0.13% 0.241 0.92% 1.525 −0.68% −1.139 0.17% 0.315 0.16% 0.335 0.11% 0.151 −0.35% −0.805 0.58% 1.412 0.56% 1.062
R2 19.10% 24.40% 26.30% 19.90% 18.20% 21.10% 42.70% 28.90% 13.80% 18.00% 37.30% 20.90%
Rm-Rf
SMB
HMLDEV
1.26 5.669 0.66 6.378 1.12 6.971 0.52 3.35 0.33 1.937 0.89 5.176 1.24 7.867 0.7 5.026 1.1 5.083 0.78 6.292 1.09 9.277 0.66 4.32
0.12 0.387 0.18 1.213 0.18 0.805 0.38 1.76 −0.35 −1.466 −0.05 −0.197 −0.06 −0.282 0.46 2.421 0.45 1.479 0.25 1.438 0.15 0.914 0.23 1.061
0.18 0.569 −0.09 −0.63 −0.16 −0.786 0.33 1.682 0 −0.018 0.09 0.358 0.14 0.661 −0.19 −1.088 0.26 0.857 0.11 0.651 −0.09 −0.545 0.09 0.45
MOM
QMJ
BAB
Alpha
0.1 0.392 −0.26 −2.229 −0.35 −2.036 −0.22 −1.356 −0.13 −0.677 −0.17 −0.844 0.04 0.229 −0.13 −0.898 0 0.007 −0.01 −0.053 −0.03 −0.225 0.1 0.604
−0.48 −1.216 −0.11 −0.622 0.11 0.399 −0.2 −0.772 −1.16 −3.893 −0.67 −2.178 −0.78 −2.834 −0.48 −2.037 0.05 0.119 0.06 0.282 −0.28 −1.312 −0.55 −2.09
0.12 0.534 0.14 1.339 0.26 1.657 0.32 2.154 0.47 2.774 0.41 2.315 0.49 3.196 0.29 2.158 0.25 1.115 0.18 1.401 0.08 0.632 0.33 2.257
0.77% 0.976 0.42% 1.140 −0.69% −1.234 0.26% 0.483 0.98% 1.609 −0.57% −0.934 0.15% 0.275 0.24% 0.492 0.11% 0.146 −0.34% −0.775 0.60% 1.425 0.50% 0.933
R2 19.10% 25.50% 27.40% 20.40% 18.30% 21.30% 42.70% 29.10% 13.80% 18.00% 37.30%
9.61in x 6.69in
BRAZIL t-Stats CHILE t-Stats CHINA t-Stats COLUMBIA t-Stats EGYPT t-Stats GREECE t-Stats HUNGARY t-Stats INDIA t-Stats INDONESIA t-Stats MALAYSIA t-Stats MEXICO t-Stats PERU t-Stats
HMLDEV
Sourcing Alpha in Global Equity Markets
Country
End date
Handbook of Financial Econometrics,. . . (Vol. 1)
AFP-5 FACTOR Start date
21.00%
(Continued)
b3568-v1-ch19
767
page 767
July 6, 2020
AFP-5 FACTOR Country
Jan-91
Dec-16
Jan-93
Dec-16
Jan-95
Dec-16
Jan-93
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-91
Dec-16
Jan-95
Dec-16
Rm-Rf
SMB
HMLDEV
0.9 6.844 1.19 5.488 1.61 6.553 0.92 7.956 0.95 6.245 0.76 5.026 1.26 8.193 0.66 2.697 0.57 4.226
−0.02 −0.123 0.22 0.718 0.17 0.509 0.18 1.105 0.34 1.584 0.34 1.604 0.32 1.479 −0.36 −1.026 0.26 1.372
0.2 1.697 0.04 0.207 0 0.007 0.07 0.77 0.24 1.78 0.4 2.924 0.26 1.894 0.22 0.982 0.14 1.255
AFP-6 FACTOR
QMJ
BAB
Alpha
0.01 0.055 −0.61 −1.638 −0.27 −0.631 −0.32 −1.635 −0.08 −0.308 −0.15 −0.539 0.37 1.339 −1.18 −2.646 −0.65 −2.734
0.17 1.369 0.31 1.634 0.27 1.214 0.19 1.835 −0.38 −2.672 −0.06 −0.434 0.04 0.29 −0.37 −1.625 0.25 2.075
−0.19% −0.416 0.34% 0.449 0.41% 0.466 −0.02% −0.047 0.25% 0.471 0.03% 0.058 −0.45% −0.836 1.60% 1.839 0.12% 0.253
R2 19.30% 24.50% 27.10% 37.90% 25.90% 19.20% 24.00% 14.20% 28.40%
Rm-Rf
SMB
HMLDEV
0.87 6.514 1.2 5.432 1.56 6.302 0.95 8.185 0.92 5.965 0.73 4.777 1.22 7.798 0.58 2.327 0.59 4.355
−0.06 −0.329 0.22 0.733 0.14 0.419 0.21 1.298 0.3 1.4 0.31 1.43 0.26 1.204 −0.46 −1.321 0.27 1.46
0.01 0.04 0.08 0.278 −0.28 −0.862 0.28 1.905 0.07 0.306 0.23 1.089 −0.02 −0.096 −0.31 −0.887 0.3 1.676
MOM
QMJ
BAB
Alpha
−0.21 −1.349 0.04 0.19 −0.3 −1.109 0.23 1.835 −0.19 −1.072 −0.18 −0.999 −0.3 −1.689 −0.56 −1.963 0.17 1.142
−0.01 −0.045 −0.61 −1.62 −0.28 −0.656 −0.3 −1.524 −0.11 −0.387 −0.17 −0.611 0.34 1.214 −1.24 −2.795 −0.64 −2.71
0.25 1.831 0.29 1.372 0.38 1.571 0.09 0.821 −0.3 −1.904 0.01 0.062 0.16 1.019 −0.15 −0.574 0.19 1.394
−0.06% −0.122 0.31% 0.406 0.56% 0.632 −0.15% −0.379 0.38% 0.687 0.15% 0.268 −0.26% −0.462 1.96% 2.219 0.04% 0.075
R2 19.80% 24.50% 27.50% 38.60% 26.20% 19.50% 24.60% 15.20% 28.80%
9.61in x 6.69in
End date
S. S. Mohanty
PHILIPPINES t-Stats POLAND t-Stats RUSSIA t-Stats S. AFRICA t-Stats S. KOREA t-Stats TAIWAN t-Stats THAILAND t-Stats TURKEY t-Stats CZECH REP. t-Stats
Start date
Handbook of Financial Econometrics,. . . (Vol. 1)
(Continued )
11:59
768
Table 19.2:
b3568-v1-ch19 page 768
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
769
CH 4-factor model, with an additional positive factor beta of 4 bps, Alpha gets reduced by another 4 bps and market risk goes up by 1 bps. In the case of FF-5, we observe that Alpha improves by 4 bps and market risk goes down by 12 bps as compared to the FF 3-factor model, largely due to the statistically significant negative CMA factor beta and an improvement in the HML factor beta to a statistically significant positive level with a positive RMW factor beta. This shows that not all the value drivers have positive investment factor bias or in other words, companies following aggressive investment strategies do not necessarily add value. Similarly, the AFP 5-factor and AFP-6 factor significantly reduce market risk when we add the QMJ, the BAB and the HML-WLAB factor betas as Alpha increases significantly to 0.84%, mainly due to a significant reduction the value premium without the look-ahead bias and impact of a large negative beta of the QMJ factor on market risk. This means there are overvalued junk stocks that can be avoided and investors need to identify quality stocks or stocks of companies that follow prudent investment policies, enhancing their bookto-market ratio. Hence, value strategy with stock screening parameters of profitability, quality and low risk should work well in Brazil. If investors are using market factor models, then AFP’s 5-factor model delivers better results. Chile: In Chile, the AFP 6-factor model reduces market risk to its lowest and generates the highest Alpha, followed by the CH 4-factor model. Interestingly, in both the models, MOM factor has a high negative beta and in the case of the former, it is statistically significant as well. Hence, momentum should be avoided in Chile. The SMB factor beta which is positive and statistically significant in the FF 3-factor, CH 4-factor and FF 5-factor models, goes down to a non-significant level in the AFP models. In the FF 5-factor model, RMW or the profitability factor has a positive beta while the CMA or the investment factor has a negative beta. Interestingly, the HML-WLAB factor beta which is positive in the AFP 5-factor model, turns negative in the AFP 6-factor model with the high and negative MOM beta and improvement in BAB factor beta. The above observations show that MOM, HML, CMA and QMJ factors have a direct relationship with the market risk, while BAB has an inverse relationship with market risk and a strong positive impact on sourcing long-term Alpha. Hence by removing small stocks with momentum bias and concentrating on small but quality stocks with high book-to-market ratio and-low volatility, may provide better performance.
page 769
July 6, 2020
11:59
770
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
China: In China, CH-4 factor model has the highest Alpha, though the market risk is higher compared to FF-5 and the rest of all the other models. Further observed that statistically significant positive beta of the SMB factor and statistically significant negative beta of MOM, can enhance Alpha more than any other combination, though in FF-5 factor model, HML and CMA betas are also statistically significant. The high negative beta of CMA factor as opposite to high positive beta of HML factor shows that risk associated with high book-to-price stocks does not commensurate with return proportionately. In AFP’s 5-factor and 6-factor model, again factor betas of HML without the look-ahead bias and MOM are statistically significant. China has a significant direct relationship between the CMA factor and market risk (t-stat of −3.654), followed by MOM and HML in that order and hence Alpha can be enhanced by betting on SMB, QMJ, RMW and BAB factors. Interestingly, the mispricing impact in HML with look-ahead bias is very high, with a significant variation in the HML factor of Fama–French and that of Asness–Frazzini and Pedersen. It also means that China is a growth market with possible mispricing in quality value stocks driven by the profitability factor but not the investment factor. Hence such value stocks, preferably with low volatility can be selected for better results. MOM should be avoided (t-stat of −2.04). Columbia: In Columbia, the CAPM beta provides the highest Alpha and the overall market is more risk-return efficient. It is followed by CH-4 factor with a statistically significant negative momentum beta. SMB and HML factor betas of FF 3-factor, CH-4 factor and FF 5-factor models are also statistically significant, while in the case of AFP 5-factor only the value premium beta (HML without the look-ahead bias) is statistically significant and AFP 6-factor, only the BAB beta is statistically significant. Interestingly, market beta goes down drastically by adding the QMJ and BAB factors in AFP’s 5-factor model and additionally MOM in AFP’s 6-factor model. Hence, Columbia appears to be a small market wherein cross-dimensional factor risks are very prevalent. It could be worth building a model with SMB, HML and BAB factors with MOM or without MOM factor as a new model to determine whether it enhances Alpha value further or not, though index investing is a suitable option. Egypt: In Egypt, the CAPM beta of 0.86 gets significantly reduced to 0.35 in AFP 5-factor and to 0.33 in AFP 6-factor models which have very high statistically significant negative dimensional beta (−1.16 with t-stats
page 770
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
771
of −3.893 for 6-factor and −1.16 with t-stats of 3.883 for 5-factor) of QMJ and very high statistically significant positive dimensional beta (+0.47 with t-stats of +2.774 for 6-factor and +0.42 with t-stats of +2.75 for 5-factor) of BAB factors. This highlights the positive impact of quality stocks and a low risk strategy in Egypt on alpha, as its value doubles up and reaches to near-double in the case of AFP’s 6-factor and 5-factor models respectively, as compared to the CAPM alpha. Interestingly, market risk increases in the case of FF-3 and CH-4 factors, as only SMB factor is statistically significant. In FF-5 models, all the SMB, HML and RMW factor betas are high with positive beta values and CMA factor beta has a very high and statistically significant negative value with a t-stats of −3.079. Hence, if QMJ also proxies for profitability and investment policies of companies listed on the Egyptian equity markets, then adopting a low risk strategy in the small as well as high value quality stocks would provide better results. Greece: In Greece, the CAPM market beta is very high at 1.46 and gets further accentuated to 1.52 and 1.49, in the case of the FF-3 and CH-4 factor models, respectively, with statistically significant positive beta of SMB and HML factors. Though market risks go down slightly in the case of FF5 factor model with the impact of statistically significant positive beta of HML and statistically significant negative beta of CMA factors, the negative Alpha value is a notch below that of the FF-3 and CH-4 models, and way higher than the CAPM’s negative Alpha. It is only when the QMJ and BAB factors are added to SMB and HML in AFP 5-factor model and MOM is added to their 6-factor model, the negative Alpha gets significantly reduced to 0.68 and further to 0.57, highlighting the advantage of quality and low risk strategy in the Greece equity markets. In both the cases, the QMJ and BAB factor betas are high and statistically significant. The HML factor beta which was very high and statistically significant in the FF 5-factor model was probably due to the existence of look-ahead bias, as it goes down drastically to a statistically insignificant level in AFP models. Hence, in Greece, betting against beta or a low risk strategy (BAB has a t-stat of 2.32) among quality and value stocks and avoiding momentum and investment factor-led large junk stocks, may provide enhanced Alpha. Hungary: Hungary has a market structure more or less similar to Greece, though the CAPM model is more efficient as it has the highest Alpha. The AFP 5-factor and 6-factor models have generated better alphas with their
page 771
July 6, 2020
11:59
772
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
statistically significant QMJ and BAB factor betas then the FF 3-factor, CH 4-factor and FF 5-factor models. Moreover, these factor betas have reduced market risk to 1.24 as compared to FF 3-factor and CH-4 factor models. This shows that factor risks associated with size, value and momentum premiums, though statistically significant in the FF 3-factor, CH 4-factor and FF 5-factor models, have a complimentary impact on market risk. The impact of the CMA factor’s high and statistically significant negative beta has reduced market risk to some extent, but not to the extent by QMJ and BAB factor betas in AFP 5-factor and 6-factor models. This means that there may be companies which are following aggressive investment policies but not generating returns commensurate with their investments. Hence, in Hungary both QMJ and BAB factors work best and the market risk gets accentuated due to the CMA, QMJ and SMB factors. There is possibly some benefit in the HML and MOM strategies, though to a less significant extent. On the whole, index investing could be a better option. India: India is typically a growth market where SMB factor beta is very high and statistically significant across all the models. The market risk gets diluted to the extent of 1 bps in the FF 3-factor, 1 bps in the CH 4-factor and gets heightened by 1 bps in the FF 5-factor, but significantly goes down by 31 bps in the AFP 5-factor and further by 33 bps in the AFP 6-factor model. The overall analysis shows that the value premium does not exist in India, or at best, is at its excess due to its negative beta. The CMA factor takes away the majority of excess value premium with its negative beta. However, the size premium is complemented by the statistically significant low risk premium (BAB factor beta) alongside a very high and statistically significant negative beta of QMJ, as the Alpha value almost doubles up as compared to the CAPM Alpha. Hence, in India the market inefficiency exists in the small companies and low risk segments. Hence, India is a market where there is a lot of scope for additional Alpha sourcing by adopting a SMB (t-stat 2.42) and BAB (t-stat 2.16) strategies. Indonesia: In Indonesia, SMB and HML in the FF 3-factor and the CH 4-factor models as well as HML and RMW in the FF 5-factor model, adds statistically significant dimensional risk to market risk as it goes up by adding these factors and accordingly, the alpha gets further diluted. Interestingly, the factor betas of AFP 5-factor and 6-factor models, though statistically not significant, reduce market risk significantly by 30 bps as compared to the market risk of the FF 5-factor model, thereby enhancing
page 772
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
773
alpha from a negative territory of 0.77% to a positive territory of 0.11%. It is observed that due to Indonesia being a small market, the SMB (size), HML (value) and RMW (profitability) segments can add positive Alpha, if further stock picking is done keeping low risk (BAB) and quality (QMJ) factors in mind. Malaysia: In Malaysia, the CAPM beta appears to be more efficient in delivering better alpha, though SMB has a statistically significant dimensional beta as observed from the FF 3-factor, CH 4-factor and FF 5-factor models. Interestingly, the AFP 5-factor and AFP 6-factor models do not provide any statistically significant indication of dimensional betas’ impact on Alpha, though they reduce market risk to some extent. Overall, we can say that SMB, HML-WLAB and BAB dimensional betas have some impact on reducing market risk and if investment risk (negative CMA beta) and momentum risk (negative MOM beta) are minimized within the SMB and HML segments, the Alpha can be further enhanced. Mexico: In Mexico, though the CAPM, FF 3-factor, CH 4-factor and FF 5-factor models highlight the overall impact of SMB, HML, RMW and MOM factor betas on market risk, it appears that only the impact of statistically significant negative CMA factor beta might have reduced the overall market risk. However, these models have not been able to generate excess return in comparison to the AFP’s 5-factor and 6-factor models. In the latter models, a combination of SMB and BAB with HML-LAB (without the lookahead bias) and high negative QMJ beta, has worked better in enhancing Alpha. It also shows that the impact of look-ahead bias is also pretty large. Hence, a strategy focusing on small companies with profitability growth but without the look-ahead bias and a low risk strategy focusing on large quality stocks and avoiding junk and investment-led risky stocks can enhance Alpha. Peru: In Peru, the SMB betas are very high and statistically significant in FF 3-factor, CH 4-factor and FF 5-factor models, though they have increased market risk without being able to enhance Alpha. On the other hand, the AFP 5-factor and 6-factor models have significantly reduced market risk with high and statistically significant QMJ and BAB factor betas. Though these factors take away more than two-thirds of the risk associated with small companies, still, the alphas generated by them are not the highest as the CAPM model generates the highest Alpha. Hence, the overall market
page 773
July 6, 2020
11:59
774
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
appears to be more efficient than what the factor betas might contribute and we conclude that index investing could be a better choice for Peru. Philippines: In Philippines, none of the factor betas are statistically significant, though SBM, HML (including HML-WLAB), MOM, RMW and BAB factors’ betas are quite high and worth a mention. As observed, the CH 4-factor model has produced the highest Alpha and has reduced market risk to its lowest by taking away very high negative momentum risk from the pack of SML and HML factors. Interestingly, the CAPM has produced the second highest alpha. Between the AFP 5-factor and 6-factor models, the latter works better as a considerable amount of market risk is taken off due to high negative MOM factor beta and a high and positive BAB factor beta. Hence, in this market, low risk and profitability factors would work well within the SMB and HML segments without taking advantage of momentum gains. Poland: In Poland, the AFP 5-factor produces the best Alpha, followed by the AFP 6-factor model with quite high but not statistically significant QMJ and BAB factor betas as they reduce market risk considerably. Both the models have some impact of HML-WLAB factor too. The CAPM model produces the 3rd largest Alpha. Hence, quality and low risk factors are most important to this market. On the other hand, although the FF 3-factor, CH 4-factor and FF 5-factor models, indicate that the SMB factor has a statistically significant beta, the overall impact of all the factors does not improve Alpha. Hence, a low risk strategy with quality as a ranking parameter, particularly in select small and high book-to-price stocks, may provide the best results. Russia: In Russia, the FF 5-factor model greatly enhances the alpha and reduces market risk to its lowest and the CAPM produces the second best Alpha, though at a higher market risk as compared to the AFP models. The FF 5-factor model with its very high and statistically significant negative CMA factor beta (−2.99, t-stats −4.755) as well as a high and statistically significant positive HML beta (1.83, t-stats 3.871), shows that considerable mispricing opportunity exists in the high book-to-price stocks led by profitability growth but not investment-led growth. Similarly, the AFP models’ results show that HML beta without the look-ahead bias drops significantly from 1.83 level to 0 and −0.28 levels and MOM factor beta turns negative with a negative QMJ factor beta in the AFP 5-factor and 6-factor models, respectively. This indicates overvaluation in
page 774
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
775
junk and momentum-led low book-to-price stocks. Hence, concentrating on low risk, high book-to-price and profitable small stocks may provide the best results. South Africa: In South Africa, the CAPM model produces the best alpha, followed by the AFP 5-factor and 6-factor models. The AFP 5-factor model has high but not statistically significant negative QMJ factor beta, and high but not statically significant positive BAB factor beta. The CH 4-factor and FF 5-factor models portray the statistically significant impact of SML and HML factor betas and high but not statistically significant impact of positive RMW factor beta. Interestingly the momentum risk is also positive in the CH 4-factor and AFP 6-factor models. Hence, some amount of momentum risk is desirable in this market and probably there is value in profitabilitydriven stocks. It also shows that probably a large portion of small stocks are junk. South Africa is a typical market, wherein the market risk has priced in all the factor risks in a more risk-return efficient manner, hence index investing can be a better option, although on a risk-averse and on a very selective basis some large value stocks with sound profitability growth may also enhance Alpha. South Korea: In this market, the AFP 6-factor model provides the best result, followed by the AFP 5-factor model. As we observe from these two models, the BAB factor beta, in contrast to many other markets is negative and statistically significant in the latter model. The HML factor without the look-ahead bias is high; however, its impact diminishes after adding the momentum factor which has a negative beta. Typically, the CH 4-factor model provides the 3rd highest Alpha due to its high and statistically significant positive SBM factor beta, and high and statistically significant negative MOM-factor beta. Similarly, the FF 3-factor and 5-factor also show very high statistically significant positive SMB-factor beta and some positive RMW or profitability beta. Overall, this market is highly influenced by HML (with look-ahead bias), MOM and CMA, and not even suitable for a low risk strategy. Hence, return enhancement opportunities can only be found among selective large and high book-to-price stocks that are profitable and momentum investing may be avoided. Taiwan: In Taiwan, market risk is reduced and Alpha is enhanced to its highest with the AFP 6-factor model, followed by the AFP 5-factor model and the CH 4-factor model. As observed, market risk can be reduced by removing the momentum risk (MOM factor beta is negative in AFP 6-factor
page 775
July 6, 2020
11:59
776
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
model, and negative and statistically significant in the CH 4-factor model). Additionally, in the AFP 5-factor model, the HML factor beta without lookahead bias is positive and statistically significant also, though it goes down to a non-significant level in AFP 6-factor model due to the negative MOM and QMJ factor betas. The BAB factor also has a negative beta and a very small positive beta in the AFP 5-factor and 6-factor models, respectively. Interestingly, the SMB factor has statistically significant positive beta in the FF 3-factor, CH 4-factor and FF 5-factor models but goes down to a statistically non-significant level in AFP 5-factor and 6-factor models. The FF 5-factor model also provides a statistically significant negative CMA factor beta and a negative RMW factor beta. Hence, stock selection in the small and high book-to-price segment with low volatility may provide enhanced Alpha. Thailand: In Thailand, the CH 4-factor model provides the highest Alpha with lowest market risk, contributed by its high and negative MOM factor beta, negative HML factor beta and a positive SMB factor beta. The CAPM provides the second highest Alpha, though its market risk ranks third. Results of other models show that none of the factors are statistically significant. Interestingly, the HML-WLAB which has a high and positive beta in the AFP 5-factor model, turns into a small negative beta in the AFP 6-factor model, after inclusion of the MOM factor. Hence, by concentrating mainly on the SMB segments and investing in good quality and profitable companies would provide the best results as there may be some profitability factor-led quality, small and low volatility stocks in this market. Turkey: In Turkey, the AFP 6-factor model enhances alpha to its highest by reducing a significant amount of market risk, with all its factor betas being negative, of which, the QMJ factor beta is the highest and statistically significant. The AFP 5-factor model produces the second highest Alpha and its HML-WLAB factor beta is positive. Hence, the MOM factor is the key one that reduces market risk as it turns the HML-WLAB factor beta negative in the AFP 6-factor model. Both, the MOM and HML factor betas are negative and statistically significant in the CH 4-factor model too and the latter is negative and statistically significant in the FF 3-factor model. This shows that some low book-to-price stocks may have high momentum associated with them and should be avoided. Interestingly, in the FF 5-factor model, the CMA factor beta is also high, statistically significant and negative; the RMW factor beta is negative and the SMB factor beta is negative. Hence,
page 776
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
777
Turkey appears to be an overheated market, particularly in the aggressively growing companies segment where investment returns are not commensurate with market risk. Czech Republic: In the Czech Republic, the AFP 5-factor model produces the highest Alpha, followed by their 6-factor model, by reducing market risk drastically as QMJ has a high and statistically significant negative factor beta while BAB factor has a high and statistically significant positive beta. Secondly, both SMB and HML factor betas are high and statistically significant in the FF 3-factor, CH 4-factor and FF 5-factor models, with the latter having a high and statistically significant negative CMA factor beta as well. Overall, this market appears to have overvaluation situations in aggressive growth stocks that lack quality and could be highly speculative. Hence, a low risk strategy among large and high book-to-price stocks would provide the best results. 19.4.2 Factor loading and alpha enhancement After the Gold Rush, chasing Alpha has been the greatest rush among academicians and practitioners across the global equity markets. Experts have developed many different models, both from the investors’ perspective (demand) and the assets perspective (supply) including external factors influencing the demand and supply of financial assets. Most of their efforts have been directed towards enhancing Alpha or Excess Return. The various market models that we have considered here are all products of logical financial reasoning. 19.4.2.1 Developed markets The Alphas obtained from the regression results of the above mentioned 6 models across 22 developed markets shows very interesting but divergent results as depicted in Exhibit 19I. The CAPM generated the highest Alpha in 8 markets (Australia, Austria, Belgium, Ireland, New Zealand, Portugal, Switzerland and UK). It was followed by the AFP 5-factor model having the highest alpha in 7 markets (Denmark, France, Germany, Holland, Italy, Norway and Spain). The AFP 6-factor model generated the highest Alpha in 4 markets (Finland, Israel, Japan and Sweden). The Carhart 4-factor model generated the highest Alpha in the largest equity market: the US and it was higher than the Alpha produced by the Fama–French 5-factor model, the Fama–French 3-factor model
page 777
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
778
Exhibit 19I: A Developed markets: CAPM and factor model Alpha. COUNTRY
CAPM
AUSTRALIA AUSTRIA BELGIUM CANADA DENMARK FINLAND FRANCE GERMANY HOLLAND HONGKONG IRELAND ISRAEL ITALY JAPAN NEW ZEALAND NORWAY PORTUGAL SPAIN SWEDEN SWITZERLAND UNITED KINGDOM UNITED STATES
–0.14% –0.56% –0.22% –0.10% 0.16% 0.08% –0.24% –0.24% –0.16% 0.09% –0.46% –0.22% –0.51% –0.50% –0.20% –0.32% –0.55% –0.25% 0.00% 0.16% –0.34% 0.06%
ALPHA RANKING
COLOR CODE LEGEND
HIGHEST ALPHA 2ND HIGHEST ALPHA 3RD HIGHEST ALPHA 4TH HIGHEST ALPHA 5TH HIGHEST ALPHA LOWEST ALPHA
FF-3 CH-4 FF-5 AFP-5 AFP-6 FACTOR FACTOR FACTOR FACTOR FACTOR –0.25% –0.98% –0.34% –0.12% 0.07% 0.31% –0.30% –0.26% –0.26% 0.08% –0.64% 0.11% –0.70% –0.57% –0.34% –0.56% –0.65% –0.39% 0.13% 0.04% –0.41% 0.14% CAPM
8 2 7 2 0 3 22
–0.30% –1.05% –0.38% –0.13% 0.04% 0.25% –0.34% –0.28% –0.27% 0.08% –0.61% 0.08% –0.71% –0.61% –0.31% –0.67% –0.67% –0.39% 0.19% 0.00% –0.42% 0.18%
–0.44% –1.18% –0.65% 0.03% –0.04% 0.52% –0.36% –0.25% –0.37% 0.17% –0.71% 0.26% –0.56% –0.53% –0.58% –0.70% –0.78% –0.30% 0.25% –0.08% –0.61% 0.16%
–0.28% –0.62% –0.49% 0.00% 0.19% 0.71% –0.11% 0.08% –0.15% 0.11% –0.59% 0.59% –0.26% –0.31% –0.67% –0.19% –0.67% –0.15% 0.53% –0.05% –0.49% 0.02%
–0.40% –0.76% –0.51% –0.02% 0.18% 0.75% –0.17% –0.05% –0.18% –0.01% –0.61% 0.75% –0.32% –0.23% –0.67% –0.34% –0.59% –0.17% 0.57% –0.05% –0.56% 0.00%
FF-3 CH-4 FF-5 AFP-5 AFP-6 FACTOR FACTOR FACTOR FACTOR FACTOR 0 4 3 9 5 1 22
1 1 4 4 7 5 22
2 1 3 5 0 11 22
7 8 1 4 1 1 22
4 6 5 1 3 3 22
and CAPM, in that order. Hence, if we go by the Efficient Market hypothesis, the first 8 markets, in which CAPM Alpha is highest, are probably the most efficient markets as market risk explains most of their characteristics, while the other markets are prone to factor-specific risk. We present in Exhibit 19I: B a chart depicting the edge that factor models have over CAPM in generating Alpha in developed markets, so that readers can gain some insight into markets where factor models work and where they do not.
page 778
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
Sourcing Alpha in Global Equity Markets
779
Exhibit 19I: B Alpha enhancement in developed markets: Market factor models’ excess Alpha over CAPM Alpha. UNITED STATES UNITED KINGDOM SWITZERLAND SWEDEN SPAIN PORTUGAL NORWAY NEW ZEALAND JAPAN ITALY ISRAEL IRELAND HONGKONG HOLLAND GERMANY FRANCE FINLAND DENMARK CANADA BELGIUM AUSTRIA AUSTRALIA -0.008
-0.006
AFP- 6 FACTOR
-0.004
-0.002
AFP- 5 FACTOR
0
0.002
FF- 5 FACTOR
0.004
0.006
CH- 4 FACTOR
0.008
0.01
FF- 3 FACTOR
19.4.2.2 Emerging markets The Alphas obtained from the regression results of the above-mentioned 6 models across 21 emerging markets shows very interesting but quite different results then developed markets as depicted in Exhibit 19II. Our findings show that, among the emerging markets, the AFP 6-factor model has generated the highest Alpha in 9 markets (Chile, Egypt, Greece, India, Indonesia, Mexico, South Korea, Taiwan and Turkey). The CAPM has generated the highest Alpha in 5 markets (Columbia, Hungary, Malaysia, Peru and South Africa), the AFP 5-factor model has generated the highest Alpha in 4 markets (Brazil, Indonesia, Poland and Czech. Republic); CH
page 779
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
780
Exhibit 19II: A Emerging markets: CAPM and factor model Alpha. COUNTRY
BRAZIL CHILE CHINA COLUMBIA EGYPT GREECE HUNGARY INDIA INDONESIA MALAYSIA MEXICO PERU PHILIPPINES POLAND RUSSIA S. AFRICA S. KOREA TAIWAN THAILAND TURKEY CZECH REP.
ALPHA RANKING
HIGHEST ALPHA 2ND HIGHEST ALPHA 3RD HIGHEST ALPHA 4TH HIGHEST ALPHA 5TH HIGHEST ALPHA LOWEST ALPHA
CAPM
0.65% 0.22% –0.52% 0.38% 0.49% –1.21% 0.25% 0.13% –0.05% –0.10% 0.29% 0.57% 0.16% 0.28% 0.69% 0.00% –0.05% –0.21% –0.07% 0.28% –0.03%
COLOR CODE LEGEND
FF-3 CH-4 FF-5 AFP-5 AFP-6 FACTOR FACTOR FACTOR FACTOR FACTOR 0.57% 0.15% –0.63% 0.05% 0.40% –1.58% 0.03% 0.17% –0.33% –0.23% 0.32% 0.42% 0.04% 0.01% 0.57% –0.16% –0.03% –0.27% –0.08% 0.60% –0.25%
CAPM
5 4 6 1 4 1 21
0.53% 0.30% –0.37% 0.31% 0.39% –1.47% –0.13% 0.20% –0.38% –0.21% 0.26% 0.25% 0.21% –0.02% 0.50% –0.25% 0.33% 0.01% 0.14% 1.09% –0.27%
0.61% 0.15% –0.48% –0.10% 0.11% –1.46% –0.03% 0.01% –0.77% –0.39% 0.45% 0.31% –0.14% –0.02% 0.80% –0.33% –0.16% –0.13% –0.37% 0.97% –0.17%
0.84% 0.25% –0.90% 0.13% 0.92% –0.68% 0.17% 0.16% 0.11% –0.35% 0.58% 0.56% –0.19% 0.34% 0.41% –0.02% 0.25% 0.03% –0.45% 1.60% 0.12%
0.77% 0.42% –0.69% 0.26% 0.98% –0.57% 0.15% 0.24% 0.11% –0.34% 0.60% 0.50% –0.06% 0.31% 0.56% –0.15% 0.38% 0.15% –0.26% 1.96% 0.04%
FF-3 CH-4 FF-5 AFP-5 AFP-6 FACTOR FACTOR FACTOR FACTOR FACTOR 0 0 6 8 4 3 21
4 8 2 2 1 4 21
3 5 2 1 4 6 21
9 3 4 4 1 0 21
1 1 1 5 4 9 21
4-factor model in 3 markets (China, Philippines and Thailand). While Fama– French’s 3-factor model has not generated higher Alpha in any of the emerging markets; their 5-factor model has generated the highest Alpha in one market, Russia. Hence going by the efficient market hypothesis, emerging markets are less mature and prone to factor risk (beta), except probably Columbia, Hungary, Malaysia, Peru and South Africa, which appear to be more efficient. We present in Exhibit 19II: B, a chart depicting the edge that factor models have over CAPM in generating alpha in emerging markets, so that readers can get an overview of the countries where the factor models work and where they do not.
page 780
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
Sourcing Alpha in Global Equity Markets
781
Exhibit 19II: B Alpha enhancement in emerging markets: Market factor models’ excess Alpha over CAPM Alpha. CZECH REP. TURKEY THAILAND TAIWAN S. KOREA S. AFRICA RUSSIA POLAND PHILIPPINES PERU MEXICO MALAYASIA INDONESIA INDIA HUNGARY GREECE EGYPT COLUMBIA CHINA CHILE BRAZIL -0.01
AFP- 6 FACTOR
-0.005
0
AFP- 5 FACTOR
0.005
FF- 5 FACTOR
0.01
CH- 4 FACTOR
0.015
0.02
FF- 3 FACTOR
19.5 Conclusion and Relevance of Factor Risk In order to provide readers a bird’s-eye view of market characteristics, relevance of factor risk and recommended investment strategy we have provided a summarized chart for the developed markets, in Exhibit 19III: A and for emerging markets in Exhibit 19III: B. 19.5.1 Developed markets As far as the various market factors models and their impact on market risk is concerned, it is observed that among the 22 developed markets studied, the AFP 5-factor model has reduced market risk to its lowest in the case of 10 countries, out of which, 7 countries have generated highest Alpha. This is followed by the AFP 6-factor model which has reduced market risk to its minimum in the case of 5 countries out of which 4 countries have generated highest Alpha. The CAPM has the lowest market risk in 5 countries, but highest Alpha in 8 countries. Compared to the above, the FF 5-factor model has produced lowest market risk in 2 countries and highest Alpha in 2 countries.
page 781
July 6, 2020 11:59
782
Exhibit 19III: A Developed markets.
CAPM
BELGIUM
CAPM
CANADA
FF 5-FACTOR
DENMARK
AFP 5-FACTOR
FINLAND FRANCE
AFP 6-FACTOR AFP 5-FACTOR
GERMANY
AFP 5-FACTOR
HOLLAND
AFP 5-FACTOR
HONGKONG IRELAND
FF 5-FACTOR CAPM
INDEX INVESTING, QUALITY WITH PROFITABILITY GROWTH AND LOW RISK INDEX INVESTING, LOW RISK STRATEGY WITH VALUE PREMIUM AND PROFITABILITY GROWTH Positive: HML, RMW, HML-WLAB, INDEX INVESTING, LOW RISK STRATEGY WITH QMJ, BAB VALUE PREMIUM AND PROFITABILITY GROWTH Positive: SMB, HML Negative: CMA, STOCK SELECTION WITH VALUE AND QUALITY QMJ BASED STRATEGY IN SMALL STOCK SEGMENT, HIGHLY AGGRESSIVE AND JUNK STOCKS SHOULD BE AVOIDED. Positive: SMB, HML, BAB STOCK SELECTION WITH LOW RISK, PROFITABILITY-LED HIGH VALUE STRATEGY Negative: SMB, HML, BAB LOW RISK STRATEGY Negative: SMB Positive: HML STOCK SELECTION WITH HIGH VALUE QUALITY AND PROFITABILITY CRITERIA Positive: HML-WLAB, MOM VALUE WITH MARKET NEUTRAL STRATEGY Negative: BAB Positive: HML, HML-WLAB HIGH VALUE LARGE STOCK Negative: SMB Positive: HML-WLAB Negative: CMA CONSERVATIVE HIGH VALUE STOCK Positive: HML, HML-WLAB, BAB VALUE AND LOW RISK STRATEGY
b3568-v1-ch19
AUSTRIA
Positive: SMB, HML, RMW, HML-WLAB, BAB Positive: SMB, HML, RMW, HML-WLAB, BAB Negative: CMA
9.61in x 6.69in
CAPM
Recommended investment strategy
S. S. Mohanty
AUSTRALIA
Statistically significant market factors
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
Model that generated highest alpha
page 782
July 6, 2020 11:59
UNITED KINGDOM UNITED STATES
CAPM CH 4-FACTOR
Positive: HML, RMW, HML-WLAB, BAB Negative: SMB Positive: HML, RMW, HML-WLAB, QMJ, BAB Negative: SMB Positive: HML-WLAB, QMJ Negative: SMB, HML, BAB, CMA
VALUE STRATEGY LOW RISK AND CONSERVATIVELY GROWING STOCKS INDEX INVESTING, HIGH VALUE, LOW RISK AND PROFITABILITY-LED GROWTH STOCKS HIGH VALUE, LOW RISK AND PROFITABILITY-LED GROWTH STOCKS INDEX INVESTING, LOW RISK STRATEGY VALUE STRATEGY STOCK SELECTION WITH SMALL STOCKS, PROFITABILITY AND CONSERVATIVE INVESTMENT STRATEGY VALUE STRATEGY WITH PROFITABILITY AND INVESTMENT LED GROWTH INDEX INVESTING, HIGH VALUE, LOW RISK AND PROFITABILITY-LED GROWTH STOCKS QUALITY VALUE STOCKS
9.61in x 6.69in
SWITZERLAND CAPM
AVOID LOW VALUE JUNK STOCKS Sourcing Alpha in Global Equity Markets
AFP 6-FACTOR Positive: SMB Negative: HML, HML-WLAB, QMJ ITALY AFP 5-FACTOR Positive: HML JAPAN AFP 6-FACTOR Positive: SMB, CMA, BAB Negative: HML-WLAB NEW ZEALAND CAPM Positive: SMB, HML, HML-WLAB, BAB NORWAY AFP 5-FACTOR Positive: SMB, HML, RMW, HML-WLAB, MOM, BAB Negative: CMA, QMJ PORTUGAL CAPM Positive: BAB SPAIN AFP 5-FACTOR Positive: HML, HML-WALB SWEDEN AFP 6-FACTOR Positive: SMB Negative: HML, BAB
Handbook of Financial Econometrics,. . . (Vol. 1)
ISRAEL
b3568-v1-ch19
783
page 783
July 6, 2020
Statistically significant market factors Positive: HML Negative: CMA
CHILE CHINA COLUMBIA
AFP 6-FACTOR CH 4-FACTOR CAPM
EGYPT
AFP 6-FACTOR
GREECE
AFP 6-FACTOR
HUNGARY
CAPM
INDIA INDONESIA
AFP 6-FACTOR AFP 5-FACTOR, AFP 6-FACTOR CAPM AFP 6-FACTOR CAPM CH 4-FACTOR
Positive: SMB Negative: MOM Positive: SMB, HML Negative: MOM, CMA Positive: SMB, HML, HML-WLAB, BAB Negative: MOM Positive: SMB, HML, RMW, BAB Negative: CMA, QMJ Positive: SMB, HML, RMW, BAB Negative: CMA, QMJ Positive: SMB, HML, BAB Negative: CMA, QMJ Positive: SMB, BAB Negative: QMJ Positive: SMB, HML, RMW
MALAYSIA MEXICO PERU PHILIPPINES
Positive: SMB Negative: CMA Positive: SMB, BAB Negative: QMJ No significant factors
VALUE, QUALITY, PROFITABILITY, LOW RISK SMALL, QUALITY, LOW RISK SMALL, VALUE, QUALITY, LOW RISK INDEX INVESTING, VALUE, LOW RISK LOW RISK, QUALITY, PROFITABLE, VALUE LOW RISK, QUALITY, PROFITABLE, VALUE LOW RISK, SMALL, VALUE SMALL AND LOW RISK SMALL, VALUE, PROFITABLE
9.61in x 6.69in
AFP 5-FACTOR
S. S. Mohanty
BRAZIL
Recommended strategy
Handbook of Financial Econometrics,. . . (Vol. 1)
Country
Model that generated highest alpha
11:59
784
Exhibit 19III: B Emerging markets.
SMALL LOW RISK, LARGE, QUALITY INDEX INVESTING, SMALL, LOW RISK LOW RISK, PROFITABLE b3568-v1-ch19 page 784
July 6, 2020 11:59
CH 4-FACTOR AFP 6-FACTOR AFP 5-FACTOR
WORLD
CAPM
SMALL, LOW RISK, QUALITY VALUE, PROFITABILITY, LOW RISK INDEX INVESTING, SMALL, VALUE SMALL, PROFITABLE AND LOW RISK SMALL, MOMENTUM, VALUE, LOW RISK AVOID MOMENTUM OVER-HEATED SMALL, LOW RISK, VALUE VALUE, PROFITABLE AND QUALITY
9.61in x 6.69in
THAILAND TURKEY CZECH REP.
Positive: SMB Positive: HML Negative: CMA Positive: SMB, HML Positive: SMB Negative: MOM, BAB Positive: SMB, HML-WLAB Negative: MOM, CMA No significant factors Negative: HML, MOM, CMA, QMJ Positive: SMB, HML, BAB Negative: CMA, QMJ Positive: HML, HML-WLAB, RMW, QMJ Negative: SMB
Handbook of Financial Econometrics,. . . (Vol. 1)
AFP 5-FACTOR FF 5-FACTOR CAPM AFP 6-FACTOR AFP 6-FACTOR
Sourcing Alpha in Global Equity Markets
POLAND RUSSIA S. AFRICA S. KOREA TAIWAN
b3568-v1-ch19
785
page 785
July 6, 2020
11:59
786
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
Hence, contrary to the risk-return efficiency framework, we find that lower market risk results in higher excess return in 19 out of the 22 developed markets, which is a major anomaly. However, although in the majority of the markets, the AFP models have reduced market risk (15 countries) and enhanced Alpha (11 countries), it is also very interesting to note that, the CAPM is second only in generating excess returns in the developed markets. We find that each market is unique in its composition and trend even over a long time horizon and hence a generalized approach in asset allocation cannot be adopted across all the markets. Considering the factor impact on market risk, the HML factor both with or without its look ahead bias is the most pervasive factor with statistically significant beta in 17 out of the 22 developed markets in the case of FF 3-factor and CH 4-factor models; in 13 out of the 22 developed markets in the case of AFP 5-factor and 6-factor models, and in half of them as per the FF 5-factor model. The beta values of the HML factor are also mostly positive, except for 5 markets: Finland, Hong Kong, Israel, Sweden and the US as per the FF 3-factor model. After adding other factors such as MOM in CH 4-factor, RMW and CMA factors in FF 5-factor model and QMJ and BAB factor in AFP 5-factor and QMJ, BAB and MOM in AFP 6-factor models, the negative beta value of HML gets compensated by the negative beta values of either all or one or more negative beta of these factors. Hence, it can be concluded that profitability, investment and quality factors could have contributed to the high minus low book-to-market factor or the value premium, and some of the value premium could have positive momentum as well (such as, compensated by positive momentum beta in CH 4-factor model). The second most pervasive factor is the BAB factor as it generates statistically significant beta in 13 markets in the case of the AFP 5-factor model and in 11 markets in the case of the AFP 6-factor model. Interestingly, in 8 of these markets BAB has a negative beta as per the AFP 5-factor model and in 7 of these markets BAB has a negative beta as per the AFP 6-factor model. In most of these markets, due to the negative BAB factor risk, the market risk also goes down and hence it enhances Alpha. According to the observations made from the FF 3-factor and CH 4-factor models, the size factor SMB is the third-most pervasive factor among the developed markets as it generates statistically significant beta in 11 markets, its statistical significance goes down to 8 markets in the FF 5-factor model and further to 4 markets in the AFP models. The RMW factor beta is positive in 14 markets and statistically significant in 7 markets, such as the UK, Switzerland, Norway, New Zealand,
page 786
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
787
Belgium, Austria and Australia. Incidentally, there are 8 markets in which RMW beta is negative but none of them are statistically significant. On the other hand, the CMA factor beta is negative in 17 markets and positive in the other 5 markets and it is negative and statistically significant in Austria, Canada, Hong Kong, Norway and US but positive and statistically significant in Japan. We found negative QMJ factor betas in 15 markets and in 16 markets, and positive QMJ factor beta in 7 markets and 6 markets, as per the AFP 5-factor and 6-factor model, respectively, out of which Canada, Israel and Norway have statistically significant negative QMJ factor beta and Belgium, UK and the US have statistically significant positive QMJ factor beta. 19.5.2 Emerging markets As far as the emerging markets are concerned, most of them are not risk/return efficient or the CAPM beta is not the true measure of market risk. Betting Against Beta (or the low risk strategy) has a positive beta and has enhanced alpha in 18 of the emerging markets, except for South Korea, Taiwan and Turkey. This highlights the riskier aspects of emerging markets. It is also positive and statistically significant in 5 markets, such as Egypt, Greece, Hungary, Peru and Czech Republic and in India too (India has a negative and statistically significant QMJ beta, when momentum is added in the AFP 6-factor model). We also observe that when we add the momentum premium to the model the significant negative beta of the low risk strategy gets diluted to such an extent that it becomes positive for Taiwan. Hence, while allocating funds among emerging markets, a low risk strategy could provide additional incremental return, although in South Korea, Taiwan and Turkey going against such a strategy may provide better returns. Typically, emerging markets are driven by small companies as they are growth markets. Hence, the SMB factor is positive and statistically significant in 14 out of the 21 markets in the FF 3-factor and CH 4-factor models and slightly less significant in Greece in the FF 5-factor model. In AFP’s 5-factor and 6-factor models, the SMB factor loses much of its significance, except in India where it is still statistically significant, indicating strong undervaluation in the small companies segment. The AFP models also indicate that in countries like Egypt, Greece, Hungary, Philippines and Turkey, small companies have overvaluation situations as they have negative factor betas. The investment factor (CMA) is the next pervasive factor which is statistically significant and high in 10 markets. Interestingly, the CMA factor is negative in all the emerging markets and it shows that investment-led
page 787
July 6, 2020
11:59
788
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
growth has overdriven valuation in most of these emerging markets to the extent of a highly euphoric overreaction in Russia, China, Egypt, Turkey, Greece, Brazil, Czech. Republic, Turkey etc. compared to the other emerging markets such as South Africa, Poland, Philippines, Peru, South Korea, Thailand, Chile and India. This could also be interpreted as investmentfactor led price returns are below market expectations, or, in other words, there could be some major structural issues in these economies which have impacted the investment factor, causing actual returns to remain significantly below expectation. Thus, investment factor beta has increased market risk in all the emerging markets. The HML factor is also a very important factor to the extent that it is high and statistically significant in 8 of the 21 emerging markets, as per the FF 3-factor, CH 4-factor ad FF 5-factor models. It is mostly a positive beta factor in most of the emerging markets, except for India, Mexico, South Korea, Thailand and Turkey. It is observed that momentum as a factor has also increased market risk and is statistically significant and highly negative in 5 markets such as South Korea, Taiwan, Turkey, Columbia and China. Momentum has a negative beta in 11 out of the 21 emerging markets as per the CH 4-factor model, which gets further expanded to 14 in AFP 6-factor model. Hence, it is noted that a momentum-led return strategy may not be the right strategy in the emerging markets. The impact of profitability as a factor is risk-positive in most of the countries, but statistically significant only in 2 countries — Egypt and Indonesia. However, it is risk-negative in Mexico, Poland, Taiwan and Turkey. In a multi-factor setting, while other factors also influence this factor, typically this means positive risk of profitability factor beta can reduce market risk and enhance alpha in most of these markets, whereas it is just the opposite in the 4 markets mentioned earlier. It also shows profitability overdrive in these 4 markets, as compared to the other markets. Egypt and Indonesia could be the best countries where there is a lot of scope in reaping the profitability premium. It is observed that though HML is highly statistically significant in 8 markets in the FF 3-factor, CH 4-factor and FF 5-factor models, the HML-WLAB shows that much of the value premium significance is lost in the AFP models, except for Columbia and Taiwan where this factor beta remains statistically significant. Hence, when look-ahead bias is removed from the sample data by considering historical earnings and price of the same date, much of the value impact goes away. However, our observations
page 788
July 6, 2020
11:59
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Sourcing Alpha in Global Equity Markets
b3568-v1-ch19
789
on value risk-positive or risk-negative countries remain the same and value premium remains a very important factor beta. The most interesting among all findings is that the QMJ factor beta is statistically significant in 7 of the 21 emerging markets. Moreover, the quality premium which adds a few more factors such as growth, safety and high payout ratio to the profitability factor is risk-negative in most of the countries and risk-positive only in China, Malaysia, Indonesia, Philippines and Thailand. It shows that in these 5 markets, junk stocks have pushed up valuation, whereas in other 16 markets, the quality premium still holds good and will enhance Alpha with its positive risk premium. Overall for the emerging markets, the AFP 6 factor model has reduced market risk in 17 out of the 21 countries and hence enhanced alpha significantly, whereas a combination of the FF 5-factor betas have reduced market risks in 3 markets- China, Mexico and Russia and the CH 4-factor betas have reduced market risk to the lowest in Thailand only. As far as the world equity markets as a whole (both developed and emerging markets) are concerned, interestingly, all the models generate more or less equal Alpha and also have similar market risk. However, the SMB factor has the most pronounced and statistically significant negative beta, followed by a positive and statistically significant HML-WLAB beta (HML beta), a statistically significant positive QMJ beta, a statistically significant positive RMW beta, a negative but high BAB beta, a small but negative CMA beta and a very negligible positive momentum beta. This shows that in the world equity markets, large companies are overvalued and most sought after, even though the markets have some positive momentum. Hence, value premium with profitability and quality factor can generate excess returns with some amount of momentum. Low volatility could be associated with some of the large stocks that are led by aggressive investment policies. Bibliography Asness, C. S., Frazzini, A. and Pedersen, L. H. (2012). Leverage Aversion and Risk Parity, Financial Analysts Journal, 68(1), 47–59. Asness, C. S., Frazzini, A. and Pedersen, L. H. (2017). Quality Minus Junk, Working Paper (Available at http://dx.doi.org/10.2139/ssrn.2312432). Carhart, M. M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance, 52(1), 57–82. Cootner, P. (1964). Stock prices: Random vs. systematic changes, The Random Character of Stock Market Prices, MIT Press, Cambridge, MA, pp. 231–252. Fama, E. F. and French, K. R. (2015). A Five-factor Asset Pricing Model, Journal of Financial Economics, 116, 1–22.
page 789
July 6, 2020
11:59
790
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch19
S. S. Mohanty
Fama, E. F. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work, The Journal of Finance, 25(2), 383–417. Fama, E. F. (1991). Efficient Capital Markets: II, The Journal of Finance, 46(5), 1593. Fama, E. F. (1998). Market efficiency, long-term returns, and behavioral finance, Journal of Financial Economics, 49(3), 283–306. Fisher, B., Jensen, M. C. and Scholes, M. (1972). The Capital Asset Pricing Model: Some Empirical Tests, Studies in the Theory of Capital Markets, Praeger, 79–121. Frazzini, A. and Pedersen, L. H. (2014). Betting against Beta, Journal of Financial Economics, 111, 1–25. Hicks, J. R. (1939). Value and Capital, Oxford University Press, P. 126. Jegadeesh, N. and Titman, S. (1993). Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency, Journal of Finance, 48, 65–91. Jensen, M. C. (1986). Agency costs of free cash flow, corporate finance, and takeovers, The American Economic Review, 76(2), 323–329. Markowitz, H. (1959). Portfolio selection: Efficient diversification of investments, John Wiley & Sons, New York. Markowitz, H. (1991). Foundations of Portfolio Theory, Journal of Finance, 46, 469–477. Merton, R. C. (1973). An Intertemporal Capital Asset Pricing Model, Econometrica, 41, 867–887. Osborne, M. F. M. (1959). Brownian motion in the stock market, Operations Research, 145–173. Tobin, J. (1958). Liquidity Preference as Behavior Towards Risk, The Review of Economic Studies, 25(2), 65–86.
page 790
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
Chapter 20
Support Vector Machines Based Methodology for Credit Risk Analysis Jianping Li, Mingxi Liu, Cheng Few Lee and Dengsheng Wu Contents 20.1 20.2
20.3
Introduction . . . . . . . . . . . . . . . . . . . . Support Vector Machines . . . . . . . . . . . . 20.2.1 Standard support vector machine . . . 20.2.2 Feature extraction methods . . . . . . . 20.2.3 Kernel functions of SVM . . . . . . . . 20.2.4 Hyper-parameter optimization methods Data and SVM-Based Methodology . . . . . . . 20.3.1 Data . . . . . . . . . . . . . . . . . . . 20.3.2 SVM-based methodology . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Jianping Li Institutes of Science and Development, Chinese Academy of Sciences e-mail: [email protected] Mingxi Liu Institutes of Science and Development, Chinese Academy of Sciences e-mail: [email protected] Cheng Few Lee Rutgers University e-mail: cfl[email protected] Dengsheng Wu Institutes of Science and Development, Chinese Academy of Sciences e-mail: [email protected]
791
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
793 796 796 797 800 800 805 805 806
page 791
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
792
20.4
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.1 Experiment results — Different SVM-based models . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.2 Experiment results — SVM-based models with other major models . . . . . . . . . . . . . . . . . . . . . . . 20.4.3 Experimental results — Computation complexity of different SVM-based methods . . . . . . . . . . . . . . 20.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 20A: Data Normalization Method . . . . . . . . . . . . . . . Appendix 20B: Detailed Procedure of SVM-Based Methodology . . . .
808 809 810 811 812 813 818 818
Abstract Credit risk analysis is a classical and crucial problem which has attracted great attention from both academic researchers and financial institutions. Through the accurate classification of borrowers, it enables financial institutions to develop lending strategies to obtain optimal profit and avoid potential risk. Actually, in recent decades, several different kinds of classification methods have been widely used to solve this problem. Owing to the specific attributes of the credit data, such as its small sample size and nonlinear characteristics, support vector machines (SVMs) show their advantages and have been widely used for scores of years. SVM adopts the principle of structural risk minimization (SRM), which could avoid the “dimension disaster” and has great generalization ability. In this study, we systematically review and analyze SVM based methodology in the field of credit risk analysis, which is composed of feature extraction methods, kernel function selection of SVM and hyper-parameter optimization methods, respectively. For verification purpose, two UCI credit datasets and a real-life credit dataset are used to compare the effectiveness of SVM-based methods and other frequently used classification methods. The experiment results show that the adaptive Lq SVM model with Gauss kernel and ES hyper-parameter optimization approach (ES-ALq G-SVM) outperforms all the other models listed in this study, and its average classification accuracy in the two UCI datasets could achieve 90.77% and 75.21%, respectively. Moreover, the classification accuracy of SVM-based methods is generally better or equal than other kinds of methods, such as See5, DT, MCCQP and other popular algorithms. Besides, Gauss kernel based SVM models show better classification accuracy than models with linear and polynomial kernel functions when choosing the same penalty form of the model, and the classification accuracy of Lq -based methods is generally better or equal than L1 - and L2 -based methods. In addition, for a certain SVM model, hyper-parameter optimization utilizing evolution strategy (ES) could effectively reduce the computing time in the premise of guaranteeing a higher accuracy, compared with the grid search (GS), particle swarm optimization (PSO) and simulated annealing (SA). Keywords Support vector machines • Feature extraction • Kernel function selection • Hyperparameter optimization • Credit risk classification.
page 792
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
793
20.1 Introduction Credit risk analysis is a classical and crucial problem which has attracted great attention from both academic researchers and financial institutions (e.g., see Altman and Saunders, 1997; Crouhy et al., 2000; Gilchrist and Mojon, 2018; Kumar et al., 2016). The main task of the credit risk analysis is to build a model, which will be able to distinguish between a good creditor and a bad one. In general, the aim is to utilize the past credit dataset and use them to learn the rules generally applicable for the future discrimination between the two creditor classes, with minimal number of mistakes both in false positives and false negatives. In the past decades, numerous methodologies have been introduced for addressing the challenge and importance of credit risk analysis, risk evaluation, and risk management (e.g., see Abdou et al., 2008; Angelini et al., 2008; Grace and Williams, 2016; Guo et al., 2016; Oreski and Oreski, 2014; Maldonado et al., 2017). In general, statistical and neural network-based approaches are among the most popular paradigms (e.g., see Baesen et al., 2003; Doumpos et al., 2002; Laha, 2007; Lin, 2009; Yu et al., 2008; Yu et al., 2018). Support vector machine (SVM) method was first proposed by Vapnik (1995), which has been widely used in credit evaluation (e.g., see Chen et al., 2009; Li et al., 2004; Zhang et al., 2015) and other fields, such as pattern classification, bioinformatics, and text categorization (e.g., see Diederich et al., 2007; Jindal et al., 2016; Kumar and Gopal, 2009; Langone et al., 2015; Mitra et al., 2007; Vanitha et al., 2015), due to its good generalization performance and strong theoretical foundations. Particularly, owing to the specific attributes of the credit data, such as its small sample size and nonlinear characteristics, SVM shows its advantages than many other classification methods. SVM adopts the principle of structural risk minimization (SRM), which could avoid the “dimension disaster” and has great generalization ability. Although their strong theoretical foundations have been illustrated (e.g., see Vapnik, 1995; Vapnik, 1998), the standard SVMs still have several drawbacks. In recent years, several variants of SVM have been proposed to improve the standard SVM method from different standpoints, such as LSSVM, L1 SVM, Lq SVM, and so on (e.g., see Chen et al., 2007, 2011; Debnath et al., 2005; Li et al., 2007, 2012; M¨ uller et al., 2001; S´ anchez, 2003; Suykens and Wandewalle, 1999; Wang et al., 2003). Overall, feature extraction methods, kernel function selection of SVM and hyper-parameter optimization methods are three most important components during the algorithm procedure of SVM-based methods.
page 793
July 6, 2020
12:0
794
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
Feature extraction is a crucial problem in the development of machine learning algorithms, which is the key to effective model construction. Feature extraction involves reducing the amount of resources required to describe a large set of data. When performing analysis of complex data, one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power; also it may cause a classification algorithm to overfit the training samples and generalize poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy. For SVM-based methods, several different kinds of feature extraction methods have been developed. For instance, regularization methods have been widely used for automatic feature selection in SVM. For example, L1 SVM use L1 regularization to make feature extraction, which adds the sum of the absolute values of the model parameters to the objective function (e.g., see Gao et al., 2011; Zhu et al., 2004). Compared with standard SVM method, it penalizes model details of unnecessary complexity, focuses on the most relevant features by making most of the coefficients 0, and thus avoids overfitting of the data used for training. As it is less sensitive to outliers of L1 , many researchers have chosen the L1 penalty to improve the generalization performance, and have achieved better results on datasets having redundant samples or outliers (e.g., see Bradley and Mangasarian, 2000; M¨ uller et al., 2001; S´anchez, 2003). However, it is worth noting that L1 penalty SVM are not preferred if the data set is of a non-sparse structure. Considering the fact that the form of penalty should keep pace with the original data structure, rather than remain unchanged, an adaptive Lq SVM is proposed by Liu et al., which adaptively selects the optimal penalty, driven by the data (e.g., see Liu et al., 2007, 2011). Kernel function is another effective method to enhance the classification ability of SVM. For standard linearly separable problems, SVM attempts to optimize generalization performance, bound by separating data with a maximal margin classifier, which can be transformed into a constrained quadratic optimization problem. However, the data in real life are so complicated that most original credit data are not linearly separated and, therefore, the input space needs to be mapped into a higher dimensional feature space to make the data linearly separable (e.g., see Chen et al., 2009; M¨ uller et al., 2001; Vapnik, 1995). In addition, since the introduction of kernel functions, the dimension disaster which usually appears in the process of calculating inner product could be avoided. As increasing attention is being put on learning
page 794
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
795
kernel functions from data structures, kernels are introduced in the SVM for nonlinear transformations, in order to improve classification accuracy (e.g., see M¨ uller et al., 2001). Many researchers have observed that kernels greatly improve the performance of SVM (e.g., see Jiang et al., 2017; Li et al., 2012; M¨ uller et al., 2001; S´anchez, 2003; Shankar et al., 2018; Wang et al., 2003). In addition, hyper-parameters optimization also needs to be considered in SVM-based methods. In these methods, there are many hyper-parameters to be optimized simultaneously, such as the norm of penalty function when using adaptive Lq norm, regularization factor, parameters of kernel function, and so on. Apart from the classification accuracy of the model, the computation complexity should also be considered simultaneously. Actually, credit data are mass data, which makes computation time increases as the size of training data increases, and performance of SVM deteriorates. Thus, some hyper-parameter optimization methods need to be introduced to reduce the computational complexity of credit analysis. The Grid Search (GS) method is the most traditional and reliable optimization method for the parameter selection problem (e.g., see Syarif et al., 2016). To select optimal SVM parameters, GS is implemented by varying the SVM parameters through a wide range of values, with a fixed step size, assessing the performance of every parameter combination, based on a certain criterion, and finding the best setting in the end. However, the grid size will increase dramatically as the number of parameters increases. It is an exhaustive search method and very time-consuming (e.g., see Huang and Wang, 2006; Syarif et al., 2016). Considering the computational costs, several other parameter optimization methods could be used, such as Evolutionary Strategy (ES), Particle Swarm Optimization (PSO), and Simulated Annealing (SA), respectively. In brief, in order to improve the performance of credit classification and control computational costs, different combinations of feature extraction methods, kernel function selection methods and hyper-parameter optimization methods could show different characteristics and are fit for different credit datasets. In this chapter, two UCI credit datasets and a real-life credit dataset are used to compare the effectiveness of SVM-based methods and other frequently used classification methods. The chapter is divided into five sections. Section 20.2 reviews the standard SVM method and three important components when using the algorithm, including feature extraction methods, kernel function selection of SVM and hyper-parameter optimization methods. Section 20.3 introduces three datasets which we use to compare the classification accuracy between different methods, and the whole computing framework of SVM-based
page 795
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
796
methodology in credit risk analysis. The experimental results when using different SVM-based models, compared with other popular models, are shown in Section 20.4. Conclusions are finally drawn in Section 20.5, with proposals for future research. 20.2 Support Vector Machines 20.2.1 Standard support vector machine The general form of support vector machine is used to solve binary classification problem. Given a set of credit data sample points G = {(xi , yi )}ni=1 , where xi ∈ Rm is the ith input record and yi ∈ {1, −1} is the corresponding observed result, the main goal of SVM is to find an optimal separating hyper-plane, which can be represented as w, x + b = 0, to maximize the separation margin (the distance between it and the nearest data point of each class), and to minimize the empirical classification error (e.g., see Vapnik, 1995). Considering the fact that our original data may have some noises, which make all the points linearly inseparable, non-negative slack variables could be introduced. Thus, the problem of seeking the optimal separating hyper-plane can be transformed into the following optimization problem: n
min
1 w2 + C ξi , 2 i=1
s.t. yi (wT φ(xi ) + b) ≥ 1 − ξi , ξi ≥ 0,
i = 1, . . . , n,
(20.1)
where variables ξi are the non-negative slack variables, representing the classification error; regularized parameter C is a constant denoting a tradeoff between the maximum margin and the minimum experience risk, which should be empirically chosen before using this classification algorithm. The above quadratic optimization problem can be solved by transforming it into Lagrange function: n
1 ξi L(w, b, ξ, α) = w2 + C 2 i=1
−
n i=1
T
αi (yi (w φ(xi ) + b) − 1 + ξi ) −
n i=1
where αi , θi denote Lagrange multipliers; αi ≥ 0 and θi ≥ 0.
θ i ξi ,
(20.2)
page 796
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
797
At the optimal point, utilizing the Karush–Kuhn–Tucher optimality conditions, we have the following saddle point equations: ∂L ∂L ∂L = 0, = 0 and = 0, ∂w ∂b ∂ξi Combined with (20.2), we obtain the following equations: w=
n
αi yi φ(xi ),
i=1
n
αi yi = 0 and
αi = C − θi
i = 1, . . . , n. (20.3)
i=1
By substituting (20.3) into (20.2), and by replacing (φ(xi ) · φ(xj )) with kernel functions K(xi , xj ), we get the dual optimization problem of the original problem: n n n 1 αi − αi αj yi yj K(xi , xj ), max 2 i=1
s.t.
n
i=1 j=1
αi yi = 0,
i=1
0 ≤ αi ≤ C,
i = 1, . . . , n.
(20.4)
On solving the dual optimization problem, solution αi determines the parameters of the optimal hyper-plane. This leads to the decision function, expressed as n yi αi K(xi , x) + b , (20.5) f (x) = sgn n
i=1
where b = yi − i=1 yi αi K(xi , xj ), for some j, αi ≤ C. Usually, there is only a small subset of Lagrange multipliers αi that is greater than zero in a classification problem. Training vectors having nonzero α∗i are called support vectors, that is, the optimal decision hyper-plane depends on the support vectors exclusively. Finally, the decision function could be written as follows: n ∗ ∗ yi αi k(xi , x) + b . (20.6) f (x) = sgn i∈SV
20.2.2 Feature extraction methods In the fields of machine learning, feature extraction is the first and critical step which starts from an initial set of original data and extracts effective features intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to
page 797
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
798
better human interpretations (e.g., see Hira and Gillies, 2015; Wiatowski and B¨olcskei, 2018). From another aspect, dealing with a large number of variables generally requires a large amount of computation power, and it may cause the classification algorithm facing the problem of overfitting, reducing the generalization ability of the model (e.g., see Chen et al., 2011; Frohlich et al., 2004; Li et al., 2012; Liu et al., 2007; Suykens and Wandewalle, 1999). Actually speaking, there are several different kinds of feature extraction methods, such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Multi-Dimension Scaling (MDS), Isometric feature mapping (IsoMap), and so on. Among all these feature extraction methods, regularization method is an effective and frequently used method in recent researches, through adding penalty term on the model in the objective function (e.g., see Gao et al., 2011; Liu et al., 2007; Maldonado et al., 2017). Here, we mainly adopt the L1 , L2 and the adaptive Lq regularization methods. 20.2.2.1 L1 and L2 regularization In general, a common problem exists when using PCA, SVD, MDS, that is, long-term experience and expert knowledge are necessary to set several parameters to the right values. As a result, embedded feature selection strategies have recently gained more and more interest. In these embedded methods, feature selection and model building is combined into one single optimization step. In other words, the objective function contains a regularization term that constrains meaningless and redundant features to have zero weights, thus switching them effectively off. The p-norm of parameter vector w of the model is commonly used as the regularization term, whose calculation formula is as follows: m 1/p |wi |p . (20.7) wp = i=1
In general, the regularization terms frequently used now are with p = 1 and p = 2 called L1 (Lasso) and L2 (Ridge regression) regularization, respectively. The corresponding form is as follows: w1 =
m
|wi |,
(20.8)
i=1
w2 =
m i=1
1/2 2
|wi |
.
(20.9)
page 798
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
799
Actually, considering the quadratic form of the L2 regularization term, none of the model weights will be set exactly to zero during learning and hence no explicit feature selection is done. On the other hand, L1 regularization leads to sparse models due to its linear form where the weights of many features are set to zero rigorously, resulting in efficient feature selection (e.g., see Chen et al., 2011; Li et al., 2012). When considering feature selection and model building simultaneously, the standard SVM for binary classification problem can be changed as follows: n 1 l(f (xi ), yi ) + λw22 , (20.10) min f n i=1
where l(f (xi ), yi ) = [1 − yi f (xi )]+ is the convex hinge loss; w22 , the L2 penalty of f , is a regularization term serving as the roughness penalty of f , and λ > 0 is a parameter that controls trade-off between the fitness of the data measured by l and the complexity of f , in terms of w22 . 20.2.2.2 Adaptive Lq regularization In practice, different data tend to have different structures, and may be suitable for different types of penalty forms. In other words, fixed penalty form would affect the performance of the model. In 2007, Liu et al. proposed an adaptive Lq penalty SVM to find the best penalty q for different data structures (e.g., see Liu et al., 2007). Based on the SVM model (e.g., see Vapnik, 1995), the adaptive Lq penalty SVM can be briefly described as n 1 c(−yi )[1 − yi f (xi )]+ + λwqq , (20.11) min n i=1
where f (x) is the decision function, and c(+1) and c(−1) are the costs for false positives and false negatives, respectively. Different from the standard binary SVM, there are two parameters λ and q to be tuned. Parameter λ, playing the same role as in the non-adaptive SVM, controls the trade-off between minimizing the hinge loss and the penalty on f , and parameter q determines the penalty function on f . Here, q ∈ (0, 2) is considered as a tuning parameter, and it can be adaptively chosen by data, together with λ. The best q is dependent on the nature of the data structure. If there are many noisy attributes of inputs, the Lq penalty with q ≤ 1 is chosen since it automatically chooses important attributes and removes the noisy ones. Otherwise, if all the attributes are important, q > 1 is usually used to avoid any attributes deletion.
page 799
July 6, 2020
12:0
800
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
20.2.3 Kernel functions of SVM In recent years, due to the increased popularity of the SVM-based methods, kernel functions have received more and more attention, particularly (e.g., see Chen et al., 2011, 2018; Li et al., 2012; S´ anchez, 2003; Wang et al., 2003). Through kernel functions, researchers could map the data into higher dimensional spaces in which the data could become more easily separated or better structured. The idea of the kernel function is to enable operations to be performed in the input space rather than the potentially high dimensional feature space (e.g., see Ravale et al., 2015). Hence, the inner product does not need to be evaluated in the feature space. In a word, the principle of kernel function is that an inner product in feature space has an equivalent kernel in input space, which could be illustrated by the following formula: K(x, y) = φ(x), φ(y).
(20.12)
Usually, there are also no constraints on the form of this mapping, which could even lead to infinite-dimensional spaces. Commonly used kernel functions include linear, polynomial, Gauss and sigmoid kernel, which are shown in (20.13), (20.14), (20.15) and (20.16), respectively. K(x, y) = x · y + c
for c ≥ 0,
(20.13)
for γ > 0, d ∈ N, c ≥ 0,
(20.14)
K(x, y) = exp(−γx − y2 )
for γ > 0,
(20.15)
K(x, y) = tanh(γ(x · y) + c)
for γ > 0, c ≥ 0.
(20.16)
K(x, y) = (γ(x · y) + c)d
Every kernel function has its own characteristic and is suitable for different kinds of datasets. For instance, polynomial kernel allows us to model feature conjunctions up to the order of the polynomial. Gauss kernel function allows us to pick out circles (or hyperspheres), in contrast with the linear kernel, which allows only picking out lines (or hyperplanes). It is observed that there are some parameters in the kernel function that may influence the classification ability of the SVM models, which is another important factor in the selection of kernel functions. In our research, considering the structure of the credit dataset, we mainly adopted linear kernel function, polynomial kernel function and Gauss kernel function. 20.2.4 Hyper-parameter optimization methods Obviously, several different kinds of hyper-parameters need to be determined when using the SVM-based methods, such as the norm of the penalty term,
page 800
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
801
the parameters in different kernel functions, and so on. In machine learning, it is possible and recommended to search the hyper-parameter space for the best cross validation score. As a critical and necessary process, several different kinds of hyper-parameter optimization methods have been proposed by researchers (e.g., see Mantovani et al., 2015; Xia et al., 2017). The main objective of hyper-parameter optimization is to reduce the computation complexity so as to speed up the execution time of algorithm, while maintaining the accuracy of the models simultaneously. In the framework of SVM-based methodology, we could mainly choose four different kinds of hyper-parameter optimization methods, which are Grid Search (GS), Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Evolutionary Strategy (ES), respectively. 20.2.4.1 Grid search (GS) The Grid search method is a kind of violence search method (e.g., see Moreover, 2014; Syarif et al., 2016). The hyper-parameters are specified using minimal value (lower bound), maximal value (upper bound) and number of steps. That is, the hyper-parameters are varied with a fixed step-size through a wide range of values and the performance of every combination is assessed using some performance measure. For example, if there are two hyper-parameters to be determined, one parameter has M kinds of values and the other parameter has N kinds of values, then the main idea of Grid Search algorithm is to select M values of the first parameter and N values of the other and construct a grid with the size M × N . As for the SVM-based method, grid search optimizes the SVM parameters (C, γ, degree, etc.) using a cross-validation (CV) technique as a performance metric. The goal is to identify good hyper-parameter combination so that the classifier can predict unknown data accurately. The advantage of this method lies in the completeness of the combinations of different hyper-parameters. However, one of the biggest problems of SVM parameter optimization is that there are no exact ranges of C and γ values. We believe that the wider the parameter range is, the more possibilities the grid search method has of finding the best combination of parameters. Besides, although the optimal parameters can be achieved using this method at last, there exists large amounts of calculation in the training processes. For example, if we choose to optimize 5 parameters and 10 steps for each parameter, then the total combinations would be 105 , which would require a huge amount of time. Because of the computational complexity, grid search is only suitable for the adjustment of very few parameters.
page 801
July 6, 2020
12:0
802
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
20.2.4.2 Particle swarm optimization (PSO) Particle swarm optimization is a kind of swarm intelligence optimization method, which can be realized by simulating various group behaviors of social insects and by using information interaction and cooperation among individuals in a group (e.g., see Demidova et al., 2016; Ghamisi and Benediktsson, 2015; Kennedy and Eberhart, 2011; Zhang et al., 2015). Like evolutionary algorithms, PSO performs searches using a population (called swarm) of individuals (called particles) that are updated from iteration to iteration. These particles are moved around in the search-space according to a few simple formulae. The movements of the particles are guided by their own best known position in the search-space as well as the entire swarm’s best known position. The process is repeated and by doing so it is hoped, but not guaranteed, that a satisfactory solution will eventually be discovered. To discover the optimal solution, each particle changes its searching direction according to two factors, its own best previous experience (Pbest ) and best experience of all other members (Gbest ). In general, specific to the optimization problem, the start of the particle or the possible solution is randomly distributed, with the iterative optimization, these particles in search-space constantly change, eventually gathered near the optimal solution. Specifically, each particle’s movement is the combination of its own initial velocity and the two randomly weighted influences: individualism, the tendency to return to its best previous position, and the tendency to move towards the neighborhood’s best previous position (e.g., see Sands et al., 2015). The movement of a certain particle could be expressed as follows (e.g., see Shi and Eberhart, 1999): t+1 t = wvid + c1 ψ1 (ptid − xtid ) + c2 ψ2 (ptgd − xtgd ), vid
(20.17)
t is the velocity component in the d dimension of the ith particle, where vid t xid is the position component in the d dimension of the ith particle, c1 , c2 are individual and social constant weighted factors, pi is the best position achieved thus far by the particle, pg is the best position achieved thus far by the particle’s neighbors, ψ1 , ψ2 are random weight factors between 0 and 1, and w is the inertia weight. PSO has some disadvantages: (1) it is easy to get trapped in the local extreme point and the correct result cannot be obtained; (2) although PSO provides the possibility of global search, it does not guarantee the convergence to the global optimum. Therefore, PSO algorithm is generally applicable to a class of high dimensional optimization problems with multiple local extreme without requiring high precision solutions.
page 802
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
803
20.2.4.3 Simulated annealing (SA) The simulated annealing algorithm, as its name suggests, is similar to the annealing process in thermodynamics. At a given initial temperature, the algorithm will find the approximate optimal solution of the problem in polynomial time as the temperature decreases (e.g., see Gonz´alez-Mancha et al., 2017; Kirkpatrick et al., 1983; Lin et al., 2008; Welsh, 1988). The simulated annealing algorithm accepts a solution worse than the current solution with a certain probability, so it is possible to get out of the local optimal solution and reach the global optimal solution. In general, the simulated annealing algorithm works as follows. At each time step, the algorithm randomly selects a solution close to the current one, measures its quality, and then decides to move to it or to stay with the current solution based on either one of two probabilities between which it chooses on the basis of the fact that the new solution is better or worse than the current one. During the search, the temperature is progressively decreased from an initial positive value to zero and affects the two probabilities: at each step, the probability of moving to a better new solution is either kept to 1 or is changed towards a positive value; instead, the probability of moving to a worse new solution is progressively changed towards zero. The implementation steps of simulated annealing algorithm are as follows (e.g., see Liu et al., 2013): (1) Let x stand for a certain parameter in the model. Set the parameter x a variation range, choose an initial solution randomly within the range and calculate corresponding target value E(x0 ); set initial temperature T0 and final temperature Tf , produce a random number ε ∈ (0, 1) as the probability threshold value, set the cooling law: T (t + 1) = γ × T (t), where 0 < γ < 1 is the annealing factor and t is the iteration number. (2) Under the condition of temperature T , carry out perturbation Δx on the current solution, which will produce a new solution x = x + Δx. Calculate the difference between target values E(x ) and E(x) : ΔE(x) = E(x ) − E(x). (3) If ΔE(x) < 0, that is, the temperature is lower, then the new solution x is accepted; if ΔE(x) > 0, calculate the probability p = exp(−ΔE/(K × T )), where K is a certain constant usually valued 1, and T is the temperature. If p > ε, then x is accepted as the new solution. Make x = x when x is accepted. (4) Under certain temperature, repeat (3). Cool down slowly to T following the temperature reduction principle.
page 803
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
804
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
(5) Repeat Steps (2)–(4) until convergence conditions are satisfied. As for tuning the hyper-parameters in SVM-based method, the target value E(x) here would be changed by MAPE value. The advantage of SA is that the local search capability is strong and the running time is short. However, the disadvantage is that the global search ability is poor and easy to be affected by parameters. 20.2.4.4 Evolution strategy (ES) As one of the important classes of evolution algorithms (EA), evolution strategy (ES) is often used as an optimization tool due to its self-adaptation search capacity, which quickly and nicely guides the search toward optimal points (e.g., see Li et al., 2012; Papageorgiou and Groumpos, 2005; Rechenberg, 1973; Schwefel, 1975; Suykens and Wandewalle, 1999). The ES algorithm has three basic operations, which are mutation, recombination and selection, respectively. ES are made of several iterations of the basic evolution cycle, which is composed of the above three steps (e.g., see Beyer and Schwefel, 2002; Liu et al., 2006; Papadrakakis and Lagaros, 2003). Moreover, the fitness function is a criterion for evaluating search results, and is also very important to the ES. The chromosome with high fitness value has high probability of being preserved to the next generation. ES algorithm has two main characteristics: (1) natural selection in evolution strategy is conducted in a definite way, which is different from the random selection method in genetic algorithm and evolutionary planning; (2) recombination operators are provided in evolutionary strategy, but recombination in evolutionary strategy is different from exchange in genetic algorithm. That is, it does not exchange a part of an individual, but binds each of the individuals; each of the new individuals contains the corresponding information of the two old individuals (e.g., see Burges, 1998; Li et al., 2011; Mersch, 2007; Wei et al., 2011). The ES algorithm is made of several iterations of the basic evolution cycle, which has three basic operations: mutation, recombination and selection. The process of a basic evolution cycle of ES algorithm is as follows (e.g., see Li et al., 2012): (1) Mutation: The mechanism of ES mutation is implemented by making perturbations, in other words, adding random numbers which follow a Gaussian distribution to its parameter vector (chromosome).
page 804
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
805
Let (p1 , . . . , pn ) be the chromosome involving n parameters. Then a mutation operator is defined as: = pgt + N (0, εg+1 ), pg+1 t t
(20.18)
where strategy parameters εt are actually step sizes of mutation, and g is the number of generations. The parameters εt will also evolve as follows: = εgt exp(τ, N (0, 1)), εg+1 t
(20.19)
where τ is the learning rate controlling the self-adaptation speed of εt . (2) Recombination: The recombination process will reproduce only one offspring from ρ parents. Specifically, ES has two kinds of the recombination technique: discrete recombination and intermediate recombination. Discrete recombination selects every component randomly for the offspring, from relevant components of ρ parents. In contrast, the intermediate recombination gives an equal right to all ρ parents for reproduction. The offspring takes the average of ρ parent vectors as its value. When the variables are discrete, we need to perform a rounding operation. (3) Selection: There are two kinds of this selection technique, one is comma selection method, denoted by (μ, θ), and another is plus selection method, denoted by (μ + θ). For (μ, θ) selection, only θ offspring individuals can be selected, while parental individuals are not selected in the set. This selection technique will forget the information of parent generation. Thus, we can avoid pre-converging on local optimal points. However, the plus selection (μ + θ) gives a choice from the selection set, which comprises the parents and the offspring. It can ensure the best individuals survive, and then preserve them. Finally, the fitness function is a criterion for evaluating search results, and is also very important to the ES. The chromosome with high fitness value has high probability of being preserved to the next generation. In our research, we would use the total classification accuracy of SVM methods as the fitness function. 20.3 Data and SVM-Based Methodology 20.3.1 Data In recent decades, credit risk analysis has always been a research hotspot and several standardized datasets have been used for the purpose of comparing
page 805
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
806 Table 20.1:
Basic information of three credit datasets.
Datasets No. No. No. No. No.
of of of of of
Total Instances Good Instances Bad Instances Attributes Classes
ACD
GCD
AMCD
690 383 307 14 2
1000 700 300 24 2
5000 4185 815 65 2
different models’ performance in the given space and time. Notable examples include Australian credit dataset (ACD) and German credit dataset (GCD), which we could download from the UCI Machine Learning Repository (http://archive.ics.uci.edu/ml). ACD has 690 total instances, which is composed of 383 good instances and 307 bad instances. GCD has 1,000 total instances, which is composed of 700 good instances and 300 bad instances. The number of attributes of ACD and GCD are 14 and 24, respectively, which is less enough than the number of instances. Obviously, these two datasets are balanced dataset, the positive and negative samples in which are basically the same size. In addition, we also utilized a real-life credit dataset of a US commercial bank (AMCD) to compare the classification ability of different models. Particularly, AMCD has 5,000 total instances, which is composed of 4,185 good instances and 815 bad instances. It’s worth noting that the proportion of bad instances in AMCD is relatively lower, is only 16.3%. Thus, AMCD is a typical unbalanced dataset and more like the actual situation. These three datasets we used in the experiment are described in Table 20.1. 20.3.2 SVM-based methodology In order to classify the good and bad borrowers utilizing SVM-based methods, the whole framework should be constructed as shown in Figure 20.1. As shown in Figure 20.1, the specific algorithm steps for the classification problem are as follows: (1) Data collection and collation: In this step, all the values of different attributes should be normalization, that is, map all the values to [0, 1]. When numerical levels vary considerably between different attributes, if we directly analyze with the original data, the model would largely be influenced by those attributes with higher levels. Therefore, in order to ensure the reliability of the results, it is necessary to normalize the
page 806
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
Figure 20.1:
b3568-v1-ch20
807
The flow chart of utilizing SVM-based methods in credit risk analysis.
original data. This could effectively avoid the errors from the values with different dimension. In general, min–max normalization method is a commonly used method, and the specific computational process could be found in Appendix 20A. (2) Choose appropriate feature extraction method for SVM: In our research, we mainly focus on the regularization method which has the penalty on the model complexity. Hence, during this step, we would choose an appropriate regularization method to extract important and effective features for the SVM model. In detail, we could choose a certain regularization method from L1 , L2 and adaptive Lq . (3) Choose appropriate kernel functions: Based on the intrinsic structure of data, we could choose different kernel functions introduced in Section 20.2.3. In general, if the number of features is much larger than the sample size, or both number of features and the sample size are large, then the linear kernel function would be better. In addition, the Gauss kernel function is generally used in cases where the number of features is much less than the sample size.
page 807
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
808
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
(4) Choose appropriate hyper-parameter optimization method: Facing a certain kind of combination of penalty and kernel function which we determine before, GS, PSO, SA or ES method could be used to implement the hyper-parameter optimization process. Generally speaking, different hyper-parameter optimization methods themselves have different number of parameters which should be determined in advance; therefore, we should choose an appropriate method considering the present dataset structure. (5) Training and predicting process: For the purpose of obtaining an independent test set, we randomly divided the whole dataset into two parts, and the corresponding proportion is 80% and 20%, respectively. After partitioning the training set and test set randomly for several times, SVM model’s training and predicting would be implemented for many times, until we obtain the satisfactory accuracy. In other words, using the SVM model constructed through the combination of the above (2), (3) and (4) steps, we could implement the model training and predicting. In total, we could utilize SVM-based methodology to classify the credit dataset through the above five steps. For every combination of different feature extraction method, kernel function and hyper-parameter optimization method, there is a certain SVM-based algorithm. For example, we could use GS-L1 G-SVM to stand for the algorithm which is composed of L1 regularization method, Gauss kernel function and GS hyper-parameter optimization method. Specifically speaking, the detailed procedure of the whole algorithm could be found in Appendix 1B of this chapter.
20.4 Experiment Results To guarantee valid results for making predictions, the dataset was randomly partitioned into training sets and independent test sets, and the corresponding proportions are 80% and 20%. For a certain SVM-based method, which is a certain combination of feature extraction method, kernel function and hyper-parameter optimization method, 5-fold cross-validation is used in our experiments when training SVM models for choosing the hyper-parameters for the binary-class datasets. With 5-fold cross-validation, each sample is used for validating the model trained with the other samples in the training sets. Then we use this model with the optimal parameters in the test sets
page 808
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
Support Vector Machines Based Methodology for Credit Risk Analysis
809
to make classification. Therefore, we could acquire the corresponding average validation accuracy and avoid overfitness. We used Matlab R2015b to perform all the computation. The performance is measured by Type 1 accuracy (T1), Type 2 accuracy (T2) and Total accuracy (T). They stand for the percent of correctly classified good samples, the percent of correctly classified bad samples and the percent of correctly classified in total, respectively. The formulas are as follows (e.g., see Chen et al., 2011; Li et al., 2012): TP + TN , TP + FN + TN + FP TP , Type 1 Accuracy (T1) = TP + FN TN , Type 2 Accuracy (T2) = TN + FP Total Accuracy (T) =
where TP is the number of good credit samples that are correctly classified; TN is the number of bad instances correctly classified; FN is the number of good credit samples that are misclassified, and FP is the number of the misclassified bad samples. 20.4.1 Experiment results — Different SVM-based models Firstly, we compare the results of several different SVM-based models derived from the framework we constructed before, which are combinations of different penalty form (L1 , L2 and adaptive Lq ), different kernel functions (linear, polynomial, Gauss and Sigmoid) and different hyper-parameter optimization methods (GS, PSO, SA, ES). In total, there are 48 different kinds of methods under our computational framework theoretically. However, actually speaking, many kinds of methods are obviously not suitable for the credit dataset. For example, linear kernel function may not be suitable for the nonlinear classification tasks. Finally, after comparison and selection, results of those experiments with relatively high precision on the three credit datasets (ACD, GCD and AMCD) with SVM-based models are shown in Table 20.2. The experiment results show that Gauss kernel-based SVM models show better accuracy than models with linear and polynomial kernel functions, and the classification accuracy of Lq -based methods is generally better or equal than L1 - and L2 -based methods. As for hyper-parameter optimization method, ES-based methods and GS-based methods generally outperforms the other methods, such as PSO and SA.
page 809
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
810 Table 20.2:
Experiment results of SVM-based models on ACD, GCD and AMCD. ACD
Methods GS-L1 G-SVM PSO-L1 G-SVM SA-L1 G-SVM ES-L1 G-SVM GS-L2 G-SVM PSO-L2 G-SVM SA-L2 G-SVM ES-L2 G-SVM GS-ALq G-SVM PSO-ALq G-SVM SA-ALq G-SVM ES-ALq G-SVM
GCD
AMCD
T1(%) T2(%) T(%) T1(%) T2(%) T(%) T1(%) T2(%) T(%) 73.44 74.09 73.68 70.99 82.13 79.40 81.00 81.78 87.72 85.97 85.72 93.21
77.87 77.56 71.18 76.69 86.21 81.72 85.52 84.33 89.18 84.35 87.18 86.58
75.03 75.08 72.12 73.56 84.06 80.57 83.38 82.25 88.10 86.60 86.10 90.77
66.22 65.77 60.57 83.00 73.43 90.57 74.38 74.00 75.60 74.34 68.60 70.38
62.67 62.22 67.33 44.67 50.67 27.00 72.00 68.00 72.00 70.17 72.62 86.20
64.99 63.35 62.20 65.20 68.80 71.50 73.50 71.25 75.00 73.50 70.60 75.21
60.33 68.55 67.00 63.21 74.33 69.67 68.11 69.09 66.89 68.66 70.98 72.37
71.15 60.44 60.23 67.48 70.15 73.45 75.77 73.45 80.35 73.45 75.77 80.15
65.00 64.46 64.34 65.67 72.30 71.17 72.25 72.26 74.01 70.65 72.34 73.77
20.4.2 Experiment results — SVM-based models with other major models Apart from SVM-based methods, we also compare some SVM-based models with eight other major credit analysis models proposed by other researchers in the two UCI datasets, i.e., linear discriminant analysis (LDA), decisionbased See5, logistic regression (LogR), decision tree (DT), k-nearest neighbor classifier (k-NN) with k = 10, MCCQP, SVM light and MK-MCP. Results of LDA, See5, LogR, DT, k-NN, MCCQP, SVM light are from previous studies (e.g., see Li et al., 2011; Peng et al., 2008), MK-MCP is from research results of Wei (e.g., see Wei, 2008). Results of experiments on the two UCI datasets (ACD and GCD) are shown in Table 20.3. The experiment results show that the classification accuracy of SVMbased methods is generally better or equal than other kinds of methods, such as See5, DT, MCCQP and other popular algorithms. The adaptive Lq SVM model with Gauss kernel and ES hyper-parameter optimization approach (ES-ALq G-SVM) outperforms all the other models listed in this study, and its average total classification accuracy in the two UCI datasets could achieve 90.77% and 75.21%, respectively. With respect to ACD, ES-ALq G-SVM performs best in Total accuracy and Type 1 accuracy, i.e., 90.77% and 93.21%; Type 2 accuracy is 86.58%, which is also competitive. With respect to GCD, the results of ES-ALq G-SVM are also satisfactory. Though Type 1 accuracy is a little lower, Type 2 accuracy and Total accuracy are satisfactory. Type 2 accuracy is 86.2%, which is the best; Total accuracy is 75.21%, although it is not the best, but is better than popular LDA, See5, DT, k-NN, MCCQP, SVM light and MK-MCP. In general, the experiment results of ACD and
page 810
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
Support Vector Machines Based Methodology for Credit Risk Analysis
811
Table 20.3: Experiment results between SVM-based models and other models on ACD and GCD. ACD Methods LDA See5 LogR DT k-NN MCCQP SVM light MK-MCP GS-L2 G-SVM GS-ALq G-SVM ES-ALq G-SVM
GCD
T1(%)
T2(%)
T(%)
T1(%)
T2(%)
T(%)
80.68 87.99 86.32 80.13 54.40 87.00 18.03 87.58 82.13 87.72 93.21
92.18 84.69 85.90 87.21 81.72 85.52 90.65 90.91 86.21 89.18 86.58
85.8 86.52 86.09 84.06 69.57 86.38 44.83 88.70 84.06 88.10 90.77
72.57 84.00 88.14 77.43 90.57 74.38 77.00 75.60 73.43 75.60 70.38
71.33 44.67 50.33 48.67 27.00 72.00 42.00 72.00 50.67 72.00 86.20
72.20 72.20 76.80 68.80 71.50 73.50 66.50 75.00 68.80 75.00 75.21
GCD illustrate that ES-based ALq G-SVM model is relatively competitive than other popular models. In total, the average classification accuracy of SVM-based methods is competitive enough, and this kind of method is an effective and robust means to deal with credit risk analysis problems. 20.4.3 Experimental results — Computation complexity of different SVM-based methods To compare the computing speed of these different SVM-based methods, we also compared the time consumed in completing classification tasks on the computer, which could be indicated by the CPU time. The CPU time denotes the time required for each algorithm to run a 5-fold cross-validation within the training set. In our experiment, the average CPU time is the average time of 10 training runs (5-fold cross validation repeated 10 times), which shown in Table 20.4. Actually, in comparison to the procedure of SVM models’ internal training and predicting, most of the CPU time would be spent on the hyper-parameter optimization process. Among all the SVM-based methods, ALq G-SVM approach is obviously the most time-consuming because it has more hyper-parameters to be optimized simultaneously, compared with L1 or L2 norm-based methods. In order to be consistent, we choose Gauss kernel function and Lq regularization term to compare the computation complexity of all the SVM-based methods. Table 20.4 shows the results when different hyper-parameter optimization methods are applied on the binaryclass datasets, with the 5-fold training process.
page 811
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
812 Table 20.4: methods.
Experiment results of different hyper-parameter optimization
Methods
ACD
GCD
AMCD
ES based
T1(%) T2(%) T(%) average CPU time(s)
93.21 86.58 90.77 128
70.38 86.20 75.21 220
72.37 80.15 73.77 23022
GS based
T1(%) T2(%) T(%) average CPU time(s)
87.72 89.18 88.10 18545
75.60 72.00 75.00 43788
66.89 80.35 74.01 253300
PSO based
T1(%) T2(%) T(%) average CPU time(s)
85.97 84.35 86.60 158
74.34 70.17 73.50 244
68.66 73.45 70.65 25211
SA based
T1(%) T2(%) T(%) average CPU time(s)
85.72 87.18 86.10 207
68.60 72.62 70.60 458
70.98 75.77 72.34 32311
The experiment results show that ES-based ALq G-SVM can speed up the training process prominently. Obviously, GS-based methods would spend most time because it must try all the possible combinations of different parameters. For a certain SVM model, hyper-parameter optimization utilizing ES could effectively reduce the computing time in the premise of guarantying a higher accuracy, compared with the GS, PSO and SA. In general, the experiment results confirm that ES-based ALq G-SVM can get satisfactory performance and save much time in the meanwhile. Thus, when it comes to the real-world dataset, which usually has a large number of samples, it would be an excellent choice. 20.5 Conclusions Credit risk analysis, in other words, the classification of good and bad borrowers through a mass of data is a critical and classical problem. With the development of machine learning, several different kinds of classification methods have been utilized to solve this problem, among which SVMbased methods have showed their strong ability. In this study, we systematically review and analyze SVM-based methodology in the field of credit risk analysis, which is composed of feature extraction methods, kernel function
page 812
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
813
selection of SVM and hyper-parameter optimization methods, respectively. With the purpose of validating the accuracy and effectiveness of SVM-based methods, we implemented these different methods on three credit datasets. Through the experiment results, we obtained the following three major conclusions. Firstly, SVM-based methods generally show better or equal classification ability when choosing suitable combination of feature extraction method, kernel function and hyper-parameter optimization method in the field of credit risk analysis, which is a very effective method to implement credit risk analysis. According to our experiment’s results, the adaptive Lq SVM model with Gauss kernel (ALq G-SVM) approach outperforms all the other models listed in this study, which could dynamically adjust the norm of penalty function according to the characteristics and structure of the sample data. Secondly, compared with linear kernel function and polynomial kernel function, Gauss kernel function has its advantages in solving nonlinear classification problem, which maps the data into high dimension to separate the data better when choosing the same penalty form of the model. Thirdly, evolution strategy (ES) based hyper-parameter optimization method can drastically save up the computation time and obtain a good accuracy in the meanwhile when the parameter space is very huge. Some questions still remain unsolved. On the one hand, different datasets would have different structure and internal characteristics, many more other datasets need to be tested in the future to verify and extend the SVM-based methods. In addition, how to make the model more interpretable for credit decision making needs to be researched in the future. With the development of deep learning methods, several unsupervised feature learning methods have been proposed, which could reduce the subjectivity further. How to combine SVM-based methods with deep learning methods to utilize the respective advantages is another interesting point of concern for the future. Bibliography Abdou, H., Pointon, J. and El-Masry, A. (2008). Neural Nets Versus Conventional Techniques in Credit Scoring in Egyptian Banking. Expert Systems with Applications, 35, 1275–1292. Altman, E. I. and Saunders, A. (1997). Credit Risk Measurement: Developments Over the Last 20 Years. Journal of Banking & Finance, 21(11–12), 1721–1742. Angelini, E., di Tollo, G. and Roli, A. (2008). A Neural Network Approach for Credit Risk Evaluation. The Quarterly Review of Economics and Finance, 48, 733–755. Baesens, B., Gestel, T. V., Viaene, S., et al. (2003). Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring. Journal of the Operational Research Society, 54(6), 627–635.
page 813
July 6, 2020
12:0
814
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
Beyer, H. G. and Schwefel, H. P. (2002). Evolution Strategies — A Comprehensive Introduction. Natural Computing, 1(1), 3–52. Bradley, P. S. and Mangasarian, O. L. (2000). Massive Data Discrimination via Linear Support Vector Machines. Optimization Methods and Software, 13(1), 1–10. Burges, C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2), 121–167. Chang, C. C. and Lin, C. J. (2011). LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27. Chen, W. M., Ma, C. Q. and Ma, L. (2009). Mining the Customer Credit Using Hybrid Support Vector Machine Technique. Expert Systems with Applications, 36, 7611–7616. Chen, W., Pourghasemi, H. R. and Naghibi, S. A. (2018). A Comparative Study of Landslide Susceptibility Maps Produced Using Support Vector Machine with Different Kernel Functions and Entropy Data Mining Models in China. Bulletin of Engineering Geology and the Environment, 77(2), 647–664. Chen, Z. Y., Li, J. P. and Wei, L. W. (2007). A Multiple Kernel Support Vector Machine Scheme for Feature Selection and Rule Extraction from Gene Expression Data of Cancer Tissue. Artificial Intelligence in Medicine, 41(2), 161–175. Chen, Z. Y., Li, J. P., Wei, L. W., Xu, W. X. and Shi, Y. (2011). Multiple-kernel SVM Based Multiple-Task Oriented Data Mining System for Gene Expression Data Analysis. Expert Systems with Applications, 38(10), 12151–12159. Crouhy, M., Galai, D. and Mark, R. (2000). A Comparative Analysis of Current Credit Risk Models. Journal of Banking & Finance, 24(1–2), 59–117. Debnath, R., Muramatsu, M. and Takahashi, H. (2005). An Efficient Support Vector Machine Learning Method with Second-Order Cone Programming for Large-Scale Problems. Applied Intelligence, 23(3), 219–239. Demidova, L., Nikulchev, E. and Sokolova, Y. (2016). The SVM Classifier Based on the Modified Particle Swarm Optimization. arXiv preprint arXiv:1603.08296, 2016. Diederic, J. H., Al-Ajmi, A. and Yellowlees, P. (2007). Ex-ray: Data Mining and Mental Health. Applied Soft Computing, 7, 923–928. Doumpos, M., Kosmidou, K., Baourakis, G., et al. (2002). Credit Risk Assessment Using a Multicriteria Hierarchical Discrimination Approach: A Comparative Analysis. European Journal of Operational Research, 138(2), 392–412. Frohlich, H., Chapelle, O. and Scholkopf, B. (2004). Feature Selection for Support Vector Machines Using Genetic Algorithm. International Journal on Artificial Intelligence Tools, 13, 791–800. Gao, S., Ye, Q. and Ye, N. (2011). 1-Norm Least Squares Twin Support Vector Machines. Neurocomputing, 74(17), 3590–3597. Ghamisi, P. and Benediktsson, J. A. (2015). Feature Selection Based on Hybridization of Genetic Algorithm and Particle Swarm Optimization. IEEE Geoscience and Remote Sensing Letters, 12(2), 309–313. Gilchrist, S. and Mojon, B. (2018). Credit Risk in the Euro Area. The Economic Journal, 128(608), 118–158. Gonz´ alez-Mancha, J. J., Frausto-Sol´ıs, J., Valdez, G. C., et al. (2017). Financial Time Series Forecasting Using Simulated Annealing and Support Vector Regression. International Journal of Combinatorial Optimization Problems and Informatics, 8(2), 10–18. Grace, A. M. and Williams, S. O. (2016). Comparative Analysis of Neural Network and Fuzzy Logic Techniques in Credit Risk Evaluation. International Journal of Intelligent Information Technologies, 12(1), 47–62. Gunn, S. R. (1998). Support Vector Machines for Classification and Regression, Faculty of Engineering, Science and Mathematics School of Electronics and Computer Science, Technical Report.
page 814
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
815
Guo, Y., Zhou, W., Luo, C., et al. (2016). Instance-based credit risk assessment for investment decisions in P2P lending. European Journal of Operational Research, 249(2), 417–426. Hira, Z. M. and Gillies, D. F. (2015). A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data. Advances in bioinformatics, 2015. Huang, C. L. and Wang, C. J. (2006). A GA-based Feature Selection and Parameters Optimization for Support Vector Machines. Expert Systems with applications, 31(2), 231–240. Jiang, H., Ching, W. K., Cheung, W. S., et al. (2017). Hadamard Kernel SVM with Applications for Breast Cancer Outcome Predictions. BMC systems biology, 11(7), 138. Jindal, A., Dua, A., Kaur, K., et al. (2016). Decision Tree and SVM-based Data Analytics for Theft Detection in Smart Grid. IEEE Transactions on Industrial Informatics, 12(3), 1005–1016. Kennedy, J. and Eberhart, R. (2011). Particle Swarm Optimization. Proc. of 1995 IEEE Int. Conf. Neural Networks, (Perth, Australia), Nov. 27-Dec, 4(8), 1942–1948, Vol. 4. Kirkpatrick, S., Gelatt, C. D. and Vecchi, M. P. (1983). Optimization by Simulated Annealing. Science, 220(4598), 671–680. Kumar, M. A. and Gopal, M. (2009). Least Squares Twin Support Vector Machines for Pattern Classification. Expert Systems with Applications, 36, 7535–7543. Kumar, V., Natarajan, S., Keerthana, S., et al. (2016). Credit Risk Analysis in Peer-toPeer Lending System. IEEE International Conference on Knowledge Engineering and Applications, 2016, 193–196. Laha, A. (2007). Building Contextual Classifiers by Integrating Fuzzy Rule Based Classification Technique and k-nn Method for Credit Scoring. Advanced Engineering Informatics, 21, 281–291. Langone, R., Alzate, C., Ketelaere, B. D., et al. (2015). LS-SVM Based Spectral Clustering and Regression for Predicting Maintenance of Industrial Machines. Engineering Applications of Artificial Intelligence, 37, 268–278. Li, J. P., Liu, J. L., Xu, W. X. and Shi, Y. (2004). Support Vector Machines Approach to Credit Assessment. Lecture Notes in Computer Science, 3039, 892–899. Li, J. P., Chen, Z. Y., Wei, L. W., Xu, W. X. and Kou, G. (2007). Feature Selection via Least Squares Support Feature Machine. International Journal of Information Technology & Decision Making, 6(4), 671–686. Li, J. P., Wei, L. W., Li, G. and Xu, W. X. (2011). An Evolution-Strategy Based Multiple Kernels Multi-Criteria Programming Approach: The Case of Credit Decision Making. Decision Support System, 51(2), 292–298. Li, J. P., Li, G., Sun, D. X., et al. (2012). Evolution Strategy Based Adaptive Lq Penalty Support Vector Machines with Gauss Kernel for Credit Risk Analysis. Applied Soft Computing, 12(8), 2675–2682. Lin, S. L. (2009). A New Two-Stage Hybrid Approach of Credit Risk in Banking Industry. Expert Systems with Applications, 36, 8333–8341. Lin, S. W., Lee, Z. J., Chen, S. C., et al. (2008). Parameter Determination of Support Vector Machine and Feature Selection Using Simulated Annealing Approach. Applied soft computing, 8(4), 1505–1512. Liu, J., Niu, D., Zhang, H., et al. (2013). Forecasting of Wind Velocity: An Improved SVM Algorithm Combined with Simulated Annealing. Journal of Central South University, 20(2), 451–456. Liu, J. L., Li, J. P., Xu, W. X. and Shi, Y. (2011). A Weighted LQ Adaptive Least Squares Support Vector Machine Classifiers — Robust and Sparse Approximation. Expert Systems with Applications, 38(3), 2253–2259.
page 815
July 6, 2020
12:0
816
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
Liu, R. M., Liu, E., Yang, J., Li, M. and Wang, F. L. (2006). Optimizing the HyperParameters for SVM by Combining Evolution Strategies with A Grid Search. Lecture Notes in Control and Information Sciences, 344, 712–721. Liu, Y. F., Zhang, H. H., Park, C. and Ahn, J. (2007). Support Vector Machines with Adaptive LQ Penalty. Computational Statistics & Data Analysis, 51, 6380–6394. Maldonado, S., Bravo, C., Lopez, J., et al. (2017). Integrated Framework for Profit-Based Feature Selection and SVM Classification in Credit Scoring. Decision Support Systems, 104, 113–121. Maldonado, S., Montoya, R. and L´ opez, J. (2017). Embedded Heterogeneous Feature Selection for Conjoint Analysis: A SVM Approach Using L1 Penalty. Applied Intelligence, 46(4), 775–787. Mantovani, R. G., Rossi, A. L. D., Vanschoren. J., et al. (2015). Effectiveness of Random Search in SVM Hyper-Parameter Tuning. International Joint Conference on Neural Networks, IEEE, 2015, 1–8. Mersch, B., Glasmachers, T., Meinicke, P. and Igel, C. (2007). Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts. International Journal of Neural Systems, 17(5), 369–381. Mitra, V., Wang, C. J. and Banerjee, S. (2007). Text Classification: A Least Square Support Vector Machine Approach. Applied Soft Computing, 7, 908–914. Moreover. (2014). Support Vector Regression Based on Grid-Search Method for Short-Term Wind Power Forecasting. Journal of Applied Mathematics, 2014(1–4), 1–11. M¨ uller, K. R., Mika, S., R¨ atsch, G., Tsuda, K. and Sch¨ olkopf, B. (2001). An Introduction to Kernel-Based Learning Algorithms. IEEE Transaction on Neural Networks, 12(2), 181–201. Oreski, S. and Oreski, G. (2014). Genetic Algorithm-Based Heuristic for Feature Selection in Credit Risk Assessment. Expert Systems with Applications, 41(4), 2052–2064. Papadrakakis, M. and Lagaros, N. D. (2003). Soft Computing Methodologies for Structural Optimization. Applied Soft Computing, 3, 283–300. Papageorgiou, E. I. and Groumpos, P. P. (2005). A New Hybrid Method Using Evolutionary Algorithms to Train Fuzzy Cognitive Maps. Applied Soft Computing, 5, 409–431. Peng, Y., Kou, G., Shi, Y. and Chen, Z. X. (2008). A Multi-Criteria Convex Quadratic Programming Model for Credit Data Analysis. Decision Support Systems, 44, 1016–1030. Ravale, U., Marathe, N. and Padiya, P. (2015). Feature Selection Based Hybrid Anomaly Intrusion Detection System Using K Means and RBF Kernel Function. Procedia Computer Science, 45, 428–435. Rechenberg, I. (1973). Evolutionsstrategie, Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Frommann-Holzboog, Stuttgart, Germany. S´ anchez V. D. (2003). Advanced Support Vector Machines and Kernel Methods. Neurocomputing, 55, 5–20. Sands, T. M., Tayal, D., Morris, M. E., et al. (2015). Robust Stock Value Prediction Using Support Vector Machines with Particle Swarm Optimization. IEEE Congress on Evolutionary Computation, 2015, 3327–3331. Schwefel, H. P. (1975). Evolutionsstrategie und numerische optimierung, Dissertation, TU Berlin, Germany. Shankar, K., Lakshmanaprabu, S. K., Gupta, D., et al. (2018). Optimal Feature-Based Multi-Kernel SVM Approach for Thyroid Disease Classification. The Journal of Supercomputing, 2018, 1–16.
page 816
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
817
Shi, Y. and Eberhart, R. C. (1999). Empirical study of particle swarm optimization. Proceedings of the 1999 Congress on IEEE, 3, 1945–1950. Sun, D. X., Li, J. P. and Wei, L. W. (2008). Credit Risk Evaluation Using Adaptive LQ Penalty SVM with Gauss Kernel. Journal of Southeast University, 24, 33–36. Suykens, J. A. K. and Wandewalle, J. (1999). Least Squares Support Vector Machine Classifiers. Neural Processing Letters, 9, 293–300. Syarif, I., Prugel-Bennett, A. and Wills, G. (2016). SVM Parameter Optimization Using Grid Search and Genetic Algorithm to Improve Classification Performance. Telecommunication Computing Electronics and Control, 14(4), 1502–1509. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml/datasets.html. Vanitha, C. D. A., Devaraj, D. and Venkatesulu, M. (2015). Gene Expression Data Classification Using Support Vector Machine and Mutual Information-based Gene Selection. Procedia Computer Science, 47, 13–21. Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York. Vapnik, V. N. (1998). Statistical Learning Theory. Addison-Wiley, Reading, MA. Wang, W. J., Xu, Z. B., Lu, W. Z. and Zhang, X. Y. (2003). Determination of the Spread Parameter in the Gaussian Kernel for Classification and Regression. Neurocomputing, 55, 643–663. Wei, L. W., Chen, Z. Y., Li, J. P. and Xu, W. X. (2007). Sparse and robust least squares support vector machine: a linear programming formulation. Proceedings of the IEEE International Conference on Grey System and Intelligent Services, 54–58. Wei, L. W. (2008). Research on Data Mining Classification Model Based on the Multiple Criteria Programming and Its Application. PhD Thesis. Wei, L. W., Chen, Z. Y. and Li, J. P. (2011). Evolution Strategies Based Adaptive Lp LS-SVM. Information Sciences, 181(14), 3000–3016. Welsh, D. J. A. (1988). Simulated Annealing: Theory and Applications. Acta Applicandae Mathematica, 12(1), 108–111. Wiatowski, T. and B¨ olcskei, H. (2018). A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction. IEEE Transactions on Information Theory, 64(3), 1845–1866. Xia, Y., Liu, C., Li, Y. Y., et al. (2017). A Boosted Decision Tree Approach Using Bayesian Hyper-parameter Optimization for Credit Scoring. Expert Systems with Applications, 78, 225–241. Yu, L., Wang, S. Y. and Lai, K. K. (2008). Credit Risk Assessment with a Multistage Neural Network Ensemble Learning Approach. Expert Systems with Applications 34, 1434–1444. Yu, L., Zhou, R., Tang, L. and Chen, R. D. (2018). A DBN-based Resampling SVM Ensemble Learning Paradigm for Credit Classification with Imbalanced Data. Applied Soft Computing, 69, 192–202. Zhang, L., Hu, H. and Zhang, D. (2015). A Credit Risk Assessment Model Based on SVM for Small and Medium Enterprises in Supply Chain Finance. Financial Innovation, 1(1), 14. Zhang, Y., Wang, S. and Ji, G. (2015). A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications. Mathematical Problems in Engineering, 2015. Zhu, J., Rosset, S., Tibshirani, R., et al. (2004). 1-norm Support Vector Machines. Advances in Neural Information Processing Systems, 2004, 49–56.
page 817
July 6, 2020
12:0
818
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
Appendix 20A: Data Normalization Method In the step of data normalization, we utilize the min–max normalization method, which is also known as 0–1 normalization method, to process the credit data. For a certain attribute of the original sample data, let xi (i = 1, 2, . . . , n) denote the ith sample value of this attribute, x∗i (i = 1, 2, . . . , n) denote the corresponding value after normalization, then the formula of min–max normalization method is as follows: xi − min , x∗i = max – min where min stands for the minimum value of the sample, and max stands for the maximum value of the sample. After this normalization process, all the value of this attribute would be transformed within [0, 1]. Appendix 20B: Detailed Procedure of SVM-Based Methodology In this chapter, a framework of SVM-based methodology has been constructed, which combines different kinds of feature extraction method, kernel function and hyper-parameter optimization method. Here, the detailed procedure of SVM-based methodology would be introduced as follows: (1) For a certain dataset, we firstly normalize the original data utilizing the method introduced in Appendix 1A. Then, we randomly select 80% of the whole dataset as the train set used in the process of machine learning, and the other 20% be the test set. In the whole procedure of a certain SVM-based method, this random partition would be repeated for 10 times, with the purpose of obtaining the average classification accuracy of the algorithm. (2) Next, according to the data structure of the train set, we choose a certain SVM-based method to implement the classification task, and the selection procedure is as follows: (a) Firstly, we determine which penalty form would be used, from alternative L1 , L2 and Lq . For L1 and L2 method, we utilize the corresponding Matlab toolbox called LIBSVM (Version 3.22, released on December 22, 2016), which was proposed by Chang and Lin in 2011(e.g., see Chang and Lin, 2011). As for Lq method, we could use some part of the Matlab code proposed by Liu et al. in 2007 (e.g., see Liu et al., 2007).
page 818
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Support Vector Machines Based Methodology for Credit Risk Analysis
b3568-v1-ch20
819
(b) Secondly, we determine which kernel function would be used, from alternative linear, polynomial, Gauss and sigmoid kernel functions. Actually, whenever we choose a certain kind of kernel function, we just need to choose the corresponding parameter in the Matlab code of the SVM classifier. (c) Thirdly, we determine which hyper-parameter optimization method would be used, from alternative GS, PSO, SA and ES. Every method has a different number of parameters to be set. For instance, the GS method needs to initialize the minimum and maximum value of every parameter appears in the SVM method, and the step size used in parameter searching process. Moreover, the optimal hyperparameter would be chosen by the accuracy of cross-validation. Hence, the number k of k-fold CV would also be set before. After the above three steps a, b and c, we could obtain 48 kinds of different SVM-based methods, which could be found in the following Table 20B.1. Actually, in this chapter, only the 12 kinds of methods with Gauss kernel function are suitable for the credit dataset. That is, the classification accuracy of the methods with other kernel functions is significantly lower than the methods with Gauss kernel function. (3) Up to now, we have determined the certain SVM-based method. For example, ES-ALq G-SVM is the SVM-based algorithm generated from the combination of adaptive Lq form, Gauss kernel function and ES method. Then we could use this algorithm on the training set to train our classification model, in which the optimal model would be constructed. Finally, we could use this model on the test set of original dataset to calculate the average classification accuracy. For the normalized train set, normalized test set and a certain SVM-based method, the whole procedure of algorithm is as follows: (1) Determine the hyper-parameters that would be optimized in the training process of SVM. For different SVM-based methods, the numbers of hyper-parameters are different. Table 20B.2 shows the corresponding hyper-parameters which would be tuned. Besides, the corresponding range of fluctuation of these parameters should also be determined. In our research, we set λ ∈ [2−8 , 28 ], σ 2 ∈ [2−8 , 28 ], q ∈ [0.1, 0.2, . . . , 2], d ∈ [1, 2, 3] and γ ∈ [2−8 , 28 ]. (2) Apart from the hyper-parameters of SVM-based method, there would be several parameters in different hyper-parameter optimization methods which should be determined before. These parameters are listed in
page 819
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
820 Table 20B.1: Methods GS-L1 L-SVM PSO-L1 L-SVM SA-L1 L-SVM ES-L1 L-SVM GS-L2 L-SVM PSO-L2 L-SVM SA-L2 L-SVM ES-L2 L-SVM GS-ALq L-SVM PSO-ALq L-SVM SA-ALq L-SVM ES-ALq L-SVM GS-L1 G-SVM PSO-L1 G-SVM SA-L1 G-SVM ES-L1 G-SVM GS-L2 G-SVM PSO-L2 G-SVM SA-L2 G-SVM ES-L2 G-SVM GS-ALq G-SVM PSO-ALq G-SVM SA-ALq G-SVM ES-ALq G-SVM GS-L1 P-SVM PSO-L1 P-SVM SA-L1 P-SVM ES-L1 P-SVM GS-L2 P-SVM PSO-L2 P-SVM SA-L2 P-SVM ES-L2 P-SVM GS-ALq P-SVM PSO-ALq P-SVM SA-ALq P-SVM ES-ALq P-SVM GS-L1 S-SVM PSO-L1 S-SVM SA-L1 S-SVM ES-L1 S-SVM GS-L2 S-SVM PSO-L2 S-SVM SA-L2 S-SVM ES-L2 S-SVM GS-ALq S-SVM PSO-ALq S-SVM SA-ALq S-SVM ES-ALq S-SVM
The 48 kinds of different SVM-based methods.
Penalty form
Kernel function
Hyper-parameter optimization method
L1 L1 L1 L1 L2 L2 L2 L2 Lq Lq Lq Lq L1 L1 L1 L1 L2 L2 L2 L2 Lq Lq Lq Lq L1 L1 L1 L1 L2 L2 L2 L2 Lq Lq Lq Lq L1 L1 L1 L1 L2 L2 L2 L2 Lq Lq Lq Lq
Linear Linear Linear Linear Linear Linear Linear Linear Linear Linear Linear Linear Gauss Gauss Gauss Gauss Gauss Gauss Gauss Gauss Gauss Gauss Gauss Gauss Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Polynomial Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid
GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES GS PSO SA ES
page 820
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
Support Vector Machines Based Methodology for Credit Risk Analysis Table 20B.2: Methods L1 L-SVM L2 L-SVM ALq L-SVM L1 G-SVM L2 G-SVM ALq G-SVM L1 P-SVM L2 P-SVM ALq P-SVM L1 S-SVM L2 S-SVM ALq S-SVM
Table 20B.3: methods. Methods GS PSO SA ES
821
The hyper-parameters of different SVM-based methods. Number of hyper-parameter
Hyper-parameter
1 1 2 2 2 3 2 2 3 2 2 3
λ λ λ, q λ, σ 2 λ, σ 2 λ, σ 2 , q λ, d λ, d λ, q, d λ, γ λ, γ λ, γ, q
The parameters in different hyper-parameter optimization
Number of hyper-parameter
Parameter
2 5 4 6
λ, k c1 , c2 , ψ1 , ψ2 , k ε, γ, K , k g, ρ, μ, θ, τ , k
Table 20B.3. Here, in our research, we mainly utilize the default value of these parameters. In this chapter, the number k of k-fold CV would be set to 5. (3) Implement the corresponding hyper-parameter optimization method to find the best combination of different parameters of SVM through 5-fold CV. For a certain SVM-based method with certain penalty form and kernel function, we could directly choose the corresponding model in LIBSVM, which is quite user-friendly. After 5-fold CV, we choose the best parameters to implement the training process of SVM on the train set, and finally use the test set to calculate the classification accuracy. (4) Repeat the above (3) step for 10 times and obtain the average T , T1 , and T2 classification accuracy Here, as an example, the detailed procedure of ES-ALq G-SVM would be shown, from the original data normalization to the final classification. The implement processes of other SVM-based methods are similar with ESALq G-SVM. The ES-ALq G-SVM algorithm is proposed as follows:
page 821
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch20
J. Li et al.
822
Input: credit dataset D Output: optimal penalty parameter q, hyper-parameters and classification results Set: g is the number of generations in ES method; ρ is the number of parents in ES method; μ is the size of parent pool in ES method; θ is the number of offspring individuals; Initialize: (0) regularization parameter pm (1) ← λ; (0)
penalty parameter pm (2) ← q; (0)
kernel parameter pm (3) ← σ 2 ; (0)
strategy parameter εt ; self-adaptation learning rate τ ; (0) fitness measure F (pm ) For each repeat g For each offspring l from 1 to θ
(g)
(1) Choose a parent family, ςl ← marriage (δt , ρ) (2) Recombine parameters, endogenous strategy parameters εl ← ε recombination(ςl ), object parameters pl ← p recombination(ςl ) (3) Parameter mutation, strategy parameters ε˜l ← ε mutation(εl ), object parameter p˜l ← p mutation (pl , ε˜l ) 1T ]T , initialize coefficients b0 (4) Let k ← 1, x ˜i ← [1, xTi ]T , η ← [b, 0 and (5) Let η (0) ← η (k) , compute η (k+1) ← arg min F (η) = η T Q (k(x, y ))η + η T L (6) Let k ← k + 1, repeat (5) until convergence
η
end (g) (g) (g) (g) ← {(pl , εl , F (pl )), (7) Generate offspring population, δo l = 1, . . . , θ} (g+1) (g) ← selection(δo , μ) (8) Select new parent population (μ, θ) : δt (9) Let g ← g + 1, IF the outputs are optimal or the termination criteria are satisfied THEN stop end.
page 822
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
Chapter 21
Data Mining Applications in Accounting and Finance Context Wikil Kwak, Yong Shi and Cheng Few Lee Contents 21.1
21.2
Bankruptcy Prediction after the Sarbanes–Oxley Act Using Financial Ratios and Current Data Mining Approach . . . . . . 21.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 Background and prior research . . . . . . . . . . . . . . 21.1.3 Sample data, variables, and empirical results . . . . . . 21.1.4 Summary and conclusions . . . . . . . . . . . . . . . . . Bankruptcy Prediction for Korean Firms after the 1997 Financial Crisis: Using a Multiple Criteria Linear Programming Data Mining Approach . . . . . . . . . . . . . . . . . . . . . . . 21.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 21.2.2 Background . . . . . . . . . . . . . . . . . . . . . . . .
Wikil Kwak University of Nebraska at Omaha e-mail: [email protected] Yong Shi University of Nebraska at Omaha Chinese Academy of Sciences e-mail: [email protected] Cheng Few Lee Rutgers University e-mail: cfl[email protected]
823
825 825 826 828 834
835 835 836
page 823
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
824
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee
21.2.3 Models of multiple criteria linear programming classification . . . . . . . . . . . . . . . . . . . . 21.2.4 Data collection and research design . . . . . . . 21.2.5 Summary and conclusions . . . . . . . . . . . . . 21.3 Bankruptcy Prediction for Chinese Firms: Comparing Mining Tools with Logit Analysis . . . . . . . . . . . . . 21.3.1 Introduction . . . . . . . . . . . . . . . . . . . . 21.3.2 Literature review . . . . . . . . . . . . . . . . . 21.3.3 Data collection and research design . . . . . . . 21.3.4 Empirical results . . . . . . . . . . . . . . . . . . 21.3.5 Summary and conclusions . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . .
838 839 844 844 844 845 848 848 852 853
Abstract This chapter shows examples of applying several current data mining approaches and alternative models in an accounting and finance context such as predicting bankruptcy using US, Korean, and Chinese capital market data. Big data in accounting and finance context is a good fit for data analytic tool applications like data mining. Our previous study also empirically tested Japanese capital market data and found similar prediction rates. However, overall prediction rates depend on different countries and time periods (Mihalovic, 2016). These results are an improvement on previous bankruptcy prediction studies using traditional probit or logit analysis or multiple discriminant analysis. The recent survival model shows similar prediction rates in bankruptcy studies. However, we need longitudinal data to use the survival model. Because of computer technology advances, it is easier to apply data mining approaches. In addition, current data mining methods can be applied to other accounting and finance contexts such as auditor changes, audit opinion prediction studies, and internal control weakness studies. Our first paper shows 13 data mining approaches to predict bankruptcy after the Sarbanes–Oxley Act (SOX) (2002) implementation using 2008–2009 US data with 13 financial ratios and internal control weaknesses, dividend payout, and market return variables. Our second paper shows application of a Multiple Criteria Linear Programming Data Mining Approach using Korean data. Our last paper shows bankruptcy prediction models using Chinese firm data via several data mining tools and compared with those of traditional logit analysis. Analytic Hierarchy Process and Fuzzy Set also can be applied as an alternative method of data mining tools in accounting and finance studies. Natural language processing can be used as a part of the artificial intelligence domain in accounting and finance in the future (Fisher et al., 2016). Keywords Data mining • Big data • Bankruptcy • Multiple criteria linear programming data mining • China • Korea • Japan • US • Probit • Logit • Multiple discriminant analysis • Survival model • Auditor change • Audit opinion prediction • Internal control weakness • Decision tree • Bayesian net • Decision table • Analytic hierarchy process • Support vector machines • Fuzzy set • Natural language processing.
page 824
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
Data Mining Applications in Accounting and Finance Context
825
21.1 Bankruptcy Prediction after the Sarbanes–Oxley Act Using Financial Ratios and Current Data Mining Approach 21.1.1 Introduction The development of current information technology created huge datasets called big data in different areas with various formats. This is especially true in accounting and finance areas where we deal with a huge amount of financial and non-financial data generated by accounting information system (Murthy and Geerts, 2017). Personalized customer services have triggered research and applications in database, information processing and knowledge extraction. Data mining is a process of extraction of useful information and patterns from big data. The goal of business data mining is to find individual patterns that were previously unknown that can be used to make business decisions. In general, there are three steps involved in data mining (Bharati and Ramageri, 2010; Chen et al., 2016): Exploration, pattern identification, and deployment. Exploration includes data cleaning and transformation. The important variables and natures are determined in this step. Pattern identification is the step of modeling and model verification. Lastly, deployment applies the model to real-time data. Artificial neural networks include a family of neural networks and their capability of treating nonlinear data (Chen et al., 2016). Various algorithms and techniques like support vector machine (SVM), decision tree, and k-means are used for knowledge discovery. Recently, Lahmiri (2016) proposed that SVM is the best method in bankruptcy prediction studies from his comparative data mining study. They can be categorized as supervised learning and unsupervised learning. Classification and regression analysis are the most common methods in supervised learning whereas the most common method in unsupervised learning is clustering. SVMs are effective when underlying data is typically nonlinear and nonstationary (Chen et al., 2016). Classification is the most common data mining method, which uses a set of pre-tagged datasets to train the model for future prediction. Fraud detection and risk analysis are classic examples of applications in this technology. SVM and decision tree are the most well-known algorithms in classification. SVM was developed by Vladimir Vapnik and is mostly used in the application of classification problems even though it can be used for regression analysis as well (Cortes and Vapnik, 1995). The algorithm realizes classifications by finding the most appropriate hyper-planes that can successfully
page 825
July 6, 2020
12:0
826
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee
differentiate between classes in a training dataset. The decision of a hyperplane is to maximize the distances from the closest points in each of the different classes. The algorithm learning process of SVM can be formulated as a problem of optimization. The decision tree summarizes the patterns using the format of the binary tree (Safavian and Landgrebe, 1991). Since the result of the decision tree is understandable, it is commonly used in operation management. The decision tree is built top-down from root node. The process of constructing a decision tree is all about finding a node that provides the highest information gain. The information gain is based on the decrease in entropy after splitting on a specific node (attribute). Clustering is the method of identifying similar classes of objects. Clustering differs from classification because the training objects in clustering are not tagged. The k-means algorithm is the most popular algorithm in clustering and is used to partition observations into k clusters by Euclidean distance (Wagstaff et al., 2001). An iterative algorithm is designed to find clusters that minimize the group dispersion and maximize the between-group distance. Case-based reasoning can solve new problems and provide suggestions by recalling similar experience. It is based on the K-nearest neighbors (KNN) principle that similar input should have the same output (Chen et al., 2016). Data mining has recently become popular in a business applications context and its techniques are becoming more powerful in terms of overall prediction rates. Audit firms soon will introduce big data analytics into the audit process (Rose et al., 2017). If we had applied data mining tools to the Fannie Mae Mortgage portfolio, we could have decreased default during the financial crisis (Mamonov and Benbunan-Fich, 2017). We apply the most current data mining tools for several of our studies and try to show data mining applications in the accounting and finance context. Our paper presents data mining applications for bankruptcy studies with a summary of our findings and future research avenues. 21.1.2 Background and prior research Altman (1968) used multiple discriminant analysis to predict bankruptcy. Another such major study is Ohlson’s (1980) which used a logit model. His model does not require any assumptions about the prior probability of bankruptcy or the distribution of predictor variables. Gupta et al. (1990) proposed linear goal programming, but his approach may not be practical because of computational complexity. The decision tree
page 826
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Data Mining Applications in Accounting and Finance Context
b3568-v1-ch21
827
approach to develop bankruptcy prediction models was proposed by Sung et al. (1999). Their results show that prediction ratios are not stable in different economic conditions. Therefore, economic conditions are important factors in bankruptcy prediction studies. The accuracy of their model is 72.4% for bankruptcy under normal conditions while it was 66.7% under crisis conditions using the 1991–1998 Korean data. A genetic algorithm (GA) in bankruptcy prediction modeling was used by Shin and Lee (2002). Their accuracy rate is 80.8% in both training and holdout samples. However, this study used only one industry and there may be an upward bias for prediction accuracy. A case-based reasoning (CBR) with weights derived by an analytic hierarchy process (AHP) for bankruptcy prediction, since AHP incorporates both financial ratios as well as non-financial variables into the model, was proposed by Park and Han (2002). They reported an 84.52% accuracy rate even after incorporating non-financial data. Chi and Tang (2006) used 24 variables to apply logit analysis with a prediction rate of 85%. From previous bankruptcy studies, any study with more than an overall prediction rate of 85% is a reasonable prediction study. The most prominent bankruptcy studies are Altman’s and Ohlson’s and our study uses their financial variables. In addition, internal control weaknesses, missing dividend payouts, and the stock market return rate are added to improve the prediction rate. These variables are pulled from each bankrupt firm’s financial statements 3 years before they file bankruptcy. There have been other intelligent techniques used in bankruptcy studies such as neural networks or SVMs (Elish and Elish, 2008). Baek and Cho (2003) used a neural network approach for Korean firms to predict bankruptcy. Their approach showed 80.45% for solvent and 50.6% for default firms in terms of prediction accuracy. Min and Lee (2005) proposed SVM for the bankruptcy prediction study and found their holdout data classification rate was 83.06%. Artificial neural networks and a fuzzy set approach have both been used in management fraud and internal controls research. Fanning et al. (1995) used artificial neural networks to detect management fraud and developed a discriminant function that provides a set of questions to detect management fraud. Deshmukh et al. (1997) proposed a fuzzy set approach that provides guidelines to measure and combine red flags, and it can be used to build fuzzy reasoning systems that assess the risk of management fraud. Korvin et al. (2004) proposed a pragmatic approach to assess risks of internal control using the fuzzy set theory. Their approach is useful for a computer-based accounting information system’s internal control. However, their approach could be expensive and time consuming.
page 827
July 6, 2020
12:0
828
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee
Hammersley et al. (2008) found results using the stock price reaction to the SOX Section 302 disclosures. They found that the information content of internal control weaknesses disclosed in Section 302 disclosures depends on the factors of internal control weaknesses, such as their severity, auditability, and the vagueness of disclosures. This study’s results indicate that investors react to information about the existence of internal control weaknesses, so the results of our study may benefit investors by providing better predictions on which firms will enter bankruptcy. We use 13 of the most current data mining tools including the decision tree approach in this study for comparison. 21.1.3 Sample data, variables, and empirical results In this paper, we used DirectEDGAR (2015) to identify 130 firms that filed bankruptcy in 2008 and 2009. These are the years that SOX was fully implemented on each firm’s financial statements. We then identified more than twice the number of control firms that had no bankruptcy filing. Our control firms were matched with our bankrupt firms using size and two-digit industry codes. Our final sample is composed of 306 firm-year observations that have available financial and other data in Form 10-K filings using DirectEDGAR and Compustat. In selecting variables that may help predict bankruptcy over financial reporting, we included Altman’s (1968) and Ohlson’s (1980) variables because these variables have proven to be useful for bankruptcy prediction in previous studies, as discussed in Section 21.2 of this paper. In addition, we include internal control weaknesses, dividend payouts, and stock return variables as proposed by previous bankruptcy studies (e.g., Duffie et al., 2007; Sun, 2007; Shumway, 2001). Table 21.1 shows the descriptive statistics. As you can see in Table 21.1, all variables between bankrupt firms and non-bankrupt firms are quite different except for size, funds flow from operations/total liabilities, CHIN (change in income), market value/total debt, sales/total assets, and return variables. From our results, bankrupt firms have higher debt to total assets, current liability to current asset ratios are greater, have frequent losses, higher total liabilities/total assets ratios greater than 1, more internal control weaknesses, and they miss dividends more frequently. In contrast, bankrupt firms have lower working capital to total assets, more negative net income to total assets, more negative earnings before interest and taxes to total assets, and more negative retained earnings to total assets ratios.
page 828
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Data Mining Applications in Accounting and Finance Context
Table 21.1:
b3568-v1-ch21
829
Part 1: Descriptive data.
Bankrupt = 1 Variable
N
Mean
SIZE TDEBT TA WCA TA CL CA NI TA FU TL LOSS OENEG CHIN EBIT TA MKV TD SALES TA RE TA IC DIV RETX
130 130 130 130 130 130 130 130 130 130 130 130 130 130 128 28
3.036415 0.522197 0.003749 1.325495 −0.12756 33.56841 0.815385 0.084615 −3.04477 −0.01393 42.64569 1.234936 −0.59322 0.207692 0.757813 −0.48858
Minimum
Maximum
t-diff stat
0.670017 1.066699 0.372747 4.8E−05 0.365557 −2.35556 1.917049 0.139244 0.238387 −1.36716 347.9683 −52.72 0.389486 0 0.279385 0 27.96642 −198.828 0.130669 −0.92413 410.2507 0.000499 0.857946 0.016094 1.167781 −7.32708 0.407225 0 0.430091 0 0.938442 −2.37081
4.251368 2.365381 0.573819 13.66695 0.544211 3949.74 1 1 110.2286 0.171389 4648.89 4.438618 0.469771 1 1 1.085964
1.87984 6.539735∗∗∗ −5.29695∗∗∗ 3.459551∗∗∗ −4.98748∗∗∗ −0.69753 11.59204∗∗∗ 2.97523∗∗∗ −1.24302 −5.49235∗∗∗ −1.34875 1.823911 −5.04411∗∗∗ 4.712419∗∗∗ 3.380889∗∗∗ −1.83609
Std. dev
Maximum
Std. dev
Non-Bankrupt = 0 Variable
N
SIZE TDEBT TA WCA TA CL CA NI TA FU TL LOSS OENEG CHIN EBIT TA MKV TD SALES TA RE TA IC DIV RETX
306 306 306 306 306 306 306 306 304 306 306 306 306 306 306 212
Mean
Minimum
2.898546 0.767627 −0.2204 5.30122 0.291959 0.228544 1.31E−05 1.080334 0.185622 0.21494 −0.66999 0.824403 0.728434 0.680917 0.093013 9.784977 −0.00602 0.218965 −2.18817 0.588664 83.02064 1119.38 −78.3581 19433.31 0.313726 0.464766 0 1 0.009804 0.09869 0 1 0.004126 0.068398 −0.24818 0.365246 0.071082 0.181967 −2.15954 0.888582 386.9547 4421 0.001869 75722.15 1.081045 0.667693 0.080171 3.137091 −0.01265 0.91857 −11.4386 1.557583 0.03268 0.178088 0 1 0.598039 0.491097 0 1 −0.15298 0.643907 −4.07103 1.608485 (Continued )
page 829
July 6, 2020
12:0
830
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee Table 21.1:
Notes: ∗ : p < 0.10; ∗∗ : p < 0.05; Variable Descriptions:
∗∗∗
(Continued )
: p < 0.001.
Size = Log (Total Assets/Gross Domestic Products); TDEBT TA = Total Liabilities/Total Assets; WCA TA = Working Capital Divided by Total Assets; CL CA = Total Current Liabilities/Total Current Assets; NI TA = Net Income/Total Assets; FU TL = Funds from Operations/Total Liabilities; LOSS = if Net Income b,C i N (non-bankrupt). Now the better separation of bankrupt and Non-bankrupt firms should be considered. Let αi be the overlapping degree with respect to Ci , and βi be the distance from Ci to its adjusted boundary. In addition, we define α to be the maximum overlapping of two-class boundary for all cases Ci (αi < α) and β to be the minimum distance for all cases Ci from its adjusted boundary (βi > β). Our goal is to minimize the sum of αi and maximize the sum of βi simultaneously. By adding αi into (a), we have: (b) C i X ≤ b + αi , C i ∈ B and C i X > b − αi , C i ∈ N. However, by considering βi , we can rewrite (b) as (c) C i X = b + αi − βi , C i ∈ B and C i X = b − αi + βi , C i ∈ N. Our two-criterion linear programming model is stated as (d) Minimize i αi and Maximize i βi Subject to C i X = b + αi − βi , C i ∈ B, C i X = b − αi + βi , C i ∈ N, where C i are given, X and b are unrestricted, αi and βi ≥ 0, see Shi et al. (2001, p. 429). The above MCLP model can be solved in many different ways. One method is to use the compromise solution approach as Shi and Yu (1989) and Shi (2001) did, to reform the model (d) by systematically identifying the best trade-offs between − i αi and i βi . To show this graphically,
page 838
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
Data Mining Applications in Accounting and Finance Context
b3568-v1-ch21
839
we assume the ideal value of − i αi is α∗ > 0 and the ideal value of β ∗ > 0. Then, if − i αi > α∗ , we define the regret measure as i βi is ∗ it is 0. If − i αi > α∗ , the regret mea−d+ α = i αi + α ; otherwise, ∗ it is 0. Thus, we have sure is defined as d− α = α + i αi ; otherwise, ∗ − + ∗ + (iii) (i) α + i αi = dα − dα , (ii) |α + i αi | = d− α − dα , and − + − + ∗ ∗ dα , −dα ≥ 0. Similarly, we derive β − i βi = dβ − dβ , |β − i βi | = + − + d− β + dβ , and dβ , dβ ≥ 0 (see Table 21.2). An MCLP model has been gradually evolved as − + + (e) Minimize d− α + dα + dβ + dβ Subject to + α∗ + i αi = d− α − dα , − + ∗ β − i βi = dβ − dβ , C i X = b + αi − βi , C i ∈ B, C i X = b − αi + βi , C i ∈ N, − + where C i , α∗ , and β ∗ are given, X and b are unrestricted, αi , βi , d− α , dα , dβ , d+ β ≥ 0. The data separation of the MCLP classification method is determined by solving the above problem. Two versions of actual software have been developed to implement the MCLP method on large-scale databases. The first version is based on the well-known commercial SAS platform (SAS/OR, 1990). In this software, the SAS codes, including a SAS linear programming procedure, are used to solve (e) and produce the data separation. The second version of the software is written in C++ language running on a Linux platform (see Kou and Shi, 2002). Currently, a SAS system is more popular for data analysis in large firms and the SAS version is useful in conducting data mining analysis under the SAS environment. However, the Linux version goes along with the trend of information technology for the future and a Linux version of the MCLP classification software is also developed.
21.2.4 Data collection and research design We collected a listing of bankrupt firms in Korea from 1997 to 2003 from public sources via the internet and an average of two times more than the number of matching control firms by the size and two-digit industry code to emulate a real world bankruptcy situation. Most of our sample firms come from 1997 and 1998, right after the financial crisis in Korea (1997: 22 firms, 1998: 19 firms, 1999: 7 firms, 2000: 8 firms, 2001: 6 firms, 2002, 1 firms, and 2003: 2 firms). Almost twice as many firms filed bankruptcy
page 839
July 6, 2020
12:0
840
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee
during this time, but half of the bankrupt firms went through the walk out and restructuring, and survived with the support of government funding. We did not include these firms for out sample. Our final sample consists of 65 bankrupt firms and 130 non-bankrupt firms whose data are available and are publicly trading firms on the Korean Stock Exchange. Financial data are also from public databases such as company annual reports via internet search. Firm bankruptcy in Korea is rare because of chaebol’s interlocking shareholdings within group firms and internal fund transfers among member firms. They want to protect the reputation of chaebol. Sometimes government gets involved to prevent bankruptcy of publicly traded firms by contributing public funds to protect jobs of troubled firms’ employees. This structure is very similar to the Japanese keiretsu system. However, the 1997 financial crisis was different and there were many bankrupt firms during that time. We used Altman’s (1968) 5 ratio variables and Ohlson’s (1980) 9 variables for our bankruptcy prediction study. Our model is flexible enough to add more variables without difficulty with current available data mining tools and computer capabilities. We first run our MCLP data mining model using 5 variables for each year as Altman (1968) did in his original study. We want to compare the results of our study with the results of Altman’s study to show the prediction accuracy of our study. Then we run our MCLP data mining model using nine variables as Ohlson (1980) did in his study. Our goal is to show the effectiveness of our model compared with Altman’s and Ohlson’s models. For a detailed discussion of MCLP data mining refer to Olson and Shi (2007). Table 21.3 presents the descriptive statistics. Panel A shows mean, standard deviation, minimum, and maximum values for each variable for the bankrupt firms. Average firm size is 1.9% of gross domestic products for bankrupt firms and 2.6% for non-bankrupt firms. Total liability is more than total assets for bankrupt firms. Working capital is −1.7% of total assets for bankrupt firms and current liability is 1.25 times more for bankrupt firms. These ratios show that there are problems in liquidity of bankrupt firms. Net income is −8.9% for bankrupt firms and funds from operations are −1.2% of total liabilities. Changes in net income compared with previous years’ net income are 67% negative. Retained earnings are −6.2% of total assets and earnings before interest and taxes are −8%. Market value of equity is only 15% of book value of total debt and sales are 81% of total assets. Most of these variables show worse ratios for bankrupt firms compared with
page 840
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
Data Mining Applications in Accounting and Finance Context
841
Table 21.3: Descriptive statistics of predictor variables for Korean bankrupt and non-bankrupt sample firms between 1997 and 2003. Panel A: Bankrupt firms Variables
N
Mean
Std. Dev
Min
Max
Size TL/TA WCA/TA CL/CA NI/TA FU/TL INTWO OENEG CHIN RE/TA EBIT/TA MKV/TD SALE/TA
65 65 65 65 65 65 65 65 64 65 65 65 65
0.0019 1.0307 −0.0171 1.2492 −0.0893 −0.0120 0.8615 0.1538 −0.6743 −0.0616 −0.0801 0.1512 0.8137
0.0050 1.5402 0.3307 0.6936 0.1480 0.0890 0.3481 0.3636 1.4041 0.1947 0.1535 0.1538 1.1411
0.00 0.0891 −0.9217 0.2389 −0.6504 −0.3590 0.00 0.00 −5.3980 −0.9425 −0.6503 0.0002 0.1153
0.0352 13.1025 1.4548 3.6691 0.0850 0.3082 1.00 1.00 1.00 0.1917 0.2102 0.8643 9.5923
Panel B: Non-bankrupt firms Variables
N
Mean
Std. Dev
Min
Max
t-value1
Size TL/TA WCA/TA CL/CA NI/TA FU/TL INTWO OENEG CHIN RE/TA EBIT/TA MKV/TD SALE/TA
130 130 130 130 130 130 130 130 129 130 130 130 130
0.0026 0.7133 0.0463 0.9735 0.0209 0.0286 0.1231 0.0154 −0.1508 0.1352 0.0301 0.1436 1.0521
0.0062 0.6851 0.3524 0.5163 0.0474 0.1419 0.3298 0.1236 1.0011 0.1249 0.0541 0.1472 1.0416
0.0001 0.1982 −3.2444 0.1691 −0.3050 −0.6446 0.00 0.00 −8.1049 −0.0101 −0.2625 0.0073 0.1580
0.0429 8.2152 0.6624 3.5770 0.1924 1.0662 1.00 1.00 1.00 0.5953 0.2498 1.1594 11.5462
0.80 −1.58 1.23 −2.84∗∗∗ 5.86∗∗∗ 2.43∗∗ −14.21∗∗∗ −2.99∗∗∗ 2.68∗∗∗ 7.42∗∗∗ 5.62∗∗∗ −0.33 1.42
Notes: 1 t-value for testing mean differences between bankrupt and non-bankrupt firms. ∗ : p < 0.10; ∗∗ : p < 0.05; ∗∗∗ : p < 0.001. Variable Descriptions: Size = Total Assets/Gross Domestic Products; TL/TA = Total Liabilities/Total Assets (Ohlson (1980) ratio); (Continued )
page 841
July 6, 2020
12:0
842
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee Table 21.3:
(Continued )
WCA/TA = Working Capital/Total Assets (Altman (1968) ratio and Ohlson (1980) ratio); CL/CA = Total Current Liabilities/Total Current Assets (Ohlson (1980) ratio); NI/TA = Net Income/Total Assets (Ohlson (1980) ratio); FU/TL = Funds from Operations/Total Liabilities (Ohlson (1980) ratio); INTWO = if Net Income 1 then OENEG = 1; else OENEG = 0; WCA TA = Working Capital Divided by Total Assets; CL CA = Total Current Liabilities/Total Current Assets; NI TA = Net Income/Total Assets; TP TL = Earnings before Interest and Taxes/Total Liabilities; RE TA = Retained Earnings/Total Assets; BV TD = Market Value of Equity/Book Value of Total Debt; SALE TA = Sales/Total Assets; INTWO = if Net Income Chi-q
0.0825 0.1786 0.1842 0.0432 0.0103 0.0339
26.7337 0.8994 6.9401 0.0459 0.0053 8.6995
Chi-q 1.922 0.8448 0.1487 0.0139 0.0201 4.9643 0.7057 0.1889 4.0859 0.0007 0.0001 10.9692 10.2684
0.1656 0.358 0.6997 0.906 0.8874 0.0259 0.4009 0.6638 0.0432 0.9793 0.9927 0.0009 0.0014
Panel F: Overall prediction rates using combined Altman and Ohlson variables
Bankruptcy Non-bankruptcy Overall prediction rate
Correct
Incorrect
Total
71 682 0.6723
323 44
394 726
the above mentioned variables are significant except for RE/TA. This variable may be correlated with Sales/TA variable. The overall prediction rate is 67.23% in this combined model. Table 21.7 shows prediction rates of data mining models. Decision Tree, Baive Beyes, Simple Logistic, Neural Networks, and Nearest Neighbor show overall prediction rates of 65.18%, 63.66%, 68.04%, 65.89%, and 62.95%, respectively. NearestNeighbor has the worst prediction rates. As we can see from these results, bankruptcy prediction models of Chinese firms using data mining is similar to those of traditional logit models. Overall prediction rates are lower than those of other studies because of China’s macro-economic factors. From our empirical results, the effectiveness of bankruptcy prediction models for Chinese firms is as effective as our traditional logit analysis data mining models. Usually, from data mining models’ outputs, we cannot tell which factor contributed to the prediction rates due to the ‘black box’
page 851
July 6, 2020
12:0
Handbook of Financial Econometrics,. . . (Vol. 1)
9.61in x 6.69in
b3568-v1-ch21
W. Kwak, Y. Shi & C. F. Lee
852
Table 21.7: Prediction rates for data mining models: All the variables are normalized with min-max normalization: x’ = (x − min)/(max − min). So all the values are between [0, 1]. Attached is the normalized data. Method
Accuracy
Decision Tree
65.1786%
Baive Beyes
63.6607%
Simple Logistic
68.0357%
Neural Networks
65.8929%
Nearest Neighbor
62.9464%
Confusion Matrix a b 0). of F (W ) occurs at P , where dFdW i However, in Figure 82D.1(b), the maximum feasible value of F (W ) occurs F(W)
F(W) P
P
P’
Wi (a)
Figure 82D.1:
Wi (b)
Value of the function F (W ) as Wi changes.
July 6, 2020
15:52
2828
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
(W ) at P instead of P as short sell is not allowed, then dFdW < 0 at optimal i weight (Wi = 0). Therefore, in general, we can obtain the optimal weights under conditions dF (W ) ≤ 0, dWi
dF (W ) Wi = 0. dWi We can rewrite the conditions as five Kuhn–Tucker conditions: dF (W ) + Ai = 0, dWi Wi Ai = 0, Wi ≥ 0, Ai ≥ 0, n
Wi = 1.
i=1
When maximum feasible value occurs on
dF (W ) dWi
< 0, Ai is positive and Wi
(W ) is equal to zero. If maximum feasible value occurs on dFdW = 0, then Ai i is equal to zero and Wi is positive. In the following part, we will solve the optimal weights under Kuhn–Tucker conditions by Excel. Given an example in Figure 82D.2: initial weights W = (0.1, 0.2, 0.7), dF (W ) dWi can be calculated by equation (82A.2) in Appendix 82A and the value of the excess return and covariance matrix as shown in Figure 82D.2.
page 2828
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
Sharpe Performance Measure and Treynor Performance Measure Approach
Figure 82D.2:
page 2829
2829
Solver function in Excel.
Then we use Solver function in Excel to find the optimal weights W . Set target cell B5 equal to the value of 1 (the fifth Kuhn–Tucker condition), select the range from B2 to C4 as change cells, and select the first and second Kuhn–Tucker conditions as the constraints.
July 6, 2020
15:52
2830
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
Figure 82D.2:
(Continued)
For the third and fourth Kuhn–Tucker conditions, press Options into Solver Options, select Assume Non-Negative and press OK. Then go back to Solver Parameters and press Solve.
page 2830
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
Sharpe Performance Measure and Treynor Performance Measure Approach
Figure 82D.2:
page 2831
2831
(Continued)
If first solve cannot find a solution, use Solver function again until the Solver find a solution as Figure 82D.3.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
2832
Figure 82D.3:
Optimal weights by solver function.
Appendix 82E: Portfolio Optimization with Short-Selling Constraints3 The traditional mean–variance efficient portfolios generate appealing characteristics. However, it is frequently questioned by financial professionals due to the propensity of corner solutions in portfolio weights. Therefore, adding portfolio weighting limits help portfolio managers to fashion realistic asset allocation strategies. Markowitz (1952) model is based on following assumptions concerning the behavior of investors and financial markets: (1) Investors have single-period utility functions in which they maximize utility within the framework of diminishing marginal utility of wealth. Return is desired, but risk is to be avoided. 3
This appendix is contributed by Paul Chiou at Shippensburg University.
page 2832
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
Sharpe Performance Measure and Treynor Performance Measure Approach
page 2833
2833
(2) Investors care only about the first two moments of the returns of their portfolios over the investment period. (3) Investors can long and short sell assets without limits. (4) Financial markets are frictionless, that is, there are no taxes and costs. Suppose risky asset investments can be characterized as a vector of multivariate returns of N securities, RT . The expected risk premiums and variance– covariance of asset returns can be expressed as a vector μ and a positive definite matrix V, respectively. Let Ω be the set of all real vectors w that define the weights of assets such that wT 1 = 1, where 1 is an N -vector of ones. The expected return of the portfolio is μp = wT μ and the variance of the portfolio is σp = wT Vw. Considering all constraints given the objective to minimize the portfolio’s risk, the efficient frontier can then be expressed as a Lagrangian function: min{w,φ,η} Ξ =
1 T w Vw + φ(μp − wT µ) + η(1 − wT 1). 2
(82E.1)
As described in the previous section, the optimal portfolio weights are a function of means, variances, and covariances of asset returns. Consider a generalized case with N assets, let A =1T V−1 R, B =RT V−1 R, X =1T V−82 , and Δ = BX − A2 , the solution of the above quadratic function is (see Pennacchi (2008)) wp =
1 [(Xμp − A)(V−1 R) + (B − Aμp )(V−82 )]. Δ
(82E.2)
The short-selling constraints are further considered in Markowitz model. The inequalities that represent non-negative portfolio weights are w1 ≥ 0, w2 ≥ 0, . . . . . . , wN ≥ 0, where w = [w1 , w2 , . . . . . . , wN ]. (82E.3) The solution of this constrained diversification is to incorporate equation (82E.3) in equation (82E.1). Although the constraints on portfolio weights (wT 1 = 1 and Equation (82E.3)) are linear, the objective function, which is to minimize σp , is nonlinear as it contains the squares and cross-products of portfolio weights. There is no standard package to solve this quadratic programming issue because of the inequality restriction on each of the elements in w.4 4
For detailed discussion, see Elton et al. [2007, Chapters 6 and 9].
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
2834
Kuhn–Tucker conditions are frequently applied to solve the above problem. Consider the quadratic property in the objective function; the traditional Lagrangian techniques encounter difficulty to solve the global optimum. When inequalities are active, they will be treated as equalities and are included in the Lagrangian. Once the equalities are inactive, they should be left out. The advantage is that Kuhn–Tucker conditions require the objective function gradient to be expressible as a multiplier-weighted combination of constrained gradient. The quadratic programming problem becomes complicated once there are more constraints included and can be solved by computer program.5 However, the results of such constrained optimization in asset allocation generate more feasible portfolios than the unconstrained optimization.6 A Numerical Example Considering a three-asset case exemplified in the previous section, we have the following objective and constraint functions: min σp2 =
3 3
Wj Wi Cov(Rj , Ri ),
(82E.4)
j=1 i=1
s.t. E(Rp ) =
3
Wi E(Ri ),
i=1 3
Wi = 1,
i=1
W1 ≥ 0, W2 ≥ 0, W3 ≥ 0. Replacing the variables in equation (82E.4), we can rewrite equation (82E.1) as min Ξ = [(W12 σ11 + W22 σ22 + W32 σ33 ) + 2(W1 W2 σ12 + W1 W3 σ13 + W2 W3 σ23 ) 3 3 Wi E(Ri )) − η Wi − 1 −φ(E(Rp ) − i=1
i=1
−λ1 W1 − λ2 W2 − λ3 W3 ], (82E.5) 5
See Elton et al. [2007, Chapter 6] and Rardin [1998, Chapter 14]. See Jagannathan and Ma (2003). Chiou et al. (2009) empirically investigate the impact of weighting constraints on international diversification and suggest that including upper and lower bounds in portfolio weights enhance the feasibility of asset allocation.
6
page 2834
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
Sharpe Performance Measure and Treynor Performance Measure Approach
page 2835
2835
where φ, η, and λi i = 1, 2, 3, are Lagrange multipliers (LMs). They represent the shadow prices, or the penalty, of the constraints. The corresponding complementary slackness conditions are as follows: φ(E(Rp ) −
3
Wi E(Ri )) = 0,
i=1
η
3
Wi − 1
= 0,
i=1
−λ1 W1 = 0, −λ2 W2 = 0, −λ3 W3 = 0.
(82E.6)
The above complementary slackness conditions indicate that either inequality constraints should be active at a local optimum or the corresponding Lagrange variable should equal zero. For differentiable nonlinear program, solutions Wi , φ, η, λi , i = 1, 2, 3, satisfy the Kuhn–Tucker conditions if they fulfill complementary slackness conditions, primal constraints, and gradient equation ∂σp2 = ∇Constraint(W1 , W2 , W3 ) · LM. ∇ ∂Wi
(82E.7)
Any combination of Wi , i = 1, 2, 3, for which there exist corresponding LMs satisfying these conditions is called a Kuhn–Tucker point. Our portfolio model has the following objective function gradient: ⎛ ⎞ 0.91W1 + 0.0018W2 + 0.0008W3 ⎜ ⎟ ∂σp2 0.0018W1 + 0.1228W2 + 0.002W3 ⎟ , =⎜ (82E.8) ∇ ⎝ ⎠ ∂Wi 0.0008W1 + 0.002W2 + 0.105W3 and those of the five linear constraints are as follows: ∇Constraint1 (W1 , W2 , W3 ) = (0.0053, 0.0055, 0.0126), ∇Constraint2 (W1 , W2 , W3 ) = (1, 1, 1), ∇Constraint3 (W1 , W2 , W3 ) = (1, 0, 0), ∇Constraint4 (W1 , W2 , W3 ) = (0, 1, 0), ∇Constraint5 (W1 , W2 , W3 ) = (0, 0, 1).
(82E.9)
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
2836
Therefore, the gradient equation part of Kuhn–Tucker conditions is 0.0053 φ + 1η + 1λ1 = 0.91W1 + 0.0018W2 + 0.0008W3 , 0.0055 φ + 1η + 1λ2 = 0.0018W1 + 0.1228W2 + 0.002W3 , 0.0126 φ + 1η + 1λ3 = 0.0008W1 + 0.002W2 + 0.105W3 ,
(82E.10)
plus the primal constraints as part of the conditions E(Rp ) =
3
Wi E(Ri ),
(82E.11)
i=1 3
Wi = 1,
i=1
W1 ≥ 0, W2 ≥ 0, W3 ≥ 0, and complementary slackness conditions in equation (82E.6). Note that the weights are functions of five LMs and bounded by inequality constraints. Since the functions of solutions of Wi are multidimensional, we cannot show their relation on a graph. One may start from any feasible point and then search an improving feasible direction Δ(W1 , W2 , W3 ) chased by implementations of feasible and small steps of LMs. The stop of an improving feasible search does not necessarily represents the current Kuhn–Tucker point is the global optimum but suggests it is a local represent that optimum. Since there is no close-form solution function for each weight, one may need to continue the search until no improvement can be found. In our case, if ∇
∂σp2 · Δ(W1 , W2 , W3 ) < 0, ∂Wi
(82E.12)
there is an improvement in objective function. The time-consuming calculation of the constrained portfolio selection can be speeded by applying computer software, such as Microsoft Excel and Matlab.7 We use the same dataset to construct the nonconstrained (NC) efficient frontier and short-selling constrained (SS) efficient frontier, which is shown in Figure 82E.1. When the portfolio weights have no limits, theoretically, 7
For example, the Matlab syntax frontcon allows users to enter a matrix which defines constraints related to minimum and maximum for assets. For more in-depth discussion, readers may refer to the text of Matlab’s finance function.
page 2836
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
Sharpe Performance Measure and Treynor Performance Measure Approach
page 2837
2837
3.0% No Constraint 2.5% Short-Selling Constraint 2.0%
1.5%
Short-selling Constrained Efficient Frontier
1.0%
0.5%
0.0% 0%
20%
40%
Figure 82E.1: Table 82E.1:
60%
80%
100%
Efficient frontiers.
Minimum-variance Portfolio.
E(R)(%)
σ(%) Portfolio Weight w1 (%)
0.78
13.39
38.57
w2 (%)
w3 (%)
28.10
33.33
the investor can expand the upper bound of efficient frontier unlimitedly by allocating assets in extreme long and short positions in different securities. The maximum return of NC efficient frontier can be infinite if the investor is willing to take infinite risk. In Figure 82E.1, the SS optimal diversification is a subset of the NC portfolio. The global minimum variance (MV) for the two kinds of portfolios is identical in this case. The information is listed in Table 82E.1. The conclusion of the same MV under different constraints does not always happen when the coefficients of correlation among securities and the relative magnitudes of return among assets change. Note that the optimal diversification strategies are sensitive to the variation in the first two moments of asset returns in the portfolio. Note that, theoretically, the maximum return of NC efficient frontier can be infinite if the investor is willing to take infinite risk by allocating assets in extremely huge long and short positions. However, the effectiveness of such extreme-weight diversification strategies is questionable due to their
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch82
P. Chiou & C. F. Lee
2838 Table 82E.2:
Portfolios on efficient frontier.
E(R) %
σ(%)
Sharpe Ratio
0.78 11.04 1.24 2.63 3.38
13.39 16.87 22.38 72.96 101.30
0.0333 0.0422 0.0406 0.0315 0.0300
Portfolio Weight
w1 (%)
w2 (%)
w3 (%)
38.57 28.10 16.86 13.04 −25.82 −16.90 −100.53 −199.47 −366.73 66.73
33.33 70.10 142.72 400.00 400.00
low mean–variance efficiency. As the portfolios on efficient frontiers shown in Table 82E.2, assuming the long-term annual risk-free interest rate is 4%, the portfolios of corner solutions are less mean–variance efficient than the ones without negative and extremely positive weights.
page 2838
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Chapter 83
Options and Option Strategies: Theory and Empirical Results Cheng Few Lee Contents 83.1 83.2
83.3
83.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . The Option Market and Related Definitions . . . . . . . . 83.2.1 What is an option? . . . . . . . . . . . . . . . . . 83.2.2 Types of options and their characteristics . . . . . 83.2.3 Relationships between the option price and the underlying asset price . . . . . . . . . . . . . . . . 83.2.4 Additional definitions and distinguishing features 83.2.5 Types of underlying asset . . . . . . . . . . . . . . 83.2.6 Institutional characteristics . . . . . . . . . . . . . Put–Call Parity . . . . . . . . . . . . . . . . . . . . . . . . 83.3.1 European options . . . . . . . . . . . . . . . . . . 83.3.2 American options . . . . . . . . . . . . . . . . . . 83.3.3 Futures options . . . . . . . . . . . . . . . . . . . 83.3.4 Market application . . . . . . . . . . . . . . . . . Risk–Return Characteristics of Options . . . . . . . . . . 83.4.1 Long call . . . . . . . . . . . . . . . . . . . . . . . 83.4.2 Short call . . . . . . . . . . . . . . . . . . . . . . . 83.4.3 Long put . . . . . . . . . . . . . . . . . . . . . . . 83.4.4 Short put . . . . . . . . . . . . . . . . . . . . . . . 83.4.5 Long straddle . . . . . . . . . . . . . . . . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 2839
. . . .
. . . .
2840 2841 2841 2841
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
2843 2846 2848 2849 2850 2850 2853 2854 2855 2856 2856 2858 2861 2862 2863
page 2839
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
2840
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
83.4.6 Short straddle . . . . . . . . . 83.4.7 Long vertical (bull) spread . . 83.4.8 Short vertical (bear) spread . . 83.4.9 Calendar (time) spreads . . . . 83.5 Excel Approach to Analyze the Option 83.5.1 Long straddle . . . . . . . . . 83.5.2 Short straddle . . . . . . . . . 83.5.3 Long vertical (bull) spread . . 83.5.4 Short vertical (bear) spread . . 83.5.5 Protective put . . . . . . . . . 83.5.6 Covered call . . . . . . . . . . 83.5.7 Collar . . . . . . . . . . . . . . 83.6 Summary . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
2866 2868 2870 2871 2873 2874 2874 2875 2877 2877 2878 2881 2882 2882
Abstract This chapter aims to establish a basic knowledge of options and the markets in which they are traded. It begins with the most common types of options, calls, and puts, explaining their general characteristics and discussing the institutions where they are traded. In addition, the concepts relevant to the new types of options on indexes and futures are introduced. The next focus is the basic pricing relationship between puts and calls, known as put–call parity. The final study concerns how options can be used as investment tools. Alternative option strategies theory has been presented. Excel is used to demonstrate how different option strategies can be executed. Keywords Put-call Parity • Long Call • Short Call • Long Put • Short Put • Long Straddle • Short Straddle • Long Vertical (Bull) Spread • Short Vertical (Bear) Spread • Calendar (Time) Spreads • Long Straddle • Short Straddle • Long Vertical (Bull) Spread • Short Vertical (Bear) Spread • Protective Put • Covered Call • Collar.
83.1 Introduction The use of stock options for risk reduction and return enhancement has expanded at an astounding pace over the last two decades. Among the causes of this growth, two are most significant. First, the establishment of the Chicago Board Option Exchange (CBOE) in 1973 brought about the liquidity necessary for successful option trading through public listing and standardization of option contracts. The second stimulus emanated from academia. In the same year that the CBOE was established, Professors Fischer Black and Myron Scholes published a paper in which they derived a revolutionary option-pricing model. The power of their model to predict an option’s fair price has since made it the industry standard.
page 2840
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2841
2841
The development of option-valuation theory shed new light on the valuation process. Previous pricing models such as CAPM were based on very stringent assumptions, such as there being an identifiable and measurable market portfolio, as well as various imputed investor attributes, such as quadratic utility functions. Furthermore, the previous theory priced only market risk since investors were assumed to hold well-diversified portfolios. The strength of the Black–Scholes and subsequent option-pricing models is that they rely on far fewer assumptions. In addition, the option-valuation models price total risk and do not require any assumptions concerning the direction of the underlying securities price. The growing popularity of the option concept is evidenced by its application to the valuation of a wide array of other financial instruments (such as common stock and bonds) as well as more abstract assets including leases and real estate agreements. In Section 83.2, we discuss the option market and any related definitions. Section 83.3 discusses put–call parity, and Section 83.4 considers the risk–return characteristics of options. The Excel approach to analyze the option strategies is talked about in Section 83.5, and finally the chapter is summarized in Section 83.6. 83.2 The Option Market and Related Definitions This section discusses option-market and related definitions of options, which are needed to understand option valuations and option strategies. 83.2.1 What is an option? An option is a contract conveying the right to buy or sell a designated security at a stipulated price. The contract normally expires at a predetermined time. The most important element of an option contract is that there is no obligation placed upon the purchaser: it is an “option.” This attribute of an option contract distinguishes it from other financial contracts. For instance, while the holder of an option has the opportunity to let his or her claim expire unused if so desired, futures and forward contracts obligate their parties to fulfill certain conditions. 83.2.2 Types of options and their characteristics A call option gives its owner the right to buy the underlying asset, while a put option conveys to its holder the right to sell the underlying asset. An option is specified by five essential parts: (1) the type (call or put);
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
b3568-v3-ch83
C. F. Lee
2842
(2) (3) (4) (5)
9.61in x 6.69in
the the the the
underlying asset; exercise price; expiration date; and option price.
While the most common type of underlying asset for an option is an individual stock, other underlying assets for options exist as well. These include futures contracts, foreign currencies, stock indexes, and US debt instruments. In the case of common stock options (on which this discussion is exclusively centered), the specified quantity to which the option buyer is entitled to buy or sell is 100 shares of the stock per option. The exercise price (also called the striking price) is the price stated in the option contract at which the call (put) owner can buy (sell) the underlying asset up to the expiration date, the final calendar date on which the option can be traded. Options on common stocks have expiration dates three months apart in one of three fixed cycles: (1) January/April/July/October; (2) February/May/August/November; and (3) March/June/September/December. The normal expiration date is the third Saturday of the month. (The third Friday is the last trading date for the option.) As an example, an option referred to as an “ABC June 25 call” is an option to buy 100 shares of the underlying ABC stock at $25 per share, up to its expiration date in June. Option prices are quoted on a per-share basis. Thus, a stock option that is quoted at $5 would cost $500($5 × 100 shares), plus commission and a nominal SEC fee. A common distinction among options pertains to when they can be exercised. Exercising an option is the process of carrying out the right to buy or sell the underlying asset at the stated price. American options allow the exercise of this right at any time from when the option is purchased up to the expiration date. On the other hand, European options allow their holder the right of exercise only on the expiration date itself. The distinction between an American and European option has nothing to do with the location at which they are traded. Both types are currently bought and sold in the United States. There are distinctions in their pricing and in the possibility of exercising them prior to expiration. Finally, when discussing options, the two parties to the contract are characterized by whether they have bought or sold the contract. The party buying
page 2842
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results Table 83.1: Strike
page 2843
2843
Options Quotes for JNJ at March 29, 2011.
Symbol
Last
Bid
Ask
Vol
Open Int
23.95 18.95 14.15 9.2 6.7 4.2 1.89 0.3 0.03 N/A N/A
24.45 19.35 14.25 9.35 6.85 4.3 1.94 0.31 0.04 0.02 0.02
0 33 45 2 1 5 40 399 232 407 24
122 360 54 213 518 1787 4491 23603 24478 12559 8731
0.01 0.02 0.05 0.09 0.2 1.09 3.25 5.75 8.2
0.04 0.04 0.07 0.1 0.22 1.13 3.4 5.85 8.5
2 2 4 2 53 15 1 3 2
1983 3518 3355 7993 24464 83678 10861 1038 713
Stock Price at March 29, 2011 = $59.38 Call Options Expiring Fri. April 15, 2011 35 40 45 50 52.5 55 57.5 60 62.5 65 67.5
JNJ110483C00035000 JNJ110483C00040000 JNJ110483C00045000 JNJ110483C00050000 JNJ110483C00052500 JNJ110483C00055000 JNJ110483C00057500 JNJ110483C00060000 JNJ110483C00062500 JNJ110483C00065000 JNJ110483C00067500
24.7 17.85 13.85 8.75 6.62 4.25 1.84 0.3 0.04 0.02 0.02
Put Options Expiring Fri. April 15, 2011 47.5 50 52.5 55 57.5 60 62.5 65 67.5
JNJ110483P00047500 JNJ110483P00050000 JNJ110483P00052500 JNJ110483P00055000 JNJ110483P00057500 JNJ110483P00060000 JNJ110483P00062500 JNJ110483P00065000 JNJ110483P00067500
0.02 0.04 0.05 0.09 0.23 1.21 3.28 5.85 8.45
the option contract (call or put) is the option buyer (or holder), while the party selling the option is the option seller (or writer). If the writer of an option does not own the underlying asset, he or she is said to write a naked option. Table 83.1 shows a listing of publicly traded options for Johnson & Johnson (JNJ) at March 29, 2011. 83.2.3 Relationships between the option price and the underlying asset price A call (put) option is said to be in the money if the underlying asset is selling above (below) the exercise price of the option. An at-the-money call (put) is one whose exercise price is equal to the current price of the underlying asset. A call (put) option is out of the money if the underlying asset is selling below (above) the exercise price of the option.
July 6, 2020
15:52
2844
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
Suppose the ABC stock is selling at $30 per share. An ABC June 25 call option is in the money ($30 − 25 > 0), while an ABC June 35 call option is out of the money ($30 − 35 < 0). Of course, the expiration dates could be any month without changing the option’s standing as in, at, or out of the money. The relationship between the price of an option and the price of the underlying asset indicates both the amount of intrinsic value and time value inherent in the option’s price, as shown in equation (83.1): Intrinsic value = Underlying asset price − Option exercise price.
(83.1)
For a call (put) option that is in the money (underlying asset price > exercise price), its intrinsic value is positive. And for at-the-money and out-of-themoney options, the intrinsic value is zero. An option’s time value is the amount by which the option’s premium (or market price) exceeds its intrinsic value. For a call or put option, Time value = Option premium − Intrinsic value,
(83.2)
where intrinsic value is the maximum of zero or stock price minus exercise price. Thus an option premium or market price is composed of two components: intrinsic value, and time value. In-the-money options are usually most expensive because of their large intrinsic-value component. An option with an at-the-money exercise price will have only time value inherent in its market price. Deep out-of-the-money options have zero intrinsic value and little time value and consequently are the least expensive. Deep in-the-money cases also have little time value, and time value is the greatest for at-themoney options. In addition, time value (as its name implies) is positively related to the amount of time the option has to expiration. The theoretical valuation of options focuses on determining the relevant variables that affect the time-value portion of an option premium and the derivation of their relationship in option pricing. In general, the call price should be equal to or exceed the intrinsic value: C ≥ Max(S − E, 0), where C is the value of the call option, S is the current stock price, and E is the exercise price. Figure 83.1 illustrates the relationship between an option’s time value and its exercise price. When the exercise price is zero, the time value of an option is zero. Although this relationship is described quite well in general by Figure 83.1, the exact relationship is somewhat ambiguous. Moreover, the
page 2844
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
Figure 83.1:
b3568-v3-ch83
page 2845
2845
The relationship between an option’s exercise price and its time value.
identification of options with a mispriced time-value portion in their total premium motivates interest in a theoretical pricing model. One more aspect of time value that is very important to discuss is the change in the amount of time value an option has as its duration shortens. As previously mentioned, options with a longer time to maturity and those near to the money have the largest time-value components. Assuming that a particular option remains near to the money as its time to maturity diminishes, the rate of decrease in its time value, or what is termed the effect of time decay, is of interest. How time decay affects an option’s premium is an important question for the valuation of options and the application of option strategies. To best see an answer to this question, refer to Figure 83.2. In general, the value of call options with the same exercise price increases as time to expiration increases: C (S1 , E1 , T1 ) ≤ C (S1 , E1 , T2 ), where T1 ≤ T2 and T1 and T2 are the time to expiration. Note that for the simple care in Figure 83.2, the effect of time decay is smooth up until the last month before expiration when the time value of an option begins to decay very rapidly. Example 83.1 shows the effect of time decay. Example 83.1. It is January 1, the price of the underlying ABC stock is $20 per share, and the time premiums are shown in the following Table 83.2. What is the value for the various call options?
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2846
Figure 83.2: The relationship between time value and time to maturity for a near-tothe-money option (assuming a constant price for the underlying asset). Table 83.2: Exercise Price X 15 20 25
Time Premiums.
January
April
July
$0.50 1.00 0.50
$1.25 2.00 1.25
$3.50 5.00 3.50
Solution Call premium = Intrinsic value + Time premium C15,Jan = Max (20 − 15, 0) + 0.50 = $5.50 per share or $550 for 1 contract Other values are shown in Table 83.3. 83.2.4 Additional definitions and distinguishing features Options may be specified in terms of their classes and series. A class of options refers to all call and put options on the same underlying asset. For example, all AT&T call and put options at various exercise prices and expiration months form one class. A series is a subset of a class and consists of all contracts of the same class (such as AT&T) having the same expiration date and exercise price.
page 2846
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results Table 83.3: Exercise Price X 15 20 25
page 2847
2847
Value of Call premiums.
January
April
July
$5.50 Max (20 − 20, 0) + 1.00 = $1.00 Max (20 − 25, 0) + 5.0 = $0.50
Max (20 − 15, 0) + 1.25 = $6.25 Max (20 − 20, 0) + 2.00 = $2.00 Max (20 − 25, 0) + 1.25 = $1.25
Max (20 − 15, 0) + 3.50 = $8.50 Max (20 − 20, 0) + 5.00 = $5.00 Max (20 − 25, 0) + 3.50 = $3.50
When an investor either buys or sells an option (i.e., is long or short) as the initial transaction, the option exchange adds this opening transaction to what is termed the open interest for an option series. Essentially, open interest represents the number of contracts outstanding at a particular point in time. If the investor reverses the initial position with a closing transaction (i.e., sells the option if he or she originally bought it or vice versa), then the open interest for the particular option series is reduced by one. While open interest is more of a static variable, indicating the number of outstanding contracts at one point in time, volume represents a dynamic characteristic. More specifically, volume indicates the number of times a particular option is bought and sold during a particular trading day. Volume and open interest are measures of an option’s liquidity, the ease with which the option can be bought and sold in large quantities. The larger the volume and/or open interest, the more liquid the option. Again, an option holder who invokes the right to buy or sell is exercising the option. Whenever a holder exercises an option, a writer is assigned the obligation to fulfill the terms of the option contract by the exchange on which the option is traded. If a call holder exercises the right to buy, a call writer is assigned the obligation to sell. Similarly, when a put holder exercises the right to sell, a put writer is assigned the obligation to buy. The seller or writer of a call option must deliver 100 shares of the underlying stock at the specified exercise price when the option is exercised. The writer of a put option must purchase 100 shares of the underlying stock when the put option is exercised. The writer of either option receives the premium or price of the option for this legal obligation. The maximum loss an option buyer can experience is limited to the price of the option. However, the maximum loss from writing a naked call is unlimited; the maximum loss possible from writing a naked put is the exercise price less the original price of that put. To guarantee that the option writer can meet these obligations, the exchange clearinghouse requires margin deposits.
July 6, 2020
15:52
2848
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
The payment of cash dividends affects both the price of the underlying stock and the value of an option on the stock. Normally, no adjustment is made in the terms of the option when a cash dividend is paid. However, strike price or number of shares may be adjusted if the underlying stock realizes a stock dividend or stock split. For example, an option on XYZ Corporation with an exercise price of $100 would be adjusted if XYZ Corporation stock split two for one. The adjustment in this case would be a change in the exercise price from $100 to $50, and the number of contracts would be doubled. In the case of a non-integer split (such as three for two), the adjustment is made to the exercise price and the number of shares covered by the option contracts. For example, if XYZ Corporation had an exercise price of $100 per share and had a three-for-two split, the option would have the exercise price adjusted to $66.66, and the number of shares would be increased to 150. Note that the old exercise value of the option, $10,000 ($100 × 100 shares), is maintained by the adjustment ($66.66 × 150 shares). 83.2.5 Types of underlying asset Although most people would identify common stocks as the underlying asset for an option, a variety of other assets and financial instruments can assume the same function. In fact, options on agricultural commodities were introduced by traders in the United States as early as the mid-l800s. After a number of scandals, agricultural commodity options were banned by the government. They were later reintroduced under tighter regulations and in a more standardized tradable form. Today, futures options on such agricultural commodities as corn, soybeans, wheat, cotton, sugar, live cattle, and live hogs are actively traded on a number of exchanges. The biggest success for options has been realized for options on financial futures. Options on the S&P 500 index futures contracts, NYSE index futures, foreign-currency futures, 30-year US Treasury bond futures, and gold futures have all realized extraordinary growth since their initial offerings back in 1982. Options on futures are very similar to options on the actual asset, except that the futures options give their holders the right (not the obligation) to buy or sell predetermined quantities of specified futures contracts at a fixed price within a predetermined period. Options on the actual asset have arisen in another form as well. While a number of options have existed for various stock-index futures contracts, options now also exist on the stock index itself. Because of the complexity of having to provide all the stocks in an index at the spot price should a call holder exercise his or her buy right, options on stock indexes are always
page 2848
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2849
2849
settled in cash. That is, should a call holder exercise his or her right to buy because of a large increase in the underlying index, that holder would be accommodated by a cash amount equal to the profit on his contract, or the current value of the option’s premium. Although the options on the S&P 100 stock index at the CBOE are the most popular among traders, numerous index options are now traded as well. These include options on the S&P 500 index, the S&P OTC 250 index, the NYSE composite and AMEX indexes (computer technology, oil and gas, and airline), the Philadelphia Exchange indexes (gold/silver), the Value Line index, and the NASDAQ 100 index.
83.2.6 Institutional characteristics Probably, two of the most important underlying factors leading to the success of options have been the standardization of contracts through the establishment of option exchanges and the trading anonymity brought about by the Option Clearing Corporations and clearinghouses of the major futures exchanges. An important element for option trading is the interchangeability of contracts. Exchange contracts are not matched between individuals. Instead, when an investor or trader enters into an option contract, the Option Clearing Corporation (or clearinghouse for the particular futures exchange) takes the opposite side of the transaction. So rather than having to contact a particular option writer to terminate an option position, a buyer can simply sell it back to the exchange at the current market clearing price. This type of anonymity among option-market participants is what permits an active secondary market to operate. The sources of futures options traded on the various futures exchanges mentioned earlier are determined by the open-auction bidding, probably the purest form of laissez-faire price determination that can be seen today. With the open-auction-bidding price mechanism, there are no market makers, only a large octagonal pit filled with traders bidding among themselves to buy and sell contracts. While some traders buy and sell only for themselves, many of the participants are brokers representing large investment firms. Different sides of the pit usually represent traders who are dealing in particular expiration months. As brokers and other pit participants conduct trade, they mark down what they bought or sold, how much, at what price, and from whom. These cards are then collected by members of the exchange who record the trades and post the new prices. The prices are displayed on “scoreboards” surrounding the pit.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2850
While stock options and options on commodities and indexes are traded in a similar fashion, one major difference prevails — the presence of market makers. Market makers are individuals who typically trade one type of option for their own account and are responsible for ensuring that a market always exists for their particular contract. In addition, some option exchanges utilize board brokers as well. These individuals are charged with the maintenance of the book of limit orders (orders from outside investors that are to be executed at particular prices or when the market goes up or down by a prespecified amount). Essentially, market makers and board brokers on the options exchanges share the duties performed by the specialists on the major stock exchanges. Although stocks can be bought with as little as 50% margin, no margin is allowed for buying options-the cost of the contract must be fully paid. Because options offer a high degree of leverage on the underlying asset, additional leveraging through margins is considered by regulators to be excessive. However, if more than one option contract is entered into at the same time — for instance, selling and buying two different calls — then, of course, a lower cost is incurred, since the cost of one is partially (or wholly) offset by the sale of the other. 83.3 Put–Call Parity This section addresses a most important concept, called put–call parity (for option valuation). The discussion includes European options, American options, and future options. 83.3.1 European options As an initial step to examining the pricing formulas for options, it is essential to discuss the relationships between the prices of put and call options on the same underlying asset. Such relationships among put and call prices are referred to as the put–call parity theorems. Stoll (1969) was the first to introduce the concept of put–call parity. Dealing strictly with European options, he showed that the value of a call option would equal the value of a portfolio composed of a long put option, its underlying stock, and a short discounted exercise price. Before stating the basic put–call parity theorem as originally devised by Stoll, it must be assumed that the markets for options, bonds, and stocks (or any other underlying asset we choose) are frictionless. Theorem 1. Put–call parity for European options with no dividends Ct,T = Pt,T + St − EBt,T ,
(83.3)
page 2850
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2851
2851
where Ct,T is the value of a European call option at time t that matures at time T (T > f ); Pt,T is the value of a European put option at time t, that matures at time T , St is the value of the underlying stock (asset) to both the call and put options at time t; E is the exercise price for both the call and put options; Bt,T is the price at time tof a default-free bond that pays $1 with certainty at time T (if it is assumed that this risk-free rate of interest is the same for all maturities and equal to r — in essence a flatterm structure — then Bt,T = e−r(T −t) , under continuous compounding), or Bt,T = 1/(1 + r)T −t for discrete compounding. Equation (83.3) uses the following principle. If the options are neither dominant nor dominated securities, and if the borrowing and lending rates are equal, then the return patterns of a European call and a portfolio composed of a European put, a pure discount bond with a face value equal to the options exercise price E, and the underlying stock (or asset) are the same.1 In understanding why the put–call parity theorem holds, and to support the theorem, two additional properties of option pricing must be provided: Property 1. At maturity (time T ), the call option is worth the greater of ST − E dollars or zero dollars: CT = Max(0, ST − E).
(83.4)
As an example, suppose that the call option has an exercise price of $30. At maturity, if the stock’s (asset’s) price is $25, then the value of the call is the maximum of (0, 25 − 30) or (0, − 5), which of course is zero. If an option sells for less than (S, − E), its intrinsic value, an arbitrage opportunity will exist. Investors would buy the option and short sell the stock, forcing the mispricing to correct itself. Consequently, this first property implies that a call option’s value is always greater than zero. An equivalent property and argument exist for the value of a put option as well. Property 2. At maturity, the value of a put option is the greater of E − ST dollars or zero dollars: PT = Max (0, E − ST ) .
1
(83.5)
Any security x is dominant over any security y if the rate of return on x is equal to or greater than that of y for all states of nature and is strictly greater for at least one state. For an expanded discussion of this subject, see Merton (1973) and Smith (1976).
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2852 Table 83.4:
Call parity for a European option with no dividends. Time T (Maturity)
Time t strategy Portfolio A 1. Buy 100 shares of the stock (St ). 2. Buy a put (Pt , maturity at T with exercise price E). 3. Borrow EBt,T dollars. Portfolio A value at time T
ST > E
ST = E
ST < E
ST 0
ST 0
ST E − ST
−E ST − E
−E 0
−E 0
0
0
Portfolio B 1. Buy a call (CT maturing at T with exercise price E). (ST − E)
Using the same line of reasoning and argument as for the call option, the second property also implies that the value of a put option is never less than zero. Table 83.4 provides proof of this first put–call parity theorem. Suppose at time t, two portfolios are formed: portfolio B is just a long call option on a stock with price St an exercise price of E, and a maturity date at T. Portfolio A consists of purchasing 100 shares of the underlying stock (since stock options represent one hundred shares), purchasing (going long) one put option on the same stock with exercise price E and maturity date T , and borrowing at the risk-free rate an amount equal to the present value of the exercise price or EBt,T with face value of E. (This portion of the portfolio finances the put, call, and stock position.) At maturity date T , the call option (portfolio B) has value only if ST > E, which is in accordance with Property 1. For portfolio A, under all these conditions, the stock price and maturing loan values are the same, whereas the put option has value only if E > ST . Under all three possible outcomes for the stock price ST , it can be seen that the values of portfolios A and B are equal. Proof has been established for the first put–call parity theorem. Example 83.2 provides further illustration. Example 83.2. A call option with one year to maturity and exercise price of $110 is selling for $5. Assuming discrete compounding, a risk-free rate of 10%, and a current stock price of $100, what is the value of a European put option with a strike price of $110 and one-year maturity? Solution Pt,T = Ct,T + EBt,T − St
page 2852
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
P0,1yr = $5 + $110
1 (1.1)1
page 2853
2853
− $100
P0,1yr = $5. 83.3.2 American options Indeed, this first put–call parity theorem holds only under the most basic conditions (i.e., no early exercise and no dividends). Jarrow and Rudd (1983) give an extensive coverage of the effects of more complicated conditions on put–call parity. These authors demonstrate that the effect of known dividends is simply to reduce, by the discounted value (to time t) of the dividends, the amount of the underlying stock purchased. In considering stochastic dividends, the exactness of this pricing relationship breaks down and depends on the degree of certainty that can be maintained about the range of future dividends. Put–call parity for American options is also derived under various dividend conditions. Jarrow and Rudd demonstrate that as a result of the American option’s early exercise feature, strict pricing relationships give way to boundary conditions dependent on the size and certainty of future dividends, as well as the level of interest rates and the size of the exercise price. To summarize, they state that for sufficiently high interest rates and/or exercise prices it may be optimal to exercise the put prior to maturity (with or without dividends). So the basic put–call parity for an American option with no dividends and constant interest rates is described by the following theorem. Theorem 2. Put–call parity for an American option with no dividends Pt,T + S − EBt,T > Ct,T > Pt,T + St − E.
(83.6)
Increasing the generality of conditions results in increasing boundaries for the equilibrium relationship between put and call options. The beauty of these arguments stems from the fact that they require only that investors prefer more wealth to less. If more stringent assumptions are made, then the bounds can be made tighter. For an extensive derivation and explanation of these theorems, see Jarrow and Rudd (1983). Example 83.3 provides further illustration. Example 83.3. A put option with one year to maturity and an exercise price of $90 is selling for $15; the stock price is $100. Assuming discrete compounding and a risk-free rate of 10%, what are the boundaries for the price of an American call option?
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2854
Solution Pt,T + S − EBt,T > Ct,T > Pt,T + St − E 1 > Ct,T > $15 + $100 − $90 $15 + $100 − $90 (1.1)1 $33.18 > Ct,1yr > $25. 83.3.3 Futures options As a final demonstration of put–call parity, the analysis is extended to the case where the underlying asset is a futures contract. Nevertheless, this chapter takes time to apply put–call parity when the options are on a futures contract because of the growing popularity and importance of such futures options. A futures contract is a contract in which the party entering into the contract is obligated to buy or sell the underlying asset at the maturity date for some stipulated price. While the difference between European and American options still remains, the complexity of dividends can be ignored since futures contracts do not pay dividends. Put–call parity for a European futures option (when interest rates are constant) is as follows: Theorem 3. Put–call parity for a European futures option Ct,T = Pt,T + Bt,T (Ft,T − E) ,
(83.7)
where Ft,T is the price at time tfor a futures contract maturing at time T (which is the underlying asset to both the call and put options). Option pricing Properties 1 and 2 for call and put options apply in an equivalent sense to futures options as well. However, to understand this relationship as stated in equation (83.7), it must be assumed that the cost of a futures contract is zero. While a certain margin requirement is required, the majority of this assurance deposit can be in the form of interest-bearing securities. Hence, as an approximation, a zero cost for the futures contract is not unrealistic. Again, the easiest way to prove this relationship is to follow the same path of analysis used in proving Theorem 1. Table 83.5 indicates that the argument for this theorem’s proof is similar, with only a few notable exceptions. The value of the futures contract at time T (maturity) is equal to the difference between the price of the contract at time T and the price at which it was bought, or FT T − Ft,T . This is an outcome of the fixed duration of a futures contract as opposed to the perpetual duration of common stock. Second, because no money is required to enter into the futures contract, the
page 2854
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results Table 83.5:
page 2855
2855
Put–call parity for a European futures option. Time T (Maturity)
Time t strategy Portfolio A 1. Buy a future contract (Ft,T ). 2. Buy a put (Pt,T on Ft,T with exercise price E and maturity T ). 3. Lend Bt,T (Ft−T − E) dollars. Portfolio A’s value at time T Portfolio B 1. Buy a call (Ct,T on Ft,T , with exercise price E and maturity T ).
FT T > E
FT T = E
FT T < E
FT T − Ft,T 0
FT T − Ft,T 0
FT T − Ft,T E − FT T
Ft,T − E FT T − E
Ft,T − E 0
Ft,T − E 0
FT T − E
0
0
exercise price is reduced by the current futures price and the total is lent at the risk-free rate. (Actually, this amount is either lent or borrowed depending on the relationship between Ft,T and E at time t. If Ft,T − E < 0, then this amount will actually be borrowed at the risk-free rate.) Why are there options on spot assets as well as options on futures contracts for the spot assets? After all, at expiration, the basis of a futures contract goes to zero and futures prices equal spot prices; thus, options in the spot and options on the future are related to the same futures value, and their current values must be identical. Yet, a look at the markets shows that options on spot assets and options on futures for the same assets sell at different prices. One explanation for this is that investors who purchase options on spot must pay a large sum of money when they exercise their options, whereas investors who exercise an option on a future need only pay enough to meet the initial margin for the futures contract. Therefore, if the exercise of the option is important to an investor, that investor would prefer options on futures rather than options on spot and would be willing to pay a premium for the option on the future, whereas the investor who has no desire to exercise the option (remember, the investor can always sell it to somebody else to realize a profit) is not willing to pay for this advantage and so finds the option on spot more attractive. 83.3.4 Market application Put options were not listed on the CBOE until June 1977. Before that time, brokers satisfied their clients’ demands for put option risk–return
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
2856
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
characteristics by a direct application of put–call parity. By combining call options and the underlying security, brokers could construct a synthetic put. To illustrate, the put–call parity theorem is used when a futures contract is the underlying asset. Furthermore, to simulate the option broker’s circumstances on July 1, 1984, the equation is merely rearranged to yield the put’s “synthetic” value: Pt,T = Ct,T − Bt,T Ft,T + Bt,T E.
(83.8)
So, instead of a futures contract being purchased, it is sold. Assume the following values and use the S&P 500 index futures as the underlying asset. Ct,T = $3.35; Ft,T = 154.85 (September contract); E = 155.00; and Bt,T = 0.9770 (current price of a risk-free bond that pays $1 when the option and futures contract expire, average of bid and ask prices for T-bills from The Wall Street Journal). According to equation (83.8), the put’s price should equal the theorem price: Pt,T = $3.497. The actual put price on this day (July 1, 1984) with the same exercise price and expiration month was Pt,T = $3.50. With repeated comparisons of the theorem using actual prices, it becomes clear that put– call parity is a powerful equilibrium mechanism in the market. 83.4 Risk–Return Characteristics of Options One of the most attractive features of options is the myriad of ways in which they can be employed to achieve a particular combination of risk and return. Whether through a straight option position in combination with the underlying asset or some portfolio of securities, options offer an innovative and relatively low-cost mechanism for altering and enhancing the risk–return tradeoff. In order to better grasp these potential applications, this section analyzes call and put options individually and in combination, relative to their potential profit and loss and the effects of time and market sentiment. 83.4.1 Long call The purchase of a call option is the simplest and most familiar type of option position. The allure of calls is that they provide the investor a great deal of leverage. Potentially, large percentage profits can be realized from only a
page 2856
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
Figure 83.3:
b3568-v3-ch83
page 2857
2857
Profit profile for a long call.
modest price rise in the underlying asset. In fact, the potential profit from buying a call is unlimited. Moreover, the option purchaser has the right but no obligation to exercise the contract. Therefore, should the price of the underlying asset decline over the life of the call, the purchaser need only let the contract expire worthless. Consequently, the risk of a long call position is limited. Figure 83.3 illustrates the profit profile of a long call position. The following summarizes the basic risk–return features for a long-call position. Profit potential: unlimited. Loss potential: limited (to cost of option). Effect of time decay: negative (decrease option’s value). Market expectation: bullish. As the profit profile indicates, the time value of a long call declines over time. Consequently, an option is a wasting asset. If the underlying asset’s price does not move above the exercise price of the option E by its expiration date T , the buyer of the call will lose the value of his initial investment (the option premium). Consequently, the longer an investor holds a call, the more time value the option loses, thereby, reducing the price of the option. This leads to another important point — taking on an option position. As with any other investment vehicle, the purchaser of a call expresses an opinion about the market for the underlying asset. Whereas an investor can essentially express one of three different sentiments (bullish, neutral, or bearish) about future market conditions, the long call is strictly a bullish position.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2858
That is, the call buyer only wins if the underlying asset rises in price. However, depending on the exercise price of the call, the buyer can express differing degrees of bullishness. For instance, since out-of-the-money calls are the cheapest, a large price increase in the underlying asset will make these calls the biggest percentage gainers in value. So, an investor who is extremely bullish would probably go with an out-of-the-money call, since its intrinsic value is small and its value will increase along with a large increase in the market. 83.4.2 Short call Selling a call (writing it) has risk–reward characteristics, which are the inverse of the long call. However, one major distinction arises when writing calls (or puts) rather than buying them. That is, the writer can either own the underlying asset upon which he or she is selling the option (a covered write), or simply sell the option without owning the asset (a naked write). The difference between the two is of considerable consequence to the amount of risk and return taken on by the seller. Let us first examine the profit profile and related attributes of the naked short call, displayed in Figure 83.4. When the writer of a call does not own the underlying asset, his or her potential loss is unlimited. Why? Because if the price of the underlying asset increases, the value of the call also increases for the buyer. The seller of a call is obliged to provide a designated quantity of the underlying asset at some prespecified price (the exercise price) at any time up to the maturity date of the option. So if the asset starts rising dramatically in price and
Figure 83.4:
Profit profile for a short call.
page 2858
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2859
2859
the call buyer exercises his or her right, the naked-call writer must go into the market to buy the underlying asset at whatever the market price. The naked-call writer suffers the loss of buying the asset at a price S and selling it at a price E when S > E (less the original premium collected). When common stock is the underlying asset, there is no limit to how high its price could go. Thus, the naked-call writer’s risk is unlimited as well. Indeed, the naked-call writer could have reversed position by buying back the original option he sold — that is, zeroing out the position — however, this is also done at a loss. The following summarizes the basic risk–return features for a naked short-call position. Profit potential: limited (to option premium). Loss potential: unlimited. Effect of time decay: positive (makes buyer’s position less valuable). Market expectation: bearish to neutral. The naked short-call position is obviously a bearish position. If the underlying asset’s price moves down, the call writer keeps the entire premium received for selling this call, since the call buyer’s position becomes worthless. Once again, the naked-call writer can express the degree of bearishness by the exercise price at which he or she sells the call. By selling an in-themoney call, the writer stands to collect a higher option premium. Conversely, selling an out-of-the-money call conveys only a mildly bearish to neutral expectation. If the underlying asset’s price stays where it is, the value of the buyer’s position, which is solely time value, will decay to zero; and the call writer will collect the entire premium (though a substantially smaller premium than for an in-the-money call). While the passing of time has a negative effect on the value of a call option for the buyer, it has a positive effect for the seller. One aspect of an option’s time value is that in the last month before the option expires, its time value decays most rapidly. The reason being that time value is related to the probability that the underlying asset’s price will move up or down enough to make an option position increase in value. This probability declines at an accelerating (exponential) rate as the option approaches its maturity date. The consideration of time value, then, is a major element when investing in or hedging with options. Unless an investor is extremely bullish, it would probably be unwise to take a long position in a call in its last month before maturity. Conversely, the last month of an option’s life is a preferred time to sell since its time value can more easily and quickly be collected. Now consider the other type of short-call position, covered-call writing. Because the seller of the call owns the underlying asset in this case, the risk
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2860
is truncated. The purpose of writing a call on the underlying asset when it is owned is twofold. First, by writing a call option, one always decreases the risk of owning the asset. Second, writing a call can increase the overall realized return on the asset. The profit profile for a covered short call (or a covered write) in Figure 83.5 provides further illustration. The following summarizes the basic risk–return features for the covered short-call position. Profit potential: limited (exercise price − asset price + call premium). Loss potential: limited (asset price − call premium). Effect of time decay: positive. Market expectation: neutral to mildly bullish. By owning the underlying asset, the covered-call writer’s loss on the asset for a price decline is decreased by the original amount of the premium collected for selling the option. The total loss on the position is limited to the extent that the asset is one of limited liability, such as a stock, and cannot fall below zero. The maximum profit on the combined asset and option position is higher if the option was written alone, but lower simply owning the asset with no short call written on it. Once the asset increases in price by a significant amount the call buyer will very likely exercise the right to purchase the asset at the prespecified exercise price. Thus, covered-call writing is a tool or strategy for enhancing an asset’s realized return while lowering its risk in a sideways market.
Figure 83.5:
Profit profile for a covered short call.
page 2860
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2861
2861
83.4.3 Long put Again, the put option conveys to its purchasers the right to sell a given quantity of some asset at a prespecified price on or before its expiration date. Similar to a long call, a long put is also a highly leveraged position, but the purchaser of the put makes money on the investment only when the price of the underlying asset declines. While a call buyer has unlimited profit potential, a put buyer has limited profit potential since the price of the underlying asset can never drop below zero. Yet like the long-call position, the put buyer can never lose more than the initial investment (the option’s premium). The profit profile for a long put is seen in Figure 83.6. The following summarizes the basic risk–return features for the profit profile of a long-put position. Profit potential: limited (asset price must be greater than zero). Loss potential: limited (to cost of put). Effect of time decay: negative or positive. Market expectation: bearish. An interesting pricing ambiguity for this bearish investment is how the put’s price is affected by the time decay. With the long call there is a clear-cut relation — that is, the effect of the time decay is to diminish the value of the call. The relationship is not so clear with the long put. Although at certain prices for the underlying asset the value of the long-put position decreases with time, there exist lower asset prices for which its value will increase with time. It is the put’s ambiguous relationship with time that makes its correct price difficult to ascertain.
Figure 83.6:
Profit profile for a long put.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2862
One uniquely attractive attribute of the long put is its negative relationship with the underlying asset. In terms of the capital asset pricing model, it has a negative beta (though usually numerically larger than that of the underlying asset, due to the leverage affect). Therefore, the long put is an ideal hedging instrument for the holder of the underlying asset who wants to protect against a price decline. If the investor is wrong and the price of the asset moves up instead, the profit from the asset’s price increase is only moderately diminished by the cost of the put. 83.4.4 Short put As was true for the short-call position, put writing can be covered or uncovered (naked). The risk–return features of the uncovered (naked) short put are discussed first. For taking on the obligation to buy the underlying asset at the exercise price, the put writer receives a premium. The maximum profit for the uncovered-put-writer is this premium, which is initially received. Figure 83.7 provides further illustration. While the loss potential is limited for the uncovered put-writer, it is nonetheless still very large. Thus, someone neutral on the direction of the market would sell out-of-the-money (lower exercise price) puts. A more bullish sentiment would suggest that at-the-money options be sold. The investor who is convinced that the market will go up should maximize return by selling a put with a larger premium. As with the long put, the time-decay effect is ambiguous and depends on the price of the underlying asset. The
Figure 83.7:
Profit profile for an uncovered short call.
page 2862
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2863
2863
following summarizes the basic risk–return features for the profit profile of an uncovered short-put position. Profit potential: limited (to put premium). Loss potential: limited (asset price must be greater than zero). Effect of time decay: positive or negative. Market expectation: neutral to bullish. Referring again to Figure 83.5 for the combined short-call and long-asset position, notice the striking resemblance of its profit profile at expiration to that for the uncovered short put. This relationship can be seen mathematically by using the put–call parity. That is, the synthetic put price PT = Ec + CT − ST , or at expiration the value of the put should equal the exercise price of the call option plus the call option’s value minus the value at time T of the underlying asset. Buying (writing) a call and selling (buying) the underlying asset (or vice versa) allows an investor to achieve essentially the same risk–return combination as would be received from a long put (short put). This combination of two assets to equal the risk and return of a third is referred to as a synthetic asset (or synthetic option in this case). Synthesizing two financial instruments to resemble a third is an arbitrage process and is a central concept of finance theory. Now, a look at covered short puts is in order to round out the basics of option strategies. For margin purposes and in a theoretical sense, selling a put against a short-asset position would be the sale of a covered put. However, this sort of position has a limited profit potential that is obtained if the underlying asset is anywhere below the exercise price of the put at expiration. This position also has unlimited upside risk, since the short position in the asset will accrue losses while the profit from the put sale is limited. Essentially, this position is equivalent to the uncovered or naked short call, except that the latter has less expensive transaction costs. Moreover, because the time value for put options is generally less than that of calls, it will be advantageous to short the call. Strictly speaking, a short put is covered only if the investor also owns a corresponding put with exercise price equal to or greater than that of the written put. Such a position, called a spread, is discussed later in this chapter. 83.4.5 Long straddle A straddle is a simultaneous position in both a call and a put on the same underlying asset. A long straddle involves purchasing both the call and the
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2864
Figure 83.8:
Profit profile for a long straddle.
put. By combining these two seemingly opposing options, an investor can get the best risk–return combination that each offers. The profit profile for a long straddle in Figure 83.8 illustrates the nature of this synthetic asset. Figure 83.8 summarizes the basic risk–return features for the profit profile of a long-straddle position. Profit potential: unlimited on upside, limited on downside. Loss potential: limited (to cost of call and put premiums). Effect of time decay: negative. Market sentiment: bullish or bearish. The long straddle’s profit profile makes clear that its risk–reward picture is simply that of the long call overlapped by the long put, with each horizontal segment truncated (represented by the horizontal dashed lines on the bottom). An investor will profit on this type of position as long as the price of the underlying asset moves sufficiently up or down to more than cover the original cost of the option premiums. Thus, a long straddle is an effective strategy for someone expecting the volatility of the underlying asset to increase in the future. In the same light, the investor who buys a straddle expects the underlying asset’s volatility of price to be greater than that imputed in the option price. Since time decay is working against the value of this position, it might be unwise to purchase a straddle composed of a call and put in their last month to maturity when their time decay is greatest. It would be possible to reduce the cost of the straddle by purchasing a high-exercise-price call and a low- exercise put (out-of-the-money options); however, the necessary up or down movement in the asset’s price in order to profit is larger. Example 83.4 provides further illustration.
page 2864
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2865
2865
Example 83.4. Situation: An investor feels the stock market is going to break sharply up or down but is not sure which way. However, the investor is confident that market volatility will increase in the near future. To express his position, the investor puts on a long straddle using options on the S&P 500 index, buying both at-the-money call and put options on the September contract. The current September S&P 500 futures contract price is 155.00. Assume the position is held to expiration. Transaction: 1. Buy 1 September 155 call at $2.00 2. Buy 1 September 155 put at $2.00 Net initial investment (position value)
($1,000) ($1,000) ($2,000)
Results: 1. If futures price = 150.00: (a) 1 September call expires at $0 (b) 1 September put expires at $5.00 (c) Less initial cost of put Ending position value (net profit)
($1,000) $2,500 ($1,000) $500
2.
If futures price = 155.00: (a) 1 September call expires at $0 (b) 1 September put expires at $0 Ending position value (net loss)
3.
If futures price = 830.00: (a) 1 September call expires at $5.00 (b) 1 September call expires at $0 (c) Less initial cost of put Ending position value (net profit)
($1,000) ($1,000) $2,000 $2,500 ($1,000) ($1,000) $500
Summary: Maximum profit potential: unlimited. If the market had contributed to move below 150.00 or above 830.00, the position would have continued to increase in value. Maximum loss potential: $2,000, the initial investment. Breakeven points: 151.00 and 159.00, for the September S&P 500 futures contract.2 2
Breakeven points for the straddle are calculated as follows: Upside BEP = Exercise price + Initial net investment (in points) 159.00 = 155.00 + 4.00
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2866
Effect of time decay: negative, as evidenced by the loss incurred, with no change in futures price (result 2). 83.4.6 Short straddle For the most part, the short straddle implies the opposite risk–return characteristics of the long straddle. A short straddle is a simultaneous position in both a short call and a short put on the same underlying asset. Contrary to the long-straddle position, selling a straddle can be an effective strategy when an investor expects little or no movement in the price of the underlying asset. A similar interpretation of its use would be that the investor expects the future volatility of the underlying asset’s price that is currently impounded in the option premiums to decline. Moreover, since the time decay is a positive effect for the value of this position, one appropriate time to set a short straddle might be in the last month to expiration for the combined call and put. Figure 83.9 shows the short straddle’s profit profile, and Example 83.5 provides further illustration. The following summarizes the basic risk–return features for the profit profile of a short-straddle position.
Figure 83.9:
Profit profile for a short straddle.
Downside BEP = Exercise price − Initial net investment (in points) 159.00 = 155.00 + 4.00 151.00 = 155.00 − 4.00
page 2866
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2867
2867
Profit potential: limited (to call and put premiums). Loss potential: unlimited on upside, limited on downside. Effect of time decay: positive. Market expectation: neutral. Example 83.5. Situation: An investor feels the market is overestimating price volatility at the moment and that prices are going to remain stable for some time. To express his opinion, the investor sells a straddle consisting of at-the-money call and put options on the September S&P 500 futures contract, for which the current price is 155.00. Assume the position is held to expiration. Transaction: 1. Sell 1 September 155 call at $2.00 (×$500 per point) 2. Sell 1 September 155 put at $2.00 Net initial inflow (position value) Results: 1. If futures price = 150.00: (a) 1 September 155 call expires at 0 (b) 1 September 155 put expires at $5.00 (c) Plus initial inflow from sale of put Ending position value (net loss) 2.
3.
$1,000 $1,000 $2,000
$1,000 ($2,500) $1,000 ($500)
If futures price = 155.00: (a) 1 September 155 call expires at 0 (b) 1 September 155 put expires at 0 Ending position value (net profit)
$1,000 $1,000 $2,000
If futures price = 830.00: (a) 1 September 155 call expires at $5.00 (b) 1 September put expires at 0 (c) Plus initial inflow from sale of call Ending position value (net loss)
($2,500) $1,000 $1,000 ($500)
Summary: Maximum profit potential: $2,000, result 2. where futures price does not move. Maximum loss potential: unlimited. If futures price had continued up over 830.00 or down below 145.00, this position would have kept losing money.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2868
Breakeven points: 151.00 and 159.00, an eight-point range for profitability of the position.3 Effect of time decay: positive, as evidenced by result 2. 83.4.7 Long vertical (bull) spread When dealing strictly in options, a spread is a combination of any two or more of the same type of options (two calls or two puts, for instance) on the same underlying asset. A vertical spread specifies that the options have the same maturity month. Finally, a long vertical spread designates a position for which one has bought a low-exercise-price call (or a low-exercise-price put) and sold a high-exercise-price call (or a high-exercise-price put) that both mature in the same month. A long vertical spread is also known as a bull spread because of the bullish market expectation of the investor who enters into it. Actually, the long vertical spread (or bull spread) is not a strongly bullish position because the investor limits the profit potential in selling the high-exercise-price call (or high-exercise-price put). Rather, this is a popular position when it is expected that the market will more likely go up than down. Therefore, the bull spread conveys a bit of uncertainty about future market conditions. Of course, the higher the exercise price at which the call is sold, the more bullish the position. An examination of the profit profile for the long vertical spread (see Figure 83.10) can tell more about its risk–return attributes. The following summarizes the basic risk–return features for the profit profile of a long-vertical-spread position.
Figure 83.10:
3
Profit profile for a long vertical spread.
Breakeven points for the short straddle are calculated in the same manner as for the long straddle: exercise price plus initial prices of options.
page 2868
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2869
2869
Profit potential: limited (up to the higher exercise price). Loss potential: limited (down to the lower exercise price) Effect of time decay: mixed. Market expectation: cautiously bullish. Although profit is limited by the shorted call on the upside, the loss potential is also truncated at the lower exercise price by the same short call. There are other reasons for this being a mildly bullish strategy. The effect of time decay is ambiguous up to the expiration or liquidation of the position. That is, if the asset price, St , is near the exercise price of the higher-price option EH , then the position acts more like a long call and time-decay effect is negative. Conversely, if St is near the exercise price of the lower-price option EL , then the bull spread acts more like a short call and the timedecay effect is neutral. Consequently, unless an investor is more than mildly bullish, it would probably be unwise to put on a bull spread with the low exercise price call near the current price of the asset while both options are in their last month to expiration. Example 83.6 provides further illustration. Example 83.6. Situation: An investor is moderately bullish on the West German mark. He would like to be long but wants to reduce the cost and risk of this position in case he is wrong. To express his opinion, the investor puts on a long vertical spread by buying a lower-exercise-price call and selling a higher-exercise-price call with the same month to expiration. Assume the position is held to expiration. Transaction: 1. Buy 1 September 0.37 call at 0.0047 (×125.000 per point) 2. Sell 1 September 0.38 call at 0.0013 Net initial investment (position value)
($587.50) $162.50 ($425.00)
Results: 1.
2.
If futures price = 0.3700: (a) 1 September 0.37 call expires at 0 (b) 1 September 0.38 call expires at 0 Ending position value (net loss)
($587.50) $832.50 ($425.00)
If futures price = 0.3800: (a) 1 September 0.37 call expires at 0.0100 (b) I September 0.38 call expires at 0 Less initial cost of 0.37 call Ending position value (net profit)
$1,250.00 $832.50 ($587.50) $825.00
July 6, 2020
15:52
2870
3.
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
If futures price = 0.3900: (a) 1 September 0.38 call expires at 0.0200 (b) 1 September put expires at 0 Less initial premium of 0.37 call Plus initial premium of 0.38 call Ending position value (net profit)
$2,500.00 ($1,250.00) ($587.50) $832.50 $825.00
Summary: Maximum profit potential: $825.00, result 2. Maximum loss potential: $425.00, result 1. Breakeven point: 0.3734.4 Effect of time decay: mixed; positive if price is at high end of range and negative if at low end.
83.4.8 Short vertical (bear) spread The short vertical spread is simply the reverse of the corresponding long position. That is, an investor buys a high-exercise-price call (or put) and sells a low-exercise-price call (or put), both having the same time to expiration left. As the more common name for this type of option position is bear spread, it is easy to infer the type of market sentiment consistent with this position. The profit profile for the short vertical spread is seen in Figure 83.11. As the profit profile indicates, this strategy is profitable as long as the underlying asset moves down in price. Profit is limited to a price decline in the asset down to the lower exercise price, while risk is limited on the upside by the long-call position. From the time-decay effects shown, a mildly bearish investor might consider using options in the last month to expiration with the EL option near the money. The following summarizes the basic risk–return features for the profit profile of a short-vertical-spread position. Profit potential: limited (down to EL ). Loss potential: limited (up to EH ). Effect of time decay: mixed (opposite to that of long vertical spread). Market sentiment: mildly bearish.
4
Breakeven point for the long vertical spread is computed as lower exercise price plus price of long call minus price of short call (0.3734 = 0.3700 + 0.0047 − 0.0013).
page 2870
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
Figure 83.11:
b3568-v3-ch83
page 2871
2871
Profit profile for a short vertical spread.
83.4.9 Calendar (time) spreads A calendar spread (also called a time or horizontal spread) consists of the sale of one option and the simultaneous purchase of another option with the same exercise price but a longer term to maturity. The objective of the calendar spread is to capture the faster erosion in the time-premium portion of the shorted nearer-term-to-maturity option. By taking a position in two of the same type options (two calls or two puts), both with the same exercise price, the investor utilizing this strategy expresses a neutral opinion on the market. In other words, the investor is interested in selling time rather than predicting the price direction of the underlying asset. Thus, a calendar spread might be considered appropriate for a sideways-moving or quiet market. However, if the underlying asset’s price moves significantly up or down, the calendar spread will lose part of its original value. Figure 83.12 displays the calendar spread’s profit profile and related risk–return attributes. The profit profile shows that this strategy will make money for a rather narrow range of price movement in the underlying asset. While similar in nature to the short straddle (both are neutral strategies), the calendar spread is more conservative. The reason being it has both a lower profit potential and lower (limited) risk than the short straddle. The lower potential profit is the result of only benefiting from the time decay in one option premium instead of two (the call and the put) for the short straddle. Moreover, taking opposite positions in the same type of option at the same exercise price adds a loss limit on each side against adverse price moves. The following
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2872
Figure 83.12:
Profit profile for a neutral calendar spread.
summarizes the basic risk–return features for the profit profile of a neutral calendar-spread position. Profit potential: limited. Loss potential: limited (to original cost of position). Effect of time decay: positive. (Option sold loses value faster than option bought.) Market sentiment: neutral. The calendar spread does not have to be neutral in sentiment. By diagonalizing this spread, it is possible to express an opinion on the market. For instance, by selling a near-term higher-exercise-price option and purchasing a longer-term lower-exercise-price option, the investor is being bullish in position. Such a position is thus referred to as a bullish calendar spread, Why is it bullish? Remember that with the neutral calendar spread, we are concerned solely with benefiting from the faster time decay in the premium of the shorted near-term option. Any significant movement in price upwards, for instance, would have not been profitable because it would have slowed the time decay and increased the intrinsic value of the shorted near-term option. In fact, we would eventually lose money because the difference in premiums between the near-term and the longer-term options (the spread) would narrow as the underlying asset’s price increased. However, the bullish calendar spread is much like a long vertical (or bull) spread in that a modest increase in price for the asset up to the higher exercise price will be profitable. At the same time, though, the bullish calendar spread also reaps some
page 2872
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2873
2873
of the benefits from the greater time decay in the nearer-term option’s premium. While this strategy might sound superior to the straight bull spread, it really depends on market conditions. With a bullish calendar spread, its gain from time decay will probably not be as great as that from a neutral calendar spread, nor will its bullish nature be as profitable as a straight bull spread in the event of a modest price increase for the underlying asset. The real-world application will be discussed in Section 83.5.
83.5 Excel Approach to Analyze the Option Strategies In this section, we will show how the excel program can be used to analyze the option strategies. We use the data of JNJ on March 29, 2011 as presented in Table 83.6 to analyze seven option strategies. Table 83.6 gives the information published on March 29, 2011 for all options that will expire in April 2011. JNJ stock closed at $59.38 on March 29, 2011.
Table 83.6: Strike
Call and put option quotes for JNJ at March 29, 2011. Symbol
Bid
Ask
23.95 18.95 14.15 9.2 6.7 4.2 1.89 0.3 0.03
24.45 19.35 14.25 9.35 6.85 4.3 1.94 0.31 0.04
0.01 0.02 0.05 0.09 0.2 1.09 3.25 5.75 8.2
0.04 0.04 0.07 0.1 0.22 1.13 3.4 5.85 8.5
Call option expiring close April 15, 2011 35 40 45 50 52.5 55 57.5 60 62.5
JNJ110483C00035000 JNJ110483C00040000 JNJ110483C00045000 JNJ110483C00050000 JNJ110483C00052500 JNJ110483C00055000 JNJ110483C00057500 JNJ110483C00060000 JNJ110483C00062500
Put option expiring close April 15, 2011 47.5 50 52.5 55 57.5 60 62.5 65 67.5
JNJ110483P00047500 JNJ110483P00050000 JNJ110483P00052500 JNJ110483P00055000 JNJ110483P00057500 JNJ110483P00060000 JNJ110483P00062500 JNJ110483P00065000 JNJ110483P00067500
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2874
83.5.1 Long straddle Assume that an investor expects the volatility of JNJ stock to increase in the future, and then can use a long straddle to profit. The investor can purchase a call option and a put option with the same exercise price of $60. The investor will profit on this type of position as long as the price of the underlying asset moves sufficiently up or down to more than cover the original cost of the option premiums. Let S0 , ST , and X denote the stock purchase price, future stock price at the expiration time T , and the strike price, respectively. Given X = $60, and the premium for the call option is $0.31 and put option is $1.13, Table 83.7 shows the values for long straddle at different stock prices at time T . The profit profile of the long straddle position is constructed in Figure 83.13.
83.5.2 Short straddle Contrary to the long straddle strategy, an investor will use a short straddle via a short call and a short put on JNJ stock with the same exercise price of $60 when he or she expects little or no movement in the price of JNJ stock. Given X = $60, and the premium for the call option is $0.3 and put option is $1.09, Table 83.8 shows the values for short straddle at different stock prices Table 83.7:
Value of long straddle position at option expiration.
Long a call at strike price Long a put at strike price Stock Price
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
$60.00 $60.00
Premium Premium
$0.31 $1.13
Long call (X = $60)
Long put (X = $60)
Long straddle
Payoff
Profit
Payoff
Profit
Payoff
Profit
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $2.50 $5.00 $7.50 $10.00
−$0.31 −$0.31 −$0.31 −$0.31 −$0.31 −$0.31 −$0.31 $2.19 $4.69 $7.19 $9.69
$15.00 $12.50 $10.00 $7.50 $5.00 $2.50 $0.00 $0.00 $0.00 $0.00 $0.00
$13.87 $11.37 $8.87 $6.37 $3.87 $1.37 −$1.13 −$1.13 −$1.13 −$1.13 −$1.13
$15.00 $12.50 $10.00 $7.50 $5.00 $2.50 $0.00 $2.50 $5.00 $7.50 $10.00
$13.56 $11.06 $8.56 $6.06 $3.56 $1.06 −$1.44 $1.06 $3.56 $6.06 $8.56
page 2874
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
Figure 83.13: Table 83.8:
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
2875
Profit profile for long straddle.
Value of short straddle position at option expiration.
Short a call at strike price Short a put at strike price Stock Price
page 2875
$60.00 $60.00
Premium Premium
$0.30 $1.09
Short call (X = $60)
Short put (X = $60)
Short straddle
Payoff
Profit
Payoff
Profit
Payoff
Profit
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$5.00 −$7.50 −$10.00
$0.30 $0.30 $0.30 $0.30 $0.30 $0.30 $0.30 −$2.20 −$4.70 −$7.20 −$9.70
−$15.00 −$12.50 −$10.00 −$7.50 −$5.00 −$2.50 $0.00 $0.00 $0.00 $0.00 $0.00
−$13.91 −$11.41 −$8.91 −$6.41 −$3.91 −$1.41 $1.09 $1.09 $1.09 $1.09 $1.09
−$15.00 −$12.50 −$10.00 −$7.50 −$5.00 −$2.50 $0.00 −$2.50 −$5.00 −$7.50 −$10.00
−$13.61 −$11.11 −$8.61 −$6.11 −$3.61 −$1.11 $1.39 −$1.11 −$3.61 −$6.11 −$8.61
at time T . The profit profile of the short straddle position is constructed in Figure 83.14. 83.5.3 Long vertical (bull) spread This strategy combines a long call (or put) with a low strike price and a short call (or put) with a high strike price. For example, an investor purchases a
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
Figure 83.14: Table 83.9:
Profit profile for short straddle.
Value of long vertical spread position at option expiration.
Long a call at strike price Short a call at strike price
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
b3568-v3-ch83
C. F. Lee
2876
Stock Price
9.61in x 6.69in
Short call (X = $62.50)
$57.50 $62.50
Premium Premium
Long call (X = $57.50)
$1.94 $0.03
Long vertical spread
Payoff
Profit
Payoff
Profit
Payoff
Profit
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$5.00 −$7.50
$0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 −$2.47 −$4.97 −$7.47
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $2.50 $5.00 $7.50 $10.00 $12.50
−$1.94 −$1.94 −$1.94 −$1.94 −$1.94 −$1.94 $0.56 $3.06 $5.56 $8.06 $10.56
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $2.50 $5.00 $5.00 $5.00 $5.00
−$1.91 −$1.91 −$1.91 −$1.91 −$1.91 −$1.91 $0.59 $3.09 $3.09 $3.09 $3.09
call with the exercise price of $57.50 and sells a call with the exercise price of $62.50. Given X1 = $57.50, X2 = $62.50 and the premium for the long call option is $1.94 and the short call option is $0.03, Table 83.9 shows the values for Long Vertical Spread at different stock prices at time T . The profit profile of the Long Vertical Spread is constructed in Figure 83.15.
page 2876
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
Figure 83.15:
b3568-v3-ch83
page 2877
2877
Profit profile for long vertical spread.
83.5.4 Short vertical (bear) spread Contrary to a long vertical spread, this strategy combines a long call (or put) with a high strike price and a short call (or put) with a low strike price. For example, an investor purchases a call with the exercise price of $60 and sells a call with the exercise price of $57.50. Given X1 = $60, X2 = $57.50 and the premium for the long call option is $0.31 and the short call option is $1.89, Table 83.10 shows the values for short vertical spread at different stock prices at time T . The profit profile of the short vertical spread is constructed in Figure 83.16. 83.5.5 Protective put Assume that an investor wants to invest in the JNJ stock on March 29, 2011, but does not desire to bear any potential loss for prices below $57.50. The investor can purchase JNJ stock and at the same time buy the put option with a strike price of $57.50. Given S0 = $59.38, X = $57.50, and the premium for the put option is $0.22 (the ask price), Table 83.11 shows the values for Protective Put at different stock prices at time T . The profit profile of the Protective Put position is constructed in Figure 83.17.
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
Table 83.10:
Value of short vertical spread position at option expiration.
Long a call at strike price Short a call at strike price
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
b3568-v3-ch83
C. F. Lee
2878
Stock Price
9.61in x 6.69in
Short call (X = $57.50)
$60.00 $57.50
Premium Premium
Long call (X = $60)
$0.31 $1.89
Short vertical spread
Payoff
Profit
Payoff
Profit
Payoff
Profit
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$5.00 −$7.50 −$10.00 −$12.50
$1.89 $1.89 $1.89 $1.89 $1.89 $1.89 −$0.61 −$3.11 −$5.61 −$8.11 −$10.61
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $2.50 $5.00 $7.50 $10.00
−$0.31 −$0.31 −$0.31 −$0.31 −$0.31 −$0.31 −$0.31 $2.19 $4.69 $7.19 $9.69
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$2.50 −$2.50 −$2.50 −$2.50
$1.58 $1.58 $1.58 $1.58 $1.58 $1.58 −$0.92 −$0.92 −$0.92 −$0.92 −$0.92
Figure 83.16:
Profit profile for short vertical spread.
83.5.6 Covered call This strategy involves investing in a stock and selling a call option on the stock at the same time. The value at the expiration of the call will be the stock value minus the value of the call. The call is “covered” because the potential obligation of delivering the stock is covered by the stock held in
page 2878
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results Table 83.11:
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
2879
Value of protective put position at option expiration.
Long a put at strike price Buy one share of stock
Stock Price
page 2879
One share of stock
$57.50
Premium Price
Long put (X = $57.50)
$0.22 $59.38
Protective put value
Payoff
Profit
Payoff
Profit
Payoff
Profit
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
−$14.38 −$11.88 −$9.38 −$6.88 −$4.38 −$1.88 $0.62 $3.12 $5.62 $8.12 $10.62
$12.50 $10.00 $7.50 $5.00 $2.50 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00
$12.28 $9.78 $7.28 $4.78 $2.28 −$0.22 −$0.22 −$0.22 −$0.22 −$0.22 −$0.22
$57.50 $57.50 $57.50 $57.50 $57.50 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
−$2.10 −$2.10 −$2.10 −$2.10 −$2.10 −$2.10 $0.40 $2.90 $5.40 $7.90 $10.40
Figure 83.17:
Profit profile for protective put.
the portfolio. In essence, the sale of the call sold the claim to any stock value above the strike price in return for the initial premium. Suppose a manager of a stock fund holds a share of JNJ stock on March 29, 2011 and she plans to sell the JNJ stock if its price hits $62.50. Then she can write a share
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2880
of a call option with a strike price of $62.50 to establish the position. She shorts the call and collects premiums. Given that S0 = $59.38, X = $62.50, and the premium for the call option is $0.03 (the bid price), Table 83.12 shows the values for covered call at different stock prices at time T . The profit
Table 83.12:
Value of covered call position at option expiration.
Write a call at strike price Buy one share of stock
Stock Price
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
One share of stock
$62.5
Premium Price
Written call (X = $62.50)
$0.03 $59.38
Covered call
Payoff
Profit
Payoff
Profit
Payoff
Profit
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
−$14.38 −$11.88 −$9.38 −$6.88 −$4.38 −$1.88 $0.62 $3.12 $5.62 $8.12 $10.62
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$5.00 −$7.50
$0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 −$2.47 −$4.97 −$7.47
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $62.50 $62.50 $62.50
−$14.35 −$11.85 −$9.35 −$6.85 −$4.35 −$1.85 $0.65 $3.15 $3.15 $3.15 $3.15
Figure 83.18:
Profit profile for covered call.
page 2880
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
Options and Option Strategies: Theory and Empirical Results
page 2881
2881
profile of the covered call position is constructed in Figure 83.18. It can be shown that the payoff pattern of covered call is exactly equal to shorting a put. Therefore, the covered call has frequently been used to replace shorting a put in dynamic hedging practice. 83.5.7 Collar A collar combines a protective put and a short call option to bracket the value of a portfolio between two bounds. For example, an investor holds the JNJ stock selling at $59.38. Buying a protective put using the put option with an exercise price of $55 places a lower bound of $55 on the value of the portfolio. At the same time, the investor can write a call option with an exercise price of $62.50. The call and the put sell at $0.03 (the bid price) and $0.10 (the ask price), respectively, making the net outlay for the two options to be only $0.07. Table 83.13 shows the values of the collar position at different stock prices at time T . The profit profile of the collar position is shown in Figure 83.19.
Table 83.13:
Value of collar position at option expiration.
Write a call at strike price Long a put at strike price Buy one share of stock One share of stock
$62.50 $55.00
Long put (X = $55)
Premium Premium Price Write call (X = $62.50)
$0.03 $0.10 $59.38
Collar value
Stock Price
Payoff
Profit
Payoff
Profit
Payoff
Profit
Payoff
Profit
$45.00 $47.50 $50.00 $52.50 $55.00 $57.50 $60.00 $62.50 $65.00 $67.50 $70.00
$45.00 −$14.38 $47.50 −$11.88 $50.00 −$9.38 $52.50 −$6.88 $55.00 −$4.38 $57.50 −$1.88 $60.00 $0.62 $62.50 $3.12 $65.00 $5.62 $67.50 $8.12 $70.00 $10.62
$10.00 $7.50 $5.00 $2.50 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00
$9.90 $7.40 $4.90 $2.40 −$0.10 −$0.10 −$0.10 −$0.10 −$0.10 −$0.10 −$0.10
$0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 −$2.50 −$5.00 −$7.50
$0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 $0.03 −$2.47 −$4.97 −$7.47
$55.00 $55.00 $55.00 $55.00 $55.00 $57.50 $60.00 $62.50 $62.50 $62.50 $62.50
−$4.45 −$4.45 −$4.45 −$4.45 −$4.45 −$1.95 $0.55 $3.05 $3.05 $3.05 $3.05
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
2882
Figure 83.19:
Profit profile for collar.
83.6 Summary This chapter has introduced some of the essential differences between the two most basic kinds of options: calls and puts. A delineation was made of the relationship between the option’s price or premium and that of the underlying asset. The option’s value was shown to be composed of intrinsic value, or the underlying asset price less the exercise price, and time value. Moreover, it was demonstrated that the time value decays over time, particularly in the last month to maturity for an option. Index and futures options were studied to introduce these important financial instruments. Put–call parity theorems were developed for European, American, and futures options in order to show the basic valuation relationship between the underlying asset and its call and put options. Finally, investment application of options and related combinations were discussed, along with relevant risk–return characteristics. A thorough understanding of this chapter is essential as a basic tool to successful study of option-valuation models. Finally, we also use Excel to show how different option strategies can be executed.
Bibliography M. Amram and N. Kulatilaka (2001). Real Options. New York: Oxford University Press. C. Ball and W. Torous (1983). Bond Prices Dynamics and Options. Journal of Financial and Quantitative Analysis, 18, 517–532.
page 2882
July 6, 2020
15:52
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Options and Option Strategies: Theory and Empirical Results
b3568-v3-ch83
page 2883
2883
M. Bhattacharya (1980). Empirical Properties of the Black–Scholes Formula under Ideal Conditions. Journal of Financial and Quantitative Analysis, 15, 1081–1106. F. Black (1972). Capital Market Equilibrium with Restricted Borrowing. Journal of Business, 45, 444–445. F. Black (1985). Fact and Fantasy in the Use of Options. Financial Analysts Journal, 31, 36–72. F. Black, and M. Scholes (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 31, 637–654. J. Bodhurta and G. Courtadon (1986). Efficiency Tests of the Foreign Currency Options Market. Journal of Finance, 41, 151–832. R. M. Bookstaber (1981). Option Pricing and Strategies in Investing. Reading, MA: Addison-Wesley Publishing Company. R. M. Bookstaber and R. Clarke (1983). Option Strategies for Institutional Investment Management. Reading, MA: Addison-Wesley Publishing Company. M. Brennan and E. Schwartz (1977). The Valuation of American Put Options. Journal of Finance, 32, 449–462. J. C. Cox (1979). Option Pricing: A Simplified Approach. Journal of Financial Economics, 8, 229–263. J. C. Cox and M. Rubinstein (1985). Option Markets. Englewood Cliffs, NJ: Prentice-Hall. W. Eckardt and S. Williams (1984). The Complete Options Indexes. Financial Analysts Journal, 40, 48–57. J. Ervine and A. Rudd (1985). Index Options: The Early Evidence. Journal of Finance, 40, 743–756. J. Finnerty (1978). The Chicago Board Options Exchange and Market Efficiency. Journal of Financial and Quantitative Analysis, 13, 28–38. D. Galai and R. W. Masulis (1976). The Option Pricing Model and the Risk Factor of Stock. Journal of Financial Economics, 3, 53–81. D. Galai, R. Geske, and S. Givots (1988). Option Markets. Reading, MA: Addison-Wesley Publishing Company. G. Gastineau (1979). The Stock Options Manual. New York: McGraw-Hill. R. Geske and K. Shastri (1985). Valuation by Approximation: A Comparison of Alternative Option Valuation Techniques. Journal of Financial and Quantitative Analysis, 20, 45–72. J. Hull (2017). Options, Futures, and Other Derivatives, 10th ed. Upper Saddle. River, New Jersey: Prentice Hall. R. A. Jarrow and A. Rudd (1983). Option Pricing. Homewood, IL: Richard D. Irwin. R. Jarrow and S. Turnbull (1999). Derivatives Securities, 2nd ed. Cincinnati, OH: SouthWestern College Pub. K. T. Liaw, and R. L. Moy (2000). The Irwin Guide to Stocks, Bonds, Futures, and Options New York: McGraw-Hill Co. C. F. Lee, A. C. Lee, and J. Lee (2009). Handbook of Quantitative Finance and Risk Management. New York, NY: Springer. C. F. Lee and Alice C. Lee (2006). Encyclopedia of Finance. New York, NY: Springer. C. F. Lee, J. Finnerty, J. Lee, A. C. Lee, and D. Wort (2013). Security Analysis, Portfolio Management, and Financial Derivatives. Singapore: World Scientific. C. F. Lee and J. C. Lee (2015). Handbook of Financial Econometrics and Statistics, Volume 1, Springer Reference, New York. J. Macbeth and L. Merville (1979). An Empirical Examination of the Black–Scholes Call Option Pricing Model. Journal of Finance, 34, J173–J186. R. L. McDonald (2012). Derivatives Markets, 3rd ed. Boston, MA: Addison Wesley.
July 6, 2020
15:52
2884
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch83
C. F. Lee
R. Merton (1973). Theory of Rational Option Pricing. Bell Journal of Economics and Management Science, 4, 141–183. R. J. Jr. Rendleman and B. J. Barter (1979). Two-State Option Pricing. Journal of Finance, 34, 1093–1110. P. Ritchken (1987). Options: Theory, Strategy and Applications. Glenview, IL: Scott, Foresman. M. Rubinstein and H. Leland (1981). Replicating Options with Positions in Stock and Cash. Financial Analysts Journal, 37, 63–72. M. Rubinstein, and J. Cox (1985). Option Markets. Englewood Cliffs, NJ: Prentice-Hall. S. Sears and G. Trennepohl (1982). Measuring Portfolio Risk in Options. Journal of Financial and Quantitative Analysis, 17, 391–410. C. Smith (1976). Option Pricing: A Review. Journal of Financial Economics, 3, 3–51. H. Stoll (1969). The Relationships between Put and Call Option Prices. Journal of Finance, 24, 801–824. J. F. Summa and J. W. Lubow (2001). Options on Futures. New York: John Wiley & Sons. G. Trennepohl (1981). A Comparison of Listed Option Premium and Black–Scholes Model Prices: 1973–1979. Journal of Financial Research, 4, 11–20. M. Weinstein (1983). Bond Systematic Risk and the Options Pricing Model. Journal of Finance, 38, 1415–1430. W. Welch (1982). Strategies for Put and Call Option Trading. Cambridge, MA: Winthrop. R. Whaley (1982). Valuation of American Call Options on Dividend Paying Stocks: Empirical Tests. Journal of Financial Economics, 10, 29–58. P. G. Zhang (1998). Exotic Options: A Guide to Second Generation Options, 2nd ed. Singapore: World Scientific.
page 2884
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Chapter 84
Decision Tree and Microsoft Excel Approach for Option Pricing Model Jow-Ran Chang and John Lee Contents 84.1 84.2 84.3 84.4 84.5 84.6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . Call and Put Options . . . . . . . . . . . . . . . . . . . . . One-Period Option Pricing Model . . . . . . . . . . . . . . . Two-Period Option Pricing Model . . . . . . . . . . . . . . Using Microsoft Excel to Create the Binomial Option Trees Using Microsoft Excel to Create Binomial American Option Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.7 Alternative Tree Methods . . . . . . . . . . . . . . . . . . . 84.7.1 Cox, Ross and Rubinstein . . . . . . . . . . . . . . . 84.7.2 Trinomial tree . . . . . . . . . . . . . . . . . . . . . 84.7.3 Comparison of the option price efficiency . . . . . . 84.7.4 Excel file . . . . . . . . . . . . . . . . . . . . . . . . 84.8 Black–Scholes Option Pricing Model . . . . . . . . . . . . . 84.9 Relationship Between the Binomial Option Pricing Model and the Black–Scholes Option Pricing Model . . . . . . . . 84.10 Decision Tree Black–Scholes Calculation . . . . . . . . . . . 84.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jow-Ran Chang National Tsing Hua University e-mail: [email protected] John Lee Center for PBBEF Research e-mail: [email protected] 2885
. . . . .
2886 2887 2889 2894 2897
. . . . . . .
2902 2905 2905 2908 2911 2912 2913
. . .
2915 2915 2916
page 2885
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2886
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 84A: Excel VBA Code — Binomial Option Pricing Model
2918 2919
Abstract In this chapter, we (i) use the decision-tree approach to derive binomial option pricing model (OPM) in terms of the method used by Rendleman and Barter (RB, 1979) and Cox et al. (CRR, 1979) and (ii) use Microsoft Excel to show how decision-tree model can be converted to Black–Scholes model when the number period increases to infinity. In addition, we develop binomial tree model for American option and trinomial tree model. The efficiency of binomial and trinomial tree methods is also compared. In sum, this chapter shows how binomial OPM can be converted step by step to Black–Scholes OPM. Keywords Binomial option pricing model • call option • put option • one-period OPM • two-period OPM • N-period OPM • synthetic option • Excel program • Black-Scholes model • Excel VBA • European options • American option.
84.1 Introduction The Binomial Option pricing model (OPM) derived by Rendleman and Barter (RB, 1979) and Cox et al. (CRR, 1979) is one of the most famous models used for price options. The Black–Scholes model (1973) is more famous. One problem with learning the binomial OPM is that it is computationally intensive. This results in a very complicated formula to price an option. The complexity of the binomial OPM makes it a challenge to learn the model. Most books teach the binomial option model by describing the formula. This is not very effective because it usually requires the learner to mentally keep track of many details, many times to the point of information overload. There is a well-known principle in psychology that the average number of things that a person can remember at one time is seven. This chapter will first demonstrate that it is possible to create large Decision Trees for the binomial pricing model using Microsoft Excel. A 10-period Decision Tree would require 2,047 call calculations and 2,047 put calculations. This chapter will also show the Decision Tree for the price of a stock and the price of a bond, each requiring 2,047 calculations. Therefore, there would be 8,848 (2,047 × 4) calculations for a complete set of 10-period Decision Trees. Second, this chapter will present the binomial option model in a less mathematical manner. By using Decision Trees, we can price call and put options without keeping track of many things at one time. Finally, this chapter will
page 2886
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2887
2887
show the relationship between the binomial OPM and the Black–Scholes OPM. Section 84.2 discusses the basic concepts of call and put options. Section 84.3 demonstrates the one-period call and put OPMs. Section 84.4 presents the two-period OPM. Sections 84.5 and 84.6 show how the binomial tree can be used to evaluate European and American options, respectively. Section 84.7 compares the efficiency of binomial tree and trinomial tree methods. Section 84.8 demonstrates the use of the Black–Scholes model. Section 84.9 shows the relationship between the binomial option pricing model and the Black–Scholes OPM. Section 84.10 demonstrates how to use the Microsoft Excel workbook binomialBS OPM.xls to demonstrate the relationship between the binomial OPM and the Black–Scholes OPM. Section 84.11 summarizes the chapter. This chapter uses a Microsoft Excel workbook called binomialBS OPM.xls that contains the Visual Basic for Applications (VBA) code to create the Decision Trees for the binomial OPM. The VBA code is provided in Appendix 84A. The password for the workbook is bigsky for those who want to study the VBA code.1
84.2 Call and Put Options A call option gives the owner the right but not the obligation to buy the underlying security at a specified price. The price in which the owner can buy the underlying price is called the exercise price. A call option becomes valuable when the exercise price is less than the current price of the underlying stock price. For example, a call option on a GE stock with an exercise price of $20 when the stock price of a GE stock is $25 is worth $5. The reason it is worth $5 is because a holder of the call option can buy the GE stock at $20 and then sell the GE stock at the prevailing price of $25 for a profit of $5. Also, a call option on a GE stock with an exercise price of $30 when the stock price of a GE stock is $15 is worth $0. A put option gives the owner the right but not the obligation to sell the underlying security at a specified price. A put option becomes valuable when the exercise price is more than the current price of the underlying stock price.
1
The Microsoft Excel workbook will be available upon request (JohnLeeExcel [email protected]).
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
page 2888
J.-R. Chang and J. Lee
2888
For example, a put option on a GE stock with an exercise price of $20 when the stock price of a GE stock is $15 is worth $5. The reason it is worth $5 is because a holder of the put option can buy the GE stock at the prevailing price of $15 and then sell the GE stock at the put price of $20 for a profit of $5. Also, a put option on a GE stock with an exercise price of $20 when the stock price of the GE stock is $25 is worth $0. Figures 84.1 and 84.2 are charts showing the value of call and put options of the above GE stock at varying prices. Value of GE Call Option Strike Price = $20 30 25 20
Dollars
15 10 5 0 -5 -10 0
5
10
15
20
25
30
35
40
45
35
40
45
-15 Price
Figure 84.1:
Value of GE call option.
Value of GE Put Option Strike Price $20 25 20 15 10 Dollars
July 6, 2020
5 0 -5 -10 0
5
10
15
20
25
30
-15 Price
Figure 84.2:
Value of GE put option.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2889
2889
84.3 One-Period Option Pricing Model What should be the value of these options? Let us look at a case where we are only concerned with the value of options for one period. In the next period, a stock price can either go up or go down. Let us look at a case where we know for certain that a GE stock with a price of $20 will either go up 5% or go down 5% in the next period and the exercise after one period is $20. Figures 84.3, 84.4 and 84.5 shows the Decision Tree for the GE stock price, the GE call option price, and the GE put option price, respectively. Let us first consider the issue of pricing a GE call option. Using a oneperiod Decision Tree, we can illustrate the price of a GE stock if it goes up 5% and the price of a stock GE if it goes down 5%. Since we know the possible ending values of the GE stock, we can derive the possible ending values of a call option. If the stock price increases to $21 = 20∗ (1 + 5%), the price of the GE call option will then be $1 = max($21–$20, 0). If the GE stock price decreases to $19, the value of the call option will be worth $0 = max($19–$20, 0) because it would be below the exercise price of $20. We have just discussed the possible ending value of a GE call option in period 1.
Period 0 Period 1 S1u=21 S0=20 S1d=19
Figure 84.3:
GE stock price.
Period 0 Period 1 c1u=1 c0=? c1d=0
Figure 84.4:
GE call option price.
Period 0 Period 1 p1u=0 p0=? p1d=1
Figure 84.5:
GE put option price.
July 6, 2020
15:53
2890
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
But, what we are really interested in is what the value is now of the GE call option knowing the two resulting values of the GE call option. To help determine the value of a one-period GE call option, it is useful to know that it is possible to replicate the resulting two states of the value of the GE call option by buying a combination of stocks, S, and bonds, B. Following is the formula to replicate the situation where the price increases to $21. We will assume that the interest rate for the bond is 3%. 21S + 1.03B = 1, 19S + 1.03B = 0. We can use simple algebra to solve for both shares of stocks and bonds, S and B. The first thing that we need to do is to rearrange the second equation as follows: 1.03B = −19S. With the above equation, we can rewrite the first equation as 21S + (−19S) = 1, 2S = 1, S = 0.5. We can solve for the shares of bonds, B, by substituting the value 0.5 for the shares of stocks, S, in the first equation. 21(0.5) + 1.03B = 1 10.5 + 1.03B = 1 1.03B = −9.5 B = −9.223. Therefore, from the above simple algebraic exercise, we should at period 0 buy 0.5 shares of GE stock and borrow 9.223 at 3% to replicate the payoff of the GE call option. This means the value of a GE call option should be 0.5 × 20 − 9.223 = 0.777. The procedure used here to create a co-option in terms of bond and stock is called synthetic option. Further detail of synthetic option can be found in Chapter 24 of Lee et al. (2013). If this was not the case, there would then be arbitrage profits. For example, if the call option were sold for $3, there would be a profit of 2.223. This would result in the increase in selling of the GE call option. The increase in
page 2890
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2891
2891
the supply of GE call options would push the price down for the call options. If the call options were sold for $0.50, there would be a saving of 0.277. This saving would result in the increased demand for the GE call option. This increased demand would result in the increase in the price of the call option. The equilibrium point would be $0.777. Using the above-mentioned concept and procedure, Benninga (2000) has derived a one-period call option model as C = qu Max[uSX, 0] + qd Max[dS − X, 0],
(84.1)
where qu =
(1 + i) − d , (1 + i)(u − d)
qd =
u − (1 + i) , (1 + i)(u − d)
u = increase factor, d = down factor, i = interest rate. If we let i = r, R = (1 + r), p = (R − d)/(u − d), 1 − p = (u − R)/(u − d), Cu = Max[uS −X, 0] and Cd = Max[dS −X, 0], then we have C = [pC u + (1 − p)Cd ]/R,
(84.2)
where Cu is the call option price after increase and Cd is the call option price after decrease. The following equations calculate the value of the above one-period call option where the strike price, X, is $20 and the risk-free interest rate is 3%. We will assume that the price of a stock for any given period will either increase or decrease by 5%. X = $20, S = $20, u = 1.05, d = 0.95, R = 1 + r = 1 + 0.03, p = (1.03 − 0.95)/(1.05 − 0.95), C = [0.8(1) + 0.2(0)]/1.03 = $0.777.
July 6, 2020
15:53
2892
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee Period 0
Period 1 1.000
0.777 0
Figure 84.6:
Call option price.
Therefore, from the above calculations, the value of the call option is $7.94. Figure 84.6 shows the resulting Decision Tree for the above call option. Like the call option, it is possible to replicate the resulting two states of the value of the put option by buying a combination of stocks, S, and bonds, B. Following is the formula to replicate the situation where the price decreases to $19. 21S + 1.03B = 0, 19S + 1.03B = 1. We will use simple algebra to solve for both shares of stocks and bonds, S and B. The first thing we will do is to rewrite the second equation as follows: 1.03B = 1.5 − 19S. The next thing to do is to substitute the above equation to the first put option equation. Doing this would result in the following: 21S + 1.5 − 19S = 0. The following solves for the shares of stocks, S: 2S = −1.5, S = −0.5. Now, let us solve for the shares of bonds, B, by putting the value of S into the first equation. This is shown as follows: 21(−0.5) + 1.03B = 0, 1.03B = 10.5, B = 10.194. From the above simple algebra exercise, we have S = −0.5 and B = 10.194. This will tell us that we should in period 0 lend $10.194 at 3% and sell 0.5 shares of stock to replicate the put option payoff for period 1, and the value of the GE put option should be 20(−0.5) + 10.194 = 0.194.
page 2892
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2893
2893
Using the same arbitrage argument that we used in the discussion of the call option, 0.194 has to be the equilibrium price of the put option. As with the call option, Benninga (2000) has derived a one-period put option model as P = qu Max[X − uS, 0] + qd Max[X − dS, 0],
(84.3)
where qu =
(1 + i) − d , (1 + i)(u − d)
qd =
u − (1 + i) , (1 + i)(u − d)
u = increase factor, d = down factor, i = interest rate. If we let i = r, R = (1 + r), p = (R − d)/(u − d), 1 − p = (u − R)/(u − d), Pu = Max[X − uS, 0] and Pd = Max[X− dS, 0], then we have P = [pP u + (1 − p)Pd ]/R,
(84.4)
where Pu is the put option price after increase and Pd is the put option price after decrease. The following equation calculates the value of the above one-period put option where the strike price, X, is $30 and the risk-free interest rate is 3%. P = [0.8(0) + 0.2(1)]/1.03 = $0.194. From the above calculation, the resulting put option pricing Decision Tree is shown in Figure 84.7.
Period 0
Period 1 0
0.194 1
Figure 84.7:
GE put option price.
July 6, 2020
15:53
2894
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
There is a relationship between the price of a put option and the price of all call option. This relationship is called the put–call parity. Equation 84.5 shows the relationship between the price of a put option and the price of a call option. P = C + X/R − S,
(84.5)
where C is the call price, X is the strike price, R is the 1 + interest rate, and S is the stock price. The following uses the put–call parity to calculate the price of the GE put option: P = $0.777 + $20/(1.03) − $20 = 0.777 + 19.417 − 20 = 0.194. 84.4 Two-Period Option Pricing Model We will now, look at pricing options for two periods. Figure 84.8 shows the stock price Decision Tree based on the parameters indicated in the last section. This Decision Tree was created based on the assumption that a stock price will either increase by 5% or decrease by 5%. How do we price the value of a call and put option for two periods? The highest possible value for our stock based on our assumption is $22.05. We get this value first by multiplying the stock price at period 0 by 105% to get the resulting value of $21 of period 1. We then again multiply the stock price in period 1 by 105% to get the resulting value of $22.05. In period 2, the value of a call option when a stock price is $22.05 is the stock price minus the exercise price, $22.05 − $20, or $2.05. In period 2, the value of a put option when a stock price $22.05 is the exercise price minus the stock price, $20 − $22.05, or −$2.05. A negative value has no value to an investor so the value of the put option would be $0. Period 0
Period 1
Period 2 22.050
21.000 19.950 20.000 19.950 19.000 18.050
Figure 84.8:
GE stock price.
page 2894
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2895
2895
The lowest possible value for our stock based on our assumptions is $84.05. We get this value first by multiplying the stock price at period 0 by 95% (decreasing the value of the stock by 5%) to get the resulting value of $19.0 of period 1. We then again multiply the stock price in period 1 by 95% to get the resulting value of $84.05. In period 2, the value of a call option when a stock price is $84.05 is the stock price minus the exercise price, $84.05 − $20, or −$1.95. A negative value has no value to an investor, so the value of a call option would be $0. In period 2, the value of a put option when a stock price is $84.05 is the exercise price minus the stock price, $20 − $84.05, or $1.95. We can derive the call and put option values for the other possible values of the stock in period 2 in the same fashion. Figures 84.9 and 84.10 show the possible call and put option values for period 2. We cannot calculate the value of the call and put option in period 1 the same way as we did in period 2 because it is not the ending value of the stock. In period 1, there are two possible call values. One value is when the stock price increased, and one value is when the stock price decreased. The call option Decision Tree shown in Figure 84.9 shows two possible values for a call option in period 1. If we just focus on the value of a call option when the
Period 0
Period 1
Period 2 2.0500 0.0000 0.0000 0.0000
Figure 84.9:
Period 0
GE call option.
Period 1
Period 2 0.0000 0.0500 0.0500 1.9500
Figure 84.10:
GE put option.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2896
stock price increases from period 1, we will note that it is like the Decision Tree for a call option for one period. This is shown in Figure 84.11. Using the same method for pricing a call option for one period, the price of a call option when stock price increases from period 0 will be $1.5922. The resulting Decision Tree is shown in Figure 84.12. In the same fashion, we can price the value of a call option when the stock price decreases. The price of a call option when the stock price decreases from period 0 is $0. The resulting Decision Tree is shown in Figure 84.13.
Period 0
Period 1
Period 2 2.0500 0 0 0
Figure 84.11:
Period 0
Period 1
GE call option.
Period 2 2.0500
1.5922 0 0 0
Figure 84.12:
Period 0
Period 1
GE call option.
Period 2 2.0500
1.5922 0 0 0 0
Figure 84.13:
GE call option.
page 2896
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model Period 0
Period 1
page 2897
2897
Period 2 2.0500
1.5922 0.0000 1.2367 0.0000 0.0000 0.0000
Figure 84.14:
Period 0
Period 1
GE call option.
Period 2 0.0000
0.0097 0.0500 0.0886 0.0500 0.4175 1.9500
Figure 84.15:
GE call option.
In the same fashion, we can price the value of a call option in period 0. The resulting Decision Tree is shown in Figure 84.14. We can calculate the value of a put option in the same manner as we did in calculating the value of a call option. The Decision Tree for a put option is shown in Figure 84.15. 84.5 Using Microsoft Excel to Create the Binomial Option Trees In the previous section, we priced the value of a call and put option by pricing backward from the last period to the first period. This method of pricing call and put options will work for any n-period. To price the value of call options for two periods required seven sets of calculations. The number of calculations increases dramatically as n increases. Table 84.1 lists the number of calculations for specific number of periods. After two periods, it becomes very cumbersome to calculate and create the Decision Trees for a call and put option. In the previous section, we saw that calculations were very repetitive and mechanical. To solve this problem,
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2898
Table 84.1: Number of calculations for specific number of periods. Periods 1 2 3 4 5 6 7 8 9 10 11 12
Calculations 3 7 17 31 63 127 255 511 1023 2047 4065 8191
this chapter will use Microsoft Excel to do the calculations and create the Decision Trees for the call and put options. We will also use Microsoft Excel to calculate and draw the related Decision Trees for the underlying stock and bond. To solve this repetitive and mechanical calculation of the Binomial OPM, we will look at a Microsoft Excel file called binomialBS OPM.xls. We will use this Microsoft Excel workbook to produce four Decision Trees for the GE stock that was discussed in the previous sections. The four Decision Trees are as follows: (1) (2) (3) (4)
Stock Price; Call Option Price; Put Option Price; Bond Price.
This section will demonstrate how to use the binomialBS OPM.xls Excel file to create the four Decision Trees. Figure 84.16 shows the Excel file binomialBS OPM.xls after the file is opened. Pushing the button shown in Figure 84.16 will result in the dialog box shown in Figure 84.17. The dialog box shown in Figure 84.17 shows the parameters for the Binomial OPM. These parameters are changeable. The dialog box in Figure 84.17 shows the default values. Pushing the calculate button shown in Figure 84.17 will produce the four Decision Trees shown in Figures 84.18–84.21.
page 2898
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
Figure 84.16:
Figure 84.17:
Excel file BinomialBS OPM.xls.
Dialog box showing parameters for the Binomial OPM.
page 2899
2899
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2900
Stock Price Decision Tree Price = 20, Exercise = 20, U = 1.0500, D = 0.9500, N = 4, R = 0.03 Number of calculaƟons: 31 24.3101 23.1525 21.9949 22.0500 21.9949 20.9475 19.9001 21.0000 21.9949 20.9475 19.9001 19.9500 19.9001 18.9525 18.0049 20.0000 21.9949 20.9475 19.9001 19.9500 19.9001 18.9525 18.0049 19.0000 19.9001 18.9525 18.0049 18.0500 18.0049 17.1475 16.2901
Figure 84.18:
Stock price decision tree.
Table 84.1 indicated that 31 calculations were required to create a Decision Tree that has four periods. This section showed four Decision Trees. Therefore, the Excel file did 31 × 4 = 121 calculations to create the four Decision Trees. Benninga (2000, p. 260) has defined the price of a call option in a Binomial OPM with n periods as C=
n n i=0
i
qui qdn−i max[S(u)i (d)n−i , 0],
(84.6)
page 2900
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2901
2901
Call OpƟon Pricing Decision Tree Price = 20, Exercise = 20, U = 1.0500, D = 0.9500, N = 4, R = 0.03 Number of calculaƟons: 31 Binomial Call Price= 2.2945 4.3101 3.7350 1.9949 3.2018 1.9949 1.5494 0.0000 2.7205 1.9949 1.5494 0.0000 1.2034 0.0000 0.0000 0.0000 2.2945 1.9949 1.5494 0.0000 1.2034 0.0000 0.0000 0.0000 0.9347 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Figure 84.19:
Call option pricing decision tree.
and the price of a put option in a Binomial OPM with n-periods as n n i n−i (84.7) qu qd max[X − S(u)i (d)n−i , 0]. P = i i=0
Lee et al. (2000, p. 237) has defined the pricing of a call option in a Binomial OPM with n-period as C=
n n! 1 pk (1 − p)n−k max[0, (u)k (d)n−k , S − X]. n R k!(n − k!) k=0
(84.8)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2902
Put OpƟon Pricing Decision Tree Price = 20, Exercise = 20, U = 1.0500, D = 0.9500, N = 4, R = 0.03 Number of calculaƟons: 31 Binomial Put Price: 0.0643 0.0000 0.0000 0.0000 0.0038 0.0000 0.0194 0.0999 0.0234 0.0000 0.0194 0.0999 0.1053 0.0999 0.4650 1.9951 0.0643 0.0000 0.0194 0.0999 0.1053 0.0999 0.4650 1.9951 0.2375 0.0999 0.4650 1.9951 0.8019 1.9951 2.2700 3.7099
Figure 84.20:
Put option pricing decision tree.
The definition of the pricing of a put option in a Binomial OPM with n period would then be defined as n n! 1 pk (1 − p)n−k max[0, X − (u)k (d)n−k , S]. (84.9) P = n R k!(n − k)! k=0
84.6 Using Microsoft Excel to Create Binomial American Option Trees An American option is an option that the holder may exercise at any time between the start date and the maturity date. Therefore, the holder of an American option faces the dilemma of deciding when to exercise the option.
page 2902
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2903
2903
Bond Pricing Decision Tree Price = 20, Exercise = 20, U = 1.0500, D = 0.9500, N = 4, R = 0.03 Number of calculaƟons: 31 1.1255 1.0927 1.1255 1.0609 1.1255 1.0927 1.1255 1.0300 1.1255 1.0927 1.1255 1.0609 1.1255 1.0927 1.1255 1.0000 1.1255 1.0927 1.1255 1.0609 1.1255 1.0927 1.1255 1.0300 1.1255 1.0927 1.1255 1.0609 1.1255 1.0927 1.1255
Figure 84.21:
Bond pricing decision tree.
Binomial tree valuation can be adapted to include the possibility of exercise at intermediate dates and not just the maturity date. This feature needs to be incorporated into the pricing of American options. The first step of pricing an American option is the same as a European option. For an American put option, the second step is to take the max of the difference between the strike price of the stock and the price of the stock at that node N and the value of the European put option at node N . Figure 84.22 shows the American put option binomial tree. This American put option has the same parameters as the European put option. With the same input parameters, we can see that the value of the European put option and the value of the American put option are different. The
July 6, 2020
15:53
2904
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
Figure 84.22:
American put decision tree.
value of the European put option is 0.0643, while the value of the American put option is 0.229394. The circle in the American put option binomial tree is one reason for this. At this node, the American put option has a value of 1.047502, while at the same node, the European put option has a value of 0.4650. At this node, the value of the put option is the max of the difference between the strike stock’s strike price and stock price at this node and the value of the European put option at this node. At this node, the stock price is 18,9525 and the stock strike price is 20.
page 2904
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2905
2905
Mathematically, the price of the American put option at this node is Max(X − St, 0.4650) = Max(20 − 18.9525, 0.4650) = 1.0475. 84.7 Alternative Tree Methods In this section, we will introduce Cox, Ross and Rubinstein (1979) binomial tree methods and Kamrad and Ritchken (1991) trinomial tree methods to price option values. Kamrad and Ritchken (1991) extends binomial tree method to multinomial approximation models. Trinomial tree method is one of the multinomial models. 84.7.1 Cox, Ross and Rubinstein Cox, Ross and Rubinstein (1979) (hereafter CRR) propose an alternative choice of parameters that also create a risk-neutral valuation environment. The price multipliers, u and d, depend only on volatility σ and on δt, not on drift √
u = eσ δt , 1 d= . u To offset the absence of a drift component in u and d, the probability of an up move in the CRR tree is usually greater than 0.5 to ensure that the expected value of the price increases by a factor of exp[(r − q)δt] on each step. The formula for p is e(r−q)δt − d . p= u−d Following is the asset price tree base on CRR binomial tree model.
July 6, 2020
15:53
2906
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
We can see that CRR tree is symmetric to its initial asset price, in this case, it is 50. Next, we want to create option tree in the worksheet. For example, a call option value is on this asset price. Let f i,j denote the option value in node (i, j), where j refers to period j (j = 0, 1, 2, . . . , N ) and i denotes the ith node in period j (in the binomial tree model, node numbers increase going up in the lattice, so i = 0, . . . , j). With these assumptions, the underlying asset price in node (i, j) is Suj di−j . At the expiration, we have fi,N = max Sui dN −i − X, 0 , i = 0, 1, . . . , N. Going backward in time (decreasing j), we get fi,j = e−rδt [pfi+1,j+1 + (1 − p) fi,j+1 ]. The CRR option value tree is shown as follows:
We can see the call option value at time zero is equal to 3.244077 in Cell C12. We also can write a VBA function to price call option. Following is the function: ’ Returns CRR Binomial Option Value Function CRRBinCall(S, X, r, q, T, sigma, Nstep) Dim dt, erdt, ermqdt, u, d, p Dim i As Integer, j As Integer Dim vvec() As Variant ReDim vvec(Nstep) dt = T / Nstep
page 2906
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2907
2907
erdt = Exp(r * dt) ermqdt = Exp((r - q) * dt) u = Exp(sigma * Sqr(dt)) d = 1 / u p = (ermqdt - d) / (u - d) For i = 0 To Nstep vvec(i) = Application.Max(S * (u ^ i) * (d ^ (Nstep - i)) - X, 0) Next i For j = Nstep - 1 To 0 Step -1 For i = 0 To j vvec(i) = (p * vvec(i + 1) + (1 - p) * vvec(i)) / erdt Next i Next j CRRBinCall = vvec(0) End Function
Using this function and put parameters in the function, we can get call option value under different steps. This result is shown as follows:
The function in cell B12 is = CRRBinCall(B3, B4, B5, B6, B8, B7, B10). We can see the result in B12 is equal to C12.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2908
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
If q = 0.01 > 0, we can get a lower call option price, 3.10063.
84.7.2 Trinomial tree Because binomial tree methods are computationally expensive, Kamrad and Ritchken (1991) propose multinomial models. New multinomial models are included as special cases existing models. The more general models are shown to be computationally more efficient.
page 2908
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2909
2909
Expressed algebraically, the trinomial tree parameters are u = eλσ d=
√
δt
1 u
The formulas for probability p are as follows: √ 1 (r − σ 2 /2) δt , pu = 2 + 2λ 2λσ pm = 1 −
1 , λ2
pd = 1 − pu − pm . If parameter λ is equal to 1, then trinomial tree model reduces to a binomial tree model. Following is the underlying asset price pattern base on trinomial tree model.
We can see that this trinomial tree model is also a symmetric tree. The middle price in each period is the same as initial asset price, 50.
July 6, 2020
15:53
2910
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
By a similar rule, we can use this tree to price a call option. At first, we can draw the option tree based on trinomial underlying asset price tree. The result is shown as follows:
The call option value at time zero is 3.269028 in Cell C12. In addition, we also can write a function to price a call option based on trinomial tree model. The function is shown as follows: ’ Returns Trinomial Option Value Function TriCall(S, X, r, q, T, sigma, Nstep, lamda) Dim dt, erdt, ermqdt, u, d, pu, pm, pd Dim i As Integer, j As Integer Dim vvec() As Variant ReDim vvec(2 * Nstep) dt = T / Nstep erdt = Exp(r * dt) ermqdt = Exp((r - q) * dt) u = Exp(lamda * sigma * Sqr(dt)) d = 1 / u pu = 1 / (2 * lamda ^ 2) + (r - sigma ^ 2 / 2) * Sqr(dt) / (2 * lamda * sigma) pm = 1 - 1 / (lamda ^ 2) pd = 1 - pu - pm For i = 0 To 2 * Nstep vvec(i) = Application.Max(S * (d ^ Nstep) * (u ^ i) - X, 0) Next i
page 2910
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2911
2911
For j = Nstep - 1 To 0 Step -1 For i = 0 To 2 * j vvec(i) = (pu * vvec(i + 2) + pm * vvec(i + 1) + pd * vvec(i)) / erdt Next i Next j TriCall = vvec(0) End Function
Similar data can be used in this function and the same call option at today’s price can be obtained.
The function in cell B12 is equal to = TriCall(B3, B4, B5, B6, B8, B7, B10, B9).
84.7.3 Comparison of the option price efficiency In this section, we would like to compare the efficiency of these two methods. In the following table, we represent different numbers of steps 1, 2, . . . to 50, and we represent Black and Scholes, CRR binomial tree and trinomial tree method results. The following figure is the result.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2912
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
In order to assess the result more in depth, we draw the result as shown in the following picture.
After we increase the number of steps, we can see that trinomial tree method is more quickly convergent to Black and Scholes than the CRR binomial tree method. 84.7.4 Excel file This section will demonstrate how to use the Binomial Tree Model.xlsm Excel file to calculate four options. Figure 84.23 shows the Excel file
page 2912
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
Figure 84.23:
page 2913
2913
Excel file binomial tree model.xlsm.
Binomial Tree Model.xlsm after the file is opened. Pushing the button shown in Figure 84.23 will result in the dialog box shown in Figure 84.24. The dialog box shown in Figure 84.24 shows the parameters for the Binomial OPM. These parameters are changeable. The dialog box shows the default values. Pushing the calculate button will produce the European call, European put, American call, and American put option values. 84.8 Black–Scholes Option Pricing Model The most popular OPM is the Black–Scholes OPM. In this section, we will demonstrate the usage of the Black–Scholes OPM. In later sections, we will demonstrate the relationship between the Binomial OPM and the Black– Scholes pricing model. The Black–Scholes model prices the European call and put options. The Black–Scholes model for a the European call option is C = SN (d1) − Xe−rT N (d2),
(84.10)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2914
Figure 84.24:
Dialog box showing parameters for the Binomial OPM.
where C is the Call price, S is the Stock price, r is the risk free interest rate, T is the time to maturity of option in years, N (·) is the standard normal distribution, and σ is the stock volatility. ln(S/X) + (r + √ σ T √ d2 = d1 − σ T . d1 =
σ2 2 )T
,
Let us manually calculate the price of a European call option in terms of equation (84.10) with the following parameter values, S = 20, X = 20, r = 3%, T = 4, σ = 20%. Solution ln(S/X) + (r + √ d1 = σ T
σ2 2 )T
2
ln(30/30) + (0.03 + 0. 0.22 )(4) √ = 0.2 4
0.2 (0.03 + 0.02)∗ 4 = = 0.5, 0.4 0.4 √ d2 = 0.5 − 0.2 4 = 0.1, =
N (d1) = 0.69146, N (d2) = 0.5398, e−rT = 0.8869,
page 2914
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2915
2915
C = (20) ∗ (0.69146) − (20) ∗ (0.8869)∗ 0.5398 = 13.8292 − 9.5749724 = 4.2542276. The Black–Scholes put–call parity equation is P = C − S + Xe−rT . The put option value for the stock would be P = 4.25 − 20 + 20(0.8869) = 4.25 − 20 + 17.738 = 1.988. 84.9 Relationship Between the Binomial Option Pricing Model and the Black–Scholes Option Pricing Model We can use either the Binomial model or Black–Scholes to price an option. They both should result in similar numbers. If we look at the parameters in both models, we will note that the Binomial model has an Increase Factor (U), a Decrease Factor (D), and n-period parameters that the Black–Scholes model does not have. We also note that the Black–Scholes model has σ and T parameters that the Binomial model does not have. Benninga (2008) suggests the following translation between the Binomial and Black–Scholes parameters: Δt = T /n,
R = erΔt ,
U = eσ
√
Δt
,
D = e−σ
√
Δt
.
In the Excel program shown in Appendix 84A, we use Benniga’s (2008) Increase Factor and Decrease Factor definitions. They are defined as follows: qu =
R−d , R(u − d)
qd =
u−R , R(u − d)
where u = 1 + percentage of price increase, d = 1 − percentage of price increase, and R = 1 + interest rate. The exact derivation of the Black–Scholes option pricing model and binomial option model can be found in Lee et al. (2016). In addition, this paper also shows how the Black–Scholes model can be derived by Ito calculus. 84.10 Decision Tree Black–Scholes Calculation We will now use the BinomialBS OPM.xls Excel file to calculate the Binomial and Black–Scholes call and put values illustrated in Section 84.5. Note that in Figure 84.25, the Binomial Black–Scholes Approximation checkbox
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2916
Figure 84.25:
Dialog box showing parameters for the Binomial OPM.
is checked. Checking this box will cause T and σ parameters to appear and will adjust the Increase Factor — u and Decrease Factor — d parameters. The adjustment was done as indicated in Section 84.7. Note in Figures 84.26 and 84.27 the Binomial OPM value does not agree with the Black–Scholes OPM. The Binomial OPM value will get very close to the Black–Scholes OPM value once the Binomial parameter n gets very large. Benninga (2008) demonstrated that the Binomial value will be close to the Black–Scholes when the Binomial n parameter gets larger than 500. Hull (2017) has shown that the decision-tree techniques discussed in this section can also be used to evaluate American option with dividend payment. In addition, Lee et al. (2016, Chapter 25) have used examples to show how the decision-tree method can be used to evaluate both American call and put options. 84.11 Summary This chapter is demonstrated with the aid of Microsoft Excel and Decision Trees, the Binomial Option model in a less mathematical fashion. This chapter allows the reader to focus more on the concepts by studying the
page 2916
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2917
2917
Call OpƟon Pricing Decision Tree Price = 20, Exercise = 20, U = 1.2214, D = 0.8187, N = 4, R = 0.03 Number of calculaƟons: 31 Binomial Call Price= 4.0670 Black-Scholes Call Price= 4.2536,d1=0.5000,d2=0.1000,N(d1)=0.6915,N(d2)=0.539
24.5108 17.0335 9.8365 11.0012 9.8365 5.0191 0.0000 6.7920 9.8365 5.0191 0.0000 2.5611 0.0000 0.0000 0.0000 4.0670 9.8365 5.0191 0.0000 2.5611 0.0000 0.0000 0.0000 1.3068 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Figure 84.26:
Decision tree approximation of Black–Scholes call pricing.
associated Decision Trees, which were created by Microsoft Excel. This chapter also demonstrated that using Microsoft Excel releases the reader from the computation burden of the Binomial Option Model. This chapter also published the Microsoft Excel VBA code that created the Binomial Option Decision Trees. This allows for those who are interested in studying the many advanced Microsoft Excel VBA programming concepts that were used to create the Decision Trees. One major computer science programming concept used by the Excel VBA program in this chapter is recursive programming. Recursive programming is the ideal of a procedure
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2918
Put OpƟon Pricing Decision Tree Price = 20, Exercise = 20, U = 1.2214, D = 0.8187, N = 4, R = 0.03 Number of calculaƟons: 31 Binomial Put Price: 1.8055 Black-Scholes Put Price: 1.9920 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6426 0.0000 0.0000 0.0000 1.3963 0.0000 3.0343 6.5936 1.8055 0.0000 0.0000 0.0000 1.3963 0.0000 3.0343 6.5936 3.2108 0.0000 3.0343 6.5936 5.4289 6.5936 8.4327 11.0134
Figure 84.27:
Decision tree approximation of Black–Scholes put pricing.
calling itself many times. Inside the procedure, there are statements to decide when not to call itself. This chapter also used decision trees to demonstrate the relationship between the Binomial OPM and the Black–Scholes OPM. Bibliography S. Benninga (2000). Financial Modeling, Cambridge, MA: MIT Press. S. Benninga (2008). Financial Modeling, Cambridge, MA: MIT Press.
page 2918
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model
page 2919
2919
F. Black, and M. Scholes (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 31, 637–659. J. Cox, S. A. Ross, and M. Rubinstein (1979). Option Pricing: A Simplified Approach. Journal of Financial Economics, 7, 229–263. R. T. Daigler (1994). Financial Futures and Options Markets Concepts and Strategies, New York: Harper Collins. J. Hull (2017). Options, Futures, and Other Derivatives, 10th ed. Upper Saddle. River, New Jersey: Prentice Hall. R. Jarrow and S. TurnBull (1996). Derivative Securities, Cincinnati: South-Western College Publishing. C. F. Lee (2009). Handbook of Quantitative Finance, New York, NY: Springer. C. F. Lee and A. C. Lee (2006). Encyclopedia of Finance, New York, NY: Springer. C. F. Lee, J. C. Lee, and A. C. Lee (2000). Statistics for Business and Financial Economics, Singapore: World Scientific. J. C. Lee, C. F. Lee, R. S. Wang, and T. I. Lin (2004). On the Limit Properties of Binomial and Multinomial Option Pricing Models: Review and Integration, in Advances in Quantitative Analysis of Finance and Accounting New Series, 1. Singapore: World Scientific. C. F. Lee, C. M. Tsai, and A. C. Lee (2019). Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence. Quantitative Finance, 13 (2). J. C. Lee (2001). Using Microsoft Excel and Decision trees to Demonstrate the Binomial Option Pricing Model. Advances in Investment Analysis and Portfolio Management, 8, 303–329. C. F. Lee, J. Finnerty, J. Lee, A. Lee, and D. Wort (2013). Security Analysis and Portfolio Management, and Financial Derivatives. Singapore: World Scientific. C. F. Lee, J. Lee, J. R. Chang, and T. Tai (2016). Essentials of Excel, Excel VBA, SAS and Minitab for Statistical and Financial Analyses, New York: Springer. C. F. Lee, Y. Chen, and J, Lee (2016). Alternative Methods to Derive Option Pricing Models: Review and Comparison. Review of Quantitative Finance and Accounting, 47, 417. A. W. Lo and J. Wang (2000). Trading Volume: Definition, Data Analysis, and Implications of Portfolio Theory. Review of Financial Studies, 13, 257–300. R. L. McDonald (2012). Derivatives Markets, 3rd ed. Boston, MA: Addison Wesley. R. J. Rendleman, Jr. and B. J. Barter (1979). Two-State Option Pricing. Journal of Finance, 34(5), 1093–1110. E. Wells and S. Harshbarger (1997). Microsoft Excel 97 Developer’s Handbook, Redmond, WA: Microsoft Press. J. Walkenbach (2003). Excel 2003 Power Programming with VBA, Indianapolis, IN: Wiley Publishing, Inc.
Appendix 84A: Excel VBA Code — Binomial Option Pricing Model It is important to note that the thing that makes Microsoft Excel powerful is that it offers a powerful professional programming language called VBA. This section shows the VBA code that generated the Decision Trees for the Binomial Option pricing model. This code is in the form frmBinomiaOption. The procedure cmdCalculate Click is the first procedure to run.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2920
9.61in x 6.69in
J.-R. Chang and J. Lee
’/*************************************************************************** ’/ Relationship Between the Binomial OPM ’/ and Black-Scholes OPM: ’/ Decision Tree and Microsoft Excel Approach ’/ ’/ by John Lee ’/ [email protected] ’/ All Rights Reserved ’/*************************************************************************** Option Explicit Dim mwbTreeWorkbook As Workbook Dim mwsTreeWorksheet As Worksheet Dim mwsCallTree As Worksheet Dim mwsPutTree As Worksheet Dim mwsBondTree As Worksheet Dim mdblPFactor As Double Dim mBinomialCalc As Long Dim mCallPrice As Double ’jcl 12/8/2008 Dim mPutPrice As Double ’jcl 12/8/2008 ’/************************************************** ’/Purpose: Keep track the numbers of binomial calc ’/************************************************* Property Let BinomialCalc(l As Long) mBinomialCalc = l End Property Property Get BinomialCalc() As Long BinomialCalc = mBinomialCalc End Property Property Set TreeWorkbook(wb As Workbook) Set mwbTreeWorkbook = wb End Property Property Get TreeWorkbook() As Workbook Set TreeWorkbook = mwbTreeWorkbook End Property Property Set TreeWorksheet(ws As Worksheet) Set mwsTreeWorksheet = ws End Property Property Get TreeWorksheet() As Worksheet Set TreeWorksheet = mwsTreeWorksheet End Property Property Set CallTree(ws As Worksheet) Set mwsCallTree = ws End Property Property Get CallTree() As Worksheet Set CallTree = mwsCallTree End Property Property Set PutTree(ws As Worksheet) Set mwsPutTree = ws End Property Property Get PutTree() As Worksheet Set PutTree = mwsPutTree End Property Property Set BondTree(ws As Worksheet) Set mwsBondTree = ws End Property Property Get BondTree() As Worksheet Set BondTree = mwsBondTree End Property Property Let CallPrice(dCallPrice As Double) ’12/8/2008 mCallPrice = dCallPrice End Property Property Get CallPrice() As Double Let CallPrice = mCallPrice End Property Property Let PutPrice(dPutPrice As Double) ’12/10/2008 mPutPrice = dPutPrice End Property Property Get PutPrice() As Double ’12/10/2008 Let PutPrice = mPutPrice End Property Property Let PFactor(r As Double) Dim dRate As Double dRate = ((1 + r) - Me.txtBinomialD) / (Me.txtBinomialU - Me.txtBinomialD) Let mdblPFactor = dRate End Property Property Get PFactor() As Double Let PFactor = mdblPFactor End Property Property Get qU() As Double Dim dblDeltaT As Double Dim dblDown As Double Dim dblUp As Double Dim dblR As Double
b3568-v3-ch84
page 2920
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model dblDeltaT = Me.txtTimeT / Me.txtBinomialN dblR = Exp(Me.txtBinomialr * dblDeltaT) dblUp = Exp(Me.txtSigma * VBA.Sqr(dblDeltaT)) dblDown = Exp(-Me.txtSigma * VBA.Sqr(dblDeltaT))
qU = (dblR - dblDown) / (dblR * (dblUp - dblDown)) End Property Property Get qD() As Double Dim Dim Dim Dim
dblDeltaT As Double dblDown As Double dblUp As Double dblR As Double
dblDeltaT = Me.txtTimeT / Me.txtBinomialN dblR = Exp(Me.txtBinomialr * dblDeltaT) dblUp = Exp(Me.txtSigma * VBA.Sqr(dblDeltaT)) dblDown = Exp(-Me.txtSigma * VBA.Sqr(dblDeltaT)) qD = (dblUp - dblR) / (dblR * (dblUp - dblDown)) End Property
Private Sub chkBinomialBSApproximation_Click() On Error Resume Next ’Time and Sigma only BlackScholes parameter Me.txtTimeT.Visible = Me.chkBinomialBSApproximation Me.lblTimeT.Visible = Me.chkBinomialBSApproximation Me.txtSigma.Visible = Me.chkBinomialBSApproximation Me.lblSigma.Visible = Me.chkBinomialBSApproximation txtTimeT_Change End Sub Private Sub cmdCalculate_Click() Me.Hide BinomialOption Unload Me End Sub Private Sub cmdCancel_Click() Unload Me End Sub
Private Sub txtBinomialN_Change() ’jcl 12/8/2008 On Error Resume Next If Me.chkBinomialBSApproximation Then Me.txtBinomialU = Exp(Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) Me.txtBinomialD = Exp(-Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) End If End Sub
Private Sub txtTimeT_Change() ’jcl 12/8/2008 On Error Resume Next If Me.chkBinomialBSApproximation Then Me.txtBinomialU = Exp(Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) Me.txtBinomialD = Exp(-Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) End If End Sub Private Sub UserForm_Initialize()
page 2921
2921
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
J.-R. Chang and J. Lee
2922
With Me .txtBinomialS = .txtBinomialX = .txtBinomialD = .txtBinomialU = .txtBinomialN = .txtBinomialr = .txtSigma = 0.2 .txtTimeT = 4
20 20 0.95 1.05 4 0.03
Me.chkBinomialBSApproximation = False End With chkBinomialBSApproximation_Click Me.Hide End Sub Sub BinomialOption() Dim wbTree As Workbook Dim wsTree As Worksheet Dim rColumn As Range Dim ws As Worksheet Set Set Set Set Set
Me.TreeWorkbook = Workbooks.Add Me.BondTree = Me.TreeWorkbook.Worksheets.Add Me.PutTree = Me.TreeWorkbook.Worksheets.Add Me.CallTree = Me.TreeWorkbook.Worksheets.Add Me.TreeWorksheet = Me.TreeWorkbook.Worksheets.Add
Set rColumn = Me.TreeWorksheet.Range("a1") With Me .BinomialCalc = 0 .PFactor = Me.txtBinomialr .CallTree.Name = "Call Option Price" .PutTree.Name = "Put Option Price" .TreeWorksheet.Name = "Stock Price" .BondTree.Name = "Bond" End With DecisionTree rCell:=rColumn, nPeriod:=Me.txtBinomialN + 1, _ dblPrice:=Me.txtBinomialS, sngU:=Me.txtBinomialU, _ sngD:=Me.txtBinomialD DecitionTreeFormat TreeTitle wsTree:=Me.TreeWorksheet, sTitle:="Stock Price " TreeTitle wsTree:=Me.CallTree, sTitle:="Call Option Pricing" TreeTitle wsTree:=Me.PutTree, sTitle:="Put Option Pricing" TreeTitle wsTree:=Me.BondTree, sTitle:="Bond Pricing" Application.DisplayAlerts = False For Each ws In Me.TreeWorkbook.Worksheets If Left(ws.Name, 5) = "Sheet" Then ws.Delete Else ws.Activate ActiveWindow.DisplayGridlines = False ws.UsedRange.NumberFormat = "#,##0.0000_);(#,##0.0000)" End If Next Application.DisplayAlerts = True Me.TreeWorksheet.Activate End Sub Sub TreeTitle(wsTree As Worksheet, sTitle As String) wsTree.Range("A1:A5").EntireRow.Insert (xlShiftDown)
b3568-v3-ch84
page 2922
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model With wsTree With .Cells(1) .Value = sTitle .Font.Size = 20 .Font.Italic = True End With With .Cells(2, 1) .Value = "Decision Tree" .Font.Size = 16 .Font.Italic = True End With With .Cells(3, 1) .Value = "Price = " & Me.txtBinomialS & _ ",Exercise = " & Me.txtBinomialX & _ ",U = " & Format(Me.txtBinomialU, "#,##0.0000") & _ ",D = " & Format(Me.txtBinomialD, "#,##0.0000") & _ ",N = " & Me.txtBinomialN & _ ",R = " & Me.txtBinomialr .Font.Size = 14 End With With .Cells(4, 1) .Value = "Number of calculations: " & Me.BinomialCalc .Font.Size = 14 End With
If wsTree Is Me.CallTree Then With .Cells(5, 1) .Value = "Binomial Call Price= " & Format(Me.CallPrice, "#,##0.0000") .Font.Size = 14 End With If Me.chkBinomialBSApproximation Then wsTree.Range("A6:A7").EntireRow.Insert (xlShiftDown) With .Cells(6, 1) .Value = "Black-Scholes Call Price= " & Format(Me.BS_Call, "#,##0.0000") _ & ",d1=" & Format(Me.BS_D1, "#,##0.0000") _ & ",d2=" & Format(Me.BS_D2, "#,##0.0000") _ & ",N(d1)=" & Format(WorksheetFunction.NormSDist(BS_D1), "#,##0.0000") _ & ",N(d2)=" & Format(WorksheetFunction.NormSDist(BS_D2), "#,##0.0000") .Font.Size = 14 End With End If ElseIf wsTree Is Me.PutTree Then With .Cells(5, 1) .Value = "Binomial Put Price: " & Format(Me.PutPrice, "#,##0.0000") .Font.Size = 14 End With If Me.chkBinomialBSApproximation Then wsTree.Range("A6:A7").EntireRow.Insert (xlShiftDown) With .Cells(6, 1) .Value = "Black-Scholes Put Price: " & Format(Me.BS_PUT, "#,##0.0000") .Font.Size = 14 End With End If End If End With End Sub Sub BondDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rBond As Range Dim rPup As Range Dim rPDown As Range Set rBond = Me.BondTree.Cells(rPrice.Row, rPrice.Column) Set rPup = Me.BondTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column)
page 2923
2923
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2924
Set rPDown = Me.BondTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rPup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then rPup.Value = (1 + Me.txtBinomialr) ^ (rPup.Column - 1) rPDown.Value = rPup.Value End If With rBond .Value = (1 + Me.txtBinomialr) ^ (rBond.Column - 1) .Borders(xlBottom).LineStyle = xlContinuous End With rPDown.Borders(xlBottom).LineStyle = xlContinuous With rPup .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((rPDown.Row - rPup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With End Sub
Sub PutDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rCall As Range Dim rPup As Range Dim rPDown As Range Set rCall = Me.PutTree.Cells(rPrice.Row, rPrice.Column) Set rPup = Me.PutTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column) Set rPDown = Me.PutTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rPup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then rPup.Value = WorksheetFunction.Max(Me.txtBinomialX - arCell(iCount - 1), 0) rPDown.Value = WorksheetFunction.Max(Me.txtBinomialX - arCell(iCount), 0) End If With rCall ’12/10/2008 If Not Me.chkBinomialBSApproximation Then .Value = (Me.PFactor * rPup + (1 - Me.PFactor) * rPDown) / (1 + Me.txtBinomialr) Else .Value = (Me.qU * rPup) + (Me.qD * rPDown) End If Me.PutPrice = .Value
’12/8/2008
.Borders(xlBottom).LineStyle = xlContinuous End With rPDown.Borders(xlBottom).LineStyle = xlContinuous With rPup .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((rPDown.Row - rPup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With End Sub Sub CallDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rCall As Range Dim rCup As Range Dim rCDown As Range Set rCall = Me.CallTree.Cells(rPrice.Row, rPrice.Column) Set rCup = Me.CallTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column) Set rCDown = Me.CallTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rCup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then With rCup
page 2924
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model .Value = WorksheetFunction.Max(arCell(iCount - 1) - Me.txtBinomialX, 0) .Borders(xlBottom).LineStyle = xlContinuous End With With rCDown .Value = WorksheetFunction.Max(arCell(iCount) - Me.txtBinomialX, 0) .Borders(xlBottom).LineStyle = xlContinuous End With End If With rCall If Not Me.chkBinomialBSApproximation Then .Value = (Me.PFactor * rCup + (1 - Me.PFactor) * rCDown) / (1 + Me.txtBinomialr) Else .Value = (Me.qU * rCup) + (Me.qD * rCDown) End If Me.CallPrice = .Value
’12/8/2008
.Borders(xlBottom).LineStyle = xlContinuous End With rCup.Offset(1, 0).Resize((rCDown.Row - rCup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End Sub Sub DecitionTreeFormat() Dim rTree As Range Dim nColumns As Integer Dim rLast As Range Dim rCell As Range Dim lCount As Long Dim lCellSize As Long Dim vntColumn As Variant Dim iCount As Long Dim lTimes As Long Dim arCell() As Range Dim sFormatColumn As String Dim rPrice As Range
Application.StatusBar = "Formatting Tree.. " Set rTree = Me.TreeWorksheet.UsedRange nColumns = rTree.Columns.Count
Set rLast = rTree.Columns(nColumns).EntireColumn.SpecialCells(xlCellTypeConstants, 23) lCellSize = rLast.Cells.Count For lCount = nColumns To 2 Step -1 sFormatColumn = rLast.Parent.Columns(lCount).EntireColumn.Address Application.StatusBar = "Formatting column " & sFormatColumn ReDim vntColumn(1 To (rLast.Cells.Count / 2), 1) Application.StatusBar = "Assigning values to array for column " & _ rLast.Parent.Columns(lCount).EntireColumn.Address vntColumn = rLast.Offset(0, -1).EntireColumn.Cells(1).Resize(rLast.Cells.Count / 2, 1) rLast.Offset(0, -1).EntireColumn.ClearContents ReDim arCell(1 To rLast.Cells.Count) lTimes = 1 Application.StatusBar = "Assigning cells to arrays. Total number of cells: " & lCellSize For Each rCell In rLast.Cells Application.StatusBar = "Array to column " & sFormatColumn & " Cells " & rCell.Row Set arCell(lTimes) = rCell lTimes = lTimes + 1 Next lTimes = 1
page 2925
2925
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
J.-R. Chang and J. Lee
2926
Application.StatusBar = "Formatting leaves for column " & sFormatColumn For iCount = 2 To lCellSize Step 2 Application.StatusBar = "Formatting leaves for cell " & arCell(iCount).Address If rLast.Cells.Count 2 Then Set rPrice = arCell(iCount).Offset(-1 * ((arCell(iCount).Row - arCell(iCount rPrice.Value = vntColumn(lTimes, 1) Else Set rPrice = arCell(iCount).Offset(-1 * ((arCell(iCount).Row - arCell(iCount rPrice.Value = vntColumn End If
arCell(iCount).Borders(xlBottom).LineStyle = xlContinuous With arCell(iCount - 1) .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((arCell(iCount).Row - arCell(iCount - 1).Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With lTimes = 1 + lTimes CallDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount PutDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount BondDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount Next Set rLast = rTree.Columns(lCount - 1).EntireColumn.SpecialCells(xlCellTypeConstants, 23) lCellSize = rLast.Cells.Count Next ’ / outer next rLast.Borders(xlBottom).LineStyle = xlContinuous Application.StatusBar = False End Sub
’/********************************************************************* ’/Purpse: To calculate the price value of every state of the binomial ’/ decision tree ’/********************************************************************* Sub DecisionTree(rCell As Range, nPeriod As Integer, _ dblPrice As Double, sngU As Single, sngD As Single) Dim lIteminColumn As Long If Not nPeriod = 1 Then ’Do Up DecisionTree rCell:=rCell.Offset(0, 1), nPeriod:=nPeriod - 1, _ dblPrice:=dblPrice * sngU, sngU:=sngU, _ sngD:=sngD ’Do Down DecisionTree rCell:=rCell.Offset(0, 1), nPeriod:=nPeriod - 1, _ dblPrice:=dblPrice * sngD, sngU:=sngU, _ sngD:=sngD End If
lIteminColumn = WorksheetFunction.CountA(rCell.EntireColumn)
If lIteminColumn = 0 Then rCell = dblPrice Else If nPeriod 1 Then rCell.EntireColumn.Cells(lIteminColumn + 1) = dblPrice Else rCell.EntireColumn.Cells(((lIteminColumn + 1) * 2) - 1) = dblPrice Application.StatusBar = "The number of binomial calcs are : " & Me.BinomialCalc _ & " at cell " & rCell.EntireColumn.Cells(((lIteminColumn + 1) * 2)
1).Row) / 2), -1)
1).Row) / 2), -1)
page 2926
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch84
Decision Tree and Microsoft Excel Approach for Option Pricing Model - 1).Address End If End If Me.BinomialCalc = Me.BinomialCalc + 1 End Sub Function BS_D1() As Double Dim dblNumerator As Double Dim dblDenominator As Double On Error Resume Next dblNumerator = VBA.Log(Me.txtBinomialS / Me.txtBinomialX) + _ ((Me.txtBinomialr + Me.txtSigma ^ 2 / 2) * Me.txtTimeT) dblDenominator = Me.txtSigma * Sqr(Me.txtTimeT)
BS_D1 = dblNumerator / dblDenominator End Function Function BS_D2() As Double On Error Resume Next BS_D2 = BS_D1 - (Me.txtSigma * VBA.Sqr(Me.txtTimeT)) End Function Function BS_Call() As Double BS_Call = (Me.txtBinomialS * WorksheetFunction.NormSDist(BS_D1)) _ - Me.txtBinomialX * Exp(-Me.txtBinomialr * Me.txtTimeT) * _ WorksheetFunction.NormSDist(BS_D2) End Function ’Used put-call parity theorem to price put option Function BS_PUT() As Double BS_PUT = BS_Call - Me.txtBinomialS + _ (Me.txtBinomialX * Exp(-Me.txtBinomialr * Me.txtTimeT)) End Function
page 2927
2927
This page intentionally left blank
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Chapter 85
Statistical Distributions, European Option, American Option, and Option Bounds Cheng Few Lee Contents 85.1 85.2 85.3 85.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Normal Distribution . . . . . . . . . . . . . . . . . . . . . The Log-Normal Distribution . . . . . . . . . . . . . . . . . . The Log-Normal Distribution and its Relationship to the Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . 85.5 Multivariate Normal and Log-Normal Distributions . . . . . . 85.6 The Normal Distribution as an Application to the Binomial and Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . 85.7 Applications of the Log-Normal Distribution in Option Pricing 85.8 The Bivariate Normal Density Function . . . . . . . . . . . . 85.9 American Call Options . . . . . . . . . . . . . . . . . . . . . . 85.9.1 Price American call options by the bivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . 85.9.2 Pricing an American call option: An example . . . . . 85.10 Price Bounds for Options . . . . . . . . . . . . . . . . . . . . 85.10.1 Options written on non-dividend-paying stocks . . . . 85.10.2 Options written on dividend-paying stocks . . . . . . 85.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 2929
2930 2931 2932 2934 2935 2939 2943 2945 2948 2948 2951 2954 2954 2955 2960
page 2929
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2930
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2960 Appendix 85A: Microsoft Excel Program for Calculating Cumulative Bivariate Normal Density Function . . . . . . . . . . 2961 Appendix 85B: Microsoft Excel Program for Calculating the American Call Options . . . . . . . . . . . . . . . . . . . . . . . 2962 Abstract In this chapter, we first review the basic theory of normal and log-normal distribution and their relationship, then bivariate and multivariate normal density function are analyzed in detail. Next, we discuss American options in terms of random dividend payment. We then use bivariate normal density function to analyze American options with random dividend payment. Computer programs are used to show how American co-options can be evaluated. Finally, pricing option bounds are analyzed in some detail. Keywords Normal distribution • Log-normal distribution • American option • Option bound • Multivariate normal distribution • Multivariate log-normal distribution.
85.1 Introduction The normal (or Gaussian) distribution is the most important distribution in probability and statistics. One of the justifications for using the normal distribution is the central limit theorem. Also, much of the statistical theory is based on the normality assumption. However, in finance research, the lognormal distribution is playing a most important role. Part of the reason is the fact that in finance, we are dealing with random quantities which are positive in nature. Hence, taking the natural logarithm is quite reasonable. Also, empirical data quite often support the assumption of the log-normality for random quantity such as the stock price movements. In Sections 85.1 and 85.2, we discuss the normal distribution and the log-normal distribution, respectively. Section 85.3 explains the relationship between the log-normal distribution and the normal distribution. In Section 85.4, the multivariate normal and log-normal distributions are introduced. Section 85.5 applies the normal distribution to the binomial distribution. Section 85.6 covers the applications of the log-normal distribution in option pricing. Section 85.7 introduces the bivariate normal distribution. In Sections 85.8, we price the American call options by the bivariate normal distribution and illustrate the calculation process by given an example. In Section 85.9, we derive the price bounds of options written on the stocks with or without dividends. In Section 85.10, we summarize and conclude our findings in the previous sections.
page 2930
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2931
85.2 The Normal Distribution A random variable X is said to be normally distributed with mean μ and variance σ 2 if it has the probability density function (PDF) f (x) = √
1 x−μ 2 1 e− 2 ( σ ) , 2πσ
σ > 0.
(85.1)
The normal distribution is symmetric around μ, which is the mean and the is mode of the distribution. It is easy to see that the PDF of Z = X−μ σ z2 1 g(z) = √ e− 2 , 2π
(85.2)
which is the PDF of the standard normal and is independent of the parameters μ and σ 2 . The cumulative distribution function (CDF) of Z is P (Z ≤ z) = N (z),
(85.3)
which has been well tabulated. Also, a software package or system, such as S-plus, will provide the value N (z) instantly. For a discussion of some approximations for N (z), the reader is referred to Johnson and Kotz (8570). For the CDF of X, we have P (X ≤ x) = P
x−μ X −μ ≤ σ σ
=N
x−μ . σ
(85.4)
The normal distribution as given in equation (85.1) is very important in practice. It is very useful in describing the phenomena, such as the scores of tests, heights of students, in a certain school. It is useful in serving as an approximation for binomial distribution. It is also quite useful in studying the option pricing. We next discuss some properties of the normal distribution. If X is normally distributed with mean μ and variance σ 2 , then the moment generating function (MGF) of X is 2 σ 2 /2
Mx (t) = eμt+t
,
(85.5)
which is useful in deriving the moment of X. Equation (85.5) is also useful in deriving the moments of the log-normal distribution. From equation (85.5),
page 2931
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2932
it is easy to verify that E(X) = μ
and Var(X) = σ 2 .
If X1 , . . . , Xn are independent, normally distributed random variable, then any linear function of these variables is also normally distributed. In fact, if Xi is normally distributed with mean μi and variance σi2 , then ni=1 ai Xi is normally distributed with mean ni=1 ai μi and variance ni=1 a2i σi2 , when ai s are constants. If X1 and X2 are independent, and each is normally distributed with mean 0 and variance σ 2 , then (X12 − X22 )/(X12 + X22 ) is also normally distributed.
85.3 The Log-Normal Distribution A random variable X is said to be log-normality distributed with parameters μ and σ 2 if Y = log X,
(85.6)
is normally distributed with mean μ and variance σ 2 . It is clear that X has to be a positive random variable. This distribution is quite useful in studying the behavior of stock prices. For the log-normal distribution as described above, the PDF is 1 2 1 e− 2σ2 (log x−μ) , g(x) = √ 2πσx
x > 0.
(85.7)
The log-normal distribution is sometimes called the antilog-normal distribution, because it is the distribution of the random variable X which is the antilog of the normal random variable Y . However, “log-normal” is most commonly used in the literature. When applied to economic data, especially production function, it is often called the Cobb–Douglas distribution. We next discuss some properties of the log-normal distribution, as defined in equation (85.6). The rth moment of X is μr = E(X r ) = E(erY ) = eμr+
r2 σ2 2
.
(85.8)
It is noted that we have utilized the fact the MGF of the normal random t2 σ 2 variable Y with mean μ and variance σ 2 is MY (t) = eμt+ 2 . Thus, E(erY )
page 2932
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2933
is simply MY (r), which is the right-hand side of equation (85.8). From equation (85.8), we have E(X) = eμ+
σ2 2
,
2μ σ2
(85.9) σ2
Var(X) = e e [e
− 1].
(85.10)
It is noted that the moment sequence {μr } does not belong only to lognormal distribution. Thus, the log-normal distribution cannot be defined by its moments. The CDF of X is P (X ≤ x) = P (log X ≤ log x) = N
log x − μ σ
,
(85.11)
because log X is normally distributed with mean μ and variance σ 2 . The distribution of X is unimodal with the mode at 2
mod e(X) = e(μ−σ ) .
(85.12)
Let xα be the (100)α percentile for the log-normal distribution and zα be the corresponding percentile for the standard normal, then P (X ≤ xα ) = P
Thus zα =
log xα − μ log X − μ ≤ σ σ
=N
log xα − μ . σ
(85.13)
log xα − μ , implying that σ xα = eμ+σzα .
(85.14)
Thus, the percentile of the log-normal distribution can be obtained from the percentile of the standard normal. From equation (85.13), we also see that median(X) = eμ ,
(85.15)
as z0.5 = 0. Thus, median(X) > mod e(X). Hence, the log-normal distribution is not symmetric.
page 2933
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2934
85.4 The Log-Normal Distribution and its Relationship to the Normal Distribution By comparing the PDF of normal distribution given in equation (85.1) and the PDF of log-normal distribution given in equation (85.7), we know that f (x) =
f (y) . x
(85.16)
In addition, from equation (85.6), it is easy to see that dx = xdy.
(85.17)
The CDF for the log-normal distribution can be expressed as F (a) = Pr(X ≤ a) = Pr(log X ≤ log a) log a − μ log X − μ ≤ , = Pr σ σ = N (d),
(85.18)
where d=
log a − μ , σ
(85.19)
and N (d) is the CDF of the standard normal distribution which can be obtained from Normal Table; N (d) can also be obtained from S-plus or other software package. Alternatively, the value of N (d) can be approximated by the following formula: N (d) = a0 e−
d2 2
(a1 t + a2 t2 + a3 t3 ),
(85.20)
where 1 1 + 0.33267d a0 = 0.3989423, a1 = 0.4361936, a2 = −0.1201676, a3 = 0.9372980. t=
In case we need Pr(X > a), then we have Pr(X > a) = 1 − Pr(X ≤ a) = 1 − N (d) = N (−d).
(85.21)
Since for any h, E(X h ) = E(ehY ), the hth moment of X, the following moment generating function of Y , which is normally distributed with mean
page 2934
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2935
μ and variance σ 2 , 1 2 2 σ
MY (t) = eμt+ 2 t
.
(85.22)
For example, 1 2 2 σ
μX = E(X) = E(eY ) = MY (1) = eμ+ 2 t μh+ 12 t2 σ2
E(X h ) = E(ehY ) = MY (h) = e
.
.
(85.23)
Hence, 2
2 = E(X 2 ) − (EX)2 = e2μ+2σ − e2μ+σ σX 2
2
2
= e2μ+σ (eσ − 1).
(85.24)
Thus, fractional and negative moment of a log-normal distribution can be obtained from equation (85.23). The mean of a log-normal random variable can be defined as ∞ σ2 xf (x)dx = eμ+ 2 . (85.25) 0
If the lower bound a is larger than 0; then the partial mean of x can be shown as ∞ ∞ σ2 xf (x)dx = f (y)ey dy = eμ+ 2 N (d), (85.26) 0
log(a)
where d=
μ − log(a) + σ. σ
This implies that partial mean of a log-normal variable is the mean of x times an adjustment term, N (d). 85.5 Multivariate Normal and Log-Normal Distributions The normal distribution with the PDF given in equation (85.1) can be extended to the p-dimensional case. Let X = (X1 , . . . , Xp ) be a p×1 random vector. Then we say that X ∼ Np (μ, Σ), if it has the PDF 1 −p/2 −1/2 −1 |Σ| exp − (x − μ) Σ (x − μ) . (85.27) f (x) = (2π) 2
page 2935
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2936
In equation (85.27), μ is the mean vector and Σ is the covariance matrix which is symmetric and positive definite. The moment generating function of X is 1 (85.28) Mx (t) = E et x = et μ+ 2 t Σt , where t = (t1 , . . . , tp ) is a p × 1 vector of real values. From equation (85.28), it can be shown that E(X) = μ and Cov(X) = Σ. If C is a q × p matrix of rank q ≤ p. Then CX ∼ Nq (Cμ, CΣC ). Thus, linear transformation of a normal random vector is also a multivariate normal random vector. Let
μ(1) Σ11 Σ12 X(1) , μ= , and Σ = , X= Σ21 Σ22 X(2) μ(2) where X(i) and μ(i) are pi × 1, p1 + p2 = p, and Σij is pi × pj . Then the marginal distribution of X (i) is also a multivariate normal (i) with
(i) (i) mean vector μ and covariance matrix Σii , that is, X ∼ Npi μ , Σii . Furthermore, the conditional distribution of X(1) given X(2) = x(2) , where x(2) is a known vector, is normal with mean vector μ1 · 2 and covariance matrix Σ11 · 2 where −1 x(2) − μ(2) , (85.29) μ1 · 2 = μ(1) + 12
22
and 11 · 2
=
11
−
−1 , 12
22
(85.30)
21
that is, X(1) X(2) = x(2) ∼ Np1 (μ1 ·2 , Σ11 ·2 ) . We next consider a bivariate version of correlated log-normal distribution. Let
μ1 log (X1 ) σ11 σ12 Y1 ∼N = , . μ2 Y2 σ21 σ22 log (X2 ) The joint PDF of X1 and X2 can be obtained from the joint PDF of Y1 and Y2 by observing that dx1 dx2 = x1 x2 dy1 dy2 , which is an extension of equation (85.17) to the bivariate case.
(85.31)
page 2936
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2937
Hence, the joint PDF of X1 and X2 is g (x1 , x2 ) =
1 2π |Σ| x1 x2 1 −1 × exp − [(log x1 , log x2 ) − μ] Σ [(log x1 , log x2 ) − μ] . 2 (85.32)
From the property of the multivariate normal distribution, we have Yi ∼ N (μi , σii ). Hence, Xi is log-normal with E(Xi ) = eμi +
σii 2
,
(85.33)
Var(Xi ) = e2μi eσii [eσii − 1].
(85.34)
Furthermore, by the property of the moment generating for the bivariate normal distribution, we have
E (X1 X2 ) = E eY1 +Y2 1
= eμ1 +μ2 + 2 (σ11 +σ22 +2σ12 ) √ = E (X1 ) E (X2 ) · exp (ρ σ11 σ22 ) .
(85.35)
Thus, the covariance between X1 and X2 is Cov (X1 , X2 ) = E (X1 X2 ) − E (X1 ) E (X2 ) √ = E (X1 ) E (X2 ) · (exp (ρ σ11 σ22 ) − 1) √ 1 = exp μ1 + μ2 + (σ11 + σ22 ) · (exp (ρ σ11 σ22 ) − 1) . 2 (85.36) From the property of conditional normality of Y1 given Y2 = y2 , we also see that the conditional distribution of Y1 given Y2 = y2 is log-normal. The extension to the p-variate log-normal distribution is trivial. Let Y = (Y1 , . . . , Yp ) , where Yi = log Xi . If Y ∼ Np (μ, Σ), where μ = (μ1 . . . μp ) and Σ = (σij ). The joint PDF of X1 , . . . , Xp , can be obtained from the following theorem.
page 2937
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2938
Theorem 1. (Anderson, 2003) Let the PDF of Y1 , . . . , Yp be f (y1 , . . . , yp ), consider the p-valued functions xi = xi (y1 , . . . , yp ), i = 1, . . . , p.
(85.37)
We assume that the transformation from the y-space to the x-space is oneto-one with the inverse transformation yi = yi (x1 , . . . , xp ), i = 1, . . . , p.
(85.38)
Let the random variables X1 , . . . , Xp , be defined by Xi = xi (Y1 , . . . , Yp ), i = 1, . . . , p.
(85.39)
Then the PDF of X1 , . . . , Xp is g(x1 , . . . , xp ) = f (y1 (x1 , . . . , xp ), . . . , yp (x1 , . . . , xp )) J(x1 , . . . , xp ), (85.40) where J(x1 , . . . , xp ) is the Jacobian of transformations ∂y1 · · · ∂y1 ∂x1 ∂xp .. . . .. . . , J(x1 , . . . , xp ) = mod . ∂yp ∂yp · · · ∂x1 ∂xp
(85.41)
where “mod” means a modulus or absolute value. Applying the above theorem with f (y1 , . . . , yp ) being a p-variate normal, and 1 0 ··· 0 x1 1 p 0 x ··· 0 1 2 , (85.42) J(x1 , . . . , xp ) = mod . . . .. = .. . . i=1 xi 1 0 ··· xp we have the following joint PDF of X1 , . . . , Xp
p 1 − p2 −p/2 g(x1 , . . . , xp ) = (2π) |Σ| x i=1 i 1 −1 × exp − [(log x1 , . . . , log xp ) − μ] Σ [(log x1 , . . . , log xp ) − μ] . 2 (85.43)
page 2938
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2939
It is noted that when p = 2, equation (85.43) reduces to the bivariate case given in equation (85.32). Then the first two moments are E(Xi ) = eμi +
σii 2
,
(85.44)
(85.45) Var(Xi ) = e2μi eσii [eσii − 1]. √
1 Cor (Xi , Xj ) = exp μi + μj + (σii + σjj ) · exp ρij σii σjj − 1 , 2 (85.46) where ρij is the correlation between Yi and Yj . For more details concerning properties of the multivariate log-normal distribution, the reader is referred to Johnson and Kotz (8572). 85.6 The Normal Distribution as an Application to the Binomial and Poisson Distribution The cumulative normal density function tells us the probability that a random variable Z will be less than some value x. Note in Figure 85.1 that P(Z T,
where S is the Current price of the stock, T is the term to maturity in years, m is the minimum number of upward movements in stock price that is necesu−R sary for the option to terminate “in the money,” p is the R−d u−d and 1−p = u−d , X is the option exercise price (or strike price), R is the 1 + r = 1+ risk-free rate of return, u is the 1+ of price increase, d is the 1+ percentage u percentage
of price decrease, p = R p, B(n, p, m) = nk=m n Ck pk (1 − p)n−k . By a form of the central limit theorem, we showed in Section 85.7 as T → ∞, the option price C converges to C as C = SN (d1 ) − XR−T N (d2 ), where C = Price of the call option,
log XrS−t 1√ √ + t, d1 = 2 σ t √ d2 = d1 − σ t
(85.48)
page 2940
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2941
N (d) is the value of the cumulative standard normal distribution, t is the fixed length of calendar time to expiration and h is the elapsed time between successive stock price changes and T = ht. If future stock price is constant over time, then σ 2 = 0. It can be shown that both N (d1 ) and N (d2 ) are equal to 1 and that equation (85.48) becomes C = S − Xe−rT .
(85.49)
Alternatively, equations (85.48) and (85.49) can be understood in terms of the following steps: Step 1: The future price of the stock is constant over time. Because a call option gives the option holder the right to purchase the stock at the exercise price X, the value of the option, C, is just the current price of the stock less the present value of the stock’s purchase price. Mathematically, the value of the call option is C=S−
X . (1 + r)T
(85.50)
Note that equation (85.50) assumes discrete compounding of interest, whereas equation (85.49) assumes continuous compounding of interest. To adjust equation (85.50) for continuous compounding, we substitute e−rT for 1 to get (1+r)T C = S − Xe−rT .
(85.51)
Step 2: Assume the price of the stock fluctuates over time (St ). In this case, we need to adjust equation (85.49) for the uncertainty associated with that fluctuation. We do this by using the cumulative normal distribution function. In deriving equation (85.48), we assume that St follows a log-normal distribution, as discussed in Section 85.3. The adjustment factors N (d1 ) and N (d2 ) in the Black–Scholes option valuation model are simply adjustments made to equation (85.49) to account for the uncertainty associated with the fluctuation of the price of the stock. Equation (85.48) is a continuous option pricing model. Compare this with the binomial option price model given in equation (85.47), which is a discrete option pricing model. The adjustment factors N (d1 ) and N (d2 ) are cumulative normal density functions. The adjustment factors B(T, p , m) and B(T, p, m) are complementary binomial distribution functions.
page 2941
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2942
We can use equation (85.48) to determine the theoretical value, as of November 29, 8591, of one of IBM’s options with maturity on April 8592. In this case we have X = $90, S = $92.50, σ = 0.2854, r = 0.0435, and 5 = 0.42 (in years). Armed with this information, we can calculate the T = 12 estimated d1 and d2 . x=
92.5 ln 90 + [(0.0435) + 12 (0.2194)2 ](0.42) 1
= 0.392,
(0.2194)(0.42) 2
√ 1 x − σ t = x − (0.2194)(0.42) 2 = 0.25 . In equation (85.45), N (d1 ) and N (d2 ) are the probabilities that a random variable with a standard normal distribution takes on a value less than d1 and a value less than d2 , respectively. The values for N (d1 ) and N (d2 ) can be found by using the tables in the back of the book for the standard normal distribution, which provide the probability that a variable Z is between 0 and x (see Figure 85.2). To find the cumulative normal density function, we need to add the probability that Z is less than zero to the value given in the standard normal distribution table. Because the standard normal distribution is symmetric around zero, we know that the probability that Z is less than zero is 0.5, so P (Z < x) = P (Z < 0) + P (0 < Z < x) = 0.5 + value from table.
Figure 85.2:
Probability of a variable Z between 0 and x.
page 2942
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2943
We can now compute the values of N (d1 ) and N (d2 ). N (d1 ) = P (Z < d1 ) = P (Z < 0) + P (0 < Z < d1 ) = P (Z < 0.392) = 0.5 + 0.1517 = 0.6517 N (d2 ) = P (Z < d2 ) = P (Z < 0) + P (0 < Z < d2 ) = P (Z < 0.25) = 0.5 + 0.0987 = 0.5987. Then the theoretical value of the option is C = (92.5)(0.6517) − [(90)(0.5987)]/e(0.0435)(0.42) = 60.282 − 53.883/1.0184 = $7.373, and the actual price of the option on November 29, 8591, was $7.75. 85.7 Applications of the Log-Normal Distribution in Option Pricing To derive the Black–Scholes formula, it is assumed that there are no transaction costs, no margin requirements, and no taxes; that all shares are infinitely divisible; and that continuous trading can be accomplished. It is also assumed that the economy is risk neutral and the stock price follows a log-normal distribution. Denote the current stock price by S and the stock price at the end of jth period by Sj . Sj = exp[Kj ] is a random variable with a log-normal distribution, Then Sj−1 where Kj is the rate of return in jth period and is a random variable with normal distribution. Let Kt have the expected value μk and variance σk2 for each j. Then K1 + K2 + . . . + Kt is a normal random variable with expected value tμk and variance tσk2 . Thus, we can define the expected value (mean) of SSt = exp[K1 + K2 + . . . + Kt ] as tσk2 St = exp tμk + . (85.52) E S 2 Under the assumption of a risk-neutral investor, the expected return E( SSt ) is assumed to be exp(rt) (where r is the riskless rate of interest). In other words, μk = r −
σk2 . 2
(85.53)
page 2943
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2944
In the risk-neutral assumptions of Cox and Ross (8576) and Rubinstein (8576), the call option price C can be determined by discounting the expected value of the terminal option price by the riskless rate of interest: C = exp[−rt]E[Max(ST − X, 0)],
(85.54)
where T is the time of expiration and X is the striking price. Note that Max(ST − X, 0) = S SST − =0
X S
, for for
ST S ST S
>
0, σY > 0, and −1 < ρ < 1, 1 q= 1 − ρ2
X − μX σX
2 − 2ρ
X − μX σX
Y − μY σY
+
X − μY σY
2 .
It can be shown that the conditional mean of Y , given X, is linear in X and given by E(Y |X) = μY + ρ
σY σX
(X − μX ).
(85.67)
It is also clear that given X, we can define the conditional variance of Y as σ(Y |X) = σY2 (1 − ρ2 ).
(85.68)
Equation (85.67) can be regarded as describing the population linear regression line. For example, if we have a bivariate normal distribution of heights of brothers and sisters, we can see that they vary together and there is no cause-and-effect relationship. Accordingly, a linear regression in terms of the bivariate normal distribution variable is treated as though there were a twoway relationship instead of an existing causal relationship. It should be noted that regression implies a causal relationship only under a “prediction” case.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2947
Equation (85.66) represents a joint PDF for X and Y . If ρ = 0, then equation (85.66) becomes f (X, Y ) = f (X)f (Y ).
(85.69)
This implies that the joint PDF of X and Y is equal to the PDF of X times the PDf of Y . We also know that both X and Y are normally distributed. Therefore, X is independent of Y . Example 85.1. Using a Mathematics Aptitude Test to Predict Grade in Statistics. Let X and Y represent scores in a mathematics aptitude test and numerical grade in elementary statistics, respectively. In addition, we assume that the parameters in equation (85.66) are μX = 550
σX = 40
μY = 80
σY = 4 ρ = 0.7.
Substituting this information into equations (85.67) and (85.68), respectively, we obtain E(Y |X) = 80 + 0.7(4/40)(X − 550) = 41.5 + 0.07X,
(85.70)
σ 2 (Y |X) = (16)(1 − 0.49) = 8.16.
(85.71)
If we know nothing about the aptitude test score of a particular student (say, john), we have to use the distribution of Y to predict his elementary statistics grade. 95% interval = 80±(1.96)(4) = 80 ± 7.84. That is, we predict with 95% probability that John’s grade will fall between 87.84 and 72.16. Alternatively, suppose we know that John’s mathematics aptitude score is 650. In this case, we can use equations (85.70) and (85.71) to predict John’s grade in elementary statistics. E(Y |X = 650) = 41.5 + (0.07)(650) = 87, and σ 2 (Y |X) = (16)(1 − 0.49) = 8.16. We can now base our interval on a normal probability distribution with a mean of 87 and a standard deviation of 2.86. 95% interval = 87±(1.96)(2.86) = 87 ± 5.61.
page 2947
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2948
That is, we predict with 95% probability that John’s grade will fall between 92.61 and 81.39. Two things have happened to this interval. First, the center has shifted upward to take into account the fact that John’s mathematics aptitude score is above average. Second, the width of the interval has been narrowed from 87.84 − 72.16 = 15.68 grade points to 92.61 − 81.39 = 11.22 grade points. In this sense, the information about John’s mathematics aptitude score has made us less uncertain about his grade in statistics.
85.9 American Call Options 85.9.1 Price American call options by the bivariate normal distribution An option contract which can be exercised only on the expiration date is called European call. If the contract of a call option can be exercised at any time of the option’s contract period, then this kind of call option is called American call. When a stock pays a dividend, the American call is more complex. Following Whaley (8581), the valuation equation for American call option with one known dividend payment can be defined as C(S, T, X) = S x [N1 (b1 ) + N2 (a1 , −b1 ; −
t/T )]
− Xe−rt [N1 (b2 )er(T −t) + N2 (a2 , −b2 ; − + De−rt N1 (b2 ),
t/T )] (85.72a)
where Sx
√ + r + 12 σ 2 T √ , a2 = a1 − σ T , a1 = σ T x
ln SS ∗ + r + 12 σ 2 t √ t √ , b2 = b1 − σ t, b1 = σ t ln
X
S x = S − De−rt ,
(85.72b)
(85.72c)
(85.73)
S x represents the correct stock net price of the present value of the promised dividend per share (D); t represents the time dividend to be paid.
page 2948
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2949
St∗ is the ex-dividend stock price for which C(St∗ , T − t) = St∗ + D − X,
(85.74)
S, X,r,σ 2 , T have been defined previously in this chapter. Both N1 (b1 ) and N2 (b2 ) are cumulative univariate normal density funcwith tion; N2 (a, b; ρ) is the cumulative bivariate normal density function upper integral limits, a and b, and correlation coefficient, ρ = − t/T . American call option on a non-dividend-paying stock will never optimally be exercised prior to expiration. Therefore, if no dividend payments exist, equation (85.72) will reduce to the European Option pricing model with no dividend payment. We have shown how the cumulative univariate normal density function can be used to evaluate the European call option in previous sections of this chapter. If a common stock pays a discrete dividend during the option’s life, the American call option valuation equation requires the evaluation of a cumulative bivariate normal density function. While there are many available approximations for the cumulative bivariate normal distribution, the approximation provided here relies on Gaussian quadratures. The approach is straightforward and efficient, and its maximum absolute error is 0.00000055. Following equation (85.66), the probability that x is less than a and that y is less than b for the standardized cumulative bivariate normal distribution
P (X < a, Y < b) =
2π
1 1 − ρ2
a −∞
2x2 − 2ρx y + y 2 dx dy , exp 2(1 − ρ2 ) −∞
b
y−μ
y x where x = x−μ σx , y = σy and p is the correlation between the random variables x and y . The first step in the approximation of the bivariate normal probability N2 (a, b; ρ) is as follows:
5 5 wi wj f (xi , xj ), φ(a, b; ρ) ≈ 0.31830989 1 − ρ2
(85.75)
i=1 j=1
where f (xi , xj ) = exp[a1 (2xi − a1 ) + b1 (2xj − b1 ) + 2ρ(xi − a1 )(xj − b1 )].
page 2949
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2950
The pairs of weights, (w) and corresponding abscissa values (x ) are i, j
x
w
1 2 3 4 5
0.24840615 0.39233107 0.21141885 0.033246660 0.00082485334
0.10024215 0.48281397 1.0609498 1.7797294 2.6697604
Note: This portion is based upon Appendix 13.1 of Stoll H. R. and R. E Whaley. Futures and Options. Cincinnati, OH: South Western Publishing, 8593.
and the coefficients a1 and b1 are computed using a1 =
a 2(1 −
ρ2 )
and b1 =
b 2(1 − ρ2 )
.
The second step in the approximation involves computing the product ab ρ; if ab ρ ≤ 0, compute the bivariate normal probability, N2 (a, b; ρ), using the following rules: (1) If a ≤ 0, b ≤ 0, and ρ ≤ 0, then N2 (a, b; ρ) = ϕ(a, b; ρ); (2) If a ≤ 0, b ≥ 0, and ρ > 0, then N2 (a, b; ρ) = N1 (a) − ϕ(a, −b; −ρ); (3) If a ≥ 0, b ≤ 0, and ρ > 0, then N2 (a, b; ρ) = N1 (b) − ϕ(−a, b; −ρ); (4) If a ≥ 0, b ≥ 0, and ρ ≤ 0, then N2 (a, b; ρ) = N1 (a) + N1 (b) −1 + ϕ(−a, −b; ρ).
(85.76)
If abρ > 0, compute the bivariate normal probability, N2 (a, b; ρ), as N2 (a, b; ρ) = N2 (a, 0; ρab ) + N2 (b, 0; ρab ) − δ,
(85.77)
where the values of N2 (•) on the right-hand side are computed from the rules, for abρ ≤ 0 (ρa − b)Sgn(a) (ρb − a)Sgn(b) 1 − Sgn(a) × Sgn(b) , , ρba = , δ= ρab = 4 a2 − 2ρab + b2 a2 − 2ρab + b2 and Sgn(x) =
1 −1
x≥0 x < 0,
N1 (d) is the cumulative univariate normal probability.
page 2950
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2951
85.9.2 Pricing an American call option: An example An American call option whose exercise price is $48 has an expiration time of 90 days. Assume the risk-free rate of interest is 8% annually, the underlying price is $50, the standard deviation of the rate of return of the stock is 20%, and the stock pays a dividend of $2 exactly for 50 days. (a) What is the European call value? (b) Can the early exercise price predicted? (c) What is the value of the American call? (a) The current stock net price of the present value of the promised dividend is S x = 50 − 2e−0.08(50/365) = 48.0218. The European call value can be calculated as C = (48.0218)N (d1 ) − 48e−0.08(90/365) N (d2 ), where d1 =
[ln(48.208/48) + (0.08 + 0.5(0.20)2 )(90/365)] = 0.25285 0.20 90/365
d2 = 0.292 − 0.0993 = 0.15354. From standard normal table, we obtain N (0.25285) = 0.5 + 0.3438 = 0.599809 N (0.15354) = 0.5 + 0.3186 = 0.561014. So, the European call value is C = (48.516)(0.599809) − 48(0.980)(0.561014) = 2.40123. (b) The present value of the interest income that would be earned by deferring exercise until expiration is X(1 − e−r(T −t) ) = 48(1 − e−0.08(90−50)/365 ) = 48(1 − 0.991) = 0.432. Since d = 2 > 0.432, therefore, the early exercise is not precluded. (c) The value of the American call is now calculated as C = 48.208[N1 (b1 ) + N2 (a1 , −b1 ; 50/90)] − 48e−0.08(90/365) [N1 (b2 )e−0.08(40/365) + N2 (a2 , −b2 ; − + 2e−0.08(50/365) N1 (b2 )
50/90)] (85.78)
page 2951
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2952
since both b1 and b2 depend on the critical ex-dividend stock price St∗ , which can be determined by C(St∗ , 40/365; 48) = St∗ + 2 − 48. By using trial and error, we find that St∗ = 46.9641. An Excel program used to calculate this value is presented in Table 85.1. Table 85.1: S*(critical ex-dividend stock price) X(exercise price of option) r(risk-free interest rate) volatility of stock T-t(expiration date-exercise date) d1 d2 D(divent) c(value of European call option to buy one share) p(value of European put option to sell one share) C (St∗ , T − t; X ) − St∗ − D + X
Calculation of St∗ (critical ex-dividend stock price). 46
46.962
46.963
46.9641
46.9
47
48
48
48
48
48
48
0.08
0.08
0.08
0.08
0.08
0.08
0.2 0.10959
0.2 0.10959
0.2 0.10959
0.2 0.10959
0.2 0.10959
0.2 0.10959
−0.4773 −0.5435 2 0.60263
−0.1647 −0.2309 2 0.96385
−0.1644 −0.2306 2 0.96362
−0.164 −0.2302 2 0.9641
−0.1846 −0.2508 2 0.93649
−0.1525 −0.2187 2 0.9798
2.18365
1.58221
1.58164
1.58102
1.61751
1.56081
0.60263
0.00185
0.00062
2.3E–06
0.03649
−0.0202
Calculation of S ∗t (critical ex-dividend stock price) 1* 2 3
4 5 6 7 8
Column C* S*(critical ex-dividend stock price) X(exercise price of option) r(risk-free interest rate) volatility of stock T − t(expiration date-exercise date) d1
46
48 0.08 0.2 = (90 − 50)/365 = (LN(C3/C4)+(C5+C6ˆ2/2)*(C7))/(C6*SQRT(C7)) (Continued)
page 2952
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2953 Table 85.1: 9 10 11 12
13
14 15
(Continued)
d2 D(divent)
=(LN(C3/C4)+(C5−C6ˆ2/2)*(C7))/(C6*SQRT(C7)) 2
c(value of European call option to buy one share) p(value of European put option to sell one share)
=C3*NORMSDIST(C8)−C4*EXP(−C5*C7)* NORMSDIST(C9)
C(St *,T−t;X)−S t * − D +X
=C 12−C 3−C 10 + C 4
=C4*EXP(-C5*C7)*NORMSDIST(−C9)− C3*NORMSDIST(−C8)
Note: ∗ This table shows the number of the row and the column in the excel sheet.
Substituting S x = 48.208, X = $48 and St∗ into equations (85.72b) and (85.72c), we can calculate a1 , a2 , b1 , and b2 as follows: a1 = d1 = 0.25285. a2 = d2 = 0.15354. 48.208 2 50 ln 46.9641 + 0.08 + 0.22 365 = 0.4859. b1 = (0.20) 50/365 b2 = 0.485931 − 0.074023 = 0.4185. In addition, we also know ρ = − 50/90 = −0.7454. From the above information, we now calculate related normal probability as follows: N1 (b1 ) = N1 (0.4859) = 0.6865 N1 (b2 ) = N1 (0.7454) = 0.6598 Following equation (85.77), we now calculate the value of N2 (0.25285, −0.4859; −0.7454) and N2 (0.15354, −0.4185; −0.7454) as follows: Since abρ>0 for both cumulative bivariate normal density function, therefore, we can use equation N2 (a, b; ρ) = N2 (a, 0; ρab ) + N2 (b, 0; ρba )−δ to calculate the value of both N2 (a, b; ρ) as follows: ρab = ρba =
[(−0.7454)(0.25285) + 0.4859](1) (0.25285)2 − 2(0.7454)(0.25285)(−0.4859) + (0.4859)2 [(−0.7454)(−0.4859) − 0.25285](1) (0.25285)2 − 2(0.7454)(0.25285)(−0.4859) + (0.4859)2
δ = (1 − (1)(−1))/4 = 1/2
= 0.87002 = −0.31979
page 2953
July 6, 2020
15:53
2954
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
N2 (0.292, −0.4859; −0.7454) = N2 (0.292, 0.0844) + N2 (−0.5377, 0.0656) − 0.5 = N1 (0) + N1 (−0.5377) −Φ(−0.292, 0; − 0.0844) −Φ(−0.5377, 0; −0.0656) − 0.5 = 0.07525. Using a Microsoft Excel programs presented in Appendix 85A, we obtain N2 (0.8527, −0.4185; −0.7454) = 0.06862. Then substituting the related information into the equation (85.78), we obtain C = $3.08238 and all related results are presented in Appendix 85B. 85.10 Price Bounds for Options 85.10.1 Options written on non-dividend-paying stocks To derive the lower price bounds and the put–call parity relations for options on non-dividend-paying stocks, simply set the cost-of-carry rate, b, equal to the risk-less rate of interest, r. Note that the only cost of carrying the stock is interest. The lower price bounds for the European call and put options are c(S, T ; X) ≥ max[0, S − Xe−rT ],
(85.79a)
p(S, T ; X) ≥ max[0, Xe−rT − S],
(85.79b)
and
respectively, and the lower price bounds for the American call and put options are C(S, T ; X) ≥ max[0, S − Xe−rT ],
(85.80a)
P (S, T ; X) ≥ max[0, Xe−rT − S],
(85.80b)
and
respectively. The put–call parity relation for non-dividend-paying European stock options is c(S, T ; X) − p(S, T ; X) = S − Xe−rT ,
(85.81a)
and the put–call parity relation for American options on non-dividend-paying stocks is S − X ≤ C(S, T ; X) − P (S, T ; X) ≤ S − Xe−rT .
(85.81b)
For non-dividend-paying stock options, the American call option will not rationally be exercised early, while the American put option may be done so.
page 2954
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2955
85.10.2 Options written on dividend-paying stocks If dividends are paid during the option’s life, the above relations must reflect the stock’s drop in value when the dividends are paid. To manage this modification, we assume that the underlying stock pays a single dividend during the option’s life at a time that is known with certainty. The dividend amount is D and the time to ex-dividend is t. If the amount and the timing of the dividend payment are known, the lower price bound for the European call option on a stock is c(S, T ; X) ≥ max[0, S − De−rt − Xe−rT ].
(85.82a)
In this relation, the current stock price is reduced by the present value of the promised dividend. Because a European-style option cannot be exercised before maturity, the call option holder has no opportunity to exercise the option while the stock is selling cum dividend. In other words, to the call option holder, the current value of the underlying stock is its observed market price less the amount that the promised dividend contributes to the current stock value, that is, S − De−rt . To prove this pricing relation, we use the same arbitrage transactions, except we use the reduced stock price S −De−rt in place of S. The lower price bound for the European put option on a stock is p(S, T ; X) ≥ max[0, Xe−rT − S − De−rt ].
(85.82b)
Again, the stock price is reduced by the present value of the promised dividend. Unlike the call option case, however, this serves to increase the lower price bound of the European put option. Because the put option is the right to sell the underlying stock at a fixed price, a discrete drop in the stock price, such as that induced by the payment of a dividend, serves to increase the value of the option. An arbitrage proof of this relation is straightforward when the stock price, net of the present value of the dividend is used in place of the commodity price. The lower price bounds for American stock options are slightly more complex. In the case of the American call option, for example, it may be optimal to exercise just prior to the dividend payment because the stock price falls by an amount D when the dividend is paid. The lower price bound of an American call option expiring at the ex-dividend instant would be 0 or S − Xe−rt , whichever is greater. On the other hand, it may be optimal to wait until the call option’s expiration to exercise. The lower price bound for a call option expiring normally is (85.82a). Combining the two results,
page 2955
July 6, 2020
15:53
2956
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
Figure 85.3: American call option price as a function of the ex-dividend stock price immediately prior to the ex-dividend instant. Early exercise may be optimal.
we get C(S, T ; X) ≥ max[0, S − Xe−rt , S − De−rt − Xe−rT ].
(85.83a)
The last two terms on the right-hand side of (85.83a) provide important guidance in deciding whether to exercise the American call option early, just prior to the ex-dividend instant. The second term in the squared brackets is the present value of the early exercise proceeds of the call. If the amount is less than the lower price bound of the call that expires normally, that is, if S − Xe−rt ≤ S − De−rT − Xe−rt ,
(85.84)
the American call option will not be exercised just prior to the ex-dividend instant. To see why, simply rewrite (85.84) so it reads D < X[1 − e−r(T −t) ].
(85.85)
In other words, the American call will not be exercised early if the dividend captured by exercising prior to the ex-dividend date is less than the interest implicitly earned by deferring exercise until expiration. Figure 85.3 depicts a case in which early exercise could occur at the ex-dividend instant, t. Just prior to ex-dividend, the call option may be exercised yielding proceeds St + D − X, where St is the ex-dividend stock price. An instant later, the option is left unexercised with value c(St , T − t; X), where c(•) is the European call option formula. Thus, if the ex-dividend stock price, St is above the critical ex-dividend stock price where the two functions intersect, St∗ , the option holder will choose to exercise his or her option early just prior to the ex-dividend instant. On the other hand, if St ≤ St∗ , the option holder will choose to leave her position open until the option’s expiration.
page 2956
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2957
Figure 85.4: American call option price as a function of the ex-dividend stock price immediately prior to the ex-dividend instant. Early exercise will not be optimal.
Figure 85.4 depicts a case in which early exercise will not occur at the exdividend instant, t. Early exercise will not occur if the functions, St + D − X and c(St , T − t; X) do not intersect, as is depicted in Figure 85.4. In this case, the lower boundary condition of the European call, St − Xe−r(T −t) , lies above the early exercise proceeds, St + D − X, and hence the call option will not be exercised early. Stated explicitly, early exercise is not rational if St + D − X < St − Xe−r(T −t) . This condition for no early exercise is the same as (85.84), where St is the ex-dividend stock price and where the investor is standing at the ex-dividend instant, t. The condition can also be written as D < X[1 − e−r(T −t) ].
(85.85)
In words, if the ex-dividend stock price decline — the dividend — is less than the present value of the interest income that would be earned by deferring exercise until expiration, early exercise will not occur. When condition (85.85) is met, the value of the American call is simply the value of the corresponding European call. The lower price bound of an American put option is somewhat different. In the absence of a dividend, an American put may be exercised early. In the presence of a dividend payment, however, there is a period just prior to the ex-dividend date when early exercise is suboptimal. In that period, the interest earned on the exercise proceeds of the option is less than the drop in the stock price from the payment of the dividend. If tn represents a time prior to the dividend payment at time t, early exercise is suboptimal, where (X − S)e−r(t−tn ) is less than (X − S + D). Rearranging, early exercise
page 2957
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2958
will not occur between tn and t if1
ln 1 +
D X−S
. (85.86) r Early exercise will become a possibility again immediately after the dividend is paid. Overall, the lower price bound of the American put option is tn > t −
P (S, T ; X) ≥ max[o, X − (S − De−rt )].
(85.83b)
Put–call parity for European options on dividend-paying stocks also reflects the fact that the current stock price is deflated by the present value of the promised dividend, that is, c(S, T ; X) − p(S, T ; X) = S − De−rt − Xe−rT .
(85.87)
That the presence of the dividend reduces the value of the call and increases the value of the put is again reflected here by the fact that the term on the right-hand side of (85.87) is smaller than it would be if the stock paid no dividend. Put–call parity for American options on dividend-paying stocks is represented by a pair of inequalities, that is, S − De−rt − X ≤ C(S, T ; X) − P (S, T ; X) ≤ S − De−rt − Xe−rT . (85.88) To prove the put–call parity relation (85.88), we consider each inequality in turn. The left-hand side condition of (85.88) can be derived by considering the values of a portfolio that consists of buying a call, selling a put, selling the stock, and lending X + De−rt risklessly. Table 85.2 contains these portfolio values. In Table 85.2, it can be seen that, if all of the security positions stay open until expiration, the terminal value of the portfolio will be positive, independent of whether the terminal stock price is above or below the exercise price of the options. If the terminal stock price is above the exercise price, the call option is exercised, and the stock acquired at exercise price X is used to deliver, in part, against the short stock position. If the terminal stock price is below the exercise price, the put is exercised. The stock received in the 1
It is possible that the dividend payment is so large that early exercise prior to the dividend payment is completely precluded. For example, consider the case where X = 50, S = 40, D = 1, t = 0.25, and r = 0.10. Early exercise is precluded if r, = 0.25 − ln[1 − l/(50 − 40)]/0.10 = −0.7031. Because the value is negative, the implication is that there is no time during the current dividend period (i.e., from 0 to t) where it will not pay the American put option holder to wait until the dividend is paid to exercise his option.
page 2958
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2959 Table 85.2: options.
Arbitrage transactions for establishing put–call parity for American stock S − De−rt − X ≤ C(S, T ; X) − P (S, T ; X) Ex-dividend day(t)
Position Buy American Call Sell American Put Sell Stock Lend De−rt Lend X Net Portfolio Value
Put exercised early (γ)
Put exercised normally (T )
Initial value
Intermediate value
−C
˜γ C
0
S˜T − X
P
−(X − S˜γ )
−(X − S˜T )
0
−S˜γ
−S˜T XerT
−S˜T XerT
S −De−rt −X −C + P + S −De−rt − X
−D D
0
Xerγ
Terminal value ˜T >X ˜T ≤X S S
˜γ + X(erγ − 1) X(erT − 1) X(erT − 1) C
exercise of the put is used to cover the short stock position established at the outset. In the event the put is exercised early at time T , the investment in the riskless bonds is more than sufficient to cover the payment of the exercise price to the put option holder, and the stock received from the exercise of the put is used to cover the stock sold when the portfolio was formed. In addition, an open call option position that may still have value remains. In other words, by forming the portfolio of securities in the proportions noted above, we have formed a portfolio that will never have a negative future value. If the future value is certain to be non-negative, the initial value must be non-positive, or the left-hand inequality of (85.88) holds. The right-hand side of (85.88) may be derived by considering the portfolio used to prove European put–call parity. Table 85.2 contains the arbitrage portfolio transactions. In this case, the terminal value of the portfolio is certain to equal zero, should the option positions stay open until that time. In the event the American call option holder decides to exercise the call option early, the portfolio holder uses his long stock position to cover his stock obligation on the exercised call and uses the exercise proceeds to retire his outstanding debt. After these actions are taken, the portfolio holder still has an open long put position and cash in the amount of X[1 − e−r(T −t) ]. Since the portfolio is certain to have non-negative outcomes, the initial value must be non-positive or the right-hand inequality of (85.88) must hold. Option bounds determination is important in option pricing. Ritchken (1985) used linear programming approach to derive option bounds. Levy (1985) used stochastic dominance approach derived option bounds. For other methods and applications of option bounds determination, see Ritchken and Kuo (1988 and 1989), Lee et al. (2002, 2019).
page 2959
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2960
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
85.11 Summary In this chapter, we first introduced univariate and multivariate normal distribution and log-normal distribution. Then we showed how normal distribution can be used to approximate binomial distribution. Finally, we used the concepts normal and log-normal distributions to derive Black–Scholes formula under the assumption that investors are risk neutral. In this chapter, we first reviewed the basic concept of the bivariate normal density function and present the bivariate normal CDF. The theory of American call stock option pricing model for one dividend payment is also presented. The evaluations of stock option models without dividend payment and with dividend payment are discussed, respectively. Furthermore, we provided an excel program for evaluating American option pricing model with one dividend payment. Finally, we discussed option bound determination and some important literature related to option bound was also briefly discussed.
Bibliography T. W. Anderson (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. New York: Wiley-Interscience. J. C. Cox and S. A. Ross (1976). The valuation of options for alternative stochastic processes. Journal of Financial Economics, 3, 145–166. J. Cox, S. Ross, and M. Rubinstein (1979). Option pricing: A simplified approach. Journal of Financial Economics, 7, 229–263. J. Hull (2017). Options, Futures, and Other Derivatives, 10th ed. Upper Saddle. River, New Jersey: Prentice Hall. N. L. Johnson and S. Kotz (2000). Distributions in Statistics: Continuous Multivariate Distributions, 2nd ed. New York: Wiley. N. L. Johnson and S. Kotz (1994). Distributions in Statistics: Continuous Univariate Distributions 2, 2nd ed, New York: Wiley. C. F. Lee, Z. Zhong, T. Tai, and H. Chuang (2019). Alternative methods for determining option bounds: A review and comparison Chapter 24, Handbook of Financial Econometrics, Mathematics, Statistics, and Technology, edited by Cheng Few Lee and John Lee, World Scientific. C. F. Lee, P. Zhang, and A. C. Lee (2002). Bounds for options prices and the expected payoffs with skewness and kurtosis. Advances in Quantitative Analysis of Finance and Accounting, 10, 117–138. C. F. Lee, J. Lee, J. R. Chang, and T. Tai (2016). Essentials of Excel, Excel VBA, SAS and Minitab for Statistical and Financial Analyses, New York: Springer. C. F. Lee, H. Y. Chen, and J. Lee (2019). Financial Econometrics, Mathematics, Statistics Theory, Method, and Application. Springer, forthcoming. H. Levy (1985). Upper and lower bounds of put and call values: stochastic dominance approach. Journal of Finance, 40, 1197–1218. R. L. McDonald (2012). Derivatives Markets, 3rd ed. Boston, MA: Addison Wesley.
page 2960
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2961 P. H. Ritchken (1985). On option pricing bounds. Journal of Finance, 40, 1219–1233. P. H. Ritchken and S. Kuo (1988). Option bounds with finite revision opportunities. Journal of Finance, 43, 301–308. P. Ritchken and S. Kuo (1989). On stochastic dominance and decreasing absolute risk averse option pricing bounds. Management Science, 35, 51–59. M. Rubinstein (1976). The valuation of uncertain income streams and the pricing of options. Bell Journal of Economics and Management Science, 7, 407–425. H. R. Stoll (1969). The relationship between put and call option prices. Journal of Finance, 24, 801–824. R. E. Whaley (1981). On the valuation of American call options on stocks with known dividends. Journal of Financial Economics, 9, 207–211.
Appendix 85A: Microsoft Excel Program for Calculating Cumulative Bivariate Normal Density Function Option Explicit Public Function Bivarncdf(a As Double, b As Double, rho As Double) As Double Dim rho_ab As Double, rho_ba As Double Dim delta As Double If (a * b * rho) = 0 And rho = 0, 1, -1)) / Sqr(a rho_ba = ((rho * b - a) * IIf(b >= 0, 1, -1)) / Sqr(a delta = (1 - IIf(a >= 0, 1, -1) * IIf(b >= 0, 1, -1)) Bivarncdf = Bivarncdf(a, 0, rho_ab) + Bivarncdf(b, 0,
^ 2 - 2 * rho * a * b + b ^ 2) ^ 2 - 2 * rho * a * b + b ^ 2) / 4 rho_ba) - delta
End If End Function Public Function Phi(a As Double, b As Double, rho As Double) As Double Dim Dim Dim Dim
a1 As Double, b1 As Double w(5) As Double, x(5) As Double i As Integer, j As Integer doublesum As Double
page 2961
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2962 a1 = a / Sqr(2 * (1 - rho ^ 2)) b1 = b / Sqr(2 * (1 - rho ^ 2)) w(1) w(2) w(3) w(4) w(5)
= = = = =
0.24840615 0.39233107 0.21141885 0.03324666 0.00082485334
x(1) x(2) x(3) x(4) x(5)
= = = = =
0.10024215 0.48281397 1.0609498 1.7797294 2.6697604
doublesum = 0 For i = 1 To 5 For j = 1 To 5 doublesum = doublesum + w(i) * w(j) * Exp(a1 * (2 * x(i) - a1) + b1 * (2 * x(j) - b1) + 2 * rho * (x(i) a1) * (x(j) - b1)) Next j Next i Phi = 0.31830989 * Sqr(1 - rho ^ 2) * doublesum End Function
Appendix 85B: Microsoft Excel Program for Calculating the American Call Options Number* 1 2 3 4 5 6 7 8 9 10 11 12
A
S(current stock price) = St *(critical ex-dividend stock price) = S(current stock price NPV of promised dividend) = X(exercise price of option) = r(risk-free interest rate )= σ(volatility of stock) = T (expiration date) = t(exercise date) = D(Dividend) = d1 (nondividend-paying) =
16 17
d2 (dividend-paying) =
15
C*
Option Pricing Calculation
d2 (nondividend-paying) = d1 *(critical ex-dividend stock price) = d2 *(critical ex-dividend stock price) = d1 (dividend-paying) =
13 14
B
50 46.9641 48.0218 = B3-B11*EXP(-B7*B10) 48 0.08 0.2 0.24658 0.13699 2 0.65933 = (LN(B3/B6)+(B7+0.5*B8ˆ 2)*B9)/ (B8*SQRT(B9)) 0.56001 = B12-B8*SQRT(B9) −0.16401 = (LN(B4/B6)+(B7+0.5*B8ˆ 2)* (B9-B10))/(B8*SQRT (B9-B10)) −0.23022 = B14-B8*SQRT(B9-B10) 0.25285 = (LN(B5/B6)+(B7+0.5*B8ˆ 2)*(B9))/ (B8*SQRT(B9)) 0.15354 = B16-B8*SQRT(B9) (Continued )
page 2962
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
Statistical Distributions, European Option, American Option, and Option Bounds 2963
(Continued ) Number*
A
B
18
a1 =
85 20
a2 = b1 =
21 22 23
b2 =
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
C(St *,T -t;X) = C(St *,T -t;X)-St *D+X =
C*
0.25285 = (LN((B3-B11*EXP(-B7*B10))/B6)+ (B7+0.5*B8ˆ 2)* (B9))/(B8*SQRT(B9)) 0.15354 = B18-B8*SQRT(B9) 0.48593 = (LN((B3-B11*EXP(-B7*B10))/B4)+ (B7+0.5*B8ˆ 2)*(B10))/(B8*SQRT(B10)) 0.41851 = B20-B8*SQRT(B10) 0.9641 = B4*NORMSDIST(B14)-B6*EXP(-B7* (B9-B10)) *NORMSDIST(B15) 2.3E-06 = B23-B4-B11+B6
N1 (a1 ) = 0.59981 = NORMSDIST(B18) 0.56101 = NORMSDIST(B85) N1 (a2 ) = 0.68649 = NORMSDIST(B20) N1 (b1 ) = 0.6598 = NORMSDIST(B21) N1 (b2 ) = 0.31351 = NORMSDIST(-B20) N1 (-b1 ) = 0.3402 = NORMSDIST(-B21) N1 (-b2 ) = ρ = −0.74536 = − SQRT(B10/B9) a = a1 ;b = −b1 Φ(a,-b;-ρ) = 0.20259 = phi(-B20,0,-B37) Φ(-a,b;-ρ) = 0.04084 = phi(-B18,0,-B36) ρab = 0.87002 = ((B32*B18-(-B20))*IF(B18>=0,1,-1))/ SQRT(B 18ˆ 2-2*B32*B18*-B20+(-B20)ˆ 2) ρba = −0.38579 = ((B32*-B20-(B18))*IF(-B20>=0,1,-1))/SQRT (B18ˆ 2-2*B32*B18*-B20+(-B20)ˆ 2) 0.45916 = bivarncdf(B18,0,B36) N2 (a,0;ρab) = 0.11092 = bivarncdf(-B20,0,B37) N2 (b,0;ρba) = δ= 0.5 = (1-IF(B18>=0,1,-1)*IF(-B20>=0,1,-1))/4 a = a2 ;b = b2 Φ(a,-b;-ρ) = 0.24401 = phi(-B21,0,-B45) Φ(-a, b;-ρ) = 0.02757 = phi(-B85,0,-B44) ρab = 0.94558 = ((B32*B85-(-B21))*IF(B85>=0,1,-1))/SQRT (B85ˆ 2-2*B32*B85*-B21+(-B21)ˆ 2) ρba = −0.48787 = ((B32*-B21-(B85))*IF(-B21>=0,1,-1))/SQRT (B85ˆ 2-2*B32*B85*-B21+(-B21)ˆ 2) 0.47243 = bivarncdf(B85,0,B44) N2 (a,0;ρab) = 0.09685 = bivarncdf(-B21,0,B45) N2 (b,0;ρba) = δ= 0.5 = (1-IF(B85>=0,1,-1)*IF(-B21>=0,1,-1))/4 N2 (a1 ,-b1 ;ρ) = N2 (a2 ,-b2 ;ρ) = c(value of European call option to buy one share)
0.07007 = bivarncdf(B18,-B20,B32) 0.06862 = bivarncdf(B85,-B21,B32) 2.40123 = B5*NORMSDIST(B16)-B6*EXP(-B7*B9)* NO RMSDIST(B17)
(Continued )
page 2963
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch85
C. F. Lee
2964
(Continued ) Number* 54
55
A
B
C*
p(value of European 1.44186 = −B5*NORMSDIST(-B16)+B6*EXP(-B7*B9)* put option to sell NO RMSDIST(-B17) one share) c(value of American 3.08238 = (B3-B11*EXP(-B7*B10))*(NORMSDIST call option to buy (B20)+bivarncdf(B18,-B20,-SQRT(B10/B9)))one share) B6*EXP(-B7*B9)*(NORMSDIST(B21)* EXP(B7*(B9-B10))+bivarncdf(B85,-B21, -SQRT(B10/B9)))+B11*EXP(-B7*B10)* NORMSDIST(B21)
Note: ∗ This table shows the number of the row and the column in the excel sheet.
page 2964
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
Chapter 86
A Comparative Static Analysis Approach to Derive Greek Letters: Theory and Applications Cheng Few Lee and Yuanyuan Xiao Contents 86.1 86.2
86.3
86.4
86.5
86.6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delta (Δ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.2.1 Derivation of delta for different kinds of stock options 86.2.2 Application of delta (Δ) . . . . . . . . . . . . . . . . Theta (Θ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.3.1 Derivation of theta for different kinds of stock options 86.3.2 Application of theta (Θ) . . . . . . . . . . . . . . . . Gamma (Γ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.4.1 Derivation of gamma for different kinds of stock options . . . . . . . . . . . . . . . . . . . . . . . . . . 86.4.2 Application of gamma (Γ) . . . . . . . . . . . . . . . Vega (ν) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.5.1 Derivation of vega for different kinds of stock options 86.5.2 Application of vega (ν) . . . . . . . . . . . . . . . . . Rho (ρ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.6.1 Derivation of rho for different kinds of stock options . 86.6.2 Application of rho (ρ) . . . . . . . . . . . . . . . . . .
Cheng Few Lee Rutgers University email: cfl[email protected] Yuanyuan Xiao Rutgers University email: [email protected] 2965
2966 2967 2967 2971 2972 2973 2977 2977 2978 2980 2982 2982 2985 2986 2986 2988
page 2965
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2966
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
86.7
Derivation of Sensitivity for Stock Options with Respect to Exercise Price . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.8 Relationship Between Delta, Theta, and Gamma . . . . . . . 86.9 Empirical Examples of Delta, Theta, Gamma, Vega, and Rho 86.10 Summary and Concluding Remarks . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 86A: Convexity and Bond Price Change . . . . . . . . . . Appendix 86B: Greek Letter Estimates in Terms of Johnson & Johnson Stock Price and Option Information . . . .
2989 2991 2992 2993 2993 2994 2997
Abstract Based on comparative analysis, we first discuss different kinds of Greek letters in terms of Black–Scholes option pricing model, then we show how these Greek letters can be applied to perform hedging and risk management. The relationship between delta, theta, and gamma is also explored in detail. Keywords Delta (Δ) • Theta (Θ) • Gamma (Γ) • Vega (ν) • Rho (ρ) • Hedging.
86.1 Introduction It is well known that the call option value can be affected by stock price per share, exercise price per share, the contract period of the option, the risk-free rate, and the volatility of the stock return. In this chapter, we will mathematically analyze these kinds of relationships. Parts of these mathematical relationships are called “Greek letters” by finance professionals. Here, we specifically derive Greek letters for call (put) options on non-dividend stock and dividends-paying stock. Some examples will be provided to explain applications of these Greek letters. Sections 86.2–86.6 discuss the derivations and applications of Delta, Theta, Gamma, Vega, and Rho, respectively. Section 86.7 derives the partial derivative of stock options with respect to their exercise prices. Section 86.8 describes the relationship between Delta, Theta, and Gamma, and their implication in delta-neutral portfolio. In Section 86.9, we summarize and conclude this chapter. Appendix 86A discusses the relationship between convexity and bond price change. Finally, Appendix 86B discusses Greek letter estimates in terms of Johnson & Johnson stock price and option information.
page 2966
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2967
2967
86.2 Delta (Δ) The delta of an option, Δ, is defined as the rate of change of the option price with respect to the rate of change of underlying asset price: Δ=
∂Π , ∂S
where Π is the option price and S is underlying asset price. We next show the derivation of delta for various kinds of stock option. 86.2.1 Derivation of delta for different kinds of stock options From Black–Scholes option pricing model, we know the price of call option on a non-dividend stock can be written as Ct = St N (d1 ) − Xe−rτ N (d2 ),
(86.1)
and the price of put option on a non-dividend stock can be written as Pt = Xe−rτ N (−d2 ) − St N (−d1 ),
(86.2)
where + r+ X √ d1 = σs τ St ln X + r − √ d2 = σs τ ln
St
σs2 2
σs2 2
τ τ
, √ = d1 − σs τ ,
τ = T − t, N (·) is the cumulative density function of normal distribution: N (d1 ) =
d1 −∞
f (u)du =
d1 −∞
u2 1 √ e− 2 du. 2π
First, we calculate N (d1 ) =
d2 ∂N (d1 ) 1 1 = √ e− 2 , ∂d1 2π
(86.3)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2968
N (d2 ) =
∂N (d2 ) ∂d2
d2 1 2 = √ e− 2 2π
(d1 −σs 1 2 = √ e− 2π
√
τ)
2
2τ √ d2 σs 1 1 = √ e− 2 · ed1 σs τ · e− 2 2π d2 − 21
1 =√ e 2π
σ2 S ln Xt + r+ 2s τ
·e
· e−
2τ σs 2
d2 1 St rτ 1 ·e . = √ e− 2 · X 2π
(86.4)
Equations (86.2) and (86.3) will be used repetitively in determining following Greek letters when the underlying asset is a non-dividend paying stock. For a European call option on a non-dividend stock, delta can be shown as Δ = N (d1 ).
(86.5)
The derivation of equation (86.5) is given as follows: Δ=
∂N (d1 ) ∂N (d2 ) ∂Ct = N (d1 ) + St − Xe−rτ ∂St ∂St ∂St = N (d1 ) + St
∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − Xe−rτ ∂d1 ∂St ∂d2 ∂St
d2 1 1 1 √ = N (d1 ) + St √ e− 2 · S t σs τ 2π d2 St rτ 1 1 1 √ ·e · −Xe−rτ √ e− 2 · X S σ 2π t s τ
= N (d1 ) + St = N (d1 ).
1 √
St σs 2πτ
e−
d2 1 2
− St
1 √
St σs 2πτ
e−
d2 1 2
page 2968
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2969
2969
For a European put option on a non-dividend stock, delta can be shown as Δ = N (d1 ) − 1.
(86.6)
The derivation of equation (86.6) is Δ=
∂N (−d2 ) ∂N (−d1 ) ∂Pt = Xe−rτ − N (−d1 ) − St ∂St ∂St ∂St = Xe−rτ
∂(1 − N (d2 )) ∂d2 ∂(1 − N (d1 )) ∂d1 − (1 − N (d1 )) − St ∂d2 ∂St ∂d1 ∂St
d2 St rτ 1 1 1 √ − (1 − N (d1 )) ·e · = −Xe−rτ √ e− 2 · X S t σs τ 2π d2 1 1 1 √ + St √ e− 2 · S σ 2π t s τ
= −St
1 √
St σs 2πτ
e−
d2 1 2
− N (d1 ) − 1 + St
1 √
St σs 2πτ
e−
d2 1 2
= N (d1 ) − 1. If the underlying asset is a dividend-paying stock providing a dividend yield at rate q, Black–Scholes formulas for the prices of a European call option on a dividend-paying stock and a European put option on a dividend-paying stock are Ct = St e−qτ N (d1 ) − Xe−rτ N (d2 ) ,
(86.7)
Pt = Xe−rτ N (−d2 ) − St e−qτ N (−d1 ) ,
(86.8)
and
where
+ r−q+ X √ d1 = σs τ ln SXt + r − q − √ d2 = σs τ ln
St
σs2 2
σs2 2
τ τ
, √ = d1 − σs τ .
To make the following derivations more easily, we calculate equations (86.9) and (86.10) in advance. N (d1 ) =
d2 ∂N (d1 ) 1 1 = √ e− 2 , ∂d1 2π
(86.9)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2970
N (d2 ) =
∂N (d2 ) ∂d2
d2 1 2 = √ e− 2 2π
(d1 −σs 1 2 = √ e− 2π
√
τ)
2
2τ √ d2 σs 1 1 = √ e− 2 · ed1 σs τ · e− 2 2π d2 − 21
1 =√ e 2π
σ2 S ln Xt + r−q+ 2s τ
·e
· e−
2τ σs 2
d2 1 St (r−q)τ 1 ·e = √ e− 2 · . (86.10) X 2π For a European call option on a dividend-paying stock, delta can be shown as
Δ = e−qτ N (d1 ).
(86.11)
The derivation of (86.11) is Δ=
∂N (d1 ) ∂N (d2 ) ∂Ct = e−qτ N (d1 ) + St e−qτ − Xe−rτ ∂St ∂St ∂St = e−qτ N (d1 ) + St e−qτ
∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − Xe−rτ ∂d1 ∂St ∂d2 ∂St
d2 1 1 1 √ = e−qτ N (d1 ) + St e−qτ √ e− 2 · S t σs τ 2π d2 St (r−q)τ 1 1 1 √ ·e · − Xe−rτ √ e− 2 · X S t σs τ 2π
= e−qτ N (d1 ) + St e−qτ
d2 d2 1 1 1 1 √ √ e− 2 − St e−qτ e− 2 St σs 2πτ St σs 2πτ
= e−qτ N (d1 ) . For a European call option on a dividend-paying stock, delta can be shown as Δ = e−qτ [N (d1 ) − 1] . The derivation of (86.12) is Δ=
∂N (−d2 ) ∂N (−d1 ) ∂Pt = Xe−rτ − e−qτ N (−d1 ) − St e−qτ ∂St ∂St ∂St
(86.12)
page 2970
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
= Xe−rτ
page 2971
2971
∂(1 − N (d2 )) ∂d2 − e−qτ (1 − N (d1 )) ∂d2 ∂St
− St e−qτ
∂(1 − N (d1 )) ∂d1 ∂d1 ∂St
d2 St (r−q)τ 1 1 1 √ − e−qτ (1 − N (d1 )) ·e = −Xe−rτ √ e− 2 · · X S σ 2π t s τ d2 1 1 1 √ + St e−qτ √ e− 2 · S t σs τ 2π
= −St e−qτ + St e−qτ
d2 1 2
1 √
e−
1 √
e−
St σs 2πτ St σs 2πτ
+ e−qτ (N (d1 ) − 1)
d2 1 2
= e−qτ (N (d1 ) − 1).
86.2.2 Application of delta (Δ) Figure 86.1 shows the relationship between the price of a call option and the price of its underlying asset. The delta of this call option is the slope of the line at the point of A corresponding to current price of the underlying asset. By calculating delta ratio, a financial institution that sells option to a client can make a delta-neutral position to hedge the risk of changes of the underlying asset price. Suppose that the current stock price is $100, the call option price on stock is $10, and the current delta of the call option is 0.4.
Figure 86.1: The relationship between the price of a call option and the price of its underlying asset.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
2972
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
Figure 86.2:
Changes of delta hedge.
A financial institution sold 10 call options to its client, so that the client has the right to buy 1,000 shares at time to maturity. To construct a delta hedge position, the financial institution should buy 0.4 × 1,000 = 400 shares of stock. If the stock price goes up to $1, the option price will go up by $0.40. In this situation, the financial institution has a $400 ($1 × 400 shares) gain in its stock position and a $400 ($0.40 × 1,000 shares) loss in its option position. The total payoff of the financial institution is zero. On the other hand, if the stock price goes down by $1, the option price will go down by $0.40. The total payoff of the financial institution is also zero. However, the relationship between option price and stock price is not linear, so delta changes over different stock price. If an investor wants to keep his portfolio in delta-neutral, he should adjust his hedged ratio periodically. The more frequent adjustments he does, the better delta-hedging he gets. Figure 86.2 exhibits the change in delta affecting the delta hedges. If the underlying stock has a price equal to $20, then the investor who uses only delta as risk measure will consider that his or her portfolio has no risk. However, as the underlying stock prices change, either up or down, the delta changes as well and thus he or she will have to use different delta hedging. Delta measure can be combined with other risk measures to yield better risk measurement. We will discuss it further in the following sections. 86.3 Theta (Θ) The theta of an option, Θ, is defined as the rate of change of the option price with respect to the passage of time: ∂Π , Θ= ∂t where Π is the option price and t is the passage of time.
page 2972
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2973
2973
If τ = T − t, theta (Θ) can also be defined as minus one timing the rate of change of the option price with respect to time to maturity. The derivation of such transformation is easy and straightforward:
Θ=
∂Π ∂τ ∂Π ∂Π = = (−1) , ∂t ∂τ ∂t ∂τ
where τ = T − t is time to maturity. For the derivation of theta for various kinds of stock option, we use the definition of negative differential on time to maturity.
86.3.1 Derivation of theta for different kinds of stock options For a European call option on a non-dividend stock, theta can be written as S t σs Θ = − √ · N (d1 ) − rX · e−rτ N (d2 ). 2 τ
(86.13)
The derivation of (86.13) is Θ=−
∂N (d1 ) ∂N (d2 ) ∂Ct = −St + (−r) · X · e−rτ N (d2 ) + Xe−rτ ∂τ ∂τ ∂τ ∂N (d1 ) ∂d1 ∂N (d 2 ) ∂d2 − rX · e−rτ N (d2 ) + Xe−rτ = −St ∂d1 ∂τ ∂d2 ∂τ 2 2 st d2 ln X r + σ2s r + σ2s 1 1 √ − √ − = −St √ e− 2 · σs τ 2σ2 τ 2σs τ 3/2 2π − rX · e−rτ N (d2 )
1 − d21 St rτ −rτ ·e · √ e 2 · + Xe X 2π st 2 ln X r + σ2s r √ − √ − · σs τ 2σs τ 3/2 2σ2 τ st 2 2 ln X r + σ2s r + σ2s 1 − d21 √ − √ − = −St √ e 2 · σs τ 2σ2 τ 2σs τ 3/2 2π
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
page 2974
C. F. Lee & Y. Xiao
2974
− rX · e−rτ N (d2 ) st 2 ln X r + σ2s r 1 − d21 √ − √ − + St √ e 2 · σs τ 2σs τ 3/2 2σ2 τ 2π σ2 s 1 − d21 2 √ − rX · e−rτ N (d2 ) = −St √ e 2 · σ τ 2π s S t σs = − √ · N (d1 ) − rX · e−rτ N (d2 ). 2 τ For a European put option on a non-dividend stock, theta can be shown as S t σs Θ = − √ · N (d1 ) + rX · e−rτ N (−d2 ). 2 τ
(86.14)
The derivation of (86.14) is Θ=−
∂N (−d2 ) ∂N (−d1 ) ∂Pt = −(−r) · X · e−rτ N (−d2 ) − Xe−rτ + St ∂τ ∂τ ∂τ ∂(1 − N (d2 )) ∂d2 = −(−r)X · e−rτ (1 − N (d2 )) − Xe−rτ ∂d2 ∂τ + St
∂(1 − N (d1 )) ∂d1 ∂d1 ∂τ −rτ
= −(−r)X · e
−rτ
(1 − N (d2 )) + Xe
·
d2 1 S 1 √ e− 2 · t · erτ X 2π
st 2 ln X r + σ2s r √ − √ − · σs τ 2σs τ 3/2 2σ2 τ st 2 2 ln X r + σ2s r + σ2s 1 − d21 √ − √ − − St √ e 2 · σs τ 2σs τ 3/2 2σ2 τ 2π
d2 1 1 = rX · e−rτ (1 − N (d2 )) + St √ e− 2 2π st 2 ln X r + σ2s r √ − √ · − σs τ 2σs τ 3/2 2σ2 τ st 2 2 ln X r + σ2s r + σ2s 1 − d21 √ − √ − − St √ e 2 · σs τ 2σs τ 3/2 2σ2 τ 2π
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
−rτ
= rX · e
d2 1 1 (1 − N (d2 )) − St √ e− 2 · 2π
σs2 2 √
page 2975
2975
σs τ
S t σs = rX · e−rτ (1 − N (d2 )) − √ · N (d1 ) 2 τ S t σs = rX · e−rτ N (−d2 ) − √ · N (d1 ). 2 τ For a European call option on a dividend-paying stock, theta can be shown as Θ = q · St e−qτ N (d1 ) −
St e−qτ σs √ · N (d1 ) − rX · e−rτ N (d2 ). 2 τ
(86.15)
The derivation of (86.15) is Θ=−
∂N (d1 ) ∂Ct = q · St e−qτ N (d1 ) − St e−qτ + (−r) · X · e−rτ N (d2 ) ∂τ ∂τ ∂N (d2 ) + Xe−rτ ∂τ ∂N (d1 ) ∂d1 − rX · e−rτ N (d2 ) = q · St e−qτ N (d1 ) − St e−qτ ∂d1 ∂τ + Xe−rτ
∂N (d2 ) ∂d2 ∂d2 ∂τ
−qτ
= q · St e
−qτ
N (d1 ) − St e
d2 1 1 √ e− 2 2π
st 2 2 ln X r − q + σ2s r − q + σ2s √ √ − − rX · e−rτ N (d2 ) · − σs τ 2σs τ 2σs τ 3/2
1 − d21 St (r−q)τ −rτ ·e · √ e 2 · + Xe X 2π st 2 ln X r − q + σ2s r−q √ − √ − · σs τ 2σs τ 2σs τ 3/2
d2 1 1 = q · St e−qτ N (d1 ) − St e−qτ √ e− 2 2π st 2 σs2 ln X r+ 2 r + σ2s √ − √ − rX · e−rτ N (d2 ) · − σs τ 2σs τ 2σs τ 3/2
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2976
s 2 t ln X r + σ2s r √ − √ + St e − σs τ 2σs τ 3/2 2σs τ σ2 s d2 1 1 2 √ = q · St e−qτ N (d1 ) − St e−qτ √ e− 2 · σs τ 2π −qτ
d2 1 1 √ e− 2 · 2π
−rX · e−rτ N (d2 ) = q · St e−qτ N (d1 ) −
St e−qτ σs √ · N (d1 ) − rX · e−rτ N (d2 ). 2 τ
For a European call option on a dividend-paying stock, theta can be shown as Θ = rX · e−rτ N (−d2 ) − qSt e−qτ N (−d1 ) −
St e−qτ σs √ · N (d1 ). 2 τ
(86.16)
The derivation of (86.16) is Θ=−
∂N (−d2 ) ∂Pt = − (−r) · X · e−rτ N (−d2 ) − Xe−rτ ∂τ ∂τ ∂N (−d 1) + (−q)St e−qτ N (−d1 ) + St e−qτ ∂τ ∂(1 − N (d2 )) ∂d2 = rX · e−rτ (1 − N (d2 )) − Xe−rτ ∂d2 ∂τ ∂(1 − N (d1 )) ∂d1 ∂d1 ∂τ
d 1 − 21 St (r−q)τ −rτ −rτ 2 ·e · √ e · = rX · e (1 − N (d2 )) + Xe X 2π s 2 t ln X r − q + σ2s r−q √ − √ − · σs τ 2σs τ 2σs τ 3/2 − qSt e−qτ N (−d1 ) + St e−qτ
d2 1 1 − qSt e−qτ N (−d1 ) − St e−qτ √ e− 2 2π 2 2 st ln X r − q + σ2s r − q + σ2s √ √ − · − σs τ 2σs τ 2σs τ 3/2 d2 1 1 = rX · e−rτ (1 − N (d2 )) + St e−qτ √ e− 2 2π
page 2976
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
·
page 2977
2977
s 2 t ln X r − q + σ2s r−q √ − √ − σs τ 2σs τ 2σs τ 3/2
d2 1 1 − qSt e−qτ N (−d1 ) − St e−qτ √ e− 2 2π 2 2 st ln X r − q + σ2s r − q + σ2s √ √ − · − σs τ 2σs τ 2σs τ 3/2
= rX · e−rτ (1 − N (d2 )) − qSt e−qτ N (−d1 ) σ2 s d2 1 1 2 √ − St e−qτ √ e− 2 · σs τ 2π = rX · e−rτ (1 − N (d2 )) − qSt e−qτ N (−d1 ) − = rX · e−rτ N (−d2 ) − qSt e−qτ N (−d1 ) −
St e−qτ σs √ · N (d1 ) 2 τ
St e−qτ σs √ · N (d1 ). 2 τ
86.3.2 Application of theta (Θ) The value of option is the combination of time value and stock value. When time passes, the time value of the option decreases. Thus, the rate of change of the option price with respect to the passage of time, theta, is usually negative. Because the passage of time on an option is not uncertain, we do not need to make a theta hedge portfolio against the effect of the passage of time. However, we still regard theta as a useful parameter because it is a proxy of gamma in the delta-neutral portfolio. We will discuss the specific details in the following sections.
86.4 Gamma (Γ) The gamma of an option, Γ, is defined as the rate of change of delta respective to the rate of change of underlying asset price: Γ=
∂2Π ∂Δ = , ∂S ∂S 2
where Π is the option price and S is the underlying asset price.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2978
Because the option is not linearly dependent on its underlying asset, delta-neutral hedge strategy is useful only when the movement of underlying asset price is small. Once the underlying asset price moves wider, gammaneutral hedge is necessary. We next show the derivation of gamma for various kinds of stock option. 86.4.1 Derivation of gamma for different kinds of stock options For a European call option on a non-dividend stock, gamma can be shown as 1 √ N (d1 ) . (86.17) Γ= S t σs τ The derivation of (86.17) is Γ=
∂2C
t
∂St2
= =
∂
∂Ct ∂St
∂St ∂N (d1 ) ∂d1 · ∂d1 ∂St
= N (d1 ) · =
1 St
√ σs τ
1 √ N (d1 ) . S t σs τ
For a European put option on a non-dividend stock, gamma can be shown as 1 √ N (d1 ) . (86.18) Γ= S t σs τ The derivation of (86.18) is Γ=
∂ 2 Pt ∂St2
= =
∂
∂Pt ∂St
∂St ∂(N (d1 ) − 1) ∂d1 · ∂d1 ∂St
= N (d1 ) · =
1 St
√ σs τ
1 √ N (d1 ) . S t σs τ
page 2978
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2979
2979
For a European call option on a dividend-paying stock, gamma can be shown as e−qτ √ N (d1 ) . S t σs τ
Γ=
(86.19)
The derivation of (86.19) is
Γ=
∂2C
t
=
∂St2
=
∂
∂Ct ∂St
∂St ∂ (e−qτ N (d1 )) ∂St
= e−qτ ·
∂N (d1 ) ∂d1 · ∂d1 ∂St
= e−qτ · N (d1 ) · =
1 St
√ σs τ
e−qτ √ N (d1 ) . S t σs τ
For a European call option on a dividend-paying stock, gamma can be shown as Γ=
e−qτ √ N (d1 ) . S t σs τ
The derivation of (86.20) is
Γ=
∂ 2 Pt ∂St2
∂ = =
∂Pt ∂St
∂St ∂ (e−qτ (N (d1 ) − 1)) ∂St
= e−qτ ·
∂ [(N (d1 ) − 1)] ∂d1 · ∂d1 ∂St
= e−qτ · N (d1 ) · =
1 St
√ σs τ
e−qτ · √ N (d1 ) . S t σs τ
(86.20)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2980
86.4.2 Application of gamma ( Γ) One can use delta and gamma together to calculate the changes of the option due to changes in the underlying stock price. This change can be approximated by the following relations: change in option value ≈ Δ × change in stock price +
1 ×Γ 2
× (change in stock price)2 . From the above relation, one can observe that the gamma makes the correction for the fact that the option value is not a linear function of underlying stock price. This approximation comes from the Taylor series expansion near the initial stock price. If we let V be option value, S be stock price, and S0 be initial stock price, then the Taylor series expansion around S0 yields the following: ∂V (S0 ) 1 ∂ 2 V (S0 ) (S − S0 ) + (S − S0 )2 + · · · ∂S 2! ∂S 2 1 ∂ n V (S0 ) (S − S0 )n + 2! ∂S n ∂V (S0 ) 1 ∂ 2 V (S0 ) (S − S0 ) + ≈ V (S0 ) + (S − S0 )2 + o(S). ∂S 2! ∂S 2 If we only consider the first three terms, the approximation is then V (S) ≈ V (S0 ) +
∂V (S0 ) 1 ∂ 2 V (S0 ) (S − S0 ) + (S − S0 )2 ∂S 2! ∂S 2 1 ≈ Δ(S − S0 ) + Γ(S − S0 )2 . 2
V (S) − V (S0 ) ≈
For example, if a portfolio of options has a delta equal to $10,000 and a gamma equal to $5,000, the change in the portfolio value if the stock price drops to $34 from $35 is approximately change in portfolio value ≈ ($10000) × ($34 − $35) +
1 × ($5000) 2
×($34 − $35)2 ≈ −$7500. The above analysis can also be applied to measure the price sensitivity of interest rate-related assets or portfolio to interest rate changes. Here, we introduce Modified Duration and Convexity as risk measure corresponding to the above delta and gamma. Modified duration measures the percentage
page 2980
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2981
2981
change in asset or portfolio value resulting from a percentage change in interest rate.
Change in price /Price Modified Duration = Change in interest rate = −Δ/P. Using the modified duration, Change in Portfolio Value = Δ × Change in interest rate = (−Duration × P) × Change in interest rate, we can calculate the value changes of the portfolio. The above relation corresponds to the previous discussion of delta measure. We want to know how the price of the portfolio changes given a change in interest rate. Similar to delta, modified duration only shows the first-order approximation of the changes in value. In order to account for the nonlinear relation between the interest rate and portfolio value, we need a second-order approximation similar to the gamma measure before, this is then the convexity measure. Convexity is the interest rate gamma divided by price, Convexity = Γ/P, and this measure captures the nonlinear part of the price changes due to interest rate changes. Using the modified duration and convexity together allows us to develop first- as well as second-order approximation of the price changes similar to previous discussion. Change in Portfolio Value ≈ −Duration × P × (change in rate) +
1 × Convexity × P × (change in rate)2 . 2
As a result, (−Duration × P) and (Convexity × P) act like the delta and gamma measure, respectively, in the previous discussion. This shows that these Greek letters can also be applied in measuring risk in interest raterelated assets or portfolios. In Appendix 86A, we will discuss the relationship between bond convexity and bond price change. Next, we discuss how to make a portfolio gamma-neutral. Suppose the gamma of a delta-neutral portfolio is Γ, the gamma of the option in this portfolio is Γo , and ωo is the number of options added to the delta-neutral portfolio. Then, the gamma of this new portfolio is ωo Γo + Γ.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2982
To make a gamma-neutral portfolio, we should trade ωo∗ = −Γ/Γo options. Because the position of option changes, the new portfolio is not delta-neutral. We should change the position of the underlying asset to maintain deltaneutral. For example, the delta and gamma of a particular call option are 0.7 and 1.2, respectively. A delta-neutral portfolio has a gamma of −2,400. To make a delta-neutral and gamma-neutral portfolio, we should add a long position of 2,400/1.2 = 2,000 shares and a short position of 2,000 × 0.7 = 1,400 shares in the original portfolio.
86.5 Vega (ν) The vega of an option, ν, is defined as the rate of change of the option price respective to the volatility of the underlying asset: ν=
∂Π , ∂σ
where Π is the option price and σ is volatility of the stock price. We next show the derivation of vega for various kinds of stock options. 86.5.1 Derivation of vega for different kinds of stock options For a European call option on a non-dividend stock, vega can be shown as √ ν = St τ · N (d1 ) .
(86.21)
The derivation of (86.21) is ν=
∂N (d1 ) ∂N (d2 ) ∂Ct = St − Xe−rτ ∂σs ∂σs ∂σs ∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − Xe−rτ ∂d1 ∂σs ∂d2 ∂σs
⎞ ⎛ 2 2 τ 3/2 − ln St + r + σs τ · τ 12 2 σ d1 s X 2 1 ⎠ = St √ e− 2 · ⎝ 2 σs τ 2π 1⎞ ⎛ σs2 St
2 + r + − ln τ ·τ2 d X 2 St rτ 1 1 ⎠ ·e ·⎝ − Xe−rτ √ e− 2 · X σs2 τ 2π = St
page 2982
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
⎛ d2 − 21
1 = St √ e 2π
·⎝
σs2 τ 3/2 − ln SXt + r + σs2 τ ⎛
d2 − 21
1 − St √ e 2π
·⎝
− ln SXt + r + σs2 τ
σs2 2
σs2 2
page 2983
2983
1⎞ τ ·τ2 ⎠
1⎞ τ ·τ2 ⎠
σs2 τ 3/2 1 − d21 = St √ e 2 · σs2 τ 2π √ = St τ · N (d1 ) . For a European put option on a non-dividend stock, vega can be shown as √ (86.22) ν = St τ · N (d1 ) . The derivation of (86.22) is ν=
∂N (−d2 ) ∂N (−d1 ) ∂Pt = Xe−rτ − St ∂σs ∂σs ∂σs ∂(1 − N (d2 )) ∂d2 ∂(1 − N (d1 )) ∂d1 − St ∂d2 ∂σs ∂d1 ∂σs 1⎞ ⎛ σs2 St
2 + r + − ln τ ·τ2 d X 2 1 − 1 St rτ −rτ ⎠ ⎝ 2 √ e ·e = −Xe · · X σs2 τ 2π
⎞ ⎛ 2 2 τ 3/2 − ln St + r + σs τ · τ 12 2 σ d s X 2 1 1 ⎠ + St √ e− 2 · ⎝ 2τ σ 2π s 1⎞ ⎛ σs2 St 2 − ln + r + τ ·τ2 d d2 X 2 1 1 ⎠ + St √1 e− 21 = −St √ e− 2 · ⎝ σs2 τ 2π 2π
⎞ ⎛ 2 1 σs2 τ 3/2 − ln SXt + r + σ2s τ · τ 2 ⎠ ·⎝ σs2 τ = Xe−rτ
σs2 τ 3/2 1 − d21 = St √ e 2 · σs2 τ 2π √ = St τ · N (d1 ) . For a European call option on a dividend-paying stock, vega can be shown as √ (86.23) ν = St e−qτ τ · N (d1 ) .
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2984
The derivation of (86.23) is ν=
∂N (d1 ) ∂N (d2 ) ∂Ct = St e−qτ − Xe−rτ ∂σs ∂σs ∂σs ∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − Xe−rτ ∂d1 ∂σs ∂d2 ∂σs
⎛ 2 τ 3/2 − ln St + r − q + 2 σ d s X 1 1 = St e−qτ √ e− 2 · ⎝ σs2 τ 2π
d2 St (r−q)τ 1 1 ·e − Xe−rτ √ e− 2 · X 2π 1⎞ ⎛ 2 − ln SXt + r − q + σ2s τ · τ 2 ⎠ ·⎝ σs2 τ = St e−qτ
⎛ d2 − 21
1 = St e−qτ √ e 2π
·⎝
σs2 τ 3/2 − ln SXt + r − q + σs2 τ ⎛
d2 − 21
− ln SXt + r − q +
1 ·⎝ − St e−qτ √ e 2π σs2 τ 3/2 1 − d21 = St √ e 2 · σs2 τ 2π √ = St e−qτ τ · N (d1 ) .
σs2 τ
σs2 2
σs2 2
1⎞ τ ·τ2 ⎠
σs2 2
1⎞ τ ·τ2 ⎠
1⎞ τ ·τ2 ⎠
For a European call option on a dividend-paying stock, vega can be shown as √ ν = St e−qτ τ · N (d1 ) . The derivation of (86.24) is ν=
∂N (−d2 ) ∂N (−d1 ) ∂Pt = Xe−rτ − St e−qτ ∂σs ∂σs ∂σs ∂(1 − N (d2 )) ∂d2 ∂(1 − N (d1 )) ∂d1 − St e−qτ ∂d2 ∂σs ∂d1 ∂σs
2 d1 St (r−q)τ 1 ·e = −Xe−rτ √ e− 2 · X 2π = Xe−rτ
(86.24)
page 2984
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
⎛ ·⎝
− ln SXt + r − q + σs2 τ ⎛ d2 − 21
1 + St e−qτ √ e 2π
·⎝ ⎛
d2 − 21
1 = −St e−qτ √ e 2π
·⎝
1 + St e−qτ √ e 2π
·⎝
2985
1⎞ τ ·τ2 ⎠
σs2 τ 3/2 − ln SXt + r − q +
σs2 2
σs2 τ
− ln SXt + r − q +
⎛ d2 − 21
σs2 2
page 2985
σs2 2
σs2 τ
1⎞ τ ·τ2 ⎠
σs2 τ 3/2 − ln SXt + r − q + σs2 τ
1⎞ τ ·τ2 ⎠
σs2 2
1⎞ τ ·τ2 ⎠
2 τ 3/2 d2 σ 1 1 s = St e−qτ √ e− 2 · σs2 τ 2π √ = St e−qτ τ · N (d1 ) . 86.5.2 Application of vega (ν) Suppose a delta-neutral and gamma-neutral portfolio has a vega equal to ν and the vega of a particular option is νo . Similar to gamma, we can add a position of −ν/νo in option to make a vega-neutral portfolio. To maintain delta-neutral, we should change the underlying asset position. However, when we change the option position, the new portfolio is not gamma-neutral. Generally, a portfolio with one option cannot maintain its gamma-neutral and vega-neutral at the same time. If we want a portfolio to be both gammaneutral and vega-neutral, we should include at least two kinds of options on the same underlying asset in our portfolio. For example, a delta-neutral and gamma-neutral portfolio contains option A, option B, and underlying asset. The gamma and vega of this portfolio are −3,200 and −2,500, respectively. Option A has a delta of 0.3, gamma of 1.2, and vega of 1.5. Option B has a delta of 0.4, gamma of 1.6, and vega of 0.8. The new portfolio will be both gamma-neutral and vega-neutral when adding ωA of option A and ωB of option B into the original portfolio. Gamma-Neutral: − 3200 + 1.2ωA + 1.6ωB = 0. Vega-Neutral: − 2500 + 1.5ωA + 0.8ωB = 0. From the two equations shown above, we can get the solution that ωA =1000 and ωB = 1250. The delta of new portfolio is 1000 × 0.3 + 1250 × 0.4 = 800.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2986
To maintain delta-neutral, we need to short 800 shares of the underlying asset. 86.6 Rho (ρ) The rho of an option is defined as the rate of change of the option price with respect to the interest rate: rho =
∂Π , ∂r
where Π is the option price and r is interest rate. The rho for an ordinary stock call option should be positive because higher interest rate reduces the present value of the strike price which in turn increases the value of the call option. Similarly, the rho of an ordinary put option should be negative by the same reasoning. We next show the derivation of rho for various kinds of stock options. 86.6.1 Derivation of rho for different kinds of stock options For a European call option on a non-dividend stock, rho can be shown as rho = Xτ · e−rτ N (d2 ).
(86.25)
The derivation of (86.25) is rho =
∂N (d1 ) ∂N (d2 ) ∂Ct = St − (−τ ) · X · e−rτ N (d2 ) − Xe−rτ ∂r ∂r ∂r ∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 + Xτ · e−rτ N (d2 ) − Xe−rτ = St ∂d1 ∂r ∂d2 ∂r √
2 d 1 τ 1 + Xτ · e−rτ N (d2 ) = St √ e− 2 · σs 2π
√
1 − d21 St rτ τ −rτ ·e · √ e 2 · · − Xe X σs 2π √
√ 1 − d21 τ 1 − d21 τ −rτ 2 2 · · + Xτ · e N (d2 ) − St √ e = St √ e σ σ 2π 2π s s = Xτ · e−rτ N (d2 ).
For a European put option on a non-dividend stock, rho can be shown as rho = −Xτ · e−rτ N (−d2 ).
(86.26)
page 2986
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2987
2987
The derivation of (86.26) is ∂N (−d2 ) ∂N (−d1 ) ∂Pt = (−τ ) · X · e−rτ N (−d2 ) + Xe−rτ − St rho = ∂r ∂r ∂r ∂(1 − N (d2 )) ∂d2 = −Xτ · e−rτ (1 − N (d2 )) + Xe−rτ ∂d2 ∂r − St
∂(1 − N (d1 )) ∂d1 ∂d1 ∂r −rτ
= Xτ · e
−rτ
(1 − N (d2 )) − Xe
·
√
√
τ 1 − d21 τ + St √ e 2 · · σs σs 2π
d2 1 S 1 √ e− 2 · t · erτ X 2π
d2 1 1 = Xτ · e (1 − N (d2 )) − St √ e− 2 · 2π √
2 d1 1 τ + St √ e− 2 · σs 2π
−rτ
√
τ σs
= −Xτ · e−rτ N (−d2 ). For a European call option on a dividend-paying stock, rho can be shown as rho = Xτ · e−rτ N (d2 ).
(86.27)
The derivation of (86.27) is ∂N (d1 ) ∂N (d2 ) ∂Ct = St e−qτ − (−τ ) · X · e−rτ N (d2 ) − Xe−rτ rho = ∂r ∂r ∂r ∂N (d ) ∂d ∂N (d 1 1 2 ) ∂d2 + Xτ · e−rτ N (d2 ) − Xe−rτ = St e−qτ ∂d1 ∂r ∂d2 ∂r √
2 d1 1 τ + Xτ · e−rτ N (d2 ) = St e−qτ √ e− 2 · σs 2π
√
1 − d21 St (r−q)τ τ −rτ 2 √ e ·e · · · − Xe X σ 2π s √
2 d1 1 τ = St e−qτ √ e− 2 · σs 2π √
d2 τ −rτ −qτ 1 − 21 √ e · + Xτ · e N (d2 ) − St e σs 2π = Xτ · e−rτ N (d2 ).
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
page 2988
C. F. Lee & Y. Xiao
2988
For a European put option on a dividend-paying stock, rho can be shown as rho = −Xτ · e−rτ N (−d2 ).
(86.28)
The derivation of (86.28) is ∂N (−d2 ) ∂N (−d1 ) ∂Pt = (−τ ) · X · e−rτ N (−d2 ) + Xe−rτ − St e−qτ rho = ∂r ∂r ∂r ∂(1 − N (d )) ∂d 2 2 = −Xτ · e−rτ (1 − N (d2 )) + Xe−rτ ∂d2 ∂r − St e−qτ
∂(1 − N (d1 )) ∂d1 ∂d1 ∂r
d2 1 St (r−q)τ 1 ·e · √ e− 2 · = Xτ · e (1 − N (d2 )) − Xe X 2π √
√
d2 τ 1 τ 1 + St e−qτ √ e− 2 · · σs σ 2π s √
d2 τ −rτ −qτ 1 − 21 √ e · = Xτ · e (1 − N (d2 )) − St e σs 2π √
2 d1 1 τ + St e−qτ √ e− 2 · σs 2π
−rτ
−rτ
= −Xτ · e−rτ N (−d2 ). 86.6.2 Application of Rho (ρ) Assume that an investor would like to see how interest rate changes affect the value of a three-month European put option she holds with the following information. The current stock price is $65 and the strike price is $58. The interest rate and the volatility of the stock is 5% and 30% per annum, respectively. The rho of this European put option can be calculated as follows: Rhoput = Xτ e−rτ N (d2 ) = ($58)(0.25)e−(0.05)(0.25) N
ln(65/58) + [0.05 − 12 (0.3)2 ](0.25) √ (0.3) 0.25
∼ = −3.168. This calculation indicates that given 1% change increase in interest rate, say from 5% to 6%, the value of this European call option will decrease: 0.03168 (0.01 × 3.168). This simple example can be further applied to stocks that pay dividends using the derivation results shown previously.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2989
2989
86.7 Derivation of Sensitivity for Stock Options with Respect to Exercise Price For a European call option on a non-dividend stock, the sensitivity can be shown as ∂Ct = −e−rτ N (d2 ). ∂X
(86.29)
The derivation of (86.29) is ∂N (d1 ) ∂N (d2 ) ∂Ct = St − e−rτ N (d2 ) − Xe−rτ ∂X ∂X ∂X ∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − e−rτ N (d2 ) − Xe−rτ = St ∂d1 ∂X ∂d2 ∂X
2 d1 1 1 1 − e−rτ N (d2 ) = St √ e− 2 · √ · − X σ τ 2π s
1 1 − d21 St rτ 1 −rτ 2 √ e √ · − ·e · − Xe X X σs τ 2π
2 d d2 1 St St 1 − 21 −rτ − 21 e e − e N (d2 ) − √ − − = √ X X σs 2πτ σs 2πτ = −e−rτ N (d2 ). For a European put option on a non-dividend stock, the sensitivity can be shown as ∂Pt = e−rτ N (−d2 ). ∂X
(86.30)
The derivation of (86.30) is ∂N (−d2 ) ∂N (−d1 ) ∂Pt = e−rτ N (−d2 ) + Xe−rτ − St ∂X ∂X ∂X ∂(1 − N (d )) ∂(1 − N (d1 )) ∂d1 2 ∂d2 − St = e−rτ (1 − N (d2 )) + Xe−rτ ∂d2 ∂X ∂d1 ∂X
2 d1 1 St rτ 1 1 √ · − ·e = e−rτ (1 − N (d2 )) − Xe−rτ √ e− 2 · X X σs τ 2π
2 d1 1 1 1 + St √ e− 2 · √ · − X σ τ 2π s
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2990
−rτ
=e
d2 1 1 e− 2 (1 − N (d2 )) + √ σs 2πτ
St X
d2 1 1 e− 2 − √ σs 2πτ
St X
= e−rτ N (−d2 ). For a European call option on a dividend-paying stock, the sensitivity can be shown as ∂Ct = −e−rτ N (d2 ). ∂X
(86.31)
The derivation of (86.31) is ∂N (d1 ) ∂N (d2 ) ∂Ct = St e−qτ − e−rτ N (d2 ) − Xe−rτ ∂X ∂X ∂X ∂N (d1 ) ∂d1 ∂N (d2 ) ∂d2 − e−rτ N (d2 ) − Xe−rτ = St e−qτ ∂d1 ∂X ∂d2 ∂X
2 d 1 1 1 1 − e−rτ N (d2 ) = St e−qτ √ e− 2 · √ · − X σs τ 2π
1 1 − d21 St (r−q)τ 1 −rτ √ e 2 · √ · − ·e −Xe X X σs τ 2π
2 −qτ d1 d2 1 St e St e−qτ 1 1 e− 2 − e− 2 − − e−rτ N (d2 ) − √ . = √ X X σs 2πτ σs 2πτ = −e−rτ N (d2 ). For a European put option on a dividend-paying stock, the sensitivity can be shown as ∂Pt = e−rτ N (−d2 ). ∂X The derivation of (86.32) is ∂N (−d2 ) ∂N (−d1 ) ∂Pt = e−rτ N (−d2 ) + Xe−rτ − St e−qτ ∂X ∂X ∂X ∂(1 − N (d2 )) ∂d2 = e−rτ (1 − N (d2 )) + Xe−rτ ∂d2 ∂X − St e−qτ
∂(1 − N (d1 )) ∂d1 ∂d1 ∂X
(86.32)
page 2990
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
d2 S 1 1 √ e− 2 · t · e(r−q)τ = e (1 − N (d2 )) − Xe X 2π
d2 1 1 1 1 + St e−qτ √ e− 2 · √ · − σs τ X 2π
2 d 1 St e−qτ −rτ − 21 e = e (1 − N (d2 )) + √ X σs 2πτ
d2 St e−qτ 1 1 e− 2 − √ X σs 2πτ
−rτ
−rτ
page 2991
2991
1 1 √ · − X σs τ
= e−rτ N (−d2 ). 86.8 Relationship Between Delta, Theta, and Gamma So far, the discussion has introduced the derivation and application of each individual Greek letter and how they can be applied in portfolio management. In practice, the interaction or trade-off between these parameters is of concern as well. For example, recall of the Black–Scholes–Merton differential equation with non-dividend paying stock can be written as ∂Π 1 2 2 ∂ 2 Π ∂Π + rS + σ S = rΠ, ∂t ∂S 2 ∂S 2 where Π is the value of the derivative security contingent on stock price, S is the price of stock, r is the risk-free rate, and σ is the volatility of the stock price, and t is the time to expiration of the derivative. Given the earlier derivation, we can rewrite the Black–Scholes partial differential equation (PDE) as 1 Θ + rSΔ + σ 2 S 2 Γ = rΠ. 2 This relation gives us the trade-off between delta, gamma, and theta. For example, suppose there are two delta-neutral (Δ = 0) portfolios, one with positive gamma (Γ > 0) and the other one with negative gamma (Γ < 0) and they both have value of $1 (Π = 1). The trade-off can be written as 1 Θ + σ 2 S 2 Γ = r. 2 For the first portfolio, if gamma is positive and large, then theta is negative and large. When gamma is positive, changes in stock prices result in a higher value of the option. This means that when there is no change in stock prices, the value of the option declines as we approach the expiration date. As a result, the theta is negative. On the other hand, when gamma is negative and large, changes in stock prices results in lower option value. This
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2992
means that when there is no stock price change, the value of the option increases as we approach the expiration and theta is positive. This gives us a trade-off between gamma and theta and they can be used as proxy for each other in a delta-neutral portfolio.
86.9 Empirical Examples of Delta, Theta, Gamma, Vega, and Rho In this section, we discuss how delta, theta, gamma, vega, and rho can be estimated in terms of Johnson & Johnson’s stock price and option price information. The procedure of estimating these Greek letters is briefly described as follows: 1. Use monthly stock rate of return of Johnson & Johnson during the period from 2013/10 to 2018/10 to calculate the annual historical variance. 2. Use the stock price data at 133.87, which is the closing price on 2018/10/12. 3. Assume the strike price is $130 and the contract expires on November 16, 2018. 4. The annual risk-free rate is 3.2% in terms of 10-year T-bond rate. By using the current stock price per share, exercise price per share, risk-free rate information, contract period, and historical variance, we can estimate both the theoretical value and call and put options. 5. Use all of the above-mentioned information to calculate delta, theta, gamma, vega, and rho in accordance with equations (86.5), (86.13), (86.18), (86.21), and (86.25). From Appendix 86B, we have empirical estimations as follows:
Table 86.1: Greek letters for European call and put options for JNJ on 2018/10/12.
Delta Theta Gamma Vega Rho
European call options
European put options
0.8090 −18.7722 0.1153 24.2348 9.9236
−0.1910 −14.6250 0.1153 24.2348 −2.5039
page 2992
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2993
2993
86.10 Summary and Concluding Remarks In this chapter, we have shown the partial derivatives of stock option with respect to five variables. Delta (Δ), the rate of change of option price to change in price of underlying asset, is first derived. After Delta is obtained, Gamma (Γ) can be derived as the rate of change of delta with respect to the underlying asset price. Other two risk measures are Theta (Θ) and Rho (ρ); they measure the change in option value with respect to passing time and interest rate, respectively. Finally, one can also measure the change in option value with respect to the volatility of the underlying asset and this gives us Vega (v). The applications of these Greek letters in the portfolio management have also been discussed. In addition, we use the Black–Scholes PDE to show the relationship between these risk measures. In sum, risk management is one of the important topics in finance for both academics and practitioners. Given the recent credit crisis, one can observe that it is crucial to properly measure the risk related to the even more complicated financial assets. The comparative static analysis of option pricing models gives an introduction to the portfolio risk management. Further discussion of the comparative statics analysis can be found in Hull (2017) and McDonald (2012). One alternative to Black–Scholes is the constant elasticity of variance (CEV) model. Cox (1975) and Cox and Ross (1976) developed the “constant elasticity of variance (CEV) model” which incorporates an observed market phenomenon that the underlying asset variance tends to fall as the asset price increases (and vice versa). Schroder (1989) derives computation formula for CEV model. Schroder’s (1989) CEV formula is generally used to do empirical computation of either option value or implied variance. The advantage of CEV model is that it can describe the interrelationship between stock prices and its volatility; therefore, it might be useful to derive the Greek letters of CEV option pricing model. The procedure of estimated implied variance of CEV model can be found in the book by Lee et al. (2016).
Bibliography T. Bjork (1998). Arbitrage Theory in Continuous Time, New York: Oxford University Press. P. P. Boyle and D. Emanuel (1980). Discretely Adjusted Option Hedges. Journal of Financial Economics, 8(3), 259–282. J. C. Cox (1975). Notes on Option Pricing I: Constant Elasticity of Variance Diffusions. Stanford University, Working Paper. J. C. Cox and S. A. Ross (1976). The valuation of options for alternative stochastic processes. Journal of Financial Economics, 3, 145–166.
July 6, 2020
15:53
2994
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
D. Duffie (2001). Dynamic Asset Pricing Theory. Princeton, NJ: Princeton University Press. F. J. Fabozzi (2007). Fixed Income Analysis, 2nd ed. New York: Wiley. S. Figlewski (1989). Options Arbitrage in Imperfect Markets. Journal of Finance, 44(5), 1289–1311. D. Galai (1983). The Components of the Return from Hedging Options against Stocks. Journal of Business, 56(1), 45–54. J. Hull (2017). Options, Futures, and Other Derivatives, 10th ed. Upper Saddle River, New Jersey: Prentice Hall. J. Hull and A. White (1987). Hedging the Risks from Writing Foreign Currency Options. Journal of International Money and Finance, 6(2), 131–152. I. Karatzas and S. E. Shreve (2000). Brownian Motion and Stochastic Calculus, Berlin: Springer. F. C. Klebaner (2005). Introduction to Stochastic Calculus with Applications, London: Imperial College Press. C. F. Lee, J. Lee, J. R. Chang, and T. Tai (2016). Essentials of Excel, Excel VBA, SAS and Minitab for Statistical and Financial Analyses, New York: Springer. R. L. McDonald (2012). Derivatives Markets, 3rd ed. Boston, MA: Addison-Wesley. A. Saunders and M. M. Cornett (2018). Financial Markets and Institutions, 7th ed. McGraw-Hill. Schroder, M. (1989). A reduction method applicable to compound option formulas. Management Science, 35(7), 823–827. S. E. Shreve (2004). Stochastic Calculus for Finance II: Continuous Time Model, New York: Springer. B. Tuckman (2002). Fixed Income Securities: Tools for Today’s Markets, 2nd ed. New York: Wiley.
Appendix 86A: Convexity and Bond Price Change Duration appears a better measure of a bond’s life than maturity because it provides a more meaningful relationship with interest-rate changes. This relationship has been expressed by Hopewell and Kaufman (1973) as D ΔP = Δi = D ∗ Δi, P (1 + i)
(86A.1)
where Δ = “change in”; P = bond price; D = duration; and i = market interest rate or bond yield. The duration rule in equation (86A.1) is a good approximation for small changes in bond yield, but it is less accurate for large changes. Equation (86A.1) implies that percentage change in bond price is linearly related to change in yield to maturity. If this linear relationship does not hold, following Saunders and Cornett (2018), then equation (86A.1) can be generalized as ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 , P
(86A.2)
page 2994
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2995
2995
where the Convexity is the rate of change of the slope of the price–yield curve as follows: Convexity =
∂2P 1 × 2 P ∂i
n 1 CFt 2 (t + t) P × (1 + i)2 t=1 (1 + i)t ΔP + 8 ΔP − + , ≈ 10 P P =
(86A.3)
where CF t represents either a coupon payment before maturity or final coupon plus par value at the maturity date. ΔP − is the capital loss from a one-basis-point (0.0001) increase in interest rates and ΔP + is the capital gain from a one-basis-point (0.0001) decrease in interest rates. In equation (86A.2), the first term of on the right-hand side is the same as the duration rule, equation (86A.1). The second term is the modification for convexity. Note that for a bond with positive convexity, the second term is positive, regardless of whether the yield rises or falls. The more accurate equation (86A.2), which accounts for convexity, always predicts a higher bond price than equation (86A.1). Of course, if the change in yield is small, the convexity term, which is multiplied by (Δi)2 in equation (86A.2), will be extremely small and will add little to the approximation. In this case, the linear approximation given by the duration rule will be sufficiently accurate. Thus, convexity is more important as a practical matter when potential interest rate changes are large. Example 86A.3 provides further illustration. Example 86A. Figure 86A.1 is drawn by the assumptions that the bond with 20-year maturity and 7.5% coupon sells at an initial yield to maturity of 7.5%. Because the coupon rate equals yield to maturity, the bond sells at par value, or $1000. The modified duration and convexity of the bond are 10.95908 and 155.059. If the bond’s yield increases from 7.5% to 8.0% (Δi = 0.005), the price of the bond actually falls to $950.9093. Based on the duration rule, the bond price falls from $1000 to $945.2046 with a decline of 5.47954% equation (86A.1) as follows: ΔP = −D ∗ Δi = −10.95908∗ 0.005 = −0.0547954, or −5.47954%. P If we use equation (86A.2) instead of equation (86A.1), we get the bond price falls from $1000 to $947.1482 with a decline of 5.28572% by
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2996
Actual Data Percentage Change in Bond Price (%)
July 6, 2020
Duration Rule Duration-withConvexity Rule
Changes in Yield to Maturity (%)
Figure 86A.1: in YTM.
The relationship between percentage changes in bond price and changes
Equation (86A.2): ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 P = −10.95908 × 0.005 + 0.5 × 155.059 × (0.005)2 = −0.0528572, or −5.28572%. The duration rule used by equation (86A.1) is close to the case with accounting for convexity in terms of equation (86A.2). However, if the change in yield is larger, 3% (Δi = 0.03), the price of the bond actually falls to $753.0727 and convexity becomes an important matter of pricing the percentage change in bond price. Without accounting for convexity, the price of the bond on dashed line actually falls from $1000 to $671.2277 with a decline of 32.8772% based on the duration rule, equation (86A.1) as follows: ΔP = −D ∗ Δi = −10.95908∗ 0.03 = −0.328772, or −32.8772%. P According to the duration-with-convexity rule, equation (86A.2), the percentage change in bond price is calculated in the following equation: ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 P = −10.95908 × 0.03 + 0.5 × 155.059 × (0.03)2 = −0.258996, or −25.8996%. The bond price $741.0042 estimated by the duration-with-convexity rule is close to the actual bond price $753.0727 rather than the price $671.2277
page 2996
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters
page 2997
2997
estimated by the duration rule. As the change in interest rate becomes larger, the percentage change in bond price calculated by equation (86A.1) is significantly different from that calculated by equation (86A.2). Saunders and Cornett (2018) have discussed why convexity is important in the risk management of financial institutions. Appendix 86B: Greek Letter Estimates in Terms of Johnson & Johnson Stock Price and Option Information In this section, we discuss how delta, theta, gamma, vega, and rho can be estimated in terms of Johnson & Johnson’s stock price and option price information. The procedure of estimating these Greek letters is briefly described as follows: 1. Use monthly stock rate of return of Johnson & Johnson during the period from 2013/10 to 2018/10 to calculate the annual historical variance. 2. Use the stock price data at 133.87, which is the closing price on 2018/10/12. 3. Assume the strike price is $130 and the contract expires on November 16, 2018. 4. The annual risk-free rate is 3.2% in terms of 10-year T-bond rate. By using the current stock price per share, exercise price per share, risk-free rate information, contract period, and historical variance, we can estimate both the theoretical value and call and put options. 5. Use all of the above-mentioned information to calculate delta, theta, gamma, vega, and rho in accordance with equations (86.5), (86.13), (86.18), (86.21), and (86.25). The input variables for the calculation of Greek letters are listed in Table 86B.1. Table 86B.1:
Input variables.
Input variables Current Price Call at strike price E Put at strike price E Today Option expired date Time to maturity
$133.87 $130.00 $130.00 2018/10/12 2018/11/16 0.0959
Premium (Ask Price) Premium (Ask Price) Monthly Volatility Annual Volatility
$6.7 $1.61 0.035318324 0.122346265
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
C. F. Lee & Y. Xiao
2998 Table 86B.2:
Inputs for calculation of Greek letters. European call options
Stock Price (S = P) Exercise Price (X = E) Time to maturity (t) Annual interest rate (r) Annual volatility (Sigma) d1 d2 N(d1 ) N(d2 ) N(−d1 ) N(−d2 ) N (d1 )
Table 86B.3:
Delta Theta Gamma Vega Rho
European put options
133.87 130 0.0959 3.20% 12.23% 0.8742 0.8363 0.8090 0.7985
133.87 130 0.0959 3.20% 12.23% 0.8742 0.8363
0.1910 0.2015 0.5846
Greek letter of European call and put options.
European call options
European put options
0.8090 −18.7722 0.1153 24.2348 9.9236
−0.1910 −14.6250 0.1153 24.2348 −2.5039
Note that the annual volatility is the annualized based on the monthly volatility which is estimated by using the historical monthly returns from 2013/10 to 2018/10, and the time to maturity is calculated by dividing the number of days from 2018/10/12 to 2018/11/6 by 365. Given the information provided above, we can calculate the following variables required for calculating the Greek letter. Inputs for Greek letters of European call and put options are shown in Table 86B.2. Note that N (d1 ) is calculated with the following formula: 2 d exp − 21 √ . 2π Given the information above, the Greek letters for European call and put options can be estimated and results are shown in Table 86B.3. The definition and formula of Greek Letters are given in Table 86B.4.
page 2998
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch86
A Comparative Static Analysis Approach to Derive Greek Letters Table 86B.4: options.
Greek letter Δ
Θ
Γ
ν
ρ
page 2999
2999
Definitions and formulas of Greek letters for European call and put
Definition The change of the option price respect to the rate of change of underlying asset price The rate of change of the option price with respect to the passage of time The rate of change of delta with respect to the rate of change of underlying asset price The rate of change of the option price with respect to the volatility of the underlying asset The rate of change of the option price with respect to the interest rate
European call options
European put options
N (d1 )
1 − N (d1 )
√ N (d1 ) − rX∗ − 2Sσ t ∗e−rt N (d2 )
√ N (d1 ) + rX ∗ − 2Sσ t ∗e−rt N (−d2 )
1√ N (d1 ) Sσ∗ t
S∗
√
t ∗ N (d1 )
Xt ∗ e−rt N (d2 )
1√ N (d1 ) Sσ∗ t
S∗
√
t ∗ N (d1 )
−Xt ∗ e−rt N (−d2 )
This page intentionally left blank
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Chapter 87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance Cheng Few Lee Contents 87.1 87.2
87.3
87.4
87.5
Introduction . . . . . . . . . . . . . . . . Fundamental Versus Technical Analysis 87.2.1 Fundamental analysis . . . . . . 87.2.2 Technical analysis . . . . . . . . 87.2.3 Dow theory . . . . . . . . . . . . 87.2.4 The odd-lot theory . . . . . . . 87.2.5 The confidence index . . . . . . 87.2.6 Trading volume . . . . . . . . . 87.2.7 Moving average . . . . . . . . . Anomalies and their Implications . . . . 87.3.1 Basu’s findings . . . . . . . . . . 87.3.2 Reinganum’s findings . . . . . . 87.3.3 Banz’s findings . . . . . . . . . . 87.3.4 Keim’s findings . . . . . . . . . 87.3.5 Additional findings . . . . . . . Security Rate-of-Return Forecasting . . 87.4.1 Regression approach . . . . . . . 87.4.2 Time-series approach . . . . . . 87.4.3 Composite forecasting . . . . . . Value Line Ranking . . . . . . . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 3001
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
3002 3003 3003 3010 3011 3012 3013 3013 3013 3016 3017 3018 3019 3019 3020 3021 3022 3023 3028 3030
page 3001
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
3002
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
87.5.1 Criteria of ranking . . . . . . . . . . . . . . . . . . . . 87.5.2 Performance evaluation . . . . . . . . . . . . . . . . . 87.6 Mutual Funds . . . . . . . . . . . . . . . . . . . . . . . . . . . 87.6.1 Mutual-fund classification . . . . . . . . . . . . . . . . 87.6.2 Three alternative mutual fund performance measures 87.6.3 Mutual-fund manager’s timing and selectivity . . . . 87.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 87A: Composite Forecasting Method . . . . . . . . . . .
3031 3031 3033 3033 3036 3036 3051 3052 3056
Abstract This chapter discusses the methods and applications of fundamental analysis and technical analysis. In addition, it investigates the ranking performance of the Value Line and the timing and selectivity of mutual funds. A detailed investigation of technical versus fundamental analysis is first presented. This is followed by an analysis of regression time-series and composite methods for forecasting security rates of return. Value Line ranking methods and their performance then are discussed, leading finally into a study of the classification of mutual funds and the mutual-fund managers’ timing and selectivity ability. In addition, the hedging ability is also briefly discussed. Sharpe measure, Treynor measure, and Jensen measure are defined and analyzed. All of these topics can help improve performance in security analysis and portfolio management. Keywords Fundamental analysis • Technical analysis • Dow theory • Odd-Lot theory • Confidence index • Trading volume • Moving average • Component analysis • ARIMA models • Composite forecasting • Sharpe performance measure • Treynor performance measure • Jensen performance measure.
87.1 Introduction The role of security analysts and portfolio managers is to select the right stocks at the right time. They can generally use theory and methods to determine which stock (or stocks) they should buy or sell at the appropriate point. Security-analysis and portfolio-management methodologies can be classified into fundamental analysis and technical analysis. Technical analysis and fundamental analysis frameworks have provided substantial evidence of their respective abilities to explain a cross section of stock prices or to forecast future price movement. Technical information about stocks has been frequently used by securities analysts, portfolio managers, and academic researchers. Technical analysts focus primarily on short-term price return and trading volume. One of the most notable lines of research using technical information in studying stock price behavior
page 3002
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3003
3003
is momentum investment strategy. Using the past performances of stocks, Jegadeesh and Titman (1993 and 2001) provide documentation based on cumulative returns in the past three to twelve months, showing that the highest-return decile portfolio outperforms the lowest-return decile portfolio in the following three to twelve months. Further research regarding these topics can be found in Chen et al. (2016). This chapter discusses the methods and applications of fundamental analysis and technical analysis. In addition, it investigates the ranking performance of the Value Line and the timing and selectivity of mutual funds. A detailed investigation of technical versus fundamental analysis is first presented. This is followed by an analysis of regression time-series and composite methods for forecasting security rates of return. Value Line ranking methods and their performance then are discussed, leading finally into a study of the classification of mutual funds and the mutual-fund managers’ timing and selectivity ability. In addition, the hedging ability is also briefly discussed. All of these topics can help improve performance in security analysis and portfolio management. In Section 87.2, we compare fundamental and technical analysis. In Section 87.3, we discuss anomalies and their implications. Security rate-of-return forecasting is explored in Section 87.4. Value line ranking is discussed in Section 87.5, and mutual funds are discussed in Section 87.6. In Section 87.7, we summarize the chapter. Finally, in Appendix 87A, we discuss composite forecasting method for forecasting either GDP or earnings. 87.2 Fundamental Versus Technical Analysis This section explores the relationship between two components of security analysis and portfolio management: fundamental analysis and technical analysis. 87.2.1 Fundamental analysis The job of a security analyst is to estimate the value of securities. If a security’s estimated value is above its market price, the security analyst will recommend buying the stock; if the value is below the market price, the security should be sold before its price drops. Underpriced stocks are purchased until their price is bid up to equal their value; overpriced stocks are sold, driving their price down until it equals their value. There are two schools of thought as to how one determines an overpriced or underpriced security. Fundamental analysis (the fundamentalist school)
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3004
studies the fundamental facts affecting a stock’s value. Fundamental analysts delve into companies’ earnings, their management, earnings forecasts, firm’s competition, market conditions, and many other business and economic factors. The second school of thought determines an overpriced or underpriced security by studying the way security prices behave over time. Technical analysis concentrates almost totally on charts of security-market prices and related summary statistics of security trading. The macro approach to fundamental analysis first emphasizes the analysis of the aggregate economy and market, then industry analysis, and finally the examination of specific companies within the industry. Changes in the national economy and credit conditions are associated with changes in interest rates, capitalization rates, and multipliers. Therefore, an aggregate economic and market analysis must be done in conjunction with determining the security’s appropriate multiplier. A multiplier of 10 may be appropriate for normal economic conditions, whereas a multiple of 15 may be more realistic in an environment of rapid economic growth and prosperity. All of the fundamentalist’s research is based upon some valuation model. The analyst prepares his or her estimate of the intrinsic value per share at time 0, Pi0 , by multiplying the ith stock’s normalized earnings per share at time 0, Ei0 times the share’s earnings multiplier, mit : Pi0 = Ei0 mit
t = 0,
(87.1)
where d
mit =
i1 Pi0 Ei0 = . Ei0 ki − gi
The earnings multiplier Pi0 /Ei0 is called the price–earnings (P/E) ratio. The ratio di1 /Ei0 is called the dividend–payout ratio; ki and gi are the required rate of return and growth rate, respectively. Much of the fundamental analyst’s work centers on determining the appropriate capitalization rate, or equivalently the appropriate multiplier to use in valuing a particular security’s income. This encompasses the micro approach to estimating future values for the stock market. It involves using a two-step approach: (1) estimating the expected earnings for some market-indicator series (Dow-Jones Industrial Average or Standard & Poor’s Industrial Index) or some stock, and (2) estimating the expected earnings multiplier for the market series or stock. The main factors that must be considered in determining the correct multiplier are (1) the risk of the security, (2) the growth rate of the dividend stream, (3) the duration of any expected growth, and (4) the dividend–payout ratio.
page 3004
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3005
3005
In determining the P/E ratio to use in valuing a firm’s securities, three factors must be estimated: (1) the capitalization rate, (2) the dividend growth rate, and (3) the dividend–payout ratio. Algebraically, d1 P0 = E k−g
or
P0 d1 d0 (1 + g) = = , E k−g k−g
(87.2)
where d1 /E is the dividend–payout ratio; K is the capitalization rate; and g is the expected growth of dividends. Given this equation, a positive relationship is expected between the earnings multiplier and the dividend payout, and with the growth rate of dividends, all things being equal. Alternatively, there should be a negative relationship between the earnings multiplier and the capitalization rate. Examples 87.1 and 87.2 provide further illustration. Example 87.1. The stock of XYZ Corporation is currently paying a dividend of $1.00 per share. The firm’s dividend growth rate is expected to be 10%. For firms in the same risk class as XYZ, market analysts agree that the capitalization rate is approximately 15%. Current earnings for XYZ are $2.00 per share and they are expected to grow at 10%. What are the P/E ratio and the price of XYZ shares given this information? Solution $1.00(1 + 0.1)/2.00(1 + 0.1) d1 /E = = 10, k−g 15 − 10 1.00(1 + 0.1) d0 (1 + g) = = $22/share. P = k−g 0.15 − 0.10
P/E =
Example 87.2. XYZ in Example 87.1 is expected to experience an increase in growth rate from 10% to 12%. What is the impact on XYZ’s P/E ratio and price? Solution $1.00(1.12)/2.00(1 + 1.12) d1 /E = = 16.67, k−g 0.15 − 0.12 1.00(1.12) d0 (1 + g) = = $37.33. P = k−g 0.15 − 0.12
P/E =
As can be seen from comparing Examples 87.1 and 87.2, a 20% increase in growth has led to a 67% increase in P/E ratio and a 69% increase in price. It is clear, therefore, that the accuracy of the analyst’s growth estimate is very important.
July 6, 2020
15:53
3006
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
The capitalization rate varies with a firm’s risk class and the prevailing market conditions (therefore, the necessity of the macro approach). Since future expectations are influenced by past experience, one way to estimate a firm’s risk class is to examine historical data .The capitalization rate is determined by (1) the economy’s risk-free rate, (2) the expected rate of price increases (annual rate of inflation), and (3) a risk premium for common stocks that reflects investor uncertainty regarding future returns. The capital asset pricing model (CAPM) suggests using the systematic risk for common stocks to determine the size of the risk premium. In theory, this measure of risk should be the beta for common stocks relative to the market portfolio for all risky assets. Since a portfolio of all risky assets does not exist, an alternative is to examine fundamental factors. These factors examine the relationship between the systematic risk (beta) for a security and various proxies for business risk and financial risk. A generally accepted measure of a firm’s business risk is the coefficient of variation of the firm’s operating income. Financial risk is determined by the financing decisions of the firm, or, more specifically, by the extent of financial leverage employed. The most common measures of financial risk are the debt/equity ratio and the fixedcharge coverage ratio. Studies of securities listed on the New York Stock Exchange (NYSE) have shown that their historical average-earnings capitalization rate varied directly with the security’s volatility coefficient. The fundamental analyst can measure the risk of the company in recent periods, adjust these historical risk statistics for any expected changes, and then use these forecasted risk statistics to obtain capitalization rates. The growth of dividends is a function of the growth of earnings and changes in the dividend–payout ratio. It is usually fairly simple to estimate the growth rate in cash dividends or earnings per share. The growth rate is as important as the capitalization rate in estimating multipliers. The effects of the dividend–payout ratio are more direct than the effect of the growth rate. If other things remain constant, reducing a corporation’s dividend payout cuts its multiplier and thus its intrinsic value proportionately. For companies whose payout ratio fluctuates widely, it is necessary to estimate the corporation’s normalized earnings per share averaged over a complete business cycle. After a share’s normal earnings are estimated, all that needs be done is to divide normalized earnings per share into the corporation’s regular cash dividend per share to find the payout ratio for use in the determination of an earnings multiplier.
page 3006
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3007
3007
Reilly et al. (1983) found that the fundamental factors, such as the payout ratio, growth, risk-free rate, earnings variability, the debt/equity ratio, and the failure rate of firms going bankrupt, combine to form the determinants of the aggregate stock-market earnings multiple. Using these fundamental factors as independent variables in an ordinary least-squares (OLS) regression and multiple discriminate analysis, predictions are made for the multiplier. They then use the OLS model to simulate investment strategies. If the model predicted a decline in the multiple, investment in T-bills was indicated; if the model predicted an increase in the multiple, and investment in common stock was indicated. Their results strongly support the multipleprediction model compared to buying and holding common stocks. Not only is the rate of return substantially higher, but risk as measured by the standard deviation is lower for the model stock portfolio than for the buy-andhold portfolio. Their results indicate that on the basis of analyzing macro variables, it is possible to estimate the likely future direction of the marketearnings multiple, which in turn is a major indicator of total stock-market movements over time. For those portfolio managers who invest in stocks, using this model is superior to a buy-and-hold strategy. Technical analysts study charts of aggregate stock movements in an attempt to determine trends in the stock market that influence movements in individual stocks. It is the fundamental analysts, however, who have delved into the driving forces behind these movements. Shiller (1984) presented a demand-side theory explaining the market movement. He considered the supply of corporate stock to be fixed, at least in the short run, while investment demand for stocks fluctuates according to economic states. The decline in the demand for shares may not be accompanied by a decline in supply. Therefore, when many investors wish to sell their shares, for whatever reason, the price of those shares must fall. Shiller proposed that the demand-side accounts for the majority of stock-market movements. A competing story that is equally attractive is the supply-side story. By this explanation, the main reason for the decline in stock prices is the decline in the expected future supply of dividends. However, in its extreme form, the theory implies that stock prices move only because of new information about future dividends. The theory is generally expressed today in conjunction with the assumption of efficient markets. Therefore, stock prices equal the present value of optimally forecasted future dividends. The third theory of stock market fluctuations rests on a “market fads” theory. According to this theory, stock prices move because people tend to be vulnerable to waves of optimism or pessimism, not because of any
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3008
economically identifiable shocks either to demand or supply. Highly publicized events or statements by influential figures have an impact on the market far beyond their true importance. The great crash of October 19, 1987, might well be thought of in this light. Shiller (1984) used the following model to test the supply-side theory: Pt =
∞ Et /(Dt+k ) k=1
(1 + r)k
,
(87.3)
where Pt is the real ex-dividend price of a share at time t; Et /(Dt+k ) is the mathematical expectation conditional on information at time t of the real dividend accruing to a share at time t + k; and r is the real discount rate. Since dividends are not known to infinity, and there is roughly a century of dividends on Standard and Poor’s stock, Shiller evaluated the model over historical data using Pt = Et (Pt∗ ) Pt∗ =
1981−t k=1
∗ P1981 Dt+k + (1 + r)k (1 + r)1981−t
(87.4a) t ≤ 1981.
(87.4b)
The variable Pt∗ is the “perfect foresight” or “ex post rational” stock price. By replacing Pt∗ with P1981 , Shiller was able to obtain an approximation Pst∗ (the subscript s refers to the supply-side theory) to the ex post rational price: Pst∗ =
1981−t k=1
∗ P1981 Dt+k + (1 + r)1981−t (1 + r)k
t ≤ 1981.
(87.5)
By plotting Pst∗ along with the real Standard & Poor’s price index Pt (Figure 87.1), Shiller found that the two series are quite divergent. It appears that Pst∗ behaves much like a simple growth trend, while Pt oscillates wildly around it; Pst∗ is smooth increasing because it is a weighted moving average of dividends, and moving averages serve to smooth the series averaged. Moreover, real dividends are a fairly stable and upward trending series. Using a basic economic theory of the two-period consumption with marginal rate of substitution st to derive a consumption beta similar to Breeden (1979), Shiller constructed a demand-side model to explain aggregate stock-price movements. Shiller showed that if it is the return on stock (found by dividing the sum of capital gain and dividend by price) between t and t + 1 and if Et (1 + it )st = 1 at all times, then the price is the
page 3008
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3009
3009
∗ Figure 87.1: Real stock-price index Pt∗ and ex post rational counterpart Pst based on real dividends, 1889–1981.
expected value of Pst∗ , where Pst∗ is the present value of dividends discounted by marginal rates of substitution: Pt = Et (Pt∗ ), Pst∗ =
∞
(k)
st Dt+k ,
(87.4a) (87.6)
k=1 (k)
and st is the marginal rate of substitution between Ct and Ct+k . The (k) function for the marginal rate of substitution is st = δk (Ct Ct+k )4 ; that (k) is, st is proportional to the consumption ratio to the fourth power. This functional form embodies the concavity we expect in indifference curves — that is, the marginal rate of substitution declines as Ct+k rises relative to Ct . The fourth power was chosen because it makes P ∗ roughly fit the data. The δk represents impatience, so that (if δ < 1) at a zero interest rate the person would consume more this period than in future periods. Substituting equation (87.6) for the marginal rate of substitution and assuming that dividends are expected to follow the trend Dt = D0 (1 + g)t with certainty: ∞ ∗ 4 t k −4 δ(1 + g) Ct+k . (87.7) Pdt = Ct D0 (1 + g) k=1
∗ ) means essentially This expression and the assumption that Pt = Et (Pdt that stock prices should be high when aggregate consumption is high and
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3010
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
∗ Figure 87.2: Real stock-price index Pt and ex post rational counterpart Pdt based on real consumption, 1889–1981.
∗ means “according to low when consumption is low. The subscript d for Pdt the demand-side theory”. ∗ along with real price per share P (Figure 87.2), Shiller By plotting Pdt t ∗ moves a great deal more than P ∗ ; P ∗ and P resemble each finds that Pdt t st dt other much more than Pst∗ and Pt . Unfortunately, Shiller finds that the theory seems to break down after 1950. Shiller concludes that movements over the last century in aggregate real dividends just fail to explain the movements in aggregate stock prices. Therefore, the supply-side efficient-market theory does not look promising. On the other hand, the demand-side theory looks more promising then the supply-side theory. It predicts a businesscycle correlation for stock prices that was, until 1950, actually observed. However, since 1950, the demand-side theory has failed to explain the dramatic increase of real stock prices. Regardless of the reasons fundamentalists will find for aggregate stock movements, technicians will continue to ignore the driving forces, whatever they may be, and will make their stock predictions based upon shapes seen in the plotting aggregate stock movements.
87.2.2 Technical analysis Technical analysts, rather than working through the large amount of fundamental information about an investment such as company earnings,
page 3010
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3011
3011
competitive products, and forthcoming legislation, searched for a summary of all facts by studying the way security prices behave historically. Over the past decades, technical analysts have focused their attention almost totally on charts of security market prices and related summary statistics. Technical analysis is based on the widely accepted premise that security prices are determined by the supply of and the demand for securities. Typically, technical analysts record historical financial data on charts, study these charts in an effort to find meaningful patterns, and use these patterns to predict future prices. Technical analysts believe that past patterns of market action will recur in the future and that past patterns can be used for predictive purposes. Rather than try to evaluate the intrinsic value of a security, the technical analysts seek to estimate security prices; that is, they try to forecast shortrun shifts in supply and demand that will affect the market price of one or more securities. Some of the tools used by chartists to measure supply and demand and to forecast security prices are the Dow theory chart, oddlot theory, confidence index, breadth-of-market indicators, relative-strength analysis, and trading-volume data. 87.2.3 Dow theory One of the tools used by technical analysts to measure supply and demand and forecast security prices is the Dow theory. The Dow theory is used to indicate reversals and trends in the market as a whole or in individual securities. According to the theory, there are three movements going on in the markets at all times. These movements are (1) daily fluctuations (the narrow movement from day-to-day), (2) secondary movements (short-run movements over two weeks to a month or more), and (3) primary trends, major movements covering at least four years in duration. The theory asserts that daily fluctuations are meaningless. However, daily asset prices or the market average must be plotted in order to outline the primary and secondary trends. In plotting the asset prices, the Dow theorists search for price patterns indicating market tops and bottoms. Technical analysts use three basic types of charts: (1) line charts, (2) bar charts, and (3) point-and-figure charts. Bar charts have vertical bars representing each day’s price movement. Each bar spans the distance from the day’s highest price to the day’s lowest price with a small cross on the bar marking the closing price. Lines are used to connect successive day’s prices. Patterns indicating market tops or bottoms are then searched for in these line charts by technical analysis. The Wall Street Journal uses the bar charts to show daily fluctuations in the Dow-Jones Average.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3012
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
Point-and-figure charts are more complex than line or bar charts. These charts draw the percentage change directly. They are not only used to detect reversals in a trend but are also employed to set actual price forecasts. The construction of a point-and-figure chart varies with the price level of the stock being charted. Only significant changes are posted to a pointand-figure chart. As a result there are one-point, two-point, three-point, and five-point, point-and-figure charts. To set the price target (forecasted stock price), which a stock is expected to attain, point-and-figure chartists begin by finding a congestion area. A congestion area is a horizontal band created by a series of reversals around a given price level. Congestion areas are supposed to result when supply and demand are equal. A breakout is said to have occurred when a column of price increase rises above the top of a congestion area. Breakout refers to a price rise or fall in which the price rises above or falls below the horizontal band which contained the congestion area. A penetration of the top of a congestion area is a signal for continued price rise. Penetration of the bottom of a congestion area by a column of price declines is a bearish signal. To establish estimates of the new prices that a security should attain, point-and-figure chartists measure the horizontal width of a congestion area as they watch for a breakout. When a breakout occurs, the chartist projects the horizontal count upward or downward in the same direction as the breakout to establish the new price target. 87.2.4 The odd-lot theory The odd-lot theory is one of the several theories of contrary opinion. In essence, the theory assumes that the common man is usually wrong, and it is therefore advantageous to pursue strategies opposite to his thinking. In order to find out what the common man is doing, statistics on odd-lot trading are gathered. Most odd-lot purchases are made by amateur investors with limited resources — that is, by the common man, who is a small, unsophisticated investor. Odd-lot trading volume is reported daily. The odd-lot statistics are broken down into the number of shares purchased, sold, and sold short. The index of odd-lot purchases less odd-lot sales is typically plotted concurrently with some market index. The odd-lotter’s net purchases are used by chartists as a leading indicator of market prices. That is, positive net purchases are presumed to forecast falls in market prices, and net selling by odd-lotters is presumed to occur at the end of a bear market.
page 3012
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3013
3013
87.2.5 The confidence index The confidence index is designed to measure how willing investors are to take a chance in the market. It is the ratio of high-grade bond yields to lowgrade bond yields. This ratio is started below one. When bond investors grow more confident about the economy, they shift their holdings from high-grade to lower-grade bonds, lowering their yield relative to high-grade bonds and increasing the confidence index. In other words, the confidence ratio moves close to one. Confidence-index technicians believe that the confidence index leads the stock market by two to eleven months. An upturn in the confidence index is supposed to foretell the rising optimism and rising prices in the stock market. A fall in the confidence index represents the fact that low-grade bond yields are rising faster or falling more slowly than high-grade yields. This is supposed to reflect increasing risk aversion by institutional money managers who foresee an economic downturn and rising bankruptcies and defaults. Analysts who have examined the confidence index conclude that it conveys some information for security analysis. 87.2.6 Trading volume Many technical analysts believe that it is possible to detect whether the market in general and/or certain security issues are bullish or bearish by studying the volume of trading. Volume is supposed to be a measure of the intensity of investors’ emotions. If high volume occurs on days when prices move up, the overall nature of the market is considered to be bullish. If the high volume occurs on days when prices are falling, this is a bearish sign. Recently, Lo and Wang (2000) and Lee et al. (2013) have empirically shown that volume is a very important indicator for technical analysis. 87.2.7 Moving average Moving-average (or rate-of-change) technicians focus on prices and/or moving averages of prices. The moving average is used to provide a smoothed stable reference point against which the daily fluctuations can be gauged. When the daily prices penetrate above the moving-average line, technicians interpret this penetration as a bearish signal. When the daily prices move downward through the moving average, they frequently fail to rise again for many months.
July 6, 2020
15:53
3014
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
Moving-average analysts recommend buying a stock when: (1) the 200day moving average flattens out and the stock’s price rises through the moving average, (2) the price of a stock falls below a moving-average line that is rising, and (3) the price of a stock that is above the moving-average line falls but turns around and begins to rise again before it ever reaches the moving-average line. Moving-average chartists recommend selling a stock when: (1) the moving-average line flattens out and the stock’s price drops downward through the moving-average line, (2) a stock’s price rises above a movingaverage line that is declining, and (3) a stock’s price falls downward through the moving-average line and turns around to rise but then falls again before getting above the moving-average line. There are many tools for technical analysts. All the technical-analysis tools have one thing in common — they attempt to measure the supply and demand for some group of investors. Shifts in supply and demand are presumed to be gradual, not instantaneous. When changes in prices are detected, they are presumed to be the result of gradual shifts in supply and demand rather than a series of instantaneous shifts. Since these shifts are expected to continue as the price gradually reacts to news or other factors, they are used to predict further price changes. There has been a lack of published research on technical analysis. Most of the work that has been done is held privately and forms the basis of trading recommendations. Academic studies in this area have concentrated on the random-walk model, which implies that daily price changes are uncorrelated. A paper by Irwin and Uhrig (1984) combined trading-system optimization and efficient-market tests to investigate the validity of technical analysis for the future market. Irwin and Uhrig’s specific objectives were (1) to test whether the random-walk model is a reasonable description of daily futureprice behavior, (2) to simulate the trading of four technical systems, and (3) to deduce the market-efficiency implications of both sets of results. If the futures market is either weak-form efficient or the random-walk model describes price generation, the implication is that the expected profit of a technical trading system will be no greater than zero. Two approaches have been adopted to test futures-market efficiency. The first, statistical analysis of the random-walk model, by Taylor (1982), has been rejected as a description of price behavior for futures markets in the United States, United Kingdom, and Australia. The second approach is to simulate the trading of a technical system and examine the resulting profits or losses. These two applications have been confined almost exclusively to examining Alexander’s (1961) filter rule, which results in the existence of trading profits.
page 3014
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3015
3015
However, the significance and importance of filter-rule research are difficult to assess objectively (see Sweeny, 1988). Lee et al. (2011) use price per share, dividend per share, and shares outstanding to test the existence of a price disequilibrium adjustment process with international index data and US equity data. They found that a disequilibrium price adjustment process does, in fact, exist in our empirical data. These results support Lo and Wang’s (2000) findings that trading volume is an important factor in capital asset pricing. Irwin and Uhrig (1984) tried to improve upon earlier efforts by: (1) testing both the random-walk model and the trading-system tests over the same data, (2) optimizing the trading systems and then simulating the optimized system over out-of-sample data, (3) deducting for both commission and transaction costs, and (4) examining trading systems actively used by trading advisors. Their data consist of eight series of daily future-price closes. The authors examined four technical trading systems selected as representative of the three main types of trading systems: price channels, moving averages, and momentum oscillators. The four systems examined are the Donchian system (DONCH), and the moving-average with a percentage price band system (MAPB), the dual moving-average crossover system (DMAC), and the directional indicator system (DI). The DONCH system is part of a family of technical systems known as price channels. The system generates a buy signal any time the daily high price is outside (greater than) the highest price in the specified time interval. A sell signal is generated any time the daily high breaks outside (lower than) the lowest price in the same interval. The system always generates a signal for the trader to take a position, long or short, in the futures market. The MAPB system belongs to a technical family derived from moving averages. Moving averages come in many forms — that is, simple moving averages, exponentially weighted, linearly weighted, and so on. The MAPB system employs a simple moving average with a band based on a percentage of price centered around it. A signal to exit a position occurs when the price recrosses the moving average. The band creates a neutral zone in which the trader is neither long nor short. The DMAC system employs logic similar to the MAPB system by seeking to find when the short-run trend rises above or below the long-term trend. The MAPB represents the short-term trend by the daily price and the longterm trend by the moving average. The DMAC uses a short-term moving average and long-term moving average to represent the short- and long-term trend. A change in the price trend is signaled when these two moving averages cross. Specifically, a buy signal is generated when the shorter moving average
July 6, 2020
15:53
3016
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
is greater than (above) the longer moving average, and a sell signal when the shorter moving average is less than (below) the longer moving average. The trader always maintains a long or short position in the futures market. The DI system is from a technical family known as momentum oscillators. Whereas the previous systems outlined deal with the futures-price level, oscillators deal with price changes. The logic employed by the directionalindicator system is that any trending period can be characterized as having a significant excess of either positive or negative price movements. Periods when prices are quickly moving upward will have more upward price change than downward price change, and vice versa. It is this relative price change that the DI estimates. Irwin and Uhrig (1984), using the Lyung-Box test on the autocorrelation analysis for nonrandomness, found that their results indicated that the random-walk model could be rejected for futures on corn, soybeans, sugar, wheat, cocoa, and live cattle. However, the random-walk model could not be rejected for futures on copper and live hogs. The data are divided into three sample periods (1960–1981, 1960–1972, and 1973–1981) to account for the structural and policy changes that occurred in the early 1970s. Each trading system is optimized; that is, the highest profit parameter is found. The optimum parameters are used as the basis for trading in a time period after their development. Substantial trading-system profits are evident over the 1960–1981 period for all commodities. However, these are almost exclusively concentrated in the years from 1973 to 1981. As a result, futures-market efficiency cannot be rejected for the 1960–1972 period and was rejected for the 1973–1981 period. It is felt that structural and policy changes during 1973–1981 prevented futures prices from adjusting instantaneously to new information, resulting in significant profits from technical trading systems utilizing significant price trends. Technical analysis seems to have some merit when the market involved is not efficient. By studying price trends, the analyst is able to realize excess profits. For further information about technical analysis, the reader can refer to Blau (1995), Brown (2015), Douglas (2001), Nison (2001), and Smith (2015). 87.3 Anomalies and their Implications There are many papers that discuss the anomalies of efficient markets. In this section we review four authors, Basu, Banz, Reinganum, and Keim. They deal with the anomalies centered around the P/E ratio, size effect,
page 3016
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3017
3017
and January effect. This section looks at these anomalies in the context of technical and fundamental analysis. 87.3.1 Basu’s findings Basu (1977) tries to determine empirically whether the investment performance of common stocks is related to their P/E ratios. The primary data for this study come from a database that includes the Compustat file of NYSE industrial firms, the investment–return file from the Center for Research in Security Prices (CRSP) tape, and a delisted file containing selected accounting data and investment returns for securities delisted from the NYSE. The database represents 1,400 industrial firms, all of which actually traded on the NYSE between September 1956 and August 1971. For any given year under consideration, three criteria are used in selecting sample firms: (1) the fiscal year ends on December 31, (2) the firm actually traded on the NYSE as of the beginning of the portfolio-holding period and is included in the merged tape, and (3) the relevant investment–return and financial-statement data are not missing. Beginning with 1956, the P/E ratio of every sample security is computed. The numerator of the ratio is defined as the market value of common stock as of December 31, and the denominator as the reported annual earnings available for common stockholders. These ratios are ranked and five portfolios are formed. This procedure, repeated annually on each April 1, gives 14 years (April 1967–March 1971) of return data for each of the P/E portfolios. Basu uses the performance-evaluation measures of Jensen, Sharpe, and Treynor. His results indicate that during the period April 1957–March 1971, low-P/E portfolios seem to have (on average) earned higher absolute and risk-adjusted rates of return than the high-P/E portfolios. Even after accounting for a bias in the performance measure resulting from the effect of risk, above-normal returns are found. The results are consistent with the view that P/E-ratio information is not fully reflected in security prices in as rapid a manner as postulated by the semi-strong form of the efficient-market hypothesis (EMH). To the extent that low-P/E portfolios did earn superior returns on a risk-adjusted basis, the proposition of the price-ratio hypothesis (on the relationship between investment performance of equity securities and their P/E ratios) seems to be valid. There appear to be lags and frictions in the process of adjusting security prices to publicly available information. Therefore, publicly available
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3018
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
P/E ratios may possess information content and may warrant an investor’s attention at the time of portfolio formation or revision. 87.3.2 Reinganum’s findings Reinganum’s (1981) study documents an empirical anomaly that suggests that either the simple one-period CAPM is misspecified or that capital markets are inefficient. He forms portfolios based upon firm size or E/P ratios. The data, collected primarily from The Wall Street Journal, consist of corporate quarterly earnings and announcement dates from the fourth quarter of 1975 and the subsequent eight quarters. The sample consists of 566 NYSE and American Stock Exchange (AMEX) stocks with fiscal-year ends in December. The sample is divided into portfolios based upon standardized unexpected earnings (SUE). The high-SUE portfolio contains the 20 securities with the highest SUE, while the low-SUE portfolio consists of 20 firms with the lowest SUE. Each 20-security portfolio is subdivided into two equal-weighted portfolios of 10 securities. One portfolio contains the 10 securities with the highest estimated betas, and the other consists of the 10 firms with the lowest estimated betas. Weights are selected for the two 10-security portfolios so that the overall 20-security portfolio has an estimated beta equal to one. Reinganum’s results indicate that abnormal returns cannot be earned over the period studied by constructing portfolios on the basis of a firm’s SUE. These results offer support for the assumption of market efficiency. Reinganum uses the same data source computed earnings/price (E/P ) ratios for the firms in his sample. The E/P ratios are computed as the quarterly net income divided by the value of the common stock. The value of the common stock is calculated with both pre-earnings and postearnings announcement prices. If capital markets rapidly incorporate information into prices, then rankings based upon post-announcement prices should reflect only the equilibrium effect between E/P ratios and asset pricing. Results indicate that during 1976 and 1977, an abnormal return of about 0.1% per day on the average can be earned by forming portfolios based on E/P ratios. That is, the mean return of a high-E/P portfolio exceeds the mean return of a low-E/P portfolio by about 0.1% per day, even after adjusting for beta risk. Ignoring transaction costs, this mean spread is greater than 6% per quarter, and it persists for at least two quarters. Reinganum suggests that the evidence indicates a misspecification in CAPM rather than any informational inefficiencies.
page 3018
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3019
3019
The evidence in this study suggests that the simple one-period CAPM is misspecified. The set of factors omitted from the equilibrium pricing mechanism seems to be more closely related to firm size than E/P ratios. According to Reinganum, the misspecification does not appear to be a market inefficiency in the sense that abnormal returns arise because of transaction costs or informational lags. Rather, the source of the misspecification seems to be risk factors that are omitted from the CAPM as is evidenced by the persistence of abnormal returns for at least two years. 87.3.3 Banz’s findings Banz (1981) examines the empirical relationship between the return and the total market value of NYSE common stocks. His sample includes all common stocks quoted on the NYSE for at least five years between 1926 and 1975. The securities are assigned to one of the 25 portfolios containing similar numbers of securities, first from one to five on the basis of the market value of the stock, then the securities in each of those five are in turn assigned to one of five portfolios on the basis of their beta. Five years of data are used for the estimation of the security beta; the next five year’s data are used for the re-estimation of the portfolio betas. Stock price and number of shares outstanding at the end of the five-year periods are used for the calculation of the market proportions. The portfolios are updated every year. The results indicate that shares of firms with large market values have had smaller risk-adjusted returns, on average, than similar small firms over a 40-year period. This size effect is not linear in proportion to market value, but is most pronounced for the smallest firms in the sample. In addition, the effect is not very stable through time. Banz argues that the P/E ratio serves as a proxy for the size of a firm and not vice versa. He cites a study by Reinganum (1981), where the results show that the P/E effect disappears for both NYSE and AMEX stocks when Reinganum controls for size, but that there is a significant size effect even when he controls for the P/E ratio. To summarize, the size effect exists, but it is not clear why it exists. Although it has been conjectured that the effect may be due to restricted distribution of information about small firms, it has not been established that size is not just a proxy for yet another effect. 87.3.4 Keim’s findings Keim (1983) examines, month by month, the empirical relation between abnormal returns and the market value of NYSE and AMEX common stocks.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3020
Evidence is provided that daily abnormal return distributions in January have large means relative to the remaining 11 months, and that the relation between abnormal returns and size is always negative and more pronounced in January than in any other months. In particular, nearly 50% of the average magnitude of the size effect over the period 1963–1979 is due to January abnormal returns. Further, more than 50% of the January premium is attributable to large abnormal returns during the first week of trading in the year, particularly on the first trading day. In addition, Lee et al. (1998) have used mutual fund to study small firm January effect. The data for this study are drawn from the CRSP daily stock files for a 17-year period, 1963–1979. The sample consists of firms listed on the NYSE and AMEX that had returns on the CRSP files during the entire calendar year under consideration. 87.3.5 Additional findings Although several hypotheses regarding the January effect have been suggested, the more prominent are a tax-loss selling hypothesis by Branch (1977) and an information hypothesis. However, neither has been theoretically or empirically linked to the seasonal return. Since February 1984 (and prior to February 1980), each Thursday, following the close of financial markets, the close of financial markets, the Federal Reserve has released an estimate of the seasonally adjusted average M-1 money supplies prevailing over the week ending Wednesday eight days earlier. (Between February 1980 and February 1984, the weekly money-supply announcement was moved from Thursday to Friday afternoon.) Therefore, the announcement effect cannot occur until the markets open on Monday. Other events over the weekend may have camouflaged the money-supply announcement effect during this period. Cornell (1983) examined the money-supply announcement effect upon various assets, including three-month Treasury bills, 30-year Treasury bonds, German marks, and the Standard & Poor’s 500 stock index. Notationally, his model can be described as follows: DAt = A0 + a1 UM t + a2 EM t + Ut ,
(87.8)
where DAt is the change in asset return; UM t is the unexpected monetary announcement; EM t is the expected monetary announcement; and Ut is the random disturbance. Cornell summarized four major hypotheses that explain why moneysupply announcements affect asset prices. First, the expected-inflation
page 3020
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3021
3021
hypothesis states that the announcements alter analysts’ inflation forecasts. Second, the Keynesian hypothesis predicts that in response to an announced innovation in the money stock, analysts expect the Fed to take offsetting action. Third, the real-activity hypothesis alleges that money-supply announcements provide the market with information about future output, and thereby future money demand. Finally, the risk-premium hypothesis states that money-supply announcements alter the required real return on financial assets by providing the market with information about aggregate risk preferences and beliefs. None of the hypotheses explained the reaction of all four assets. Therefore, at best, the market is responding to money-supply announcements in an eclectic manner. Using data from January 5, 1978 to December 18, 1981, divided into two intervals to take account of the Federal Reserve’s stated change in operating procedure on October 6, 1979, Cornell finds that the market was only responsive to monetary announcements after October 6, 1979. Others, notably Urich and Wachtel (1981) have found some evidence that the announcement effect did exist prior to the change in Fed policy. In addition, Cornell finds that only the unanticipated portion of the announcement is significant, and that a definite relationship exists between the asset-price change and the unanticipated money-supply change. In summary, the existence of anomalies tends to indicate that the market is not perfectly efficient. Due to this inefficiency, technical analysts have hope for some success in capturing excess profits. Treynor and Ferguson (1985) further defended the use of technical analysis by using a Bayesian probability estimate to assess whether, in using past price data, the market has already incorporated some firm-specific information available to the investor. If the market has not discovered this information and past price data confirms this, the informed investor may be able to realize excess returns. Their results showed that past prices, combined with other valuable information, can be used to achieve excess returns. However, they noted that it is the firm-specific information that creates the opportunity while past prices serve to permit its exploitation.
87.4 Security Rate-of-Return Forecasting This section discusses alternative methods to forecast security rate of return in security analysis. In order to do security-return forecasting, regression analysis, time-series analysis, and a composite approach are often utilized. In forecasting security rates of return, both the fundamental and the technical
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3022
analyst may use the regression approach. The time-series approach is more often associated with the technical school. 87.4.1 Regression approach A regression approach captures the relationship between independent variables(s) Xit and a dependent variable Rjt in a linear format. One can choose either a linear or log-linear model as defined in the following equations: Rjt = a + bXt + ∈t , log Rjt = a + b log Xt + ∈t ,
(87.9a) (87.9b)
in which ∈t and ∈t are error terms. The choice between equations (87.9a) and (87.9b) depends on whether the related variables, Xt and Rjt are normally or log-normally distributed. In order to obtain the best linear model to predict Rjt given Xt , it is necessary to find the equation that minimizes the squared error term. The error term (∈t ) represents the difference between the actual value of Rjt and ˆ jt can be defined as the predicted value of Rjt . The estimated value of Rjt , R ˆ jt = a ˆ + ˆbXt . R In general, if Xt is the series to be forecasted and yit is the possible explanatory series, then a further example of an explanatory model is Xt = a + b1 y1t + b2 y2t + · · · + bn ynt + et . To forecast one step ahead, write this as Xt+1 = a + b1 y1t+1 + b2 y2t+1 + · · · + bn ynt+1 + et+1 . Another model is therefore required to provide a forecast for yit+1 so that a forecast for Xt+1 can be constructed. The regression model can be classified into fixed-coefficient and time-varying-coefficient versions. 87.4.1.1 Fixed-coefficient market model A common model found in finance literature that is used to estimate the return on security j is the fixed-coefficient market model. Rjt = αj + βj Rmt + ∈jt,
(87.10)
where Rjt is the return on security in period t; αj is the regression intercept term in period t; Rmt is the return on the market portfolio in period t; βj is
page 3022
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3023
3023
the estimated parametric coefficient in period t; and ∈jt is the error term in period t. Using the estimated coefficient αj and βj from period t and forecasting the return on the market for time period t + 1, Rm,t+1 , the return on security j can be forecasted for period t + 1: Rj,t+1 = αj + βj Rm,t+1 + ∈j,t+1 .
(87.11)
87.4.1.2 Time-varying-coefficient market model A variation of the fixed-coefficient market model, the time-varying-coefficient market model allows the coefficient to vary with time. Algebraically, Rjt = αj + βjt Rmt + ∈jt,
(87.12a)
βjt = βj + γ1 X1 + γ2 X2 + · · · + γn Xn + τjt .
(87.12b)
Rosenberg and McKibben (1973) used this format to predict stock returns. In their analysis, they used historical accounting variables (X) to predict the time-varying data. Substituting equation (87.12b) into equation (87.12a) leads to a multipleregression format, where Rjt = αj + βRmt + γ1 (X1 Rmt ) + γ2 (X2 Rmt ) + · · · + γn (Xn Rmt ) + (∈jt + τjtRmt )
(87.13)
In order to use this format to forecast Rj,t+1 , not only must Rmt+1 be forecasted but also βj,t+1 , which indicates that a forecast of variable Xj must be available. As one can see, this is a much more complex situation than forecasting using a constant beta. 87.4.2 Time-series approach A time series is a set of observations generated sequentially in time. If the set is continuous, the time series is said to be continuous. If the set is discrete, the time series is said to be discrete. The use at time t of available observations from a time series to forecast its value at some future time t + 1 can provide a base for economic and business planning, production planning, inventory and production control, and optimization of industrial processes. To calculate the best forecasts, it is also necessary to specify their accuracy, so that the risks associated with decisions based upon the forecasts may be calculated.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3024
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
Two major approaches to time-series analysis are component analysis and sample-function analysis. Component analysis regards the time series as being composed of several influences or components that are generally taken to be trend-cycle, seasonal, and random movements. In component analysis the seasonal and trend movements are modeled in a deterministic manner. The trend might be fitted by a polynomial of a given degree and the seasonal component by a Fourier series (a trigonometric function with a given period and amplitude). Sample-function analysis regards a time series as an observed sample function representing a realization of an underlying stochastic process. Complicated parametric statistical-estimation procedures are used to determine the properties of time-series data. Since empirical results obtained from component analysis are easier to understand and interpret, henceforth this chapter concerns only component analysis. 87.4.2.1 Component analysis Component analysis is based on the premise that seasonal fluctuations can be measured in an original series of economic data and separated from trend, cyclical, trading-day, and random fluctuation. The seasonal component reflects a long-term pattern of variation which is repeated constantly or in an evolving fashion from year to year. The trend-cycle component includes the long-term trend and the business cycle. The trading-day component consists of variations which are attributed to the composition of the calendar. The random component is composed of residual variations that reflect the effect of random or unexplained events in the time series. Decomposing past time series and discovering the relative percentage contribution of the trend, seasonal, and random components to changes in the series provide insight to financial analysts. The trend-cycle component reflects permanent information in both short- and long-run economic time series. The seasonal component is considered to represent a permanent pattern underlying the short-run time series. The random component contains the randomness that exists in the time series for both short- and long-run analysis. The higher the relative percentage contribution of the random component in a time series, the greater the uncertainty and thus the greater the probability of forecasting errors. Gentry and Lee (1987) used the decomposition method known as the X-II to measure the relative percentage contribution of trend-cycle, seasonal, and random components to changes in the original series of income-statement variables. Their results indicated that the relative percentage contribution
page 3024
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3025
3025
of the trend-cycle, seasonal, and random components were directly affected by the length of the time period of data. The shorter the time period of data, the greater is the relative percentage contribution of the irregular component. The longer the time period, the greater is the relative contribution of the trend-cycle component and the smaller the seasonal component. In addition, the relative percentage contribution of the components varied widely among companies for all of the income-statement variables tested. These results have serious implications both for internal management and external analysis. An industry index of the percentage contribution of the random components for each income statement variable would provide a useful benchmark to measure the reliability of an analyst’s forecast. 87.4.2.2 ARIMA models This section examines a class of models used in forecasting time-series data. It is best to begin by examining the simplest of all possible time series, a purely random series. A series that is purely random is sometimes referred to as a white-noise or random-walk model. Mathematically, such a series can be described by the following equation: yt = at
(87.14)
in which the series at is assumed to have a mean of zero, to be unrelated to its past values, and to have a constant variance over time. Mathematically, these assumptions can be summarized as (i) E(at ) = 0, (ii) E(at , at−i ) = 0 for all t and i = 0 (iii) Var(at ) = σa2 for all t. Modify equation (87.14) to allow the series to be concentrated around a nonzero mean δ, the series could now be described as yt = δ + at .
(87.15)
Equation (87.15) is a model that can be used to represent many different time series in economics and finance. For example, in an efficient market, a series of stock prices might be expected to randomly fluctuate around a constant mean. So, the actual stock price observed in time period t would be equal to its average price plus some random shock in time period t. The question now is how to model a purely random series. Fortunately, a theorem known as World’s decomposition provides the answer. World’s decomposition proves that any stationary time series (a series is stationary if it is centered around a constant mean) can be considered as a sum of self-deterministic components. This theorem states that a time series can be generated from a weighted average of past random shocks of infinite order.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3026
A model such as this is known as a moving average of infinite order and can be expressed by the following equation: yt = δ + Θ1 at−1 + Θ2 at−2 + · · · + Θ∞ at−∞ + at ,
(87.16)
where δ is the mean of the process; at−∞ is the random shock that occurred ∞ periods earlier; and Θ∞ is the parameter that relates the random shock ∞ periods earlier to the current value of y. The moving-average model just discussed should not be confused with the moving-average concept previously discussed in this chapter. Previously, the term moving average was used to refer to an arithmetic average of stock prices over a specified number of days. The term moving average was used because the average was continually updated to include the most recent series of data. In time-series analysis, a moving-average process refers to a series generated by a weighted average of past random shocks. Because it would be impossible to estimate a model of infinite order, in practice, it is best to specify a model of finite order. A moving-average process of order q with zero mean would be expressed as follows: yt = Θ1 at−1 + · · · + Θq at−q + at .
(87.17)
A moving-average process is not the only way to model a stationary time series. Again, consider a moving-average process of infinite order: yt = Θ1 at−1 + Θ2 at−2 + · · · + at .
(87.18)
Equation (87.18) can be rewritten in terms of the error term at as follows: at = yt − Θ1 at−1 − Θ2 at−2 − · · · .
(87.19)
Because equation (87.19) is recursive in nature, it is easy to generate an expression for at−1 as follows: at−1 = yt−1 − Θ1 at−2 − Θ2 at−3 − · · · .
(87.20)
By substituting the expression for at−1 into equation (87.19): at = yt − Θ1 (yt−1 − Θ1 at−2 − Θ2 at−3 − · · ·) −Θ2 at−2 − Θ3 at−3 − · · · = yt − Θ1 yt−1 +
(Θ21
(87.21)
− Θ2 )at−2 + (Θ1 Θ2 − Θ3 )at−3 + · · ·
By generating expressions for at−2 at−3 , . . . and substituting them into equation (87.21) at can be expressed in terms of past values of yt : at = yt + φ1 yt−1 + φ2 yt−2 + · · · .
(87.22)
page 3026
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
3027
Rearranging equation (87.22) in terms of yt yields yt = −φ1 yt−1 − φ2 yt−2 − · · · + at ,
(87.23)
where φ1 = φ1 and φ2 = Θ21 − Θ2 . Equation (87.23) is known as an autoregressive process of infinite order. The term autoregressive refers to the fact that yt is expressed in terms of its own past values yt−1 , yt−2 , . . . , yt−p . Again, because it is impossible to estimate a model of infinite order, an approximate model of finite order is specified. An autoregressive process of order p can be expressed as: yt = −φ1 yt−1 − · · · − φp yt−p − · · · + at .
(87.24)
Thus, a stationary time series can be expressed in two ways: (1) as a movingaverage process in which the series can be represented as a weighted average of past random shocks, or (2) as an autoregressive process in which the series can be represented as a weighted average of its past values. A third possibility is that a series may involve some combination of the two processes. This process is referred to as a mixed autoregressive moving-average (ARMA) process. An ARMA process of infinite order can be expressed as: yt = Θ1 at−1 + Θ2 at−2 + · · · − φ1 yt−1 − φ2 yt−2 − · · · + at .
(87.25)
Again, an ARMA process of finite order must be specified in order to make estimation possible. An ARMA (p, q) process can be expressed as: yt = Θ1 at−1 + · · · + Θq at−q − φ1 yt−1 − · · · − φp yt−p + at .
(87.26)
So far, the discussion has focused on the estimation of stationary time series. Suppose the process of interest is not stationary. Fortunately, a nonstationary series can usually be made stationary by transforming the data in an appropriate manner. The most popular method of transforming a nonstationary series to a stationary one is by differencing the series. For example, suppose the series y1 , y2 , . . . , yt is nonstationary. By differencing the series a new series, Z1 , Z2 , . . . , Zt−1 , is created. The new series can be defined as (i) Z1 = y2 − y1 (ii) Z2 = y3 − y2 . . . (iii) Zt−1 = yt − yt−1 . If the series Zt is nonstationary, it may be necessary to differentiate the series Zt . The modeling of a series that has been differenced is referred to as an autoregressive integrated moving-average (ARIMA) process. A detailed
page 3027
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
3028
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
discussion of ARIMA modeling is beyond the scope of this book; nevertheless, a brief outline of the ARIMA modeling procedure is in order. (See Nelson (1973) or Nazem (1988) for details of the ARIMA procedure.) The ARIMA process uses the following three steps as (1) Identification, (2) Estimation and (3) Forecasting. The first step is to identify the appropriate model. Identification involves determining the degree of differencing necessary to make the series stationary and to determine the form (ARMA or ARIMA) and order of the process. After a suitable model is identified, the parameters φ1 , . . . , φp , Θ1 , . . . , Θq need to be estimated. The final step in the ARIMA process is to use the model for forecasting. Oftentimes, the adequacy of the model is checked by using the model to forecast within the sample. This allows a comparison of the forecasted values to the actual values. If the model is determined to be adequate, the model can be used to forecast future values of the series. Ji et al. (2015) have used the time-series analysis technique to forecast the performance of the Taiwan weighted stock index. They found the time-series technique can be relatively successful in forecasting Taiwan’s weighted stock index. 87.4.3 Composite forecasting Numerous approaches running from sophisticated multiple-equation regression techniques to rather na¨ıve extrapolations or intuitive estimates are being utilized to produce forecasts. Bessler and Brandt (1979) examined three alternative procedures for forecasting time-dependent quarterly observations on hog, cattle, and broiler prices along with composite forecasts based on various linear combinations of these three procedures. The alternative methods for forecasting these prices are econometric models, time series (ARIMA), and expert opinion. The results obtained by Bessler and Brandt for selected performance measures (mean-squared error and turning points) applied to the forecasts of each method over the period 1976 quarter I through 1979 quarter II suggest that no method consistently outperformed or was outperformed by the other two methods. In terms of mean-squared error performance, the forecasts based on the ARIMA processes are lowest for hog and cattle prices, while the econometric model gives lowest mean-squared error forecasts for broiler prices. The mean forecast error is determined by taking the average of the difference between the summation of the overpredictions and the summation of the underpredictions. A negative sign would indicate that the average
page 3028
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3029
3029
forecast series is above the mean of the actual series; a positive sign suggests an average forecast which is low. The mean absolute forecast error is simply the average of the absolute values of the forecast errors. Composite forecasts based on the forecasts of the individual methods are formed using three procedures; minimum variance, adaptive weighting, and simple average composites. The empirical results from all three composite forecasting schemes generate performance levels that are at least as good as any of the individual forecasts and usually much better. In particular, the mean-squared error of the best individual forecasting method is compared with that of the best composite for each of the three commodities. The composite forecast errors of the three commodities average 14% lower than the errors of the best individual forecasts. The econometric model is essentially based on representations of the underlying economic behavioral system for a particular commodity. These representations attempt to identify and model the relevant supply-anddemand factors that together determine market price and quantity. As an alternative to statistical models for forecasting, forecasts based upon expert opinions are available. These forecasts represent an accumulation of knowledge about the particular industry, commodity, or stock in question. In many respects, the forecasts of experts are like those of econometric- or ARIMA-model forecasting in that they incorporate much of the same information from the same data sources. Expert opinions, however, are less restrictive or structures, in that the expert can change the weights assigned to different bits of information, or can select with relative ease the sources from which to draw the data. In addition, these expert forecasts are able to incorporate information that cannot, perhaps, be included in a more quantitative model in the form of data. Recognizing that most forecasts contain some information that is not used in other forecasts, it seems possible that a combination of forecasts will quite often outperform any of the individual forecasts. Bessler and Brandt (1979) construct composite forecasts based upon composite weighting schemes. They use various tests or measures of performance to evaluate the price forecasts of econometric, ARIMA, expert-opinion, and composite methods. Of the single-variable measures, they use the mean-squared error, the mean forecast error, and the mean absolute forecast error. The meansquared error is a nonparametric statistic that provides a measure of the size of individual forecast errors from the actual values. Because the error is squared, large errors detract significantly from the performance of the method.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3030
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
Performance indicators that track the movements of actual and forecast price series are called tracking measures. Examples of tracking measures are the number of turning points missed or falsely predicted compared with those correctly forecasted. Although these measures will not indicate which forecasting method most closely approximates the actual series, they are particularly useful when the forecaster is interested in knowing when a series is likely to turn upward or downward from its current pattern. Bessler and Brandt’s study does not find any specific forecasting method to be universally superior in terms of the performance measures. Although the ARIMA model performs best for two of three commodities, its performance is poorest for the third commodity in terms of the mean-squared error criterion. The composite forecasting method’s mean-squared errors are lower than or nearly as low as the best of the individual methods. More important, in no case does a composite method of forecast generate errors that are as large as the worst of the individual methods. The results of the performance evaluation suggest that forecasters should seriously consider using composite forecasting techniques. The idea that alternative forecasting methods use a variety of different information sources and means for assimilating the information and generating forecasts, a variety that can be captured by a composite forecast, is not only theoretically appealing but, in Bessler’s and Brandt’s study, somewhat empirically substantiated. Appendix 87A presents the composite forecasting method.
87.5 Value Line Ranking The Value Line Investment Survey is an independent weekly investmentadvisory service registered with the US Securities and Exchange Commission. The weekly Value Line survey comes in three sections: (1) Rating and Reports contains full-page reports on each of 1,700 stocks. The stocks are classified into 92 industry groups. A report on the industry precedes reports on the stocks in it. Every week, about 130 stocks in seven or eight industries are covered on a preset sequential schedule. (2) The Summary and Indexes is a weekly alphabetical catalog of all 1,700 stocks at their most recent prices, with their current ranking for timeliness and safety. (3) Selection and Opinion gives Value Line’s opinion of business prospects, the stock-market outlook, and the advisable investment strategy.
page 3030
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3031
3031
87.5.1 Criteria of ranking By means of the two rankings, timeliness and safety, Value Line relays its expectations about the performance of individual stocks and industries. The timeliness rank runs on a scale from 1 (highest) down to 5 (lowest). The safety rank is a measure of risk avoidance. It is based mainly on the company’s relative financial strength and the stock’s price stability. The safety rank changes infrequently and may be taken as a forecast of risk avoidance. Safety ranks run on a scale of 1 (safest) to 5 (riskiest). The rankings are drawn almost completely from published information about the earnings and price history of the companies that are followed, and are based on 10 years of history of the companies that are followed, and are based on 10 years of history on earnings and prices. The rankings are produced primarily by a computer using as input the earnings and price history. The system tends to assign high ranks to stocks with low P/E ratios relative to historic norms and to the current P/E ratio of the market. The system also tends to assign high ranks to stocks whose quarterly earnings reports show an upward momentum, relative to the quarterly earnings on the market as a whole, and to stocks that have upward price momentum. These factors are chosen by doing a cross-sectional regression on past data. The set of weights that seems to give the best predictive ability is then chosen. In sum, the one-year rankings are based on growth in earnings, price momentum, and the P/E ratio of each stock relative to the market and to historical standards for the stock. The evaluation of a single stock involves (1) Choosing stocks that are acceptable in terms of timeliness rankings; (2) Among the stocks chosen for timeliness, picking those that are in industries also shown to be timely; (3) Among the most timely stocks in the most timely industries, picking those that conform to the investor’s safety constraints; (4) Among those stocks that meet the investor’s timeliness and safety constraints, choosing those that meet the investor’s current yield requirement. 87.5.2 Performance evaluation In studies to determine the profitability of the investment advice given by various brokerages and investment advisors, Value Line recommendations yield a portfolio that earns a few percentage points more return per year than
July 6, 2020
15:53
3032
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
could be earned by picking a large portfolio randomly. Fischer Black (1972), advocate of the buy-and-hold strategy, using monthly data over a given year commencing with April 1965, tested for the investment performance of the Value Line rankings. He used Jensen’s time-series test of consistency of performance, calculating the return on the market at frequent intervals. A time-series regression of the excess return on the portfolio against the excess on the market was run; the intercept of that regression shows the extra returns the portfolio is able to achieve adjusted for risk. The intercept is then tested for significance. Black constructed portfolios of all the stocks in each ranking and weighted each stock equally each month. Purchases were assumed to occur at the close of the markets on Friday, which is when most subscribers receive their reports. The results of Black’s tests show that the success of the rankings are very consistent over time, and thus very significant in a statistical sense. The extra return of rank 1 stocks is about 10% per year; it is about −10% per year for the rank 5 stocks. Black notes that if weekly returns and associated portfolio revisions had been used, rank 1 would have earned an extra 20% per year rather than an extra 10% per year. In sum, the use of Jensen’s CAPM time-series test tends to indicate that rankings clearly provide a profitable portfolio strategy for investors who can execute orders at low transaction costs. Even in reducing the turnover activity involved, significant excess returns were achieved. Similar studies were performed by Holloway (1981) and Copeland and Mayer (1982). Holloway (1981) found significant performance for rank 1 firms over the 1974–1977 period. Copeland and Mayers (1982) noted that rank 1 firms outperformed rank 5 firms by 6.8% per year on a risk-adjusted basis over the 1965–1978 period for portfolios updated semi-annually. A later work by Chen et al. (1987) using an APT framework has results that are similar to the Copeland and Mayers (1982) results. Stickel (1985), using an event-study methodology, examined evidence on (1) the differential impact of the various types of rank change and (2) the speed of adjustment of individual security prices to new information. His results indicated that although Value Line rank changes have information content, the effect varies by the type of rank change. Changes from rank 2 to rank 1 have the most dramatic effect on prices. A cross-sectional analysis finds that smaller firms have a greater reaction to a rank change than larger firms. Finally, a speed-of-adjustment test suggests that individual securities with significant abnormal performance on event day 0 or +1 adjust to the information in rank change over a multiple-day period.
page 3032
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
3033
Lee and Park (1987), using a specification-analysis approach, investigated the effect of Value Line ranking changes for the beta coefficients. Following equation (87.12), they generalized the additional market model: Rjt = α + βjt Rmt + ∈jt
(87.27a)
βjt = βj + CVjt ,
(87.27b)
where Rjt is the rate of return for the jth firm in period t; Rmt is the market rate of return in period t; βjt is the beta coefficient for the jth firm in period t; and Vjt is the Value Line ranking for the firm in period t. Substituting equation (87.27b) into (87.27a) yields Rjt = αj + βRmt + C(Vjt Rmt ) + Ejt .
(87.28)
In equation (87.28), the interaction variable Vjt Rmt can be used to test whether the Value Line ranking exhibits some market-timing ability on the jth firm’s rate-of-return determination. Their empirical results using monthly stock-return data and Value Line weekly rankings over the period July 1978 to February 1983 suggest that firm’s betas are affected by the change of a Value Line ranking more than 40% of the time. Most of the estimated Cs are negative; hence, it can be concluded that an increase in rank will reduce the beta coefficient and rate of return of the firm. These studies suggest that Value Line’s recommendations are better than picking stocks randomly. Such favorable studies have never been published for other investment-advisory services by unbiased outside researchers. 87.6 Mutual Funds Mutual funds are one of the most important investments for individual investors. In this section, mutual-fund classification and mutual-fund managers’ timing and selectivity abilities are discussed. 87.6.1 Mutual-fund classification According to the Investment Company Act of 1940, mutual funds must publish a written statement of their investment objectives and make it available to their shareholders. This objective can be changed only if the majority of the shareholders consent in advance to the new objective. The investment objectives of mutual funds can be classified into four categories as
page 3033
July 17, 2020
14:36
3034
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
(1) Growth, (2) Income and growth, (3) Income and (4) Income, growth, and stability (balanced fund).1 These objectives are listed in the descending order of the aggressiveness with which the fund’s management implies it will seek a high average rate of return and assume the corresponding risks. The balanced funds offer a complete investment program to their clients, so far as marketable securities are concerned. Their portfolios are presumably structured to include bonds and stocks in a ratio considered appropriate for an average individual investor given the return outlook for each sector and possibly a risk and volatility constraint. Generally, however, these funds have been much less popular with investors than growth funds. Growth funds are structured to include a welldiversified combination of common stocks. Basically, three reasons may be cited. First, empirical studies of common stocks have almost invariably shown their long-term total returns to exceed those on bonds. Second, stock is generally conceded to be a better hedge against inflation risk than bonds. Third, many small investors may prefer to hold obligations of financial institutions as their major fixed-income securities because of their convenience and safety resulting from government insurance programs. Income funds are composed of well-diversified selection of bonds. Empirical studies of long-term bond returns have indicated a widely diversified list of medium-quality bonds that have been superior to high-quality bonds. In order to obtain appropriate representation in this sector of the bond universe, which includes both corporate and municipals, a large pool of funds is required to obtain the desired degree of diversification. One should be alert to the possibility that in order to show highly attractive yields on a competitive basis, an income fund may acquire a heavy proportion of speculative bonds on which the default risk is high. Income-and-growth funds are composed of a combination of common stock and bonds. Whether the emphasis is on income or growth determines what percentage of bonds or common stock is in the portfolio. Money market funds are invested in money market securities, which tend to have an average maturity age of around 1 month. For example, they are comprised of commercial paper, repurchase agreements, or certificates of
1
Bodie et al. (2010) have classified mutual funds by investment policy into money market funds, equity funds, sector funds, bond funds, international funds, balance funds, asset allocation and flexible funds, and index funds.
page 3034
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3035
3035
deposit. A feature of money market funds is that they typically offer checkwriting features. In addition, their net asset value is fixed at $1 per share. With a net asset value set at $1, there are no tax implications (capital gains or losses) related to the redemption of any shares. Sector funds are related to a particular industry. One example is that Fidelity markets many different “select funds,” which invest in a specific industry such as biotechnology, utilities, precious metals, or telecommunications. Bond funds specialize in the fixed-income sector. Within that sector, however, there is considerable room for further specialization. For example, various funds will concentrate on corporate bonds, Treasury bonds, mortgage-backed securities, or municipal (tax-free) bonds. Some municipal bond funds invest only in bonds of a particular state to satisfy the investment desires of residents of that state who wish to avoid local as well as federal taxes on interest income. Many funds also specialize by maturity, ranging from short-term to intermediate to long-term, or by the credit risk of the issuer, ranging from very safe to high-yield, or “junk” bonds. While global funds and international funds both invest in securities around the world, international funds do not invest in securities of firms located in the United States. Regional funds invest in securities in specific regions, and emerging market funds invest in companies in developing nations. Asset allocation funds are similar to balanced funds because they both hold stocks and bonds, however, asset allocation funds may dramatically vary the proportions allocated to each market in accordance with the portfolio manager’s forecast of the relative performance of each sector. Hence, these funds are engaged in market timing and not designed to be low-risk investment vehicles. An index fund tries to match the performance of a broad market index. The fund buys shares in securities included in a particular index in proportion to each security’s representation in that index. For example, the Vanguard 500 index Fund is a mutual fund that replicates the composition of the Standard & Poor’s 500 stock price index. Because the S&P 500 is a value-weighted index, the fund buys shares in each S&P 500 company in proportion to the market value of that company’s outstanding equity. Investment in an index fund is a low-cost way for small investors to pursue a passive investment strategy — that is, to invest without engaging in security analysis. Of course, index funds can be tied to non-equity indexes as well. For example, Vanguard offers a bond index fund and a real estate index fund.
July 17, 2020
14:36
3036
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
87.6.2 Three alternative mutual fund performance measures 87.6.2.1 Definition of performance measure In general, to evaluate mutual fund performance, we use Sharpe’s measure, Treynor’s measure, or Jensen’s measure. Now, we briefly describe these three measures: 1. Sharpe measure: (¯ rp − r¯f )/σP Sharpe’s measure divides average portfolio excess return over the sample period by the standard deviation of returns over that period. It measures the reward to (total) volatility trade-off. 2. Treynor measure: (¯ rp − r¯f )/βP Like Sharpe’s measure, Treynor’s measure gives excess return per unit of risk, but it uses systematic risk instead of total risk. rf + βP (¯ rM − r¯f ] 3. Jensen measure (portfolio alpha): αP = [¯ Jensen’s measure is the average return on the portfolio over and above that predicted by the CAPM, given the portfolio’s beta and the average market return. Jensen’s measure is the portfolio’s alpha value. 87.6.2.2 Statistical distributions of Sharpe, Treynor, and Jensen measures Chen and Lee (1981, 1986) have derived statistical distribution of Sharpe, Treynor, and Jensen measures. In addition, Jobson and Korkie (1981) have also derived statistical distribution of Sharpe and Treynor measure. The statistical distribution of Sharpe’s measure is very important in finance research; therefore, other researchers such as Lo (2002) have also derived the statistical distribution of Sharpe’s performance measure. Chen and Lee have shown that Sharpe’s performance measure is a non-central t-distribution. Both Jobson and Korkie (1981) and Lo (2002) show that Sharpe’s performance is an approximate normal distribution. The reason that they found approximate normal distribution is due to the fact that they did not use an appropriate statistical theorem; therefore, non-central t is the more appropriate distribution. It is well-known that non-central t-distribution will approach normal distribution for a large sample size case. 87.6.3 Mutual-fund manager’s timing and selectivity When faced with the problem of deriving a performance measure, there are two considerations: (1) the collective performance of the security portfolio
page 3036
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3037
3037
and (2) the relative performance of the security portfolio. Evidence about the collective performance is relevant to the EMH, and thereby to an understanding of the process of security-price determination. Evidence about the relative performance of individual mutual funds or portfolios is of obvious interest to entities with investment funds to allocate. In examining relative performance, market-timing activities as well as careful selection of individual securities are of concern. Performance evaluations originally employed a one-parameter risk–return benchmark like that developed by Jensen (1968, 1969) and refined by Black et al. (1972), and Blume and Friend (1973). Such investigations have effectively focused on the fund manager’s security-selection skills, since the examined portfolios’ risk levels have been assumed to be stationary through time. Fama (1972) and Jensen (1972) pointed out the empirical measurement problems involved in evaluating properly the constituents of investment performance when portfolio risk levels are non-stationary as indicated by Chang and Lewellen (1984). Fama (1972), rather than following previous research on performance measurement by Sharpe (1966), Treynor (1965), and Jensen (1968) (where performance was evaluated in a two-dimensional framework of risk and return), looked for a finer breakdown of performance. Up to that time, the notion underlying performance measurement was a comparison of the returns on a managed portfolio relative to an annually selected portfolio with similar risk. The Sharpe–Lintner–Mossin version of the CAPM was used to obtain the benchmark portfolio return of the naively selected portfolio. To obtain the benchmark portfolio, Fama (1972) used Sharpe’s (1964) method to derive the efficient portfolio and ex ante security-market line (SML). The efficient portfolios are formed according to Rx = xRf + (1 − x)Rm
x ≤ 1,
(87.29)
E(Rx ) = xRf + (1 − x)E(Rm ),
(87.30)
σ(Rx ) = (1 − x)σ(Rm ),
(87.31)
so that
in which Rm , E(Rm ) and σ(Rm ) are one-period return, expected return, and standard deviation of return for the market portfolio m, respectively, and x is the weight associated with the risk-free asset. The ex ante SML can be defined as E(Rm ) − Rf Cov(Rj , Rm ) (87.32) E(Rj ) = Rf + σ(Rm ) σ(Rm )
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3038
in which Cov(Rj , Rm ) is the covariance between the return on asset j and the return on the market portfolio. The benchmark or naively selected portfolios are just the combination of the riskless asset Rf and the market portfolio Rm obtained with different values of x (where x is a weight). Given the ex post or realized return Rm for the market portfolio for the naively selected portfolio ex post return is Rx = xRf + (1 − x)Rm .
(87.33)
Moreover, βx =
Cov[(1 − x)Rm , Rm ] Cov(Rx , Rm ) = σ(Rm ) σ(Rm )
= (1 − x)σ(Rm ) = σ(Rx )
(87.34)
That is, for the benchmark portfolio, risk and standard deviation of return are equal. For the naively selected portfolios, equations (87.33) and (87.34) imply the following relationship between risk of an asset βx and ex post return Rx : Rm − Rf βx . (87.35) Rx = Rf + σ(Rm ) That is, for the naively selected portfolios, there is a linear relationship between risk and return. In performance-evaluation models, using this methodology as a benchmark is provided against which the returns on managed portfolios are judged. To use equation (87.35) as a benchmark for evaluating ex post portfolio returns requires estimates of risk βp and dispersion σ(Rp ) of the managed portfolios, as well as an estimate of σ(Rm ), the dispersion of the return on the market portfolio. In order for the performance evaluation to be objective, it must be possible to obtain reliable estimates of these parameters from historical data. Evidence suggests that, at least for portfolios of 10 or more securities, βp and σ(Rp ) seem to be fairly stationary over long periods of time and likewise for σ(Rm ). However, if market timing is to be a consideration, the problem of nonstationary βp , σ(Rp ), and σ(Rm ) must be considered. In addition, an assumption of normal return distributions is held, even though evidence suggests that actual return distributions conform more closely to non-normal, two-parameter stable distributions. Finally, the available empirical evidence indicates that the average returns over time on securities portfolios deviate systematically from the predictions of the standard CAPM model. In short, the evidence suggests that CAPM does not provide
page 3038
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
3039
the best benchmark for the average return–risk tradeoffs available in the market from naively selected portfolios. Fama (1972) first introduced the concept of selectivity, defined as how well a chosen portfolio does relative to a na¨ıve portfolio with the same level of risk. Algebraically, this measure of performance of the chosen portfolio a is Selectivity = Ra − Rx (βa ),
(87.36)
where Ra =
Va,t+1 − Va,t ; Va,t
Va,t , Va,t+1 are the total market values at t and t + 1 of the actual portfolio chosen at time t; and Rx (βa ) is the return on the combination of the riskless asset f and the market portfolio m that makes risk βx equal to βa , the risk of the chosen portfolio a. Selectivity is the sole measure of performance in the work of Sharpe, Treynor, and Jensen. Fama introduced the concept of overall performance. Overall performance is the difference between the return on the chosen portfolio and the return on the riskless asset. Overall performance is in turn split into two parts, (1) selectivity and (2) risk. Algebraically, Overall performance [Ra − Rf ] =
Selectivity Risk [Ra − Rx (βa )] + [Rx (βa ) − Rf ].
(87.37)
Figure 87.3 graphically presents the components related to mutual-fund performance of equation (87.37). Jensen’s measure of performance, is, of course, the height of the line A A. Fama referred to this distance as the return due to selectivity. In addition, Figure 87.3 indicates the overall performance Ra −Rf and risk Rx (βa ) − Rf . The risk measures the return for the decision to take one positive amount of risk. It will be determined by the level of risk chosen (the value of βa ) and the SML defined in equation (87.35). It does not matter whether this portfolio is a small part of the holdings of an investor because diversifiable risk will be diversified away when looking at the investor’s total holdings. If, on the other hand, the portfolio represents their entire holdings, it does matter. The question that now arises is whether beta or the standard deviation is the appropriate measure of risk for evaluating portfolio management. If total risk is the appropriate measure, then a Sharpe measure is the appropriate measurement tool.
page 3039
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3040
Figure 87.3:
Overall components of mutual-fund performance.
Fama (1972) further decomposed equation (87.37) by breaking up risk into two parts: (1) total portfolio risk σ(Ra ) and (2) market risk βa . Fama then showed that the portfolio risk σ(Ra ) will be greater than the market risk βa as long as the portfolio’s returns are not perfectly correlated with the returns on the market. This can be seen by looking at the correlation coefficient ρa.m between Ra and Rm : ρa,m =
Cov(Ra , Rm ) . σ(Ra )σ(Rm )
Multiplying both sides by σ(Ra ) yields ρa,m σ(Ra ) =
Cov(Ra , Rm ) . σ(Rm )
Notice that the right-hand side of the equation is just the measure of market risk βa . So βa can be written as ρa,m σ(Ra ) = βa . So, βa ≤ σ(Ra ) when ρa.m ≤ 1. Because total risk σ(Ra ) is greater than market risk βa , Fama was able to decompose selectivity into two parts, net selectivity and diversification.
page 3040
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
Figure 87.4:
3041
Detailed components of mutual-fund performance.
In Figure 87.4, the quantity Rx [σ(Ra )] − Ra is a measure of the extra return earned on portfolio as compared to a na¨ıve portfolio with the same total risk. Fama called Rx [σ(Ra )] − Ra net selectivity. He called the distance Rx [σ(Ra )] − Rx (βa ) diversification, decomposing Rx [σ(Ra )] − Rx (βa ) into Ra − Rx (βa ), selectivity, and Rx [σ(Ra )] − Ra , net selectivity. The decomposition of selectivity can be seen algebraically as follows: Selectivity
Diversification
[Ra − Rx (βa )] = Net Selectivity + [Rx [σ(Ra )] − Ra ] Selectivity
(87.38a)
Diversification
Net Selectivity = [Ra − Rx (βa )] + {[Rx [σ(Ra )] − Ra ]
(87.38b)
= Ra − Rx [σ(Ra )] Diversification measures the extra portfolio return that a less-than optimally diversified portfolio must earn to justify itself. When the return on the market is greater than the return on the risk-free asset, diversification measures the additional return that would just compensate the investor for the diversifiable dispersion [σ(Ra ) − (βa )]. However, when the return on the market is less than the return on the risk-free asset, diversification measures the return
page 3041
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3042
lost from taking on diversifiable dispersion rather than choosing the naively selected portfolio with market irks and standard deviation both equal to βa , the market risk of the portfolio actually chosen. Net selectivity may be negative, if a manger’s selectivity was not sufficient to make up for the avoidable risk taken. In Fama’s example shown in Figure 87.4 related to σ(Ra ), you can see that the area measured by the diversification is larger than the area measured by selectivity. Therefore, net selectivity must be negative, according to equation (87.38b). If the investor has a target risk level βT , the part of the overall performance due to risk can be allocated to the investor and to the portfolio manager as follows: + Investor’s risk Manager’s risk Risk = [Rx (βa ) − Rf ] = [Rx (βa ) − Rx (βI )] + [Rx (βI ) − Rf ]
(87.39)
in which Rx (βT ) is the return on the investor’s newly selected portfolio with the target level of market risk the investor has chosen. The manager’s risk is composed of the risk assumed by the manager by taking on a level of risk βa different from the investor’s target level βT . This decomposition is indicated in Figure 87.4 related to βT . Manager’s risk might in part result from a timing decision. That is, the manager might have chosen a portfolio with a higher or lower level of risk than desired by the investor due to his evaluation of economic or industry trends. Using an ex ante CAPM market line, risk can be subdivided as follows: Risk
Manager’s timing
{Rx (βa ) − Rf } = {Rx (βa ) − E[Rx (βT )]} − {Rx (βT ) − E[Rx (βT )]} Total timing
Manager’s expected risk
Market conditions
Investor’s risk
+ {E[Rx (βT )] − E[Rx (βT )]} + {Rx (βT ) − Rf } (87.40) The manager’s risk of equation (87.39) is the sum of the first three terms. The manager’s expected risk is the incremental expected return from the manager’s decision to take on a nontarget level of risk. The expression for market conditions measures how much the market deviated from expectations at the target level of risk. Total timing is the difference between the ex post return on the naively selected portfolio with risk βa and the ex ante expected return. When the return on the market is greater than the expected return on the market, total timing is positive (and more positive the larger the value of βa ). When the return on the market is less than expected return, total timing is negative (and more negative the larger the value of βa ).
page 3042
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3043
3043
Manager’s timing is the difference between total timing and market conditions. The manager’s timing is only positive (negative) when the chosen level of market risk is above (below) the target level and return on the market is above (below) the expected return on the market. It is, therefore, a more sensitive indicator of the manager’s timing ability than total timing. At times, a target level of risk may not be relevant; if this is the case, the market portfolio may be treated as the target portfolio. That is, Manager’s timing
Risk
{Rx (βa ) − Rf } = {Rx − E[Rx (βa )]} − {Rm − E(Rm )} Total timing
Expected deviation From the market
Market conditions Market risk
+ {E[Rx (βa )] − E(Rm )} + {Rm − Rf } . (87.41) Fama was, therefore, one of the first to suggest that the return on a portfolio could be subdivided into two parts: the return from security selection and the return from the bearing of risk in attempting to predict general market-price movements. Therefore, a manger’s performance can be attributed either to skill in selecting an underpriced security or in markettiming ability. However, Fama noted his concerns about current benchmark portfolios. They included the fact that βp , σ(Rp ), and σ(Rm ) are stable for long periods of time (e.g., 10 years); but in order to capture market timing, the relevant period for evaluation must be considerably shorter. Although the observed return–risk relationships seem to be linear, the trade-off of risk for return is in general less than predicted by the standard CAPM. In short, evidence suggests that standard CAPM framework may not provide the best benchmarks for the average return–risk tradeoffs available in the market from naively selected portfolios. One of the principal applications of modern capital-market theory has been to propose a structural specification within which to measure investment performance and thereby to identify superior performers if they exist. In this structure, it is usually assumed that forecasting skills can be partitioned into two distinct components: (1) forecasts of price movements of selected individual stocks (microforecasting), and (2) forecasts of price movements of the general stock market as a whole (macroforecasting). Usually microforecasting involves the identification of individual stocks that are undervalued or overvalued relative to an index for equities. Using the CAPM as a framework, a microforecaster attempts to identify individual stocks whose expected returns lie significantly above or below the SML. The microforecaster, in essence, forecasts the nonsystematic or nonmarket-explained component of the return on individual stocks. Using the CAPM framework,
July 6, 2020
15:53
3044
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
the random variable return per dollar Zj (t) on security j at time t can be algebraically shown as Zj (t) = R(t) + βj [Zm (t) − R(t)]+ ∈j (t).
(87.42)
In equation (87.42), Zm (t) is the return on the market, R(t) is the return on the riskless asset, and ∈j (t) is the error term with the property that its expectation is conditional on knowing that the outcome of Zm (t) is equal to its unconditional expectation (∈j (t) follows a martingale process). Given such a model, a microforecaster would be interested in forecasting based on the properties of ∈j (t). A macroforecaster, on the other hand, attempts to identify when equities in general are undervalued or overvalued relative to other types of securities, such as fixed-income securities. Macroforecasters try to forecast when stocks will outperform bonds using bonds as a proxy for other types of securities, that is, Zm (t) > R(t) and when bonds will outperform stocks, that is, Zm (t) < R(t). Therefore, a microforecaster tries to forecast Zm (t)–R(t). As a result, macroforecasters’ forecasts can only be used to predict differential performance among individual stocks arising from the systematic or market-explained components of their returns, {βj [Zm (t) − R(t)] + R(t)}. Jensen (1972) developed a theoretical structure for the evaluation of the micro- and macro-forecasting performance of investment managers where the basis for the evaluation is a comparison of ex post performance of the manager’s fund with the returns on the market. In the Jensen analysis, the market timer is assumed to forecast the actual return on the market portfolio, and the forecasted return and the actual return on the market are assumed to have a joint normal distribution. Under these assumptions, a market timer’s forecasting ability can be measured by the correlation between the market timer’s forecast and the realized return on the market. Jensen points out that the separate contributions of micro- and macroforecasting cannot be identified using the structure of the CAPM framework unless for each period the market-timing forecast, the portfolio adjustment corresponding to that forecast, and the expected return on the market are known. Grant (1977) showed that market-timing actions will affect the results of empirical tests that focus only on microforecasting skills. That is, using the following CAPM framework: Zj (t) − R(t) = αj + βj [Zm (t) − R(t)] + ∈j (t),
(87.43)
where Zj (t) is the return on security j, R(t) is the return on the riskfree asset, and αj is the expected excess return from microforecasting.
page 3044
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3045
3045
Market-timing ability will cause the regression estimate αj to be downwardbiased due to microforecasting ability. Treynor and Mazuy (1966) added a quadratic term to the previous CAPM framework to test for market-timing ability. They argued that the performance measure should not be a linear function. They contend that the investment manager who can forecast market returns will hold a greater proportion of the market portfolio when the return on the market is high and a lower proportion when the market return is low. Therefore, the portfolio return will be a nonlinear function of the market return. Kon and Jen (1979) used the Quandt (1972) switching regression technique in a CAPM framework to examine the possibility of changing levels of market-related risk over time for mutual-fund portfolios. Using a maximum likelihood test, they find evidence that many mutual funds do have discrete changes in the level of market-related risk they choose. Merton (1981) developed a model that is not based on the CAPM framework; from it, he was able to analyze market timing through the theoretical structure of the pattern of future returns based upon a posterior distribution of returns. Merton was able to show that up to an additive noise term, the pattern of returns from an investment strategy based upon market timing will be the same as the pattern of returns from a partial protective putoption investment strategy. If this noise (which is caused by forecast error) is diversifiable then, independent of investor’s preference, endowments, or probability beliefs, the equilibrium management fee is proportional to the price of a put option on the market portfolio. These results are obtained with no specific assumptions about the distribution of returns on the market or the way in which the option prices are determined. Henriksson and Merton’s (1981) forecast model, which assumes that a manager’s forecasts are observable, is as follows. Let γ(t) be the market timer’s forecast variable where γ(t) = 1 if the forecast, made at time t – 1 for the time period t, is that Zm (t) > R(t) and γ(t) = 0 if the forecast is that Zm (t) ≤ R(t). The probabilities for γ(t) conditional upon the realized return on the market are defined by (i) P1 (t) = Pr ob[γ(t) = 0 | Zm (t) ≤ R(t)], (ii) 1 − P1 (t) = Pr ob[γ(t) = 1 | Zm (t) ≥ R(t)], (iii) P2 (t) = Pr ob[γ(t) = 1 | Zm (t) > R(t)], (iv) 1 − P2 (t) = Pr ob[γ(t) = 0 | Zm (t) < R(t)]. Therefore, P1 (t) is the conditional probability of a correct forecast given that Zm (t) ≤ R(t), and P2 (t) is the conditional probability of a correct forecast given that Zm (t) > R(t). Assuming that P1 (t) and P2 (t) do not depend upon the magnitude of | Zm (t) ≤ R(t) |, the sum of the conditional probabilities of a correct forecast, P1 (t) + P2 (t) is a sufficient statistic for the evaluation of forecasting ability.
July 6, 2020
15:53
3046
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
However, if a manger’s forecasts are not observable, Henriksson and Merton consider a parametric test for the joint hypothesis of no market-timing ability and an assumed generating process for the returns or securities. They assume a pattern of equilibrium security returns that is consistent with the SML of the CAPM. They further assume that as a function of a manger’s forecast there are discretely different systematic-risk levels that depend on whether or not the return on the market portfolio is forecasted to exceed the return on riskless securities for those portfolios chosen by the portfolio manager. That is, the manager is assumed to have one target beta when predicting that Zm (t) > R(t) and another target beta when predicting that Zm (t) ≤ R(t): their parametric tests are open to the same criticisms that any performance measure is when based upon a CAPM structure. Using the parametric and nonparametric techniques presented in Henriksson and Merton (1981), Henriksson (1984) evaluated the markettiming performance of 116 open-end mutual funds using monthly data from February 1968 to June 1980. Using a weighted least-squares regression analysis with a correction for heteroscedasticity, the separate contribution from forecasting and market timing were obtained. Results show little evidence of market-timing ability. In fact, 62% of the funds had negative estimates of market-timing. Further examination of the estimates for the individual funds shows the existence of a strong negative correlation between microforecasting (selectivity) and macroforecasting (market timing). This negative correlation seems to imply that funds that earn superior returns from stock selection also seem to have negative market-timing ability and performance. These results tend to be somewhat disturbing, and the possibility of misspecification of the return-generating process must be considered. One potential source of error is the misspecification of the market portfolio. This results from the fact that the proxy used for the market portfolio does not include all risk assets. Another potential source of error is omission of relevant factors in addition to the return on the market portfolio from the return-generating process. If the omitted factor can be identified, then the return-generating process can be modified to take into account the omitted factor. Chang and Lewellen (1984) compared the performance estimates derived from Henriksson and Merton’s (1981) model and the single-factor market model. In order to test Henriksson and Merton’s model, first they divided data into two subsets based on the sign of X(t) = Zm (t) − R(t), the market risk premium. Second, they estimated the least-squares lines in each of the two market conditions for every mutual fund, pursuant to the requirement that the line share a common intercept for each fund. And finally they
page 3046
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3047
3047
tested whether the coefficient estimates for the two lines β1∗ and β2∗ differ significantly. Chang and Lewellen’s results indicate that the fit of the single-factor market model is little different from Henriksson and Merton’s model for the mutual-fund return data examined. Out of the 67 mutual funds studied, only four funds indicate any statistical evidence of market timing. Approximately, this number might be expected to emerge by chance alone. A similar conclusion applies to the evaluation of the fund manager’s security-selection abilities. Only five funds indicate any statistical evidence of selectivity ability. Again, chance alone could produce virtually the same findings. Their findings show that not much, if any, systematic market-timing activity was undertaken by portfolio managers in the 1970s — and to the extent that it was undertaken, it was often in the wrong direction. This may explain why the fit of the single-factor market model is not very different from the Henriksson and Merton model. Lee and Rahman (1990, 1991) updated and improved Chang and Lewellen’s results by using better econometric techniques. In addition, they estimated the risk-aversion parameter empirically. They found that mutual-fund managers indeed have some timing and selectivity abilities. Chang and Hung (2003) have used Campbell’s intertemporal version of CAPM model to test the performance of hedge factor for mutual fund. Chen et al. (1992) have performed a cross-sectional analysis of mutual funds’ market timing and security selection skill. Nevertheless, if the individual fund level is considered, it can be seen that Henriksson and Merton’s approach has the clear potential to provide a much richer insight into the nature and sources of managed-portfolio performance differentials. For example, of the seven funds that indicate excess-return intercepts from the single-factor market model, only one of the cases coincides with those for which the Henriksson–Merton regression specification gives rise to a significant intercept. Of the other six funds, four have negative and two have positive estimate intercepts according to the single-factor market model; but the Henriksson–Merton model suggests that these differentials can be imputed to market-timing behavior rather than to security-selection activities. In addition, the Henriksson–Merton-model estimates indicate for several additional funds that a combination of significant timing and selectivity phenomena is present — but in opposite directions, resulting in statistically insignificant intercepts. In short, Chang and Lewellen’s results show that although Henriksson and Merton’s model is an enhancement of the CAPM that provides a more complete appraisal of the constituents of that model and can eliminate
July 17, 2020
14:36
3048
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
certain biases in the estimates provided by it, neither skillful market timing nor clever security selections are evident in observed mutual-fund return data — nor does their model address the general critique of the CAPM as a benchmark. Kon and Jen (1979) point out that Jensen’s assumption of stationarity of risk through time may be in direct conflict with a managed portfolio. If in a managed portfolio the level systematic risk is adjusted substantially in either direction, a violation of the specifications of the OLS model occurs. The effect is that the loss of the known distributional properties of the OLS parameter is made conditional on these estimates. One possible problem could be heteroscedastic disturbances, which increase sampling variances of the OLS estimates and reduce their t-values. Kon and Jen’s model assumes a sequence of discrete risk-level decisions; thereby each observation of excess return (total returns on a portfolio minus the risk-free rate) over the measurement interval of n observation was generated by one of N distinct regression equations. In Jensen model’s estimating equation = αj + βj Rmt + ∈jt, Rjt = R − R ; R = R where Rjt j mt − Rf t ; ∈jt is normally distributed with a ft mt mean of zero and a constant variance; αj is the performance measure; and βj is the assumed stationary.
The stationary assumption is only valid if the fund manager never engages in market timing and if the expected excess return on the market, the variance of the market given information at time t – 1, and the percentage change of the variance of the excess return of the portfolio remain constant. However, if the observations can be indexed according to risk, the Jensen performance measure conditional on the risk level chosen by fund manager in period t can be applied. Kon and Jen’s model is the performance of the portfolio over the measurement interval relative to a naively selected portfolio with risk level βi . Total overall selectivity investment performance is the summation of the weighted αs of each subset. Their model is a Jensen model over N distinct risk levels. The actual number (N ) of distinct risk levels chosen during the measurement interval is an empirical issue; the actual number of regression regimes must be determined for each mutual fund by statistical inference.
page 3048
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
3049
The change in βt may be merely a change in the target level rather than an active timing decision. Therefore, even if the target risk-level evidence indicates N > 1, the timing performance measure may not be meaningful without additional procedure to estimate Rf t − Et (Rf t ) and Rmt − Et (Rmt ). Nevertheless, the empirical methodology used implies that selectivity performance can still be estimated if changes in risk level are the result of a changing investment-opportunity set. Kon and Jen’s methodology utilizes the N regime-switching regression model proposed by Quandt (1972), with a new identifiability condition. In order to ensure that the parameters are identified, the identifiability condition βN > βN −1 > · · · > β1 is imposed. The strict ordering of risk levels is a result of the ordering of fund managers’ forecasts of the unanticipated returns on the market portfolio, Et (Rmt ) − Et (Rmt | φt−1 ). It is this additional prior information that identifies the model. Because Kon and Jen’s maximum-likelihood estimation procedure presented here assumes an unknown probability that the fund manager will choose regime i for generating observations, the estimation procedure is only applicable to analyzing selectivity performance given the timing decision. In addition, it faces the problem of non-stationarity of market-level parameters. The third problem is a proxy for the fund’s target risk level. The Kon–Jen data consist of mutual funds with complete monthly return data from January 1960 to December 1971. The market proxy is the equalweighted market-index-form CRSP with a 30-day Treasury bill rate as a proxy for the risk-free rate. Their simulated results provide confidence in their methodology. Tests of the model specification on a sample of 49 mutual funds indicate that for many individual funds it is more likely that the data were generated by a mixture of two or three regression equations rather than by that of a standard linear model. The null hypothesis of risk-level stationarity was rejected by many individual funds, giving specification for each fund determined by the likelihood-ratio test. This could explain Jensen’s (1968) finding of so few significant t-values in his evidence on selectivity performance. By neglecting this phenomenon and utilizing OLS, the resulting heteroscedastic disturbances increase the sampling variance and reduce the t-statistics. In addition, Jensen’s (1968) frequency distribution of α ˆ was negatively skewed, whereas Kon and Jen found their frequency distribution of α ˆ s to be approximately symmetric about zero. Moreover, if management expenses were added to the mutual fund’s rate of return as in the Jensen study, there
page 3049
July 6, 2020
15:53
3050
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
would certainly be many more significantly positive performance measures. This evidence is clearly inconsistent with the EMH. It can be argued that one could expect managers to be successful in forecasting from time to time, whether by uncovering special information or by keener insight into the implications of publicly available information. However, in an efficient market they cannot do this consistently over time. There is very little evidence that any individual fund was able to consistently generate significantly superior performance. In addition, the evidence for the EMH is based on the bias in favor of low-risk securities using the SML benchmark. Therefore, Kon and Jen’s evidence is not inconsistent with the hypothesis that mutual-fund managers individually and on average are unable to forecast the future prices on individual securities consistently enough to recover their research expenses, management fees, and commission expenses. Much of the empirical evidence (Chang and Lewellen, 1984, Henriksson, 1984, Kon, 1983) indicates that the timing ability is rare. In addition, if the timing ability is present, it is often negative, and those funds that do exhibit significant timing performance show negative performance more often than positive performance. Henriksson (1984) found a negative correlation between the measure of security selection and market timing. A number of potential explanations for these results has been suggested, including errors-in-variables bias, misspecification of the market portfolio, and use of a single-factor rather than a multifactor asset-pricing model. Jagannathan and Korajczk (1986) suggest another explanation for the empirical results, which relies on the nonlinear payoff structure of options and option-like securities as well as the specification of the proxy for the market portfolio. They show that the portfolio strategy of buying call options exhibits positive timing performance and negative security selection even though no market forecasting or security-specific forecast is done. If market-timing occurs but the return is reduced by the premium paid for the option, it leads to negative security-selection evidence. The market proxy is a portfolio of stocks that are, to a greater or lesser extent, options. The sign of market-timing performance of a given mutual fund may depend on whether the average stock held by the mutual fund has more or less of an option effect than the average stock in the index. The average negative timing performance found in Kon (1983), Chang and Lewellen (1984), and Henriksson (1984) may be due to the fact that the mutual funds in the sample tend to invest in firms that are larger, better established, and less leveraged than the average firm on the NYSE.
page 3050
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3051
3051
Jagannathan and Korajczyk (1986), using the option-pricing model, found that when the proxy for the market portfolio contains option-like securities, portfolios with greater (lower) concentration in option-like securities will show positive (negative) timing performance and negative (positive) selectivity. This provides a possible explanation of previous empirical findings indicating that mutual funds have negative timing ability on average, and that selectivity and timing performance are negatively correlated. If mutual funds tend to invest in higher-quality securities, then the average timing performance would be expected to be negative. Also, negative correlation would be expected between selectivity and timing performance if investments were in securities that are less like options. However, Lehman and Modes (1987) and Lee and Rahman (1990) have found that mutual-fund managers have positive timing ability. In addition, Chang et al. (2003) have extended Lee and Rahman’s (1990) model to allow selective ability, market timing ability, and hedging timing ability. They also performed some empirical test for 65 US mutual funds during the period from January 1980 to September 1996.2 A possible explanation for the lack of evidence of timing ability on the aggregate is the possible use of an immunization strategy by the funds. Although timing may be an important aspect within the fund, where assets are bought and sold to maintain a fund’s duration, the fund in the aggregate may not display any timing influences. If, indeed, mutual funds do follow an immunization strategy, this could aid in explaining the empirical results, revealing timing activity within funds but not at the aggregate level.
87.7 Summary This chapter has employed the concepts and theory of technical and fundamental analysis to show that security analysts and portfolio managers might utilize theory, methodology, and data information to outperform the market. Both Value Line ranking performance and mutual-fund managers’ performance are used to support this conclusion. Overall, this chapter has discussed how to perform both fundamental and technical analysis. In addition, how to use three alternative investment performance measures to evaluate mutual fund performance has also been discussed. 2
Chang et al. (2003) have used Campbell’s (1993) Intertemporal CAPM to generalize Lee and Rahman’s (1990) model to allow for testing the existence of selectivity, timing and hedging performance for a mutual fund.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
3052
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
Bibliography S. Alexander (1961). Price Movements in Speculative Markets: Trends or Random Walks. Industrial Management Review, 2, 7–26. R. W. Banz (1981). The Relationship between Return and Market Value of Common Stocks. Journal of Financial Economics, 9, 3–18. S. Basu (1977). Investment Performance of Common Stocks in Relation to Their PriceEarnings Ratios: A Test of the Efficient Markets Hypothesis. Journal of Finance, 32, 663–682. D. A. Bessler and J. A. Brandt (1979). Composite Forecasting of Live-stock Prices: An Analysis of Combining Alternative Forecasting Methods. Department of Agricultural Economics Agricultural Experiment Station, Station Bulletin No. 265. Purdue University, West LaFayette, Indiana. F. Black (1972). Active and Passive Monetary Policy in a Neoclassical Model. Journal of Finance, 27, 801–814. F. Black, M. C. Jensen, and M. Scholes (1972). ‘The capital asset’. Pricing model: some empirical test. In M. C. Jensen (ed.), Studies in the Theory of Capital Markets. New York: Prager, 79–121. W. Blau (1995). Momentum, Direction, and Divergence. John Wiley & Sons, Inc. M. E. Blume and I. Friend (1973). A New Look at the Capital Asset Pricing Model. Journal of Finance, 28, 19–34. Z. Bodie, A. Kane, and A. Marcus (2013). Investments, 10th ed. New York: McGrawHill/Irwin. D. H. Bower, R. S. Bower, and D. F. Logue (1984). Arbitrage Pricing Theory and Utility Stock Returns. Journal of Finance, 39, 1041–1054. G. P. Box, G. M. Jenkins, and G. C. Reinsel (2008). Time Series Analysis: Forecasting and Control, 4th ed. New York: Wiley. B. Branch (1977). A Tax Loss Trading Rule. Journal of Business, 50, 198–207. D. T. Breeden (1979). An Intertemporal Asset Pricing Model with Stochastic Consumption and Investment Opportunities. Journal of Financial Economics, 7, 265–296. J. Brown (2015). FOREX TRADING: The Basics Explained in Simple Terms. CreateSpace Independent Publishing Platform. John Y. Campbell (1993). Intertemporal Asset Pricing Without Consumption Data. American Economic Review, 83, 487–512. E. C. Chang and W. G. Lewellen (1984). Market Timing and Mutual Fund Investment Performance. Journal of Business, 57, 57–72. J. R. Chang, M. W. Hung, and C. F. Lee (2003). An Intertemporal CAPM Approach to Evaluate Mutual Fund Performance. Review of Quantitative Finance and Accounting, 20, 415–433. C. R. Chen, C. F. Lee, S. Rahman, and A. Chen (1992). A Cross-sectional Analysis of Mutual Funds’ Market Timing and Security Selection Skill. Journal of Business Finance and Accounting, 19, 659–675. N. -F. Chen, R. Roll, and S. A. Ross (1986). Economic Forces and the Stock Market: Testing the APT and Alternative Asset Pricing Theories. Journal of Business, 59, 383–404. N. Chen, T. E. Copeland, and D. Mayers (1987). A Comparison of Single and Multifactor Portfolio Performance Methodologies. Journal of Financial and Quantitative Analysis, 22, 401–417. S. Chen and C. F. Lee (1982). Bayesian and Mixed Estimators of Time Varying Betas. Journal of Economics and Business, 34, 291–301.
page 3052
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3053
3053
H. Y. Chen, C. F. Lee, and W. K. Shih (2016). Technical, Fundamental, and Combined Information for Separating Winners from Losers. Pacific-Basin Finance Journal, 39, 224–242. S. N. Chen and C. F. Lee (1981). The sampling relationship between Sharpe’s performance measure and its risk proxy: Sample size, investment horizon, and market condition. Management Science, 27(6), 607–618. S. N. Chen and C. F. Lee (1986). The Effects of the Sample Size, the Investment Horizon and Market Conditions on the Validity of Composite Performance Measures: A Generalization. Management Science, 32(11), 1371–1520. C. Cho, E. J. Elton, and M. J. Gruber (1984). On the Robustness of the Roll and Ross Arbitrage Theory. Journal of Financial and Quantitative Analysis, 19, 1–10. T. E. Copeland and D. Mayers (1982). The Value Line Enigma (1965–1978): A Case Study of Performance Evaluation Issues. Journal of Financial Economics, 10, 289–322. B. Cornell (1979). Asymmetric Information and Portfolio Performance Measurement. Journal of Financial Economics, 7, 381–390. B. Cornell (1983). The Money Supply Announcements Puzzle: Review and Interpretation. American Economic Review, 73, 644–657. P. J. Dhrymes, I. Friend, and N. B. Gultekin (1984). A Critical Re-examination of the Empirical Evidence on the Arbitrage Pricing Theory. Journal of Finance, 39, 323–346. M. Douglas (2001). Trading in the Zone: Master the Market with Confidence, Discipline, and a Winning Attitude. Prentice Hall Press. P. H. Dybvig (1985a). The Analytics of Performance Measurement Using a Security Market Line. Journal of Finance, 40, 401–416. P. H. Dybvig (1985b). Yes, the APT is Testbale. Journal of Finance, 40, 1173–1188. P. H. Dybvig and S. A. Ross (1985). Differential Information and Performance Measurement Using a Security Market Line. Journal of Finance, 40, 383–399. F. J. Fabozzi, C. F. Lee, and S. Rahman (1989). Errors-in-Variables, Functional Form and Mutual Fund Returns. Chicago: Mimeo. F. J. Fabozzi, J. C. Francis, and C. F. Lee (1980). Generalized Functional Form for Mutual Fund Returns. Journal of Financial and Quantitative Analysis, 15, 1107–1120. E. F. Fama (1972). Components of Investment Performance. Journal of Finance, 27, 551–567. A. Gehr, Jr. (1976). Some Tests of the Arbitrage Pricing Theory. Journal of the Midwest Finance Association, 7, 91–105. J. A. Gentry and C. F. Lee (1987). Financial forecasting and the X-II model: preliminary evidence. In C. F. Lee (ed.), Advances in Planning and Forecasting. Greenwich, CT: JAI Press. D. Grant (1977). Portfolio Performance and the ‘Cost’ of Timing Decisions. Journal of Finance, 32, 837–846. J. B. Guerard (2010). Handbook of Portfolio and Construction: Contemporary Applications of Markowitz Techniques. New York: Springer. R. D. Henriksson (1984). Market Timing and Mutual Fund Performance: An Empirical Investigation. Journal of Business, 57, 73–96. R. D. Henriksson and R. C. Merton (1981). On Market Timing and Investment Performance II, Statistical Procedures for Evaluating Forecasting Skills. Journal of Business, 54, 513–533. C. Holloway (1981). A Note on Testing and Aggressive Investment Strategy Using Value Line Ranks. Journal of Finance, 36, 711–719.
July 6, 2020
15:53
3054
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
S. H. Irwin and J. W. Uhrig (1984). Do Technical Analysts Have Holes in Their Shoes? Review of Research and Future Markets, IV, 264–277. R. Jagannathan and R. A. Korajczyk (1986). Assessing the Market Timing Performance of Managed Portfolios. Journal of Business, 59, 217–235. N. Jegadeesh and S. Titman (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. Journal of Finance, 48, 65–91. N. Jegadeesh and S. Titman (2001). Profitability of momentum strategies: An evaluation of alternative explanations. Journal of Finance, 56, 699–720. M. C. Jensen (1972). Optimal utilization of market forecasts and the evaluation of investment performance. In G. P. Szego, and K. Shell (eds.), Mathematical Methods in Investment and Finance. Amsterdam: North-Holland. M. C. Jensen (1968). The Performance of Mutual Funds in the Period 1945–1964. Journal of Finance, 39, 389–416. M. C. Jensen (1969). Risk, the Pricing of Capital Assets and the Evaluation of Investment Portfolios. Journal of Business, 42, 167–247. D. Y. Ji, H. Y. Chen, and C. F. Lee (2015). Forecast Performance of the Taiwan Weighted Stock Index. Review of Pacific Basin Financial Markets and Policies, 18(3), 155017-1 16. J. D. Jobson and B. M. Korkie (1981). Performance Hypothesis Testing with the Sharpe and Treynor Measures. The Journal of Finance, 36(4), 889–908. D. B. Keim (1983). Size-Related Anomalies and Stock Return Seasonality: Further Empirical Evidence. Journal of Financial Economics, 11, 13–32. C. D. Kirkpatrick II and J. R. Dahlquist (2015). Technical Analysis: The Complete Resource for Financial Market Technicians, 3rd ed. FT Press. S. J. Kon (1983). The Market-Timing Performance of Mutual Fund Managers. Journal of Business, 56, 323–347. S. J. Kon and F. C. Jen (1979). The Investment Performance of Mutual Funds: An Empirical Investigation of Timing, Selectivity, and Market Efficiency. Journal of Business, 52, 363–389. C. F. Lee and E. Bubnys (1986). The stability of return, risk and the cost of capital for the electric utility industry. In R. E. Burns (ed.), Proceedings of the Fifth MARUC Biennial Regulating Information Conference. C. F. Lee and H. Park (1987). Value Line Investment Survey Rank Changes and Beta Coefficients. Financial Analysts Journal, 43, 70–72. C. F. Lee and J. K. C. Wei (2015). Multi-factor Multi-indicator Approach to Asset Pricing Model: Theory and Empirical Evidence. In C. F. Lee, A. C. Lee, and J. C. Lee (eds.), Handbook of Financial Econometrics and Statistics. Springer. C. F. Lee and S. Rahman (1990). Market Timing, Selectivity and Mutual Fund Performance: An Empirical Investigation. Journal of Business, 63, 261–278. C. F. Lee and S. Rahman (1991). New Evidence on Timing and Security Selection Skill of Mutual Fund Managers. Journal of Portfolio Management, 17, 80–83. C. F. Lee, D. C. Porter, and D. G. Weaver (1998). Indirect Test of the Haugen–Lakonishok Small Firm/January Effect Hypothesis: Window Dressing versus Performance Hedging. Financial Review, 33, 177–194. C. F. Lee, C. M. Tsai, and A. C. Lee (2013). Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence. Quantitative Finance, 13(2), 227–240. B. N. Lehman and D. Modest (1987). Mutual Fund Performance Evaluation: A Comparison of Benchmarks and Benchmark Comparisons. Journal of Finance, 42, 233–265. J. Lintner (1965). The Valuation of Risk Assets and the Selection of Risky Investments in Stock ‘Portfolios and Capital Budgets. Review of Economics and Statistics, 47, 13–37.
page 3054
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
page 3055
3055
R. H. Litzenberger and K. Ramaswamy (1979). The Effect of the Personal Taxes and Dividends on Capital Asset Prices. Journal of Financial Economics, 76, 163–195. Andrew W. Lo and Jiang Wang (2000). Trading Volume: Definition, Data Analysis, and Implications of Portfolio Theory. Review of Financial Studies, 13, 257–300. A. W. Lo (2002). The statistics of Sharpe ratios. Financial Analysts Journal, 58(4), 36–47. A. W. Lo and Jiang Wang (2000). Trading Volume: Definition, Data Analysis, and Implications of Portfolio Theory. Review of Financial Studies, 13, 257–300. D. Mayers and E. M. Rice (1979). Measuring Portfolio Performance and the Empirical Content of Asset Pricing Models. Journal of Financial Economics, 7, 3–28. R. C. Merton (1981). On Market Timing and Investment Performance. I, An Equilibrium Theory of Value for Market Forecasts. Journal of Business, 54, 363–406. J. Mossin (1966). Equilibrium in a Capital Asset Market. Econometrica, 34, 768–783. S. M. Nazer (1988). Applied Time Series Analysis for Business and Economic Forecasting. New York: Marcel Dekker. C. R. Nelson (1973). Applied Time Series Analysis for Managerial Forecasting. New York: Holden-Day. S. Nison (2001). Japanese Candlestick Charting Techniques. 2nd ed. Prentice Hall Press. R. E. Quandt (1972). A New Approach to Estimating Switching Regressions. Journal of the American Statistical Association, 67, 306–310. F. K. Reilly, F. T. Griggs, and W. Wong (1983). Determinants of the Aggregate Stock Market Earnings Multiple. Journal of Portfolio Management, 10, 36–45. M. R. Reinganum (1981). Misspecification of Capital Asset Pricing: Empirical Anomalies Based on Earnings Yields and Market Values. Journal of Financial Economics, 8, 19–46. R. Roll (1978). Ambiguity When Performance is Measured by the Securities Market Line. Journal of Finance, 33, 1051–1069. R. Roll (1977). A Critique of the Asset Pricing Theory’s Test Part I: On Past and Potential Testability of the Theory. Journal of Financial Economics, 4, 129–176. R. Roll (1979). A Reply to Mayers and Rice (1979). Journal of Financial Economics, 7, 391–400. R. Roll and S. A. Ross (1980). An Empirical Investigation of the Arbitrage Pricing Theory. Journal of Finance, 35, 1073–1103. B. Rosenberg and W. McKibben (1973). The Prediction of Systematic and Specific Risk in Common Stocks. Journal of Financial and Quantitative Analysis, 8, 317–333. S. A. Ross (1976). The Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory, 13, 341–360. J. Shanken (1982). The Arbitrage Pricing Theory: Is It Testable? Journal of Finance, 37, 1129–1140. W. F. Sharpe (1964). Capital Asset Prices: A Theory of Market Equilibrium under Condition of Risk. Journal of Finance, 19, 425–442. W. F. Sharpe (1966). Mutual Fund Performance. Journal of Finance, 39, 119–138. R. J. Shiller (1984). Theories of Aggregate Stock Price Movements. Journal of Portfolio Management, 10, 28–37. C. D. Smith (2015). How to Make a Living Trading Foreign Exchange: A Guaranteed Income for Life. John Wiley & Sons, Inc. S. E. Stickel (1985). The Effect of Value Line Investment Survey Rank Changes on Common Stock Prices. Journal of Financial Economics, 14, 121–143. R. J. Sweeny (1988). Some New Filter Rule Tests: Methods and Results. Journal of Financial and Quantitative Analysis, 23, 285–300.
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3056
S. J. Taylor (1982). Tests of the Random Walk Hypothesis against a Price-Trend Hypothesis. Journal of Financial and Quantitative Analysis, 17, 37–61. J. L. Treynor (1965). How to Rate Management of Investment Funds. Harvard Business Review, 43, 63–75. J. L. Treynor and F. Mazuy (1966). Can Mutual Funds Outguess the Market? Harvard Business Review, 44, 131–136. J. L. Treynor and R. Ferguson (1985). In Defense of Technical Analysis. Journal of Finance, 40, 757–775. T. Ulrich and P. Wachtel (1981). Market Response of the Weekly Money Supply Announcements in the 1970s. Journal of Finance, 36, 1063–1071. D. Wu (2007). Theory and evidence: An adverse-selection explanation of momentum. Working paper, Massachusetts Institute of Technology.
Appendix 87A: Composite Forecasting Method Most forecasts contain some information that is independent of that contained in other forecasts; thus a combination of the forecasts will, quite often, outperform any of the individual forecasts. Nelson (1973) and Box et al. (2008) have shown that a composite forecast of unbiased forecasts is unbiased. For n individual unbiased forecasts Xi (i = 1, 2, . . . , n) with n weights ai , each greater than or equal to zero and all weights summing to one, and a composite forecast X, the value of X is then given as X=
n
ai Xi
i=1
n
ai = 1, ai ≥ 0.
i=1
The expected value of X is
n n n ai Xi = ai E(Xi ) = ai (μx ) = μx E(X) = E i=1
i=1
i=1
in which μx is the expected value of Xi . Therefore, the expected value of combination of n unbiased forecasts is itself unbiased. If, however, a combination of n forecasts is formed, m of which are biased, the result is generally a biased composite forecast. By letting the expected value of the ith biased forecast be represented as E(Xi ) = μx + ∈i the composite bias can be represented as follows: E(X) = =
n i=1 m i=1
ai E(Xi ) ai (μ+ ∈i ) +
n i=m+1
ai (μx )
page 3056
July 17, 2020
14:36
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
Fundamental Analysis, Technical Analysis, and Mutual Fund Performance
= μx +
m
3057
ai ∈i
i=1
The composite of m-biased forecasts has a bias given by a combination of the individual forecast biases. This suggests that the composite of a biased forecast can be unbiased only if m i=1 ai ∈i = 0. In particular, combining two forecasts, one with a positive bias and one with a negative bias, can, for proper choices of weights, result in an unbiased composite, However, for biased forecasts that do not balance each other, and assuming the assignment of zero weights to biased forecasts is not desired, numerous combinations of weights can be selected, each of which gives a composite that is unbiased. The choice of weights can follow numerous approaches. These range from the somewhat na¨ıve rule of thumb to more involved additive rules. One rule of thumb is that when several alternative forecasts are available but a history of performance on each is not, the user can combine all forecasts by finding their simple average. Some additive rules may combine the econometric and ARIMA forecasts into a linear composite prediction of the form: At = B1 (Econometric)t + B2 (ARIMA)t + ∈t ,
(87A.1)
where At is the actual value for period t, B1 and B2 are fixed coefficients, and ∈t is the composite prediction error. Least-squares fitting of (87A.1) requires minimization of the sum of errors over values of B1 and B2 and, therefore, provides the minimum mean-squareerror linear composite prediction for the sample period. In the case that both the econometric model and ARIMA predictions are individually unbiased, then (87A.1) can be rewritten as At = B(Econometric) + (1 − B)(ARIMA)t + ∈t. The least-squares estimate of B in (87A.2) is then given by N t − (ARIMA)t ][At − (ARIMA)t ] ˆ = t=1 [(ECM) B N 2 t=1 [(ECM)t − (ARIMA)t ]
(87A.2)
(87A.3)
in which (ECM)t and (ARIMA)t represent forecasted values from econometric model and ARIMA model, respectively. Equation (87A.3) is seen to be the coefficient of the regression of ARIMA prediction errors [At −(ARIMA)t ] on the difference between the two predictions. As would seem quite reasonable, the greater the ability of the difference between the two predictions
page 3057
July 6, 2020
15:53
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch87
C. F. Lee
3058
to account for error committed by (ARIMA)t , the larger will be the weight given to (Econometric)t . Composite predictions may be viewed as portfolios of predictions. If the econometric models and ARIMA’s errors are denoted by u1t , and u2t , respectively, then from (87A.2) the composite prediction error is seen to be ∈t = B(u1t ) + (1 − B)(u2t ).
(87A.4)
The composite error is the weighted average of individual errors. The objective is to minimize the variance of the weighted average, given its expected value. In the case of prediction portfolios the weighted average always has expectation zero if individual predictions are unbiased; or it may be given expectation zero by addition of an appropriate constant. Minimizing composite error variance over a finite sample of observations leads to the estimate of B given by ˆ= B
s22 − s12 , s21 + s22 − 2s12
(87A.5)
where s21 , s22 , and s12 are the sample variance of u1t , the sample variance of u2t , and the sample covariance of u1t and u2t , respectively. For large samples, or in the case that the variances Var(u1t ) and Var(u2t ) and the covariance Cov(u1t u2t ) are known, equation (87A.5) becomes B=
Var(u2t ) − Cov(u1t , u2t ) . Var(u1t ) + Var(u2t ) − 2Cov(u1t , u2t )
(87A.6)
The minimum variance weight is seen to depend on the covariance between individual errors as well as on their respective variance. Holding the covariance constant, the larger the variance of the ARIMA error relative to that of the econometric error, the larger the weight given to the econometric prediction.
page 3058
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Chapter 88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity Cheng Few Lee Contents 88.1 88.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . Bond Strategies . . . . . . . . . . . . . . . . . . . . . . 88.2.1 Riding the yield curve . . . . . . . . . . . . . . 88.2.2 Maturity-structure strategies . . . . . . . . . . 88.2.3 Swapping . . . . . . . . . . . . . . . . . . . . . 88.2.3.1 Substitution swap . . . . . . . . . . 88.2.3.2 Intermarket-spread swap . . . . . . 88.2.3.3 Interest-rate anticipation swap . . . 88.2.3.4 Pure yield-pickup swap . . . . . . . 88.3 Duration . . . . . . . . . . . . . . . . . . . . . . . . . . 88.3.1 Weighted-average term to maturity . . . . . . 88.3.2 WATM versus duration measure . . . . . . . . 88.3.3 Yield to maturity . . . . . . . . . . . . . . . . 88.3.4 The Macaulay model . . . . . . . . . . . . . . 88.4 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . 88.5 Contingent Immunization . . . . . . . . . . . . . . . . 88.6 Bond Portfolios: A Case Study . . . . . . . . . . . . . 88.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 88A: Procedures of Calculating Modified Duration and Convexity . . . . . . . . . . . . . . . . . . . . . . Cheng Few Lee Rutgers University e-mail: cfl[email protected] 3059
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
3060 3060 3061 3061 3062 3063 3065 3067 3070 3071 3072 3073 3074 3078 3083 3086 3087 3093 3093
. . . .
3094
page 3059
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3060
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
Abstract This chapter first focuses on the bond strategies of riding the yield curve and structuring the maturity of the bond portfolio in order to generate additional return. This is followed by a discussion of swapping, which are essentially interest-rate swaps. Next is an analysis of duration or the measure of the portfolio sensitivity to changes in interest rates with and without convexity, after which immunization is the focus. The convexity is essentially discussed in nonlinear relationship between bond price and duration. Finally, a case study is presented of bond-portfolio management in the context of portfolio theory. Overall, this chapter presents how interest rate changes affect bond price and how maturity and duration can be used to manage portfolios. Keywords Bond strategies • Swapping • Substitution swap • Intermarket-spread swap • Interest-rate anticipation swap • Pure-yield-pickup swap • Duration • Maturity.
88.1 Introduction Portfolio theory has had a limited impact on the management of fixed-income portfolios. For the most part, the techniques and strategies used in managing a bond portfolio are unique. This chapter presents techniques that are commonly used in bond-portfolio management and discusses the impact that portfolio theory has had on bond-portfolio management. This chapter first focuses on the bond strategies of riding the yield curve and structuring the maturity of the bond portfolio in order to generate additional return. This is followed by a discussion of swapping. Next is an analysis of duration or the measure of the portfolio sensitivity to changes in interest rates with and without convexity, after which immunization is the focus. Finally, a case study on bond-portfolio management in the context of portfolio theory is presented. In Section 88.2, we explore bond strategies. We discuss duration in Section 88.3, while Section 88.4 deals with convexity. In Section 88.5, we talk about contingent immunization. A case study on bond portfolios is presented in Section 88.6, and finally, in Section 88.7, we summarize the chapter.
88.2 Bond Strategies How are yield curves useful for investors and analysts? The primary ways in which they are used in the market include the following: (1) to improve the forecasting of interest rates, (2) as a measure to help identify mispriced debt
page 3060
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3061
3061
securities, (3) to “ride the yield curve,” and (4) to help investors manage their portfolio-maturity structures. This section concerns the third and fourth uses of yield curves. 88.2.1 Riding the yield curve Riding the yield curve is an investment strategy designed to take advantage of yield-curve shapes that are expected to be maintained for a period of time. Given the yield-curve shape, an investor then decides whether to purchase a debt security that matures at the end of his or her time horizon or to purchase a longer-term debt security which can be sold at time T . (The investor may also purchase securities that mature before T and reinvest out to T .) If the yield curve is upward sloping and is expected to remain stable over the investor’s horizon period, purchasing a longer-term security and holding on to it as the end of the horizon period approaches would lead to an increasing selling price (bond value) as the yield declines. Even though reinvestment rates (RRs) for coupons would also be declining, the total realized yield (RY) would be higher than the direct investment in a shorter-term security that matures at the end of the holding period. Similar strategies could be illustrated for other yield-curve shapes. The problem with successfully using such strategies is that yield curves do not remain unchanged for long periods of time; they can and do change abruptly and seemingly without warning. The best advice to those with a specific horizon period but little forecasting skill would likely be to invest in short-maturity bonds to maintain reinvestment flexibility. 88.2.2 Maturity-structure strategies A common practice among bond-portfolio managers is to evenly space the maturity of their securities. Under the staggered-maturity plan, bonds are held to maturity, at which time the principal is reinvested in another longterm maturity instrument. Little managerial expertise is required to maintain the portfolio, and the maturing bonds and regular interest payments provide some liquidity. An alternative to a staggered portfolio is the dumbbell strategy. Dumbbell portfolios are characterized by the inclusion of some proportion of shortand intermediate-term bonds that provide a liquidity buffer to protect a substantial investment in long-term securities. In Figure 88.1, it is apparent
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3062
Figure 88.1:
Dumbbell maturity strategy.
why this is called the dumbbell strategy — the resulting graph looks like a weight lifter’s dumbbell. The dumbbell portfolio divides its funds between two components. The shortest maturity is usually less than three years, and the longest maturities are more than 10 years. The portfolio is weight at both ends of the maturity spectrum (again like a dumbbell). The logic and mechanics of the dumbbell strategy are straightforward: the short-term Treasury notes (T-notes) provide the least risk and highest liquidity, while long-term bonds provide the highest return. The best risk–return portfolio combination may very well be a combination of these extremes. Assuming an upward-sloping yield curve, no intermediate bonds will be held since they have (1) less return than the longest-maturity bonds and (2) less liquidity and safety than the shortest T-note. The performance of staggered and dumbbell strategies differs with respect to price fluctuations and return. During periods when interest rates are expected to increase, the return will most likely be over for a dumbbell portfolio than for a staggered portfolio. When rates are constant or cyclical and the yield curve is upward sloping, the dumbbell is superior to the staggered portfolio in yield; nevertheless, the price fluctuation will be greater for a dumbbell than for a staggered-maturity structure.
88.2.3 Swapping When a market is in turmoil, great distortions take place. Some bonds drop further than they should, others less. A bond swapper can improve yield and pick up a substantial capital gain. In swapping, timing is very important.
page 3062
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3063
3063
Swapping strategies generally concentrate on highly specialized trading relationships. A commonly accepted method for classifying such swaps is Homer and Leibowitz’s four types: (1) pure yield-pickup swap, (2) interestrate anticipations, (3) intermarket swap, and (4) substitution swap. The expected return from any swap is usually based upon several motives, not just one, thus, these “types” of swaps are really just sources of return. 88.2.3.1 Substitution swap The substitution swap is the simplest of all. The swap attempts to profit from a change in yield spread between two nearly identical bonds. The trade is based upon a forecasted change in the yield spread between the two nearly bonds. The forecast is generally based upon the past history of the yield– spread relationship between the two bonds, with the assumption that any aberration from the past relationship is temporary, thereby allowing profit by buying the bond with the lower (higher) yield if the spread will become wider (or narrower). This trade is later reversed, leaving the investor in the original position, but with a trading profit from the relative changes in prices. The substitution swap is simple in concept. Both the H-bond (the bond now held) and the P -bond (the proposed purchase) are equality, coupon, and maturity. The swap is executed at a time when the bonds are mispriced relative to each other. This mispricing is expected to be corrected by the end of the workout period. Example 88.1 provides further illustration. Example 88.1. Substitution Swap. Suppose the investor holds a 30-year Aa utility 7% coupon bond (the H-bond), currently priced at par. He is offered to swap another 30-year Aa utility 7% coupon bond (the P -bond) at a yield to maturity (YTM) of 7.10%. Assume the workout period is one year. During this period, the prevailing RR for coupon remains unchanged at 7%. At the end of the workout period, both the H-bond and the P -bond are priced at par to yield 7%. See the evaluation worksheet (Table 88.1) that follows. The worksheet gain of 129 basis points in realized compound yield is achieved only during the single year of the workout period. To obtain such a realized compound yield over the extended 30-year period, the investor must continue to swap an average of once a year, picking up 10 basis points with each swap, or at least averaging such a pickup on balance. At the very worst, the workout period may take the entire 30 years, at which time the realized compound-yield gain would be 4.3 basis points. This is less than the initial 10-basis-point gain in YTM because the same RR will prevail for reinvesting the coupons of both the H-bond and the P -bond.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3064 Table 88.1:
Evaluation worksheet for a sample substitution swap. H -Bond 30-year 7s @ 7.00%
Original investment per bond Two coupons during year Interest on one coupon @ 7% for one-half year Principal value at end of year @ 7.00 YTM Total accrued Total gain Gain per invested dollar Realized compound yield (%) Value of swap
P-Bond 30-year 7s @ 7.10%
Workout time: 1 year RR: 7% $1,000.00 $987.70 70.00 70.00 1.23
1.23
1,000.00 1,000.00 1,071.23 1,071.23 71.23 83.53 0.07123 0.08458 7.00 8.29 129 basis points in one year
Source: Homer and Leibowitz (1972, p. 84). Table 88.2: Effect of workout time on substitution swap: 30-year 7s swapped from 7% YTM to 7.10% YTM.
Workout time 30 years 20 10 5 2 1 6 months 3 months
Realized gain
Compound
Yield
4.3 basis points/year 6.4 12.9 25.7 64.4 129.0 258.8 527.2
Source: Homer and Leibowitz (1972, p. 85).
This RR benefits the bond with the lower starting yield relative to the bond with the higher starting yield. Thus, it pulls the total returns together. The relative benefit of the RR to the lower-yield issue will be greater at low future RRs and less at high future rates. As the workout time is reduced, the relative gain in realized compound yield over the workout period rises dramatically, as seen in Table 88.2. The substitution swap may not work out exactly as anticipated due to the following factors: (1) a slower workout time than anticipated, (2) adverse interim spreads, (3) adverse changes in overall rates and (4) the P -bonds not being
page 3064
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3065
3065
Table 88.3: Effect of major rate changes on the substitution swap: 30-year 7s swapped from 7% to 7.1%, realized compound yields — principal plus interest. 1-year workout RR and YTM (%) H -Bond
P-Bond
5 6 7 8 9
36.013 21.161 8.29 (2.896) (12.651)
34.551 19.791 7.00 (4.117) (13.811)
Gain 30-year workout (Basis points) H -Bond P-Bond 146.2 137.0 129.0 122.1 116.0
5.922 6.445 7.000 7.584 8.196
5.965 6.448 7.043 7.627 8.239
Gain (Basis points) 4.3 4.3 4.3 4.3 4.3
Source: Homer and Leibowitz (1972, p. 87).
a true substitute. In the substitution swap, major changes in overall market yields affect the price and reinvestment components of both the H- and P -bond. However, as these effects tend to run parallel for both the H- and P -bond, the relative gain from the swap is insensitive even to major rate changes, as seen in Table 88.3. 88.2.3.2 Intermarket-spread swap The intermarket-spread swap works on trading between sector-qualitycoupon categories, based upon a forecasted change in yield spread between two different categories. The most common forecasting method is to observe historical yield spreads at various points in the interest-rate cycle and then to adjust for current supply-and-demand effects. However, this is most difficult, as shown by the 1977–1978 period of extremely narrow spreads between AAs and US government issues. Many managers bought government issues early in 1977, expecting the spread to widen, only to have to sit on their position through the next 18 months as both inflation and government spending continued unabated. In the intermarket-spread swap, the offered P -bond is essentially a different bond from the investor’s H-bond, and the yield spread between the two bonds is largely determined by the yield spread between two segments of the bond market itself. The investor believes the “intermarket” yield spread is temporarily out of line, and the swap is executed in the hope of profiting when the discrepancy in this spread is resolved. The intermarket-spread swap can be executed in two directions. The swap can be made into a P -bond having a greater YTM than the H-bond. This is done either for the extra yield (in belief that the spread will not widen) or in the belief that the intermarket-spread will narrow, resulting in a lower
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3066
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
Table 88.4: Evaluation worksheet for a sample intermarket-spread swap in a yield-pickup direction. H -Bond 30-year 4s @ 6.50% Initial YTM (%) YTM at workout
P-Bond 30-year 7s @ 7.00%
6.50 6.50
7.00 6.90
Spread narrows 10 basis points from 50 basis points to 40 basis points.
Original investment per bond Two coupons during year Interest on one coupon @ 7% for 6 months Principal value at end of year Total accrued Total gained Gain per invested dollar Realized compound yield (%) Value of swap
Workout time: 1 year RR: 7% $671.82 $1,000.00 40.00 70.00 0.70 675.55 716.25 44.43 0.0661 6.508 169.2 basis points in one year
1.23 1, 012.46 1, 083.69 83.69 0.0837 8.200
Source: Homer and Leibowitz (1972, p. 90).
relative YTM for the P -bond and thus a higher relative price for the P -bond. The investor is always assured of a gain of the initial basis-point spread, less any adjustment to the YTM of the swapped bond due to reinvestments at a higher coupon rate by maturity. Example 88.2 provides further illustration. Example 88.2. Intermarket-Spread Swap. Suppose an investor holds the 30-year 4s priced at 67.18 to yield 6.50% and views the 30-year 7s at par as appropriate for an intermarket-spread swap. This investor feels that the 50-basis-point spread is excessive and anticipates a shrinkage of 10 basis points over the coming year. The price of the H-bond (4s) is kept constant for ease in computation. Table 88.4 illustrates this situation. The 24.5-basis-point gain over 30 years (see Table 88.5) is less than the initial 50-basis-point gain due to the fact that the same RR benefits the bond with lower starting yield relative to the bond with the higher starting yield. Thus, it pulls the total returns together. The yield-give-up version of the intermarket-spread swap works against the investor over time. Therefore, when a swap involves a loss in yield, there is a high premium to be placed on achieving a favorable spread change within a relatively short workout period. Here, the investor trades the higher YTM
page 3066
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3067
3067
Table 88.5: Effect of various spread realignments and workout times on the sample yield-pickup intermarket swap: Basis-point gain (loss) in realized compound yields (annual rate). Workout time Spread shrinkage 40 30 20 10 0 (10) (20) (30) (40)
6 months
1 year
2 years
5 years
1083.4 817.0 556.2 300.4 49.8 (196.0) (437.0) (673.0) (904.6)
539.9 414.6 291.1 169.2 49.3 (69.3) (186.0) (301.2) (414.8)
273.0 215.8 159.1 103.1 47.8 (6.9) (61.0) (114.5) (167.4)
114.3 96.4 78.8 61.3 44.0 26.8 9.9 (6.9) (23.4)
30 years 24.5 24.5 24.5 24.5 24.5 24.5 24.5 24.5 24.5
Source: Homer and Leibowitz (1972, p. 91).
H-bond for a lower YTM bond (P -bond). The investor will profit only if the yield spread widens, leading to relatively lower P -bond prices, which would more than offset the yield loss. As an example, assume the H-bond is the 30-year 7s priced at par, and the P -bond is the 30-year 4s period at 67.18 to yield 6.50%. The investor believes that the present 50-basis-point spread is too narrow and will widen, as shown in Table 88.6. General market moves over the short term would have little effect on the swap’s value provided the spread changes as originally anticipated. The potential risk of this swap must be viewed in the context of changes in overall rate levels and the realignment of spread relationships among the many market components. As can be seen in Table 88.7, there is a high premium to be placed on achieving a favorable spread change within a relatively short workout period. 88.2.3.3 Interest-rate anticipation swap The investor who feels that the overall level of interest rates is going to change will want to affect a swap that will net a relative gain if this happens. Most commonly, these swaps consist of shortening maturities if higher long-term yields are expected, and lengthening maturities if lower long-term yields are expected.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
b3568-v3-ch88
C. F. Lee
3068 Table 88.6: give-up.
9.61in x 6.69in
Evaluation worksheet for a sample intermarket-spread swap with yield
H -Bond 30-year 7s @ 7% Initial YTM (%) YTM at workout
P-Bond 30-year 4s @ 6.50%
7 6.5 7 6.4 Spread growth: 10 basis points Workout time: 1 year RR: 7% $1,000.00 $671.82 70.00 40.00
Original investment per bond Two coupons during year Interest on one coupon @ 7% for 6 months Principal value at end of year Total accrued Total gained Gain per invested dollar Realized compound yield (%) Value of swap
1.23 0.70 1,000.00 685.34 1,071.23 726.04 71.23 54.22 0.0712 0.0807 7 7.914 91.4 basis points in one year
Source: Homer and Leibowitz (1972, p. 88). Table 88.7: Effect on various spread realignments and workout times on the sample yield-give-up intermarket swap: Basis-point gain (loss) in realized compound yields (annual rate). Workout time Spread shrinkage
6 months
1 year
2 years
5 years
30 years
40 30 20 10 0 (10) (20) (30) (40)
1,157.6 845.7 540.5 241.9 (49.8) (335.3) (614.9) (888.2) (1,155.5)
525.9 378.9 234.0 91.4 (49.3) (187.7) (324.1) (458.4) (590.8)
218.8 150.9 83.9 17.6 (47.8) (112.6) (176.4) (239.1) (302.1)
41.9 20.1 (1.5) (22.9) (44.0) (64.9) (85.6) (106.0) (126.3)
(24.5) (24.5) (24.5) (24.5) (24.5) (24.5) (24.5) (24.5) (24.5)
Source: Homer and Leibowitz (1972, p. 89).
The decisive factor is the expected long rate: the change in the long rate will almost always be the chief determinant of the value of the swap. Maturity swaps are highly speculative. If yields do not rise or fall as expected in a short period of time, often a large penalty due to the immediate loss
page 3068
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3069
3069
in yield will occur. If the yield curve is positive, as one moves from long to short, a large yield loss occurs. Conversely, if the curve is negative, moving from short to long involves little or no yield pickup and sometimes a yield loss. Time works heavily against the swapper. Nevertheless, these periods of maximum yield loss are usually the best times to make maturity swaps in both directions. In evaluating maturity swaps, capital gains or losses will be critical over the first year or so. As time goes on, coupon income and compounded interest become more important. The concept of duration easily explains this fact, for as time increases, the present-value factor declines. Therefore, capital gains or losses are valued at the lower present-value factor. Example 88.3 provides further illustration. Example 88.3. Interest-Rate-Anticipation Swap. Suppose an investor holds a 7% 30-year bond selling at par. He expects rate to rise from 7% to 9% within the year. Therefore, a trade is made into a 5% T-note maturing in one year and selling at par, as shown in Table 88.8. The portfolio manager who shortens maturities drastically at a large loss in yield within the long range must expect a substantial increase in yields within the short range, otherwise his long-range yield loss will exceed his short-term gain. The swapper into long at a big yield increase has time in his favor, but over the near term, he can fare badly if long yields rise Table 88.8:
Evaluation worksheet for a sample interest-rate-anticipation swap. H -Bond 30-year 7s @ 100
Original investment per bond Two coupons during year Interest on one coupon @ 7% for 6 months Principal value at end of year Total accrued Total gained Gain per invested dollar Realized compound yield (%)
P-Bond 30-year 5s @ 100
Anticipated rate change: 9% Workout time: 1 year $1,000.00 $1,000 70.00 50
Value of swap Source: Homer and Leibowitz (1972, p. 94).
1.23 748.37 819.60 (180.4) (0.1804) (13.82)
−
1,000 1,050 50 0.05 5.00 1,885 basis points in one year
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3070
further. Finally, the swapper from short to long at a yield loss also has time against him. 88.2.3.4 Pure yield-pickup swap In a pure yield-pickup swap, there is no expectation of market changes, but a simple attempt to increase yield. Basically, two bonds are examined to establish their difference in YTM, with a further adjustment to consider the impact of interim reinvestment of coupons at an assumed rate or return between now and the maturity date. Example 88.4 provides further illustration. Example 88.4. Pure Yield-Pickup Swap. Suppose an investor swaps from the 30-year 4s at 671.82 to yield 6.50% into 30-year 7s at 100 to yield 7% for the sole purpose of picking up the additional 105 basis points in current income or the 50 basis points in the YTM. The investor is not motivated by a judgment that the intermarket spread will shrink or that yields will rise or fall. He has no explicit concept of a workout period — he intends to hold the 7s to maturity. To evaluate a swap of this sort, which is based on holding the P -bond to maturity, three factors must be taken into account: (1) the coupon income, (2) the interest on interest, and (3) the amortization to par. Interim marketprice changes may be ignored. A simple addition of the three money flows just listed, divided by the dollars invested, will give a total return in dollars which, with the aid of a compound-interest table, will give the total realized compound yield as a percentage of each dollar invested, as shown in Table 88.9. Table 88.9:
Evaluation worksheet for a sample pure yield-pickup swap. H -Bond 30-year 4s @ 6.50%
Coupon income over 30 years Interest on interest at 7% Amortization Total return Realized compound yield (%) Value of swap
P-Bond 30-year 7s @ 7.00%
(one bond) (0.67182 of one bond) $1,200.00 $1,410.82 2,730.34 3,210.02 328.18 0 $4,258.52 $4,620.84 6.76 7.00 24 basis points per annum at 7% RR
Source: Homer and Leibowitz (1972, p. 99).
page 3070
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3071
3071
Although the principal invested in both issues is only $671.82 per bond, over a period of 30 years, the switch results in a gain of $210.82 per bond in coupon income, a gain of $479.68 per bond in interest on interest, and a loss of $328.18 per bond in capital gain. These three factors add up to a net gain of $362.32 per bond, or a net gain of 24 basis points per year. 88.3 Duration The traditional role of bonds as an asset category has changed over the past 15 years due to surging interest rates and the resultant price volatility. The use of bond maturity to reduce interest-rate risk in bond portfolios through maturity matching has become increasingly inadequate. By the 1970s, several researchers had recognized that maturity is an incomplete measure of the life and risk of a coupon bond. In 1971, Fisher and Weil recommended a practical measurement tool that could help immunize bond portfolios against interest-rate risk, and in 1973, Hopewell and Kaufman demonstrated that it could also be used as a measure of price risk for bonds. This concept is duration, which has emerged as an important tool for the measurement and management of interest-rate risk. Bierwag, Kaufman, and Toevs (BKT) noted in 1983 that only the introduction of beta in the 1960s has generated as much interest in the investment community as has duration. The purpose of this section is to review briefly the historical development of duration and examine its potential use by investors in alternative bondportfolio immunization strategies, as well as to look at certain reservations that should be considered when using duration to immunize bond portfolios. In 1938, Frederick Macaulay developed the concept of duration as part of an overall analysis of interest rates and bond prices. He was attempting to develop a more meaningful summary measure of the life of a bond that would correlate well with changes in bond price; he arrived at a weighted average of the time to each bond payment, with the weights being the present values of each payment relative to the total present value of all the flows: n Ct t=0 t (1+kd )t , (88.1) D = n Ct t=0 (1+kd )t
where Ct is the coupon-interest payment in periods 1 through n – 1, Cn is the sum of the coupon-interest payment and the face value of the bond in period n, kd is the YTM or required rate of return of the bondholders in the market, and t is the time period in years.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3072
Thus, for the first semi-annual payment in the series between 0 and n, t would equal 0.5. The denominator of this equation is equal to the current price of the bond as estimated by the present value of all future cash flows. The duration of a bond with a fixed maturity date declines just as maturity with the passage of time. The duration of a coupon bond, however, is always shorter than its maturity. Only for pure discount (zero-coupon) bonds will duration equal maturity. Duration is also affected by the size of the coupon and its YTM, decreasing as either increases. It is useful to place the concept of duration in the perspective of other summary measures of the timing of an asset’s cash flow. Because the cash flows from bonds are specific, both as for timing and amount, analysts have derived a precise measure of the timing for bonds. The most commonly used timing measure is term to maturity (TM), the number of years prior to the final payment on the bond. The advantage of TM is that it is easily identified and measured. However, the disadvantage is that TM ignores interim cash flows. Moreover, TM ignores the substantial difference in coupon rates and the difference in sinking funds.
88.3.1 Weighted-average term to maturity In an attempt to rectify the deficiency of TM, a measure that considered the interest payments and the final principal payment was constructed. The weighted-average term to maturity (WATM) computes the proportion of each individual payment as a percentage of all payments and makes this proportion the weight for the year the payment is made: WATM =
CF2 CFn CF1 (1) + (2) + · · · + (n), TCF TCF TCF
(88.2)
where CFt is the cash flow in year t, t is the year when cash flow is received, n is the maturity, and TCF is the total cash flow from the bond. Example 88.5 provides further illustration. Example 88.5. Suppose a 10-year, 4% bond will have total cash-flow payments of $1,400. Thus, the $40 payment in CF1 will have a weight of 0.0287 ($40/$1,400), each subsequent interest payment will have the same weight, and the principal in year 10 will have a weight of 0.74286 ($1,040/1,400).
page 3072
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3073
3073
Therefore, $40 $40 $40 $40 (1) + (2) + (3) + . . . + (9) $1400 $1400 $1400 $1400 $1040 (10) = 8.71 years. + $1400 The WATM is definitely less than the TM because it takes account of all interim cash flows in addition to the final payment. In addition, a bond with a larger coupon has a shorter WATM because a larger proportion of its total cash flows is derived from the coupon payments prior to maturity — that is, the coupon-weighted average of the larger coupon is larger than the smaller coupon bond. The weighted-average term can be utilized to take into account sinking-fund payments, thus lowering the WATM. WATM =
88.3.2 WATM versus duration measure A major advantage of WATM is that it considers the timing of all cash flows from the bond, including interim and final payments. One disadvantage is that it does not consider the time value of the flows. The interest payment of the first period is valued the same as the last period. The duration measure is simply a weighted-average maturity, where the weights are stated in present value terms. In the same format as the WATM, duration is PVCF2 PVCFn PVCF1 (1) + (2) + · · · + (n), (88.3) D= PVTCF PVTCF PVTCF where PVCFt is the present value of the cash flow in year t discounted at current yield to maturity, t is the year when cash flow is received, n is the maturity, and PVTCF is the present value of total cash flow from the bond discounted at current YTM. The time in the future when a cash flow is received is weighted by the proportion that the present value of that cash flow contributes to the total present value or price of the bond. Similar to the WATM, the duration of a bond is shorter than its TM because of the interim interest payment. Duration is inversely related to the coupon of the bond. The one variable that does not influence the average TM but can affect duration is the prevailing market yield. Market yield does not affect WATM because this measure does not consider the present value of flows. Nevertheless, market yield affects more on both the numerator of WATM and TM. As a result, there is an inverse relation between a change in the market yield and a bond’s duration. Tables 88.10 and 88.11, taken from Reilly and Sidhu’s (1980) article, “The Many Uses of Bond Duration,” illustrate the difference between the timing
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3074 Table 88.10:
WATM (assuming annual interest payments).
Bond A $1,000, 10 years, 4%
Bond B $1,000, 10 years, 8%
(2) (1) Cash (3) Cash (4) Year flow ($) flow/TCF (1) × (3)
(5) Year
1 2 3 4 5 6 7 8 9 10 Sum
1 2 3 4 5 6 7 8 9 10 Sum
40 40 40 40 40 40 40 40 40 1,040 $1,400
0.02857 0.02857 0.02857 0.02857 0.02857 0.02857 0.02857 0.02857 0.02857 0.74286 1.00000
0.02857 0.05714 0.08571 0.11428 0.14285 0.17142 0.19999 0.22856 0.28713 7.42860 8.71425
WATM = 8.71 years
(6) Cash (7) Cash flow ($) flow/TCF 80 80 80 80 80 80 80 80 80 1,080 $1,800
0.04444 0.04444 0.04444 0.04444 0.04444 0.04444 0.04444 0.04444 0.04444 0.60000 1.00000
(8) (5) × (7) 0.04444 0.08888 0.13332 0.17776 0.22220 0.26664 0.31108 0.35552 0.39996 6.00000 7.99980
WATM = 8.00 years
Source: Reilly and Sidhu (1980, p. 60).
measures. These two tables show the WATM and the duration, respectively. Due to the consideration of the time value of money in the duration measurement, duration is the superior measuring technique. WATM is always longer than the duration of a bond, and the difference increases with the market rate used in the duration formula. This is consistent with the duration property of an inverse relation between duration and the market rate. 88.3.3 Yield to maturity1 Yield to maturity is an average maturity measurement in its own way because it is calculated using the same rate to discount all payments to the bondholder, thus, it is an average of spot rates over time. For example, if the spot rate for period two, r2 , is greater than that for period one, r1 , the YTM on a two-year coupon bond would be between r1 and r2 — an underestimate of the 1
Malkiel (1962) and Bodie et al. (2011) have carefully discussed the sensitivity of bond prices to changes in market interest rate, i.e., interest rate sensitivity. They have carefully proposed several propositions to describe this kind of relationship. It is well known that bond characteristics such as coupon rate or yield to maturity affect interest rate sensitivity. Therefore, the yield to maturity discussed in this section can be used to analyze the interest rate sensitivity of bond prices.
page 3074
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity Table 88.11: (1) Year
(2) Cash flow
page 3075
3075
Duration (assuming 8% market yield).
(3) PV at 8%
(4) PV of flow
(5) PV as % of price
(6) (1) × (5)
Bond A 1 2 3 4 5 6 7 8 9 10 Sum
$ 40 40 40 40 40 40 40 40 40 1,040
0.9259 0.8573 0.7938 0.7350 0.6806 0.6302 0.5835 0.5403 0.5002 0.4632
$ 37.04 34.29 31.75 29.40 27.22 25.21 23.34 21.61 20.01 481.73 $731.58
0.0506 0.0469 0.0434 0.0402 0.0372 0.0345 0.0319 0.0295 0.0274 0.6585 1.0000
0.0506 0.0938 0.1302 0.1608 0.1860 0.2070 0.2233 0.2360 0.2466 6.5850 8.1193
0.9259 0.8573 0.7938 0.7350 0.6806 0.6302 0.5835 0.5403 0.5002 0.4632
$ 74.07 68.59 63.50 58.80 54.44 50.42 46.68 43.22 40.02 500.26 $1000.00
0.0741 0.0686 0.0635 0.0588 0.0544 0.0504 0.0467 0.0432 0.0400 0.5003 1.0000
0.0741 0.1372 0.1906 0.1906 0.2720 0.3024 0.3269 0.3456 0.3600 5.0030 7.2470
Duration = 8.12 years Bond B 1 2 3 4 5 6 7 8 9 10 Sum
$ 80 80 80 80 80 80 80 80 80 1,080
Duration = 7.25 years
two-year spot rate. Likewise, the opposite condition would overestimate the two-year spot rate. With the high volatility of interest rates that has existed in recent years, the difference can be dramatic. For example, in Britain during 1977, the 20-year spot rate of interest was approximately 20%, while the YTM on 20-year coupon bonds was only 13%. The reason for the larger discrepancy was that YTM was calculated as an average of the relatively low short-term spot rates and the relatively high long-term spot rates. Until the 1970s, when interest-rate and bond-price volatility increased dramatically, the significance of the superiority of duration over average
July 6, 2020
15:54
3076
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
measurement of bond-portfolio risk was required for immunization against interest-rate risk. Studies comparing the success of portfolios immunized with a duration strategy and a maturity strategy have shown that duration outperforms maturity 75% of the time for a variety of planning periods. Duration also produces lower variances between realized and promised returns than either the maturity strategy or the na¨ıve strategy of annually rolling over 20-year bonds. The motivation for bond investment is to secure a fixed cash flow over a particular investment horizon. Asset and liability portfolios of financial institutions often generate future cash-flow patterns that must conform to certain solvency and profitability restrictions. For example, insurance companies and pension funds, with definite future commitments of funds, invest now so that their future cash-flow stream will match well with their future commitment stream. Bond-immunization strategies are designed to guarantee the investor in default-free and option-free bonds at rate of return that approximates the promised rate (or YTM) computed at the outset of the investment. Nevertheless, the YTM will be equal to the RY only if the interim coupon payments are reinvestable at the YTM. If interest rates change, the RY will be lower or higher than YTM, depending upon the relationship between the bond’s measured duration D and the investor’s expected holding period H. Babcock (1975) devised a simple formula in which RY is computed as a weighted average of the YTM and the average RR available for coupon payments: D D (YTM) + 1 − (RR). (88.4) RY = H H Therefore, the RY would equal the YTM only if the duration on the bond were kept equal to the time horizon of the investor. An increase in the RR would increase the return from reinvested coupons, but would at the same time decrease the bond value. Only holding the bond to maturity would prevent this value reduction from affecting the RY. Therefore, the overall impact of RR increases depends upon the extent to which the income effect offsets the value reduction. On the other hand, a decrease in the RR will decrease the return from reinvested coupons, but it will increase the bond values in the market. As shown in equation (88.4), these opposite impacts on RY will exactly offset each other only when D equals H. If a more complete time spectrum of zero-coupon bonds of all types were readily available, bond-portfolio immunization would be a relatively simple process. Simply by choosing bonds with maturities equal to the length of an investor’s investment horizon, the rate of return would always be as promised
page 3076
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3077
3077
by the YTM, regardless of interest-rate changes. Beginning with Treasurybond sales of February 15, 1985, the US Treasury has been cooperating with Wall Street firms in the stripping of coupon payments from new Treasury issues through separate registration of coupon and maturity payments, thus manufacturing a series of zero-coupon issues. However, since this applies only to Treasury issues, it is still necessary for investors to use duration models to immunize portfolios of non-Treasury bonds. To immunize a bond portfolio, investors must match the duration of the portfolio with the length of their investment horizons. This must be accomplished within their assumed stochastic process of interest-rate movements and changes in the yield curve. If the yield curve changes during the holding period, the immunization process will break down and the RY will not equal the YTM as promised. Since the hoped-for results of the passive duration-immunization method can be upset by interest-rate shifts, some investors may prefer to try to predict interest-rate changes and undertake an active immunization strategy. This would involve the formation of bond portfolios with durations intentionally longer or shorter than the investment-planning period. Since it is usually assumed that the market consensus about future interest rates is the most likely possibility, investors should pursue active strategies only if their forecasts of rates differ from those of the market. The investor, who expects the interest rate to be higher than the market forecast, should form a bond portfolio with a duration shorter than the length of the investment horizon. If successful, the income gains from reinvestment of coupons and maturing bonds will exceed the value loss on unmatured bonds: RY will exceed the YTM promised. If unsuccessful, the reinvestment return will not be enough to offset the value loss and RY will be less than YTM. If the investor expects interest rates to be lower than the market forecast, the procedure would be the reverse of that described above — that is, to purchase a bond portfolio with a duration longer than the investment horizon. If successful, the lower reinvestment returns will be more than offset by the bond-value gains. Thus, the active immunization strategy relies solely upon the ability of the investor to predict the direction of future interest-rate changes and to take full advantage of market opportunities for realized returns greater than market rates. While this provides the investor with an opportunity for substantial returns, it can also result in substantial losses. Because of this, Leibowitz and Weinberger (LW, 1981) derived a “stop loss” active immunization strategy which they called contingent immunization. In this method, the portfolio manager pursues higher returns through active management unless the value of the portfolio declines to a level that
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3078
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
threatens a minimum target return. At this point, the portfolio is switched into a pure immunization mode designed to provide the minimum target specified at the outset. Other researchers have developed similar strategies that rely on two portfolios — one actively and the other passively immunized. Interest changes could take place that would virtually wipe out the active portfolio, but they would not greatly affect the minimum target return of the two-portfolio combination. 88.3.4 The Macaulay model The duration model that is still most widely used because of its simplicity is the Macaulay model. However, basic limitations affect the use of this model. First, the current and forward spot rates are assumed to be equal over a specific planning horizon, that is, the yield curve is assumed to be flat. This is also one of the basic weakness of the YTM measurement. Second, this model provides an accurate measure of interest-rate risk only when there is a single parallel shift in the term of structure of interest rates, that is, there is only one shift within the holding period and it does not involve a change in yield-curve shape. Most studies surveyed seem to indicate that the Macaulay model assumptions are unrealistic and too restrictive. Cox et al. (1979) noted that it does not take into account the dynamic nature of the term structure observed in the real world, in which yield curves can and do change in shape as well as location. As a result, a number of more complex duration models have been developed to measure risk when multiple term-structure shifts can affect the shape and location of the yield curve. Because the actual underlying stochastic process governing interest-rate changes is not known, only empirical analysis can determine whether the extra complexity of these models justifies their usage. Bierwag et al. (1982) extensively tested the Macaulay model along with four more complex duration models and found that duration-matching strategies generated realized returns consistently closer to promised yields than a maturity-matching strategy. Even more interestingly, the Macaulay measure appeared to perform as well as the more complex models. Their findings suggest that single-factor duration matching is a feasible immunization strategy that works reasonably well, even with the less complex (and thus less costly) Macaulay model. Duration appears a better measure of a bond’s life than maturity because it provides a more meaningful relationship with interest-rate changes. This relationship has been expressed by Hopewell and Kaufman (1973) as D ΔP = Δi = D ∗ Δi, (88.5) P (1 + i)
page 3078
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3079
3079
where Δ = “change in”, P = bond price, D = duration, and i = market interest rate or bond yield. For example, a bond with five years duration will decline in price by approximately 10% when market yield increases by 2%. Note that ΔP /P on the left-hand side of equation (88.5) is the percentage change in price, and Δi is the absolute change in yield level and not percentage change. Other useful generalizations can be made concerning the relationships of duration to various bond characteristics as follows: 1. The higher the coupon, the shorter the duration because the face-value payment at maturity will represent a smaller proportional present-value contribution to the makeup of the current bond value. In other words, bonds with small coupon rates will experience larger capital gains or losses as interest rates change. Additionally, for all bonds except zero-couponrate bonds, as the maturity of the bond lengthens, the duration at the limit will approach (1 + YTM)/YTM. Table 88.12 shows the relationship between duration, maturity, and coupon rates for a bond with a YTM of 6%. At the limit (maturity goes to infinity), the duration will approach 17.667 (i.e., (1 + 0.06)/0.06). As can be seen in Table 88.12, the limit is independent of the coupon rate: it is always 17.667. However, when the coupon rate is the same as or greater than the yield rate (the bond is selling at a premium), duration approaches the limit directly. Conversely, for discount-priced bonds (coupon rate is less than YTM), duration can increase beyond the limit and then recede to the limit. In the case of the bond with the 2% coupon at a maturity of 50 years, the duration is 19.452 — and this approaches the limit as maturity keeps increasing. These relationships are shown in Figure 88.2. Regardless of coupon size, it is nearly impossible to find bonds with durations in excess of 20 years; most bonds have a limit of about 15 years. Table 88.12:
Duration, maturity, and coupon rate. Coupon rate
Maturity (years)
0.02
0.04
0.06
0.08
1 5 10 20 50 100 ∞
0.995 4.756 8.891 14.981 19.452 17.567 17.667
0.990 4.558 8.169 12.98 17.129 17.232 17.667
0.985 4.393 7.662 11.904 16.273 17.120 17.667
0.981 4.254 7.286 11.232 15.829 17.064 17.667
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3080
Figure 88.2:
Duration and maturity for premium and discount bonds. Table 88.13:
Duration and YTM.
YTM
Duration at limit (maturity → ∞)
0.02 0.04 0.08 0.10 0.20 0.30 0.50
51 26 13.5 11 6 4.33 3
2. The higher the YTM, the shorter the duration because YTM is used as the discount rate for the bond’s cash flows and higher discount rates diminish the proportional present-value contribution of more distant payments. As has been shown, at the limit, duration is equal to (1 + YTM)/YTM; in Table 88.13, the relationship between duration and YTM is shown. 3. A typical sinking fund (one in which the bond principal is gradually retired over time) will reduce duration. Duration can be reduced by the sinking-fund or call provision. A large proportion of current bond issues do have sinking funds, and these can definitely affect a bond’s duration. An example will illustrate this fact. A 10-year, 4% bond with a sinking fund of 10% of face value per year starting at the end of the fifth year has a duration of 7.10 years as compared to the duration of 8.12 years of a similar bond without a sinking fund. Table 88.14 provides further illustration. The effect of a sinking fund on the time structure of cash flows for a bond is certain to the issuer of the bond, since the firm must make the payments: they represent a legal cash-flow requirement that will affect the firm’s cash flow. However, the sinking fund may not affect the investor since the money put into the sinking fund may not necessarily be used
page 3080
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity Table 88.14:
page 3081
3081
Duration with and without sinking funds (assuming 8% market yield).
Cash flow
Present-value factor
Present value of cash flow
Weight
Duration
$ 37.04 34.29 31.75 29.40 27.22 25.21 23.34 21.61 20.01 481.73 $ 731.58
0.0506 0.0469 0.0434 0.0402 0.0372 0.0345 0.0319 0.0295 0.0274 0.6585 1.0000
0.0506 0.0938 0.1302 0.1608 0.1860 0.2070 0.2233 0.2360 0.2466 6.5850 8.1193
0.04668 0.04321 0.04001 0.03705 0.12010 0.11119 0.10295 0.09533 0.08826 0.31523 1.00000
0.04668 0.08642 0.12003 0.14820 0.60050 0.66714 0.72065 0.76264 0.79434 3.15230 7.09890
Bond A—no sinking fund 1 2 3 4 5 6 7 8 9 10 Sum
$ 40 40 40 40 40 40 40 40 40 1,040
0.9259 0.8573 0.7938 0.7350 0.6806 0.6302 0.5835 0.5403 0.5002 0.4632
Duration = 8.12 years Bond A—Sinking fund (10% per year from fifth year) 1 2 3 4 5 6 7 8 9 10 Sum
$ 40 40 40 40 140 140 140 140 140 540
0.9259 0.8573 0.7938 0.7350 0.6806 0.6302 0.5835 0.5403 0.5002 0.4632
$ 37.04 34.29 31.75 29.40 95.28 88.23 81.69 75.64 70.03 250.13 $ 793.48
Duration = 7.10 years Source: Reilly and Sidhu (1980, pp. 61–62).
to retire outstanding bonds. Even if it is, it is not certain that a given investor’s bonds will be called for retirement. 4. For bonds of less than 5 years to maturity, the magnitudes of duration changes are about the same as those for maturity changes. For bonds of 5–15 years maturity, changes in the magnitude of duration are considerably less than those of maturity. For bonds with more than 20 years to maturity, changes in the magnitude of duration are very small relative to changes in maturity. As can be seen in Figure 88.3, in the range of 0–5
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3082
Figure 88.3:
Duration versus maturity.
years, the relationship between duration and maturity is shown approximately by a straight line with a slope of 45 degrees. In the range of 5–10 years, the slope of the line is less, indicating a smaller change in duration for a given change in maturity, and for more than 20 years, the line is almost horizontal, showing a smaller change in duration for a given change in maturity. 5. In contrast to a sinking fund, all bondholders will be affected if a bond is called. The duration of a callable bond will be shorter than a noncallable bond. When a bond is callable, the cash flow implicit in the YTM figure is subject to possible early alteration. Most corporate bonds issued today are callable, but with a period of call protection before the call option can be exercised. At the expiration of this period, the bond may be called at a specified call price, which usually involves some premium over par. To provide some measure of the return in the event that the issuer exercises the call option at some future point, the yield to call is calculated instead of the YTM. This computation is based on the assumption that the bond’s cash flow is terminated at the “first call date” with redemption of principal at the specified call price. The crossover yield is defined as that yield where the YTM is equal to the yield to call. When the price of the bond rises to some value above the call price, and the market yield declines to a value below the crossover yield, the yield to call becomes the minimum yield. At this price and yield, the firm will probably exercise a call option when it is available. When prices go below the call price, the YTM is the minimum yield. Example 88.6 provides further illustration. Example 88.6. To calculate the crossover yield for a 8%, 30-year bond selling at par with 10-year call protection, the annual return flow divided by
page 3082
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3083
3083
the average investment can be used as an approximation for the yield. The implied crossover yield is 8.46%: Crossover yield:
80+ 1080−1000 10 1080+1000 2
= 8.46%.
In one year’s time, the bond’s maturity will be 29 years with nine years to call. If the market rate has declined to the point where the YTM of the bond is 7%, which is below the crossover yield of 8.46%, the bond’s yield to call will be 6%. Yield to call =
80+ 1000−1123.43 9 1080+1123.43 2
= 6%.
If a bond-portfolio manager ignored the call option and computed the duration of this bond to maturity at a market yield of 7%, the duration would be 12.49 years. If duration was computed recognizing the call option at a price of $1,080 and using the yield to call of 6%, it would be 6.83 years. Since a majority of corporate bonds have a call option, the effect of the call option upon a bond’s duration could have an effect on the bond manager’s investment decision. Therefore, the bond’s duration, both disregarding the call option and regarding the call option, must be considered in an investment decision. That is, if interest rates stabilize or continue to rise, the call-option duration is of less importance than the bond’s duration disregarding the call option. If interest rates fall, the call-option duration is of more importance. 88.4 Convexity The duration rule in equation (88.5) is a good approximation for small changes in bond yield, but it is less accurate for large changes. Equation (88.5) implies that percentage change in bond price is linearly related to change in yield to maturity. If this linear relationship does not hold, following Saunders and Cornett (2018) then equation (88.5) can be generalized as ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 , (88.6) P where the Convexity is the rate of change of the slope of the price-yield curve as follows: n ∂2P 1 CFt 1 2 × 2 = (t + t) Convexity = P ∂i P × (1 + i)2 t=1 (1 + i)t ΔP + 8 ΔP − + , (88.7) ≈ 10 P P
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3084
where CFt is the cash flow at time t as definition in equation (88.2), n is the maturity, and CFt represents either a coupon payment before maturity or final coupon plus par value at the maturity date. ΔP − is the capital loss from a one-basis-point (0.0001) increase in interest rates and ΔP + is the capital gain from a one-basis-point (0.0001) decrease in interest rates.2 In equation (88.6), the first term of on the right-hand side is the same as the duration rule, equation (88.5). The second term is the modification for convexity. Note that for a bond with positive convexity, the second term is positive, regardless of whether the yield rises or falls. The more accurate equation (88.6), which accounts for convexity, always predicts a higher bond price than equation (88.5). Of course, if the change in yield is small, the convexity term, which is multiplied by (Δi)2 in equation (88.6), will be extremely small and will add little to the approximation. In this case, the linear approximation given by the duration rule will be sufficiently accurate. Thus, convexity, is more important as a practical matter when potential interest rate changes is large. Example 88.7 provides further illustration. Example 88.7. Figure 88.4 is drawn by the assumptions that the bond with 20-year maturity and 7.5% coupon sells at an initial yield to maturity of 7.5%. Because the coupon rate equals yield to maturity, the bond sells at par value, or $1000. The modified duration and convexity of the bond are 10.95908 and 155.059 calculated by equation (88.1) and the approximation formula in equation (88.7), respectively. The detailed calculations of 10.95908 and 155.059 can be found in Appendix 88A.
Percentage Change in Bond Price (%)
Actual Data Duration Rule Duration-withConvexity Rule
Changes in Yield to Maturity (%)
Figure 88.4: in YTM. 2
The relationship between percentage changes in bond price and changes
The approximation of convexity is referred to Financial Institutions Management: A Risk Management Approach by Saunders and Cornett, 7th ed., 2010.
page 3084
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3085
3085
If the bond’s yield increases from 7.5% to 8.0% (Δi = 0.005), the price of the bond actually falls to $950.9093. Based on the duration rule, the bond price falls from $1000 to $945.2046 with a decline of 5.47954% by equation (88.5) as follows: ΔP = −D ∗ Δi = −10.95908 ∗ 0.005 = −.0547954, or − 5.47954%. P If we use equation (88.6) instead of equation (88.5), we get the bond price falls from $1000 to $947.1482 with a decline of 5.28572% by equation (88.6): ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 P = −10.95908 × 0.005 + 0.5 × 155.059 × (0.005)2 = −0.0528572, or − 5.28572%. The duration rule used by equation (88.5) is close to the case with accounting for convexity in terms of equation (88.6). However, if the change in yield is larger, 3% (Δi = 0.03), the price of the bond actually falls to $753.0727 and convexity becomes an important matter of pricing the percentage change in bond price. Without accounting for convexity, the price of the bond on dash line actually falls from $1000 to $671.2277 with a decline of 32.8772% based on the duration rule, equation (88.5), as follows: ΔP = −D ∗ Δi = −10.95908 ∗ 0.03 = −0.328772, or − 32.8772%. P According to the duration-with-convexity rule, equation (88.6), the percentage change in bond price is calculated in the following equation: ΔP = −D ∗ Δi + 0.5 × Convexity × (Δi)2 P = −10.95908 × 0.03 + 0.5 × 155.059 × (0.03)2 = −0.258996, or − 25.8996%. The bond price $741.0042 estimated by the duration-with-convexity rule is close to the actual bond price $753.0727 rather than the price $671.2277 estimated by the duration rule. As the change in interest rate becomes larger, the percentage change in bond price calculated by equation (88.5) is significantly different from that calculated by equation (88.6). Saunders and Cornett (2018) have discussed why convexity is important in the risk management of financial institutions.
July 6, 2020
15:54
3086
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
88.5 Contingent Immunization Contingent immunization allows a bond-portfolio manager to pursue the highest yields available through active strategies while relying on the techniques of bond immunization to assure that the portfolio will achieve a given minimal return over the investment horizon. Using this strategy, the portfolio manager attempts to earn returns in excess of the immunized return, but at the same time attempts to constrain or control losses that may result from poor forecasts of interest-rate movements. Risk control is the major objective of contingent immunization. At the inception of the strategy, the manager determines the degree of risk he or she is willing to accept. If a sequence of interest-rate movements causes the portfolio to approach the predetermined risk level, the manager alters the portfolio’s duration to completely immunize it from any further risk. On the other hand, if the interest-rate movements provide additional returns, the portfolio manager does nothing. The difference between the minimal, or floor, rate of return and the rate of return on the market is called the cushion spread. Equation (88.8) shows the relationship between the market rate of return Rm and the cushion C to be the floor rate of return, RF L . RF L = Rm − C.
(88.8)
Interest-rate movement favorable to the bond-portfolio manager’s position will enlarge the spread, that is, Rm goes up and the portfolio is long bonds, thereby increasing the realized return. Adverse interest-rate movements will reduce the cushion spread Rm −C up to the point that RF L = Rm . At this point, the portfolio manager will immunize the portfolio, which will ensure that the realized return will equal RF L . Figure 88.5 shows a graphical presentation of contingent immunization. The realized change in the market rate of return is shown on the horizontal axis, and the potential rate of return for the portfolio is shown on the vertical axis. The potential return is a function of the market return, the floor return, and the cushion. If the interest-rate change is +2%, the portfolio manager shifts to an immunization strategy that locks in the RF L at 10% for the planning horizon. Regardless of interest-rate movements thereafter, the portfolio will realize a return of 10%. If interest rates were to go down, the portfolio would earn a return in excess of the RF L because of the manager’s ability to have a portfolio with a duration larger than the investment horizon.
page 3086
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
Figure 88.5:
page 3087
3087
Contingent immunization.
The contingent-immunization approach allows the portfolio manager the ability to follow the active investment strategies discussed earlier in this chapter. At the same time, the manager is able to protect a minimum return by using a duration-based immunization strategy. There is an old saying in the market: “Let your winners run and cut your losses.” This is exactly the philosophy behind the contingent-immunization strategy. 88.6 Bond Portfolios: A Case Study Ron Allen is the President of Merchant’s Bank and Trust Co.3 Mr. Allen has been the CEO of the $80-million community bank for less than a year. The bank has always been successful and, in years past, provided its stockholders with a slightly better return than its peers. However, the return on the bank’s investment portfolio had suffered somewhat in the early 1980s because of its 3
This case study is based on Altmix (1984).
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3088
inability to cope with the dramatic rise and continued fluctuation of interest rates. Now, Allen wants the bank to become more of an investment manager than simply a buyer and seller of bonds. It is Allen’s belief that by using duration theory, he can achieve a bond portfolio that is not as sensitive to changes in rates and yet will still produce a better-than-average return. Duration is the weighted-average number of years until an initial cash investment is recovered, with the weight expressed as the relative present value of each payment of interest and principal. For example, suppose a bank bought a 5-year, $1,000 bond with a 10% coupon at par. If the going rate of interest remains at 10%, what would be the duration of the bond? The bond would produce a cash income in the amount of $100.00 per year for five years before maturing. To calculate the weighted present value, it is necessary to find the present value of each payment and multiply it by the number of periods until the payment is received. In the first year, the bank will receive $100.00 in coupon income. Discounted at 10%, the present value of the payment is $90.01 (see Table 88.15). Column 3, the present-value interest factor (PVIF), is obtained by using the formula 1/(1+i)n , in which n is the number of compounding periods and i is the interest rate. The present value of the second year’s coupon income of $100 will be $82.64. The value is multiplied by two, that is, the number of years until receipt; the weighted present value of $165.28 is obtained. The present value of the third-year cash flow is $75.13; the weighted present value is $225.39. For the fourth year, the present value is $68.30 and the weighted present value is $273.20. For the last year, it is necessary to calculate both the present value of the $100 coupon and that of the maturing $1,000 bond. This Table 88.15:
(1) Year
(2) Coupons
1 2 3 4 5
100.00 100.00 100.00 100.00 1,100.00 1,500.00
Weighted present value.
(3)
1 (1+i)n
0.9091 0.8264 0.7513 0.6830 0.6211
(4) (2) × (3) Unweighted PV
(5) (1) × (4) Weighted P
90.91 82.64 75.13 68.30 683.01 1,000.00
90.91 165.28 225.39 273.20 3,415.05 4,169.83
4,169.83 ÷ 1,000.00 = 4.17 years duration
page 3088
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3089
3089
sum is equal to $683.01, which when multiplied by five equals $3,415.05. As presented in Table 88.6, the cumulative weighted present values total $4,169.83. When this total is divided by the unweighted present value of the bond, $1,000, we have the duration of the bond, 4.1698 years. Thus, the initial duration of a 5-year, 10% bond selling at par is 4.17 years. (This example assumes that no costs are incurred in buying and selling of bonds and reinvesting the funds received as bond coupon are paid.) The basic problem that faces Mr. Allen, as with any bond portfolio manager, is obtaining a given rate of return to satisfy the yield requirements of a specific date, that is, the investment horizon. If the market rates never changed between the purchase and the maturity of a bond, it would be possible to acquire a bond that would guarantee return overall of a portfolio manager’s investment horizon. However, the term structure of interest rates is dynamic and the market rates are constantly changing. Because of the changes in the term structure, bond-portfolio managers are faced with interest-rate risk. Interest-rate risk is the combination of two risks: price risk and coupon-reinvestment risk. Price risk occurs if interest rates change before the target date and the bond is sold prior to maturity. At that time, the market price will differ from the value at the time of purchase. If rates increase after the purchase date, the price the bond would be sold at would be below what had been anticipated. If the rates decline, the realized price would be above what had been expected. Increases in interest rates will reduce the market value of a bond below its par value. But it will increase the return from the reinvestment of the coupon interest payments. Conversely, decreases in interest will increase the market value of a bond above its par value but decrease the return on the reinvestment of the coupons. In order for a bond to be protected from the changes in interest rates after purchase, the price risk and coupon reinvestment must offset each other. Of equal importance is coupon-reinvestment risk. It is the expected yield calculated by assuming that all coupon cash flows would be reinvested at the same yield that exists at the time of purchase. If rates begin to fall, it would be impossible to reinvest the coupon at a rate high enough to produce the anticipated yield. Obviously, if rates increase, the coupon cash flow will be reinvested at higher rates and produce a return above expectations. Duration is the time period at which the price risk and coupon-reinvestment risk of a bond are of equal magnitude but opposite in direction. The result is that the expected yield is realized.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
Year
b3568-v3-ch88
C. F. Lee
3090 Table 88.16: bond.
9.61in x 6.69in
Comparison of the maturity strategy and duration strategy for a 5-year
Cash flow RR (%)
Value
Maturity Strategy 1 2 3 4
105.00 105.00 105.00 105.00
10.5 10.5 8.0 8.0
105.00 221.03 343.71 1,476.01 2,145.75
105.00 105.00 105.00 1,125.10∗
10.5 10.5 8.0 8.0
105.00 221.03 343.71 1,496.31 2,166.05
Duration Strategy 1 2 3 4
Expected wealth ratio is 1,491.00. Notes: ∗ The bond could be sold at its market value of $1,125.12, which is the value for a 10.5% bond with one year to maturity priced to yield 8%.
The duration strategy may protect the bank’s bond portfolio from these unexpected changes in the yield. Merchant’s Bank set its desired holding period at 4 years in order to protect the portfolio. This means that the duration of the bond portfolio should equal 4 years. In order to give the portfolio a 4-year duration, the weighted-average duration is set at the desired length and the bank invests all future cash flows with duration equal to the remaining expected value. An example of the effect of attempting to protect a portfolio by matching the investment horizon and the duration of a bond portfolio is contained in Table 88.16 using a single bond. Merchant’s Bank has an investment horizon of 4 years. The current YTM for a four-year bond is 10.50%. Therefore, the ending wealth ratio should be 1.4909 (1/(1.105)4 ). This assumes a completely protected portfolio. Two investment strategies are computed in Table 88.16. The first is the maturity strategy, where the TM is set at four years. The second is the duration strategy, where the duration is four years. In this case, the bank would acquire a 5-year, 10.5% bond that has a duration of 4.13 years, assuming a 10.5% YTM. For this example, it is assumed that there is a single interest-rate change at the end of year two and the market yield goes from 10.5% to 8% through the fourth year.
page 3090
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3091
3091
As Table 88.16 shows, the wealth ratio for the maturity strategy fell short of the desired ending wealth ratio on this particular bond. This is due to the interest-rate change in year two, which results in a lower RR. In the maturity strategy, the price risk of the bond itself is eliminated because the bond matures in the fourth year. However, the duration strategy actually created a wealth ratio above what was expected. This is because the return that was lost in the reinvestment was offset by the increase in the value of the bond in its fourth year. In this case, a premium would be paid for a bond with a coupon 2 12 % over the market rate by the purchaser in a secondary market. The fact that a premium would be paid for this five-year bond at the end of four years is an important factor in the effectiveness of the duration concept. There is a direct relationship between the duration of a bond and the price volatility for the bond assuming given changes in the market rates of interest. This relationship can be expressed in the formula: BPC = −D ∗ (r), where BPC is the percent of change in price for the bond, D* is the adjusted duration of the bond in years, equal to D/(1+r), and r is the change in the market yield in basis points divided by 100 (e.g., a 50-basis-point decline would be −0.5). Using the values from Table 88–16, the percentage of change in the price of the five-year bond can be calculated. The duration is 4.13 years and interestrate range from 8% to 10.5%. D ∗ = 3.738, BPC = −3.738(100/100) = −3.738(1) = −3.738. In this example, the price of the bond should decline by about 3.7% for every 100-basis-point increase in market rates. The accuracy of the formulas may vary depending on the length of duration. However, the important point is in the relationship between duration and interest-rate risk. The longer the duration of a bond, the greater the price volatility of the bond for changes in interest rates. Developing an interest-rate forecast is essential in any bond-portfolio manager’s program. To aid them in the development of this forecast, Merchant’s Bank has secured the services of two well-known investment firms. Both firms publish monthly forecasts that will be used along with forecasts
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3092
provided by Mr. Allen’s staff. Investment Firm A predicts that short-term interest rates, over 0–6 months, will average between 9 12 % and 11%, and long-term rates will fluctuate between 11% and 12%. Firm B has a similar forecast for short-term rates, but Firm B predicts a downward-sloping yield curve with long-term interest rates in the 9–10% range. Mr. Allen’s problem here is that the firms’ forecasts of long-term interest rates move in opposite directions. If interest rates increase, the bank’s bond portfolio will not perform well unless it is protected using duration. However, if interest rates fall, the return provided by using duration will not be as great as the maturity strategy would provide. In this case, Mr. Allen may miss an opportunity to earn above-average returns. Mr. Allen now must develop an interest-rate forecast for his Board of Directors. At the same time, he will attempt to show them that duration can be an effective management tool. Merchant’s Bank and Trust Co. has $750,000 in bonds maturing from the existing portfolio. Table 88.17 is a list of bonds Mr. Allen is considering for purchase. His decisions, based on the future of interest rates, will be closely monitored by the Board of Directors. It is important that Mr. Allen considers the advantages and disadvantages of using duration as a bond-portfolio management tool.
Table 88.17: Bonds being considered for purchase by Merchant’s Bank and Trust Co. Amount* $150,000 $200,000 $100,000 $100,000 $150,000 $100,000 $200,000 $250,000 $100,000 $200,000 $150,000 $200,000
Name Agency A Government Government Agency B Agency C Government Agency D Agency E Agency F Government Agency G Government
1 2
3
4 5
Rate (%)
Maturity
10.90 10.85 11.00 11.10 11.25 11.25 11.35 11.40 11.30 11.25 11.70 12.50
3 3 4 4 4 5 5 5 5 6 6 10
∗ All bonds are purchased at par and are assumed to have an annual coupon.
page 3092
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
page 3093
3093
88.7 Summary The management of a fixed-income portfolio involves techniques and strategies that are unique to the specific area of bonds. This chapter has discussed riding the yield curve, swaps, and duration as three techniques that are familiar to all managers of fixed-income portfolios. In this chapter, we also discussed how the basic concepts of duration and maturity can be used in portfolio management, especially the issue of convexity in bond-portfolio management has been analyzed in detail. A comparison of these techniques was also presented as a case study. Overall, this chapter has related bondvaluation theory to bond-portfolio theory and has developed bond-portfolio management strategies. In Appendix 88A, we have explicitly shown how convexity can be calculated.
Bibliography R. A. Altmix (1984). Duration, Bond Portfolio Protection: Case study. Unpublished manuscript. G. O. Bierwag (1987). Duration Analysis: Managing Interest Rate Risk. Cambridge, MA: Ballinger Publishing Co. G. O. Bierwag (1983). Immunization Strategies for Funding Multiple Liabilities. Journal of Financial and Quantitative Analysis, 18, 113–123. G. O. Bierwag, G. G. Kaufman, and D. Khang (1978). Duration and Bond Portfolio Analysis: An Overview. Journal of Financial and Quantitative Analysis, 13, 671–681. G. O. Bierwag, G. G. Kaufman, and A. Toevs (1982). Single-Factor Duration Models in a Discrete General Equilibrium Framework. Journal of Finance, 37, 325–338. Z. Bodie, A. Kane, and A. J. Marcus (2017). Investments, 11th ed., McGraw-Hill. R. Bookstaber (1985). The Complete Investment Book, Glenview, IL: Scott, Foresman and Co. J. A. Boquist, G. Racette, and G. G. Schlarbaum (1975). Duration and Risk Assessment for Bonds and Common Stocks. Journal of Finance, 30, 1360–1365. J. Chua (1984). A Closed Form Formula for Calculating Bond Duration. Financial analysts Journal, 40, 76–78. J. Cox, J. E. Ingersoll, and S. A. Ross (1979). Duration and Measurement of Basis Risk. Journal of Business, 52, 51–61. L. Fisher and R. Weil (1971). Coping with the Risk of Interest Rate Fluctuations: Returns to Bondholders from Na¨ıve and Optimal Strategies. Journal of Business, 44, 408–431. H. Fong and F. Fabozzi (1985). Fixed Income Portfolio Management. Homewood, IL: Dow Jones-Irwin. F. Fabozzi (2012). The Handbook of Fixed Income Securities, 8th ed., McGraw-Hill. F. Fabozzi (2015). Bond Markets, Analysis, and Strategies, 9th ed., Pearson. G. Hawawini (1982). Bond Duration and Immunization: Early Developments and Recent Contributions. New York and London: Garland Publishing. C. Hessel, and L. Huffman (1981). The Effect of Taxation on Immunization Rules and Duration Estimation. Journal of Finance, 36, 1127–1142.
July 6, 2020
15:54
3094
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
S. Homer and M. L. Leibowitz (1972). Inside the Yield Book. New York: Prentice-Hall, and New York Institute of Finance. M. Hopewell and G. G. Kaufman (1973). Bond Price Volatility and Terms to Maturity: A Generalized Respecification. American Economic Review, 63, 749–753. J. Ingersoll, J. Skelton, and R. Weil (1978). Duration: Forty Years Later. Journal of Financial and Quantitative Analysis, 13, 627–650. L. M. John, L. M. Donald, E. P. Jerald, and W. M. Dennis (2007). Managing Investment Portfolios: A Dynamic Process, 3rd ed. CFA Institute. R. Lanstein, and W. Sharpe (1978). Duration and Security Risk. Journal of Financial and Quantitative Analysis, 13, 653–668. C. F. Lee, J. Finnerty, J. Lee, A. Lee, and D. Wort (2013). Security Analysis and Portfolio Management, and Financial Derivatives. Singapore: World Scientific. M. L. Leibowitz (1982). Contingent Immunization, Part I: Risk Control Procedures. Financial Analysts Journal, 38, 17–32. M. L. Leibowitz (1983). Contingent Immunization, Part II: Problem Cases. Financial Analysts Journal, 39, 35–50. M. L. Leibowitz and A. Weinberger (1981). The Uses of Contingent Immunization. Journal of Portfolio Management, 8, 51–55. B. Malkiel (1962). Expectations, Bond Prices, and the Term Structure of Interest Rates. Quarterly Journal of Economics, 76, 197–218. F. Macaulay (1938). Some Theoretical Problems Suggested by the Movement of Interest Rates, Bond Yields, and Stock Prices in the US. Since 1865. New York: National Bureau of Economic Research. R. McEnally (1977). Duration as a Practical Tool in Bond Management. Journal of Portfolio Management, 3, 53–57. F. K. Reilly and R. S. Sidhu (1980). Many Uses of Bond Duration. Financial Analysts Journal, 36, 58–72. F. Reilly, K. C. Brown, and S. Leeds (2019). Investment Analysis and Portfolio Management, 11th ed., Cengage Learning. A. Saunders and M. M. Cornett (2018). Financial Markets and Institutions, 7th ed., McGraw-Hill. A. Saunders and M. M. Cornett (2010). Financial Institutions Management, 7th ed., McGraw-Hill. R. Weil (1973). Macaulay’s Duration: An Appreciation. Journal of Business, 46, 589–592. J. Yawitz (1977). The Relative Importance of Duration and Yield Volatility on Bond Price Volatility. Journal of Money Credit and Banking, 9, 97–102.
Appendix 88A: Procedures of Calculating Modified Duration and Convexity In this appendix, we show how convexity can be calculated. Tables 88A.1– 88A.3 show the procedures of calculating convexity for bond maturity for 20, 15, and 25 years, respectively. From these tables, we found the convexities are 155.059, 109.635, and 195.560 for maturity equal to 20, 15, and 25 years. This implies that convexity increases when the maturity increases. In addition, we calculate the percentage change in bond price related to the change of interest rate and duration.
page 3094
July 6, 2020
Convexity for 20 years.
15:54
Table 88A.1:
t×
n t=1
Ct (1+kd )t
Ct (1+kd )t
Coupon Rate Coupon Payment
7.50% $75
Discount Rate
7.50%
Bond Price
Period
$1,000.00
Coupon Bond Rate Price 7.49% $1,001.02 7.50% $1,000.00 7.51% $998.98 8.00% $950.91
Delta P- Delta P+ ($1.0187) $1.0202
Coupon Rate DuraƟon 7.49% 10.96356 7.50% 10.95908 7.51% 10.9546
Delta P-: the capital loss from a one-basis-point (0.0001) increase in interest rates Delta P+: the capital gain from a one-basis-point (0.0001) decrease in interest rates
Convexity 155.059
Actual Percentage Change in Bond Price -0.049090737
ΔP = −D × P ΔP = −D × P
Δi 1+i Δi + 0.5 × Convexity × (Δi)2 1+i
b3568-v3-ch88
Percentage Change in Bond Price when interest rate rises to 8% Linearity -0.050972457 -0.049034219 Convexity
3095
Presentvalue Present Value factor of Cash Flow Weight DuraƟon 1.0750 $69.77 0.0698 0.0698 1.1556 $64.90 0.0649 0.1298 1.2423 $60.37 0.0604 0.1811 1.3355 $56.16 0.0562 0.2246 1.4356 $52.24 0.0522 0.2612 1.5433 $48.60 0.0486 0.2916 1.6590 $45.21 0.0452 0.3164 1.7835 $42.05 0.0421 0.3364 1.9172 $39.12 0.0391 0.3521 2.0610 $36.39 0.0364 0.3639 2.2156 $33.85 0.0339 0.3724 2.3818 $31.49 0.0315 0.3779 2.5604 $29.29 0.0293 0.3808 2.7524 $27.25 0.0272 0.3815 2.9589 $25.35 0.0253 0.3802 3.1808 $23.58 0.0236 0.3773 3.4194 $21.93 0.0219 0.3729 3.6758 $20.40 0.0204 0.3673 3.9515 $18.98 0.0190 0.3606 4.2479 $253.07 0.2531 5.0614 $1,000.00 1.0000 10.95908
ΔP − ΔP + + P P
9.61in x 6.69in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Cash Flow $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $1,075
Convexity ≈ 108
Handbook of Financial Econometrics,. . . (Vol. 3)
D=
n t=1
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
A bond with 20-year maturity and 7.5% coupon sells at an iniƟal yield to maturity of 7.5%. Since the coupon rate equals yield to maturity, the bond sells at par value, i.e. $1,000.
page 3095
July 6, 2020
Convexity for 15 years.
D=
t×
n t=1
Ct (1+kd )t
Ct (1+kd )t
Coupon Rate Coupon Payment
7.50% $75
Discount Rate
7.50%
Bond Price
$1,000.00
Coupon Bond Rate Price 7.49% $1,000.88 7.50% $1,000.00 7.51% $999.12
Delta P- Delta P+ ($0.8822) $0.8833
Coupon Rate DuraƟon 7.49% 9.4917 7.50% 9.4892 7.51% 9.4866
Delta P-: the capital loss from a one-basis-point (0.0001) increase in interest rates Delta P+: the capital gain from a one-basis-point (0.0001) decrease in interest rates
9.61in x 6.69in
Convexity 109.635
b3568-v3-ch88
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Presentvalue Present Value factor of Cash Flow Weight DuraƟon 1.0750 $69.77 0.0698 0.0698 1.1556 $64.90 0.0649 0.1298 1.2423 $60.37 0.0604 0.1811 1.3355 $56.16 0.0562 0.2246 1.4356 $52.24 0.0522 0.2612 1.5433 $48.60 0.0486 0.2916 1.6590 $45.21 0.0452 0.3164 1.7835 $42.05 0.0421 0.3364 1.9172 $39.12 0.0391 0.3521 2.0610 $36.39 0.0364 0.3639 2.2156 $33.85 0.0339 0.3724 2.3818 $31.49 0.0315 0.3779 2.5604 $29.29 0.0293 0.3808 2.7524 $27.25 0.0272 0.3815 2.9589 $363.31 0.3633 5.4497 $1,000.00 1.0000 9.4892
ΔP − ΔP + + P P
C. F. Lee
Period
Cash Flow $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $1,075
Convexity ≈ 108
Handbook of Financial Econometrics,. . . (Vol. 3)
A bond with 20-year maturity and 7.5% coupon sells at an iniƟal yield to maturity of 7.5%. Since the coupon rate equals yield to maturity, the bond sells at par value, i.e. $1,000.
n t=1
15:54
3096
Table 88A.2:
page 3096
July 6, 2020
Convexity for 25 years.
n t=1
Ct (1+kd )t
Ct (1+kd )t
Coupon Rate Coupon Payment
7.50% $75
Discount Rate
7.49%
Bond Price
Period
$1,001.12
Coupon Bond Rate Price 7.49% $1,001.12 7.50% $1,000.00 7.51% $998.89
Delta P- Delta P+ ($1.1137) $1.1157
Convexity 195.560
Coupon Rate 7.49% 7.50% 7.51%
DuraƟon 11.9895 11.9830 11.9764
Delta P-: the capital loss from a one-basis-point (0.0001) increase in interest rates Delta P+: the capital gain from a one-basis-point (0.0001) decrease in interest rates
b3568-v3-ch88
Presentvalue Present Value factor of Cash Flow Weight DuraƟon 1.0749 $69.77 0.0697 0.0697 1.1554 $64.91 0.0648 0.1297 1.2420 $60.39 0.0603 0.1810 1.3350 $56.18 0.0561 0.2245 1.4350 $52.27 0.0522 0.2610 1.5424 $48.62 0.0486 0.2914 1.6580 $45.24 0.0452 0.3163 1.7822 $42.08 0.0420 0.3363 1.9156 $39.15 0.0391 0.3520 2.0591 $36.42 0.0364 0.3638 2.2133 $33.89 0.0338 0.3723 2.3791 $31.52 0.0315 0.3779 2.5573 $29.33 0.0293 0.3808 2.7489 $27.28 0.0273 0.3816 2.9548 $25.38 0.0254 0.3803 3.1761 $23.61 0.0236 0.3774 3.4139 $21.97 0.0219 0.3731 3.6697 $20.44 0.0204 0.3675 3.9445 $19.01 0.0190 0.3609 4.2400 $17.69 0.0177 0.3534 4.5575 $16.46 0.0164 0.3452 4.8989 $15.31 0.0153 0.3364 5.2658 $14.24 0.0142 0.3272 5.6602 $13.25 0.0132 0.3177 6.0842 $176.69 0.1765 4.4123 $1,001.12 1.0000 11.9895
ΔP − ΔP + + P P
9.61in x 6.69in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Cash Flow $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $75 $1,075
Convexity ≈ 108
Handbook of Financial Econometrics,. . . (Vol. 3)
D=
t×
Bond Portfolio Management, Swap Strategy, Duration, and Convexity
A bond with 20-year maturity and 7.5% coupon sells at an iniƟal yield to maturity of 7.5%. Since the coupon rate equals yield to maturity, the bond sells at par value, i.e. $1,000.
n t=1
15:54
Table 88A.3:
3097 page 3097
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch88
C. F. Lee
3098 Table 88A.4:
1. Convexity increases with bond maturity Example A B N=6 N = 18 R = 8% R = 8% C = 8% C = 8% D=5 D = 10.12 CX = 28 CX = 130
Properties of convexity.
2. Convexity varies with coupon Example
C A N=∞ N=6 R = 8% R = 8% C = 8% C = 8% D = 13.5 D=5 CX = 312 CX = 28
B N=6 R = 8% C = 0% D=6 CX = 36
3. For same duration, zero-coupon bonds are less convex than coupon bonds Example A N=6 R = 8% C = 8% D=5 CX = 28
B N=5 R = 8% C = 0% D=5 CX = 25.72
We now calculate various properties of convexity. We let N = Time to maturity, R = Yields to maturity, C = Annual coupon, D = Duration, and CX = Convexity. Part 1 of Table 88A.4 shows that as the bond’s maturity (N) increases, so does its convexity (CX). As a result, long-term bonds have more convexity — which is a desirable property — than short-term bonds. This property is similar to that possessed by duration. Part 2 of Table 88A.4 shows that coupon bonds of the same maturity (N ) have less convexity than zero-coupon bonds of the same duration, Part 3 of the table shows that the coupon bond has more convexity. Further discussion about the interpretations and applications can be found in Saunders and Cornett (2018).
page 3098
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Chapter 89
Synthetic Options, Portfolio Insurance, and Contingent Immunization Cheng Few Lee Contents 89.1 89.2 89.3
89.4
89.5
89.6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Concepts of Portfolio Insurance . . . . . . . . . . . . . Strategies and Implementation of Portfolio Insurance . . . . . 89.3.1 Stop-loss orders . . . . . . . . . . . . . . . . . . . . . 89.3.2 Portfolio insurance with listed put options . . . . . . 89.3.3 Portfolio insurance with synthetic options . . . . . . . 89.3.4 Portfolio insurance with dynamic hedging . . . . . . . Comparison of Alternative Portfolio-Insurance Strategies . . . 89.4.1 Synthetic options . . . . . . . . . . . . . . . . . . . . 89.4.2 Listed put options . . . . . . . . . . . . . . . . . . . . 89.4.3 Dynamic hedging and listed put options . . . . . . . . Impact of Portfolio Insurance on the Stock Market and Pricing of Equities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89.5.1 Regulation and the Brady report . . . . . . . . . . . . Empirical Studies of Portfolio Insurance . . . . . . . . . . . . 89.6.1 Leland (1985) . . . . . . . . . . . . . . . . . . . . . . 89.6.2 Asay and Edelsburg (1986) . . . . . . . . . . . . . . . 89.6.3 Eizman (1986) . . . . . . . . . . . . . . . . . . . . . . 89.6.4 Rendleman and McEnally (1987) . . . . . . . . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 3099
3100 3101 3107 3107 3108 3110 3114 3116 3116 3118 3119 3120 3123 3125 3126 3128 3129 3130
page 3099
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3100
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
89.6.5 Garcia and Gould (1987) . . . . . . . . . . . . . . . 89.6.6 Zhu and Kavee (1988) . . . . . . . . . . . . . . . . . 89.6.7 Perold and Sharpe (1988) . . . . . . . . . . . . . . . 89.6.8 Rendleman and O’Brien (1990) . . . . . . . . . . . 89.6.9 Loria, Pham, and Sim (1991) . . . . . . . . . . . . . 89.6.10 Do and Faff (2004) . . . . . . . . . . . . . . . . . . 89.6.11 Cesari and Cremonini (2003) . . . . . . . . . . . . . 89.6.12 Herold, Maurer, and Purschaker (2005) . . . . . . . 89.6.13 Hamidi, Jurczenko, and Maillet (2007) . . . . . . . 89.6.14 Ho, Cadle, and Theobald (2008) . . . . . . . . . . . 89.7 Contingent Immunization and Bond Portfolio Management 89.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
3130 3131 3132 3132 3133 3134 3135 3136 3136 3137 3137 3139 3139
Abstract This chapter discusses how futures, options, and futures options can be used in portfolio insurance (dynamic hedging). Four alternative portfolio insurance strategies are discussed in this chapter. These strategies are: (i) stop-loss orders, (ii) portfolio insurance with listed put options, (iii) portfolio insurance with synthetic options, and (iv) portfolio insurance with dynamic hedging. In addition, the techniques of combining stocks and futures to derive synthetic options are explored in detail. Finally, important literature related to portfolio insurance is also reviewed. Keywords Synthetic option • Stop-loss orders • Put options • Dynamic hedging • Tail wag the dog • Stock index futures.
89.1 Introduction This chapter discusses how futures, options, and futures options can be used in portfolio insurance (dynamic hedging). In addition, the techniques of combining stocks and futures to derive synthetic options are explored. Portfolio insurance is a strategy that may allow portfolio managers and investors to limit downside risk while maintaining upside potential. In this context, the word insurance is somewhat misleading: portfolio insurance is not a true form of insurance, where the insured pays a premium to someone who accepts the risk of some adverse event. Rather, portfolio insurance is an asset-allocation or hedging strategy that allows the investor to alter the amount of risk he or she is willing to accept by giving up some return. This chapter looks at the basic concept of portfolio insurance, the various alternative methods available to the portfolio manager to hedge the portfolio, the
page 3100
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3101
3101
impact of portfolio insurance on the stock market, and the pricing of equity securities and market regulation, and finally, empirical studies of portfolio insurance. Section 89.2 discusses the basic concepts of portfolio insurance, and Section 89.3 goes further into detail and discusses strategies and implementation of portfolio insurance. In Section 89.4, we compare alternative portfolio insurance strategies. The impact of portfolio insurance on the stock market and pricing of equities is explored in Section 89.5. Empirical studies of portfolio insurance are presented in Section 89.6. Section 89.7 discusses contingent immunization for bond portfolio management. Finally, Section 89.8 summarizes the chapter. 89.2 Basic Concepts of Portfolio Insurance Portfolio insurance refers to any strategy that protects the value of a portfolio of assets. It can be used for stock, bonds, or real assets. If the value of the asset declines, the insurance or hedge will increase in value to help offset the decline in the price of hedged assets. If the price of the asset increases, the increase in the insured portfolio will be less than the increase in the asset but will nevertheless still increase. Table 89.1 illustrates how portfolio insurance works. In this example, the underlying asset is purchased for $95 and $5 is spent on portfolio insurance. The minimum amount that the insured investor can realize is $95, but the uninsured portfolio can fall in value to a low of $75 if the market falls. If the value of the asset increases, the value of the insured portfolio will increase, but at a smaller rate. Figure 89.1 illustrates the profit and loss of the insured and uninsured portfolio. Rubinstein (1985) stated that the portfolio shown in Figure 89.1 has the three properties of an insured portfolio. (1) The loss is limited to a prescribed level. (2) The rate of return on the insured portfolio will be a predictable percentage of the rate of return on the uninsured portfolio. (3) The investments of the portfolio are restricted to a market index and cash. The expected return on the market index is above the expected return from holding cash, and the insurance is fairly priced. This guarantees that the insured portfolio has a higher expected return than the uninsured portfolio. Portfolio insurance allows market participants to alter the return distribution to fit investors’ needs and preferences for risk. Figure 89.2 shows the effect of insurance on the expected returns of a portfolio. Notice that the
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3102 Table 89.1:
Mechanics of portfolio insurance: An example.
Initial investment Cost of portfolio insurance
$100 −$ 5 $ 95
Amount of investment going toward securities Amount invested = $100
Value of portfolio at year end (dollars) 75 80 85 90 95 100 105 110 115 120 125 130
Return on uninsured portfolio (percent)
Value of insured portfolio (dollars)
Net return on insured portfolio (percent)
−25 −20 −15 −10 −5 0 5 10 15 20 25 30
95 95 95 95 95 95 100 105 110 115 120 125
−5 −5 −5 −5 −5 −5 0 5 11 15 20 25
expected return of the uninsured portfolio is greater than the expected return of the uninsured portfolio, according to the third property listed above. It should be noted that in an efficient market at a fair price the insurance will be used until the expected return on insured and uninsured portfolios are the same. The uninsured portfolio has greater upside potential as well as greater downside risk, whereas the insured portfolio limits the downside loss to the cost of the hedge. The upside potential of the insured portfolio is always below that of the uninsured portfolio. The cost of the insurance is the lower return for the insured portfolio should prices increase. While some investors would prefer the greater upside potential that the uninsured portfolio offers, risk-averse investors would prefer the limited-risk characteristics that the hedged portfolio offers. In general, portfolio insurance can be thought of as holding two portfolios. The first portfolio can be viewed as the safe or riskless portfolio with value equal to the level of protection desired. This level is called the floor and is the lowest value the portfolio can have. For certain strategies, this can be held constant or allowed to change over time as market conditions or needs
page 3102
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
Figure 89.1:
page 3103
3103
Gains and losses of insured and uninsured portfolios: An example.
Figure 89.2:
Expected returns on insured and uninsured portfolios.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3104
Figure 89.3:
Components of an insured portfolio valued at $1,000.
change. The second portfolio consists of the difference between the total value of the portfolio and the floor, commonly called the portfolio cushion. These assets consist of a leveraged position in risky assets. To insure the portfolio, the cushion should be managed so as to never fall below zero in value because of the limited-liability property of common stock. Figure 89.3 shows the relationship between the total value of the portfolio, the cushion, and the floor. The actual investment or allocation of the portfolio funds between risky and risk-free assets is determined by changing market conditions or the changing requirements of the portfolio manager. A simple example of changing the mix between risky and risk-free assets in response to market changes offers the opportunity to demonstrate the dynamic nature of portfolio insurance. As shown in Figure 89.3, half the current portfolio is invested in risky assets and half in risk-free assets. The exposure at this point is $500. The cushion is $200. If this is a reasonable relationship that the portfolio manager wishes to maintain, the relationship between these two, defined as the multiple, can be calculated as follows: Exposure , (89.1) Cushion e m= c 500 = 2.5. (89.2) = 200 As the market for the risky assets changes, the exposure and the cushion value change, causing a change in the multiple. Given a predetermined trigger or change in multiple, the portfolio manager can trade to restore the balance between the cushion and the exposure. If, for example, the market value of the risky asset rises by 20%, the value of the cushion increases to $300 and the value of the risky assets rises to $600. The total value of the portfolio Multiple =
page 3104
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
Figure 89.4:
page 3105
3105
Components of an insured portfolio rebalanced after a market rise.
is the value of the risky assets plus the value of the risk-free assets ($500 + $600, or $1,100). The value of the floor remains at $800 so that the cushion goes to $300 ($1,100 − $800). The multiple has fallen to m=
600 e = = 2. c 300
(89.3)
If the target multiple is 2.5, an adjustment must be made by the portfolio manager, who must sell some of the risk-free assets and purchase risky assets until the multiple is 2.5. Hence, the manager needs to rebalance the portfolio so that $750 is invested in risky assets and $350 is invested in risk-free assets. This mix will restore the multiple to the desired level of 2.5: m=
750 e = = 2.5. c 300
(89.4)
Figure 89.4 shows the portfolio after rebalancing. Increasing the portfolio’s risky assets as the market rises allows the manager to participate in the bull market. As long as the market continues its rise, the manager continues to shift assets to the risky-asset portfolio to participate in the market gain. However, when the market turns bearish and begins to go down, the manager needs to sell off risky assets and invest in risk-free assets. For example, if the market declined by 16 23 % back to its original level at the beginning of this example, the value of the risky assets would be $625 ($750 × 0.8333). At this new level, the multiple would be m=
625 e = = 3.56. c 175
(89.5)
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
b3568-v3-ch89
C. F. Lee
3106
Figure 89.5:
9.61in x 6.69in
Components of an insured portfolio rebalanced after a market fall.
The target multiple of 2.5 is below the actual multiple of 3.56, so the portfolio manager must sell some of the risky assets and place the proceeds into riskfree assets. The total value of the portfolio has fallen to $975 ($625 + $350). The value of the cushion has fallen to $175 ($975 − $800). In order to have a multiple of 2.5, the risky assets have to be reduced to e = 2.5 175 e = $437.50.
m=
(89.6)
Hence, $187.50 of the risky assets must be sold and this amount must be invested in the risk-free assets. The position of the insured portfolio after this rebalancing is shown in Figure 89.5. As the market falls, the portfolio manager sells off risky assets and invests the proceeds in risk-free assets, thereby reducing the exposure to a falling market. In general, this strategy can best be described as “run with your winners and cut your losses.” Underlying this discussion are the assumptions that the rise and fall of the market take place over a time interval long enough for the portfolio manager to rebalance the position, and that the market has sufficient liquidity to absorb the value of the risky assets. This may not always be the case, however. October 19, 1987, commonly called Black Monday, and October 26, 1987, commonly called Blue Monday, as well as October 13, 1989, are examples of a rapid fall in prices in a very illiquid market for risky assets. This general discussion of portfolio insurance is based on an article by Perold (1986). The simplicity of this approach makes it an ideal way to explain the general way of accomplishing portfolio insurance or portfolio hedging. Perold has called this constant-proportion portfolio insurance (CPPI). As has been seen, it involves holding the risk-free asset in an amount equal to the level of protection desired (floor), plus holding the remainder of
page 3106
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3107
3107
the portfolio in a risky asset. The multiple of the risky asset to the cushion is then held in a constant proportion. As will be seen, there are a number of ways of insuring or hedging a portfolio to keep the multiple constant. These are discussed in Section 89.3. 89.3 Strategies and Implementation of Portfolio Insurance Portfolio insurance allows the investor to participate in the appreciation of value of a risky portfolio while limiting the potential losses of the portfolio. This is very similar to the features available from an investment in options. There are four basic strategies to implementing a portfolio-insurance program: (1) the use of stop-loss orders, (2) the purchase of exchange-traded put options, (3) the creation of synthetic put options, and (4) dynamic hedging using futures contracts. All of these strategies are detailed below. 89.3.1 Stop-loss orders A stop-loss order is a conditional market order to sell portfolio stock if the value of the stock drops to a given level. For example, if you held an index portfolio when the market index is at $100 and you expected the market to rise, you could limit your downside risk by placing a stop-loss order at $95. If the market fell to $95, your stop-loss order would become a market sell order, and the portfolio would be sold at the prevailing market price. The use of a stop-loss order does not guarantee that you would get exactly $95; you could get more or less. Still, you would begin to liquidate your position at a predetermined level. As you can see, the placement of stop-loss orders is a kind of crude portfolio insurance in that it allows you to make money if the market goes up and cuts your losses (approximately) to a predetermined level should the market fall. As pointed out by Rubinstein (1985), the major problem of using stoploss orders to approximate portfolio insurance is the path dependence of this technique — that is, if the market falls, the stop-loss order is executed, and the portfolio is sold, the portfolio manager needs to make a decision about when to get back into the market. The worst thing that could happen would be for the market to rebound immediately after the execution of the stop-loss order and sale of the portfolio. In such a case, the portfolio would consist of 100% cash. Because of this cash position, the portfolio would not benefit from the increase in stock price as the market rises. On the other hand, if the market were to continue to fall after the execution of the stop-loss order, the portfolio manager with 100% cash would be in a position enhanced by
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3108
the portfolio insurance — that is, 100% cash in a falling market. Hence, the success of the portfolio-insurance strategy is dependent on the subsequent market movement. Ideally, portfolio insurance should work regardless of the subsequent movement of the market; thus it should be path independent. Clearly, in the case of stop-loss orders, it is path-dependent, and this strategy is not a very useful form of portfolio insurance. 89.3.2 Portfolio insurance with listed put options Put options can be purchased on a stock index on various exchanges and used for creating an insured portfolio. Table 89.2 lists the indexes and exchanges on which options are available. If the market falls, the drop in the value of the portfolio will be offset by the gain in value of the put option. On the other hand, if the market increases in value, the portfolio will increase in value but the premium paid for the put option will be lost. For example, a portfolio manager with a $100-million portfolio purchases 4,000 Major Market Index (MMI) options with an exercise price of $250 with the cost of each option at $500. If the market value of the portfolio declines by $10 million by the expiration date of the option and the MMI index drops to $225, the portfolio manager can exercise the puts to offset the losses on the stock portfolio. The portfolio manager delivers the 4,000 put options and receives $10 million. Since the puts cost $2 million, the net gain on the puts is $8 million. The $8 million gain on the puts offsets some of the $10 million loss on the portfolio, for a net loss of $2 million. If the portfolio increases in value by $20 million and the MMI index closes above $250, the net gain on the portfolio will be Table 89.2: Chicago Mercantile Exchange S&P 500 Index
Philadelphia Stock Exchange Utilities Index Value Line Index National OTC Index
Listed options on market indexes.
Chicago Board Options Exchange S&P 100 Index S&P 500 Index
New York Stock Exchange NYSE Composite Index NYSE Beta Index
American Stock Exchange Major Market Index Computer Technology Index Oil Index Institutional Index Pacific Exchange Financial News Composite Index
page 3108
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
Figure 89.6:
page 3109
3109
Gains and losses of insured and uninsured portfolios.
$20 million from the increase in equity value minus the $2 million premium paid for the options — a net gain of $18 million. The portfolio returns for this example are depicted in Figure 89.6. A perfect hedge is usually not possible because the correlation between the market index and the portfolio may not be perfect. This is called the tracking problem. The greater the correlation between the portfolio and the index, the more effective the hedge. The lower the correlation, the less effective the hedge as a portfolio insurance strategy. In the example, it was assumed that the beta coefficient between the portfolio and the MMI was equal to one. If the beta in reality were not equal to one, more or less put contracts would be needed to insure the portfolio. For a beta less than one, fewer puts would be needed, and for a beta greater than one more puts would be needed. For example, if the portfolio had a beta of 1.20, 20% more put options would be needed to hedge the portfolio. The price change of the market index and the value of the portfolio, as well as the movement of the price of the option on the index, are all
July 6, 2020
15:54
3110
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
related. The success of a portfolio-insurance strategy depends on the correct determination of the hedge ratio. The hedge ratio is the ratio of the value of the portfolio to the value of the options contracts used to hedge the portfolio: Number of contracts × Face value of contracts . Hedge ratio = Face value of portfolio In discussing the use of put options for portfolio insurance, two characteristics were assumed about the nature of the options used. (1) They are available with long maturities or with maturities that match the portfolio manager’s investment horizon. (2) They are exercisable only at maturity — that is, they are European-type options. Unfortunately, all the index options listed in Table 89.2 are American options. American options can be exercised at any time before expiration, hence have higher values to certain investors. This is so because American options have all the advantages of European options plus the privilege of early exercise. The portfolio manager who uses American options for portfolio insurance finds that he or she is paying for the early exercise privilege when in fact he or she does not really need it. Second, listed options are not protected against normal cash-dividend payments. When a firm’s stock goes ex-dividend, the price of the stock is expected to fall by the amount of the dividend payment. This expected fall in the price of the stock is not offset by any changes in the option contract; hence, the market price of the option will be affected by the ex-dividend behavior of the stock’s price. As a result, options can be used to insure the capital-appreciation component of the stock return but not the dividend component. Third, all listed options have a maximum maturity of nine months, with most of the trading taking place in the near contracts with maturities of three months or less. Given these problems, the usefulness of index put options for portfolio insurance is somewhat questionable. 89.3.3 Portfolio insurance with synthetic options Rubinstein and Leland (1981) suggest a strategy that replicates the returns on a call option by continuously adjusting a portfolio consisting of stock and a risk-free asset (T-bill, cash). This is called a synthetic call-option strategy; it involves increasing the investment in stock by borrowing when the value of stocks is increasing, and selling stock and paying off borrowing or investing in the risk-free asset when market values are falling. The key variable in this strategy is the delta value, which measures the change in the price of a call
page 3110
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3111
3111
option with respect to the change in the value of the portfolio of risky stocks. For deep-in-the-money options, the delta value will be close to one because a $1 change in the stock value will result in approximately a $1 change in the option value. Thus, to replicate the option with cash and stock, almost one share must be purchased and the amount borrowed will be approximately equal to the exercise price. For deep out-of-the-money options, the value of the delta will be close to zero, and the replicating portfolio will contain very few shares and little or no borrowing. Hence, in its simplest form, the delta value largely depends on the relationship between the exercise price and the stock price. As the market moves to new levels, the value of the delta will change; hence, the synthetic option portfolio must be rebalanced periodically to maintain the proper mix between equity and borrowing or cash. In a similar manner, a portfolio manager can create replicated put options through a combination of selling short the asset and lending. The amount of stock sold short is equal to the delta value minus one. As the market decreases in value, more of the equity is sold (the short position increases), with the proceeds invested at the risk-free rate. If the market increases in value, money is borrowed to buy the stock and reduce the short position. The logic behind a call-replicating strategy is shown in Figure 89.7, when the exercise price of the option and current share price are $100. The straight line starting from the origin and labeled “1 Stock + 0 Borrowing” is the value of an unleveraged long position in the stock. When the stock is worth zero, the option is worth zero; and for every $1 increase in the value of the stock, the value of the position increases by $1. The straight line that starts from −100 on the profit axis (labeled “1 Stock + $100 Borrowed”) is the leveraged position in which one share of stock is owned and the exercise price of the call ($100) is borrowed. The curved line labeled CO is the value of the call option and is an increasing function of the stock price. When the option is deep out of the money, changes in the stock price do not have much effect on the value of the call option (delta is small); hence the line is almost horizontal. However, as the stock price increases, the rate of change in the call price (the slope of CO) increases until it approaches one for an in-the-money call. The slope of CO is the number of shares to be held in the replicating portfolio. At low stock prices, few shares are held; at higher share price more shares are held. The amount of borrowing needed to replicate the portfolio is represented by the dotted line BB . The intercept of BB and the stock-price axis represents the amount of borrowing and the line segment EF represents the amount of paying off borrowing or holding a risk-free asset. As the stock price rises, the dashed line BB pivots counterclockwise, taking on increasing
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3112
Figure 89.7:
Synthetic call option.
slope and an intercept farther from zero along the vertical axis. As the stock price falls, the BB line pivots clockwise with decreasing slope and an intercept closer to zero. The value C of the call option is shown on the profit axis. Thus, at any stock price, the value of the portfolio invested in stock is the value of the call C plus the amount of borrowing B minus the amount borrowed B or C. The line CO shows how the value of the insured portfolio reacts to changes in the stock price. As the stock price increases (decreases), the slope of the curved line becomes steeper (flatter) and the amount of borrowing increases (decreases). If the call is in the money at the expiration date, the investor will own one share in the replicating portfolio and owe an amount equal to the exercise price. If the call finishes out of the money, no stock will be owned and the borrowing will be fully repaid. This is equivalent to the position of the call buyer at expiration. Hence, a purchased call position can be replicated by a strategy of buying shares plus borrowing, where shares are bought (sold)
page 3112
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3113
3113
and the borrowing is increased (decreased) as the stock price rises (falls). The accuracy of the replicating strategy depends on four considerations. First, since the strategy may involve frequent trading, it is necessary that transaction costs be low. Second, it must be possible to borrow whatever amount is required. Third, trading in the stock may not provide continuous prices; there may be jumps or gaps. In this case, the strategy will not be able to exactly replicate the price movement of a traded call. Fourth, there may be uncertainty surrounding future interest rates, stock volatility, or dividends. This may affect the price of a traded call with an accompanying change in stock price. Hence, the value of the replicated option would not change, while the traded option price would change. Figure 89.8 shows a synthetic put position in which the stock and exercise prices are $100. As the stock value increases (decreases), the slope of a line tangent to the put-value curve becomes flatter (steeper) and the number of shares sold short in the synthetic put decreases (increases). The intersection of the value axis gives the lending amount. The lending position increases as the stock declines in value because more proceeds are available from the short sale of the stock. If the put is in the money at expiration, the investor will be short one share of the stock (e.g., $80) and will simultaneously lend
Figure 89.8:
Synthetic put option.
July 6, 2020
15:54
3114
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
the amount equal to the exercise price, $100. The total value of this position will equal the value of the listed put, $20.
89.3.4 Portfolio insurance with dynamic hedging Portfolio insurance is a method of hedging a portfolio of stocks against the market risk by short selling stock index futures. This hedging technique is frequently used by institutional investors when the market direction is uncertain or volatile. Short selling index futures can offset any downturns, but it also hinders any gains. Portfolio insurance is an investment strategy where various financial instruments such as equities and debts and derivatives are combined in such a way that degradation of portfolio value is protected. It is a dynamic hedging strategy which uses stock index futures. It implies buying and selling securities periodically in order to maintain limit of the portfolio value. The working of portfolio insurance is akin to buying an index put option, and can also be done by using the listed index options. The technique, invented by Hayne Leland and Mark Rubinstein in 1976, is often associated with the October 19, 1987, stock market crash. As a practical matter, rather than buying, selling, or short selling stocks, a portfolio manager may trade (buy or sell) stock-index futures to adjust the exposure of a portfolio. If the stock market is declining, futures contracts are sold to effectively reduce the position in the portfolio. The selling of stockindex futures has the same effect on the portfolio as the sale of stocks and the investment in a risk-free asset. As the market turns around and begins to go up in value, the portfolio manager purchases futures contracts to cover the short position by liquidating the investment in the risk-free asset. This procedure is called dynamic hedging. Whether it works depends upon a high degree of correlation between the value of index-futures contracts and the value of the underlying index. For most time periods, there is a very high degree of comovement between the index and the futures contract. However, this may not always be the case. Suppose that a portfolio manager wishes to insure a portfolio of welldiversified stocks with a value of $1 million, and the portfolio beta is 1.1 when measured relative to the S&P 500. (This means that if the S&P 500 goes up or down by 1%, the change in the value of the portfolio will be 1.1%.) If the portfolio manager is worried about a market decline, he or she can insure his or her portfolio by selling S&P 500 index futures contracts. By selling futures contracts, the fall in the value of the portfolio would be offset by gains in the futures market. If the strategy for portfolio insurance incorporates the beta, the following relationship is necessary to determine
page 3114
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3115
3115
the number of contracts required to insure the portfolio: Number of contracts =
Value of portfolio × β. Value of futures contracts
(89.7)
Equation (89.7) is a hedge ratio without considering the risk-return tradeoff and degree of risk aversion. Hedge ratios consider these factors which can be found in Chen et al. (2019). Assume the S&P 500 index is selling at $110, and the portfolio is worth $1 million on day 1. If the market falls by 10%, the value of the portfolio falls 11% to $890,000, the value of the S&P 500 futures contract falls to $99, and the scenario is as follows. (1) The portfolio has lost $110,000. (2) Each S&P 500 futures contract sold can be repurchased for $99 × 500 or $49,500, for a gain of $5500 ($55,000 − $49,500). In order to have a perfect hedge, the portfolio manager would need $110,000/5500 = 20 contracts. Using equation (89.7), we see that adjusting the number of contracts for the volatility or beta of the portfolio yields Number of contracts =
$ 1,000,000 × 1.1 = 20 contracts. $ 110 × 500
This result of a perfect hedge is dependent upon two critical assumptions. (1) The relationship of the volatility of the portfolio to the volatility of the index remains the same or the beta does not change. (2) The movement of the price of the futures contract is in lock step with the movement of the index; they are perfectly correlated. In spite of the problems for portfolio insurance associated with the validity of these critical assumptions, the use of futures contracts for portfolio hedging is widespread. The main reason why futures have been accepted as a trading vehicle for portfolio insurance is the low trading costs. This and the following reasons have led to rapid growth in the use of futures contracts for insuring portfolios during the early part of the 1980s. (1) Low trading costs allow for rapid and frequent adjustments in implementing the portfolio strategy. (2) Futures markets are liquid enough for insurance programs for large institutional portfolios. (3) The use of futures allows for independence between management of the risky assets in the portfolio and the portfolio insurance.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3116
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
89.4 Comparison of Alternative Portfolio-Insurance Strategies This section compares alternative portfolio-insurance strategies — that is, synthetic options, listed put options, and dynamic hedging. 89.4.1 Synthetic options As previously discussed, synthetic options are not options but a strategy for allocating assets that replicates the return on a call option. There are four advantages to using a dynamic-hedging strategy instead of syntheticoption approach. First, the futures contracts have lower transaction costs than stocks. Kidder, Peabody reports that index funds pay one-way transaction costs of $125 for $100,000 of stock, or 0.125% (they assume a commission of $0.05 per share at an average price of $40), and institutional clients pay approximately $0.06 per share (Hill, 1987). In contrast, one-way futures transaction costs for institutional investors are $15 per $100,000, or 0.015%. Thus, the transaction costs for futures are one-eighth to one-tenth of the cost of trading stocks, thereby reducing the cost of dynamic hedging compared with synthetic options. Second, liquidity is much greater in the futures market than in options or stock markets. For example, during the large downside movement in the stock market on September 11–12, 1986, the consulting arm of Leland O’Brien Rubinstein Associates sold more than $500 million of stock-index futures (Anders, 1986). Such large transactions could not be made in options markets. The third advantage is that futures contracts are highly levered and less cash is necessary to carry out a dynamic-hedging strategy. As little as $2,000 can control each futures contract. The fourth advantage is that because stocks are not sold, the management of the stock portfolio is independent of the portfolio insurance. In contrast, managers must be active when implementing a synthetic option because the underlying stock is bought or sold to replicate the returns on an option. The main disadvantage of dynamic hedging with respect to synthetic options is the tracking error that might occur because the futures are not perfectly correlated with the stock. This is not a problem with synthetic options because the stock itself is purchased or sold. Another disadvantage of implementing either a dynamic-hedging or synthetic-option strategy is that they require continuous monitoring to insure that the delta of the replicated put equals the delta of a theoretical option. The replicated put must be rebalanced at certain intervals to
page 3116
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3117
3117
maintain a proper mix of stock and futures. Here, a tradeoff results: the more frequent the rebalancing, the better the replication of the theoretical put; however, more frequent rebalancing also results in higher transaction costs. Thus, determining when to rebalance the portfolio is a major decision when implementing a synthetic-option or dynamic-hedging strategy in the presence of transaction costs. There might be a price advantage or disadvantage to a dynamic-hedging strategy due to futures mispricing. When shorting futures for hedging purposes, the manager hopes that the futures sell at a premium and the basis (the cash price minus the futures price) widens. By selling futures when they are expensive and buying them back at a lower price, a profit results on the hedge. On the other hand, losses on the hedge will probably result if “cheap” futures are sold for hedging purposes. The following example illustrates these concepts. Suppose that a manager holds a portfolio worth $250 and shorts futures on the same portfolio that are worth $260. The basis is $250–$260 = −$10. Because the basis is equal to zero at expiration, the negative basis will have to increase. In this example, the investor makes a profit of $10 on the hedge, regardless of how the portfolio performs. The profit is a result of selling “expensive” futures and waiting for the basis to increase (see Table 89.3). In contrast, suppose that an investor with a portfolio worth $250 shorts futures worth $245. In this example, the investor will lose money if he waits until expiration, because the basis will decrease (see Table 89.4). Here, the basis decreases and losses result for the hedger. Cheap futures will increase the costs of implementing a dynamic-hedging strategy and expensive futures will reduce the costs. The dynamic hedger hopes to se1l futures selling at a premium and buy the futures at a discount. The synthetic-call approach is usually used when implementing a portfolio-insurance strategy without futures. Thus, more of the risky asset is purchased when performance is favorable and the stock is sold when the
Table 89.3:
Profit or loss of expensive futures.
Value of stock at expiration of futures
Value of futures at expiration
Profit or loss on stock
Profit or loss on futures
Profit or loss
270 230
270 230
+20 −20
−10 +30
+10 +10
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3118 Table 89.4:
Profit or loss of cheap futures.
Value of stock at expiration of futures
Value of futures at expiration
Profit or loss on stock
Profit or loss on futures
Profit or loss
270 230
270 230
+20 −20
−25 +15
−5 −5
asset declines. When implementing a dynamic-hedging strategy, the underlying stocks are not bought or sold, but rather stock-index futures are shorted to create a put option on the portfolio. In both cases, insurance is created for the portfolio, but the methods are different. In the synthetic-option case, the underlying portfolio behaves like a call option. For the dynamic-hedging case, the portfolio is not altered, only the futures position is changed to create a put option. A further discussion of the synthetic-call approach can be found in Becker (1988). 89.4.2 Listed put options Portfolio insurance with listed put options does not require continuous monitoring because the delta of the listed option is automatically changed when the price of the underlying asset changes. Because of the automatic adjustment of the delta value, a listed-put strategy requires less monitoring and lower trading frequency than a synthetic-option or dynamic-hedging strategy. There are a number of problems with using stock-index puts for portfolio insurance. First, because index options have at most a nine-month life, an investor may have to buy these short-term options even though the planning horizon might be much longer. These cumulative purchases of puts would result in a much greater cost than a longer-term option. This is due to the fact that for the long-term one-year put to be in the money, the market price must be below the exercise price after one year. In contrast, the value of the short-term options will have a positive value if the market price finished below the exercise price after nine months. Second, the purchaser of portfolio insurance is not interested in exercising the option early because this would leave the portfolio uninsured. The purchaser of insurance would prefer European options over American options because European options are cheaper. (However, the most popular index option, the OEX S&P 100, is American.) This early-exercise feature is of
page 3118
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3119
3119
no value to the purchaser of portfolio insurance and adds to the cost of the insurance. The third problem is that large-portfolio managers cannot purchase put options for insurance purposes because options markets do not provide the liquidity necessary to purchase large amounts of put options. Low liquidity might be even more of a problem for deep in- or out-of-the-money options or options that expire in the six- to nine-month range of maturities. Like futures, there might be a lacking problem between the underlying stock or stocks and the instrument underlying the portfolio. A replicated-option approach requires more management than a listed put-option strategy. The dynamic-hedging approach uses the highly liquid futures market to rebalance the portfolio. On the other hand, portfolio managers of large portfolios cannot purchase listed puts for portfolio insurance because of lack of liquidity in the options market. Futures mispricing will increase or decrease the cost of dynamic hedging; if the short hedger shorts expensive futures, profits will result from the hedges. If expensive futures are shorted, losses will be incurred. 89.4.3 Dynamic hedging and listed put options The first difference between dynamic hedging and listed put options for portfolio insurance is that the delta value of a put option changes automatically while it must be adjusted continuously in a dynamic-hedging framework. Thus, dynamic hedging requires continuous monitoring and more frequent trading than listed puts. Second, insurance costs (the premium paid for the option) for a listedput strategy are known and paid up front. After the put is purchased, no more adjustments or monitoring are necessary. In contrast, insurance costs are unknown at the beginning of the period and are realized as the dynamichedging strategy is implemented. The cost of a dynamic-hedging strategy is the forgone profits that result from shorting futures. For example, suppose that a portfolio consists of $10,000 of stock and that $50,000 of futures contracts are shorted to replicate a put option (assume that this is one futures contract). In the second period, the value of the portfolio increased to $120,000, whereas the value of the futures contract appreciated to $60,000. If no futures were held, profit on the portfolio would be $20,000; however, the dynamic hedger lost $10,000 on the futures (see Table 89.5). Thus, the cost of the dynamic-hedging strategy is the gain on the insured portfolio minus the gain on the uninsured portfolio ($10,000 – $20,000 = −$10,000).
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
b3568-v3-ch89
C. F. Lee
3120 Table 89.5:
Period 1 Period 2
9.61in x 6.69in
Dynamic hedging.
Portfolio value
Futures value
Profit on portfolio
100,000 120,000
500,00 600,00
20,000
Loss on futures
Profit on insured portfolio
−10,000
10,000
Another difference is that the listed-option strategy is confined to fixedinterval exercise prices. In contrast, a dynamic-hedging strategy can be implemented around any exercise price. Changing volatility can affect the cost of a portfolio-insurance strategy. For options, the cost of put protection is higher if the volatility declines. This is because the purchaser of the put locks in the volatility at the time of purchase. Thus, if the volatility declines, the put value declines. In contrast, an increase in volatility lowers the cost of put protection because the price of the put increases. With a dynamic-hedging strategy, costs are lowered if volatility declines because the delta value will be reduced and not as many futures will be sold. A replicated-option approach can be used for any asset-portfolios of stocks, various types of bonds, agricultural commodities, currencies, or metals. Since options markets are limited, replicated options must be created to simulate the returns of these investments. 89.5 Impact of Portfolio Insurance on the Stock Market and Pricing of Equities Portfolio insurance is not formal insurance in the sense of a guarantee against loss. Rather, it is a method of hedging a portfolio either by selling a certain portion of the risky assets themselves, or futures contracts on stock indexes, when the market falls. Both techniques ultimately have the same effect, but index futures tend to be more popular because they entail smaller transaction costs. When the stock market rises as it did almost uninterruptedly from December 1986 to August 1987, the hedge was hardly used as investors participated in the bull market. But when stocks began to fall in autumn 1987, investors started selling futures, usually contracts tied to the S&P 500 index, thereby making up any losses from falling prices by the gain on the futures contracts. The portfolio insurer sells futures contracts at index levels that, in a falling
page 3120
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3121
3121
market, are higher than they will be later. The money from the sale of the futures is then invested in money-market instruments until it is needed to buy back the futures contracts at lower prices in the future when the market has bottomed out. Consider the following sequence of events involving the linkages between the sale of futures contracts for portfolio insurance and the market index. (1) Your portfolio, valued at $100, is equally divided between the risk-free asset and risky assets. The market index is 1000, and there is no futures discount from the index level. (2) The market declines to an index of 900, and futures give a discount from the index of 10. As the market declines, portfolio insurance requires reducing your investment in the risky asset. This can be done by selling stock directly or by selling futures on the index. You increase your investment in the risk-free asset to $55, leaving your investment in the risky asset long stocks 45, short index 5. (3) Market arbitrageurs can now take advantage of mispricing between the index level and the price of futures because the investor has sold futures contracts, thereby increasing the discount between the futures price and the market index level. The discount has gone from 0 to −10. These so-called program traders begin to sell stocks and buy index futures to lock in a risk-free return. This causes the market index to fall; and in a falling market, the investor will want to reduce his exposure to the risky asset by selling futures and investing the funds in the risk-free asset. This action will increase the discount between the index and the futures value and again encourage program traders to execute arbitrage trades. Once the cycle starts, it may be difficult to stop. Could the scenario described in the last few paragraphs ever really take place? Unfortunately, the answer is yes. On Monday, October 19, 1987 — ever after to be known as Black Monday — this is precisely what happened on the New York Stock Exchange. Figures 89.9–89.11 show the precipitous drop of 508 points on the Dow Jones index that took place on Black Monday. Figure 89.10 shows that for most of the day futures were selling at a discount to the index. What is more revealing is that every time the discount got wider, the index started to fall faster. The bottom panels in Figures 89.10 and 89.11 show the percentages of trading volume for futures and the NYSE. It is apparent that a substantial amount of the record trading was being done by institutions trying to implement their portfolio-insurance strategies. What is especially unfortunate about Black Monday is that no
July 6, 2020
15:54
3122
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
Figure 89.9: Daily movement of the Dow Jones industrial average, October 19–22, 1987. Source: Knight-Ridder Tradecenter, 1988. Courtesy of Knight-Ridder Financial Information.
Figure 89.10: S&P Index and futures contracts, Monday, October 19, 1987. Source: The Washington Post, October 9, 1988. Copyright @ 1988 by The Washington Post.
page 3122
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3123
3123
Figure 89.11: Dow Jones industrial one-minute chart, Monday, October 19, 1987. Source: The Washington Post, October 9, 1988. Copyright @ 1988 by The Washington Post.
one foresaw what would happen if every institution implemented a similar portfolio strategy simultaneously. Even the generally liquid NYSE could not handle the crushing volume of orders to sell as a result of portfolio-insurance strategies. Most recently, Blume et al. (1989) have shown the breakdown in the linkage between futures prices and the spot index on October 19 and October 20, 1987. In addition, they have found breakdowns in the linkage among NYSE stocks. 89.5.1 Regulation and the Brady report In the aftermath of the October 1987 crash, the role of regulation in both the futures and the stock markets was hotly debated. Figure 89.12 shows the relationships of the Congress, Federal agencies, and the Federal Reserve with the two leading futures and stock exchanges. Some have argued that the separate regulation of futures and stocks contributed to the severity of
July 6, 2020
15:54
3124
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
Figure 89.12: Regulating the markets. Source: The Washington Post, October 9, 1988. Copyright @ 1988 by The Washington Post.
page 3124
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3125
3125
the October 1987 crash. The role of portfolio insurance and index arbitrage in linking the futures and stock markets has brought into question the separation of regulatory authority. At the present time, what form the future regulatory structure will take is a matter of conjecture. One reasoned answer has been presented by the Brady Commission report. This report is likely to provide the framework for the debate of how to regulate US security markets. The Brady report makes five primary recommendations. (1) Information systems should be established to monitor transactions and conditions in related markets. (2) Clearing methods should be unified across marketplaces to reduce financial risk. (3) Margins should be made consistent across marketplaces to control speculation and financial leverage. (4) One agency should coordinate the few but critical regulatory issues that have an impact across the related market segments and throughout the financial system. The Federal Reserve is considered a reasonable candidate for this role. (5) Circuit-breaker mechanisms such as price limits and coordinated trading halts should be formulated and implemented to protect the market system. How many of these recommendations are acted upon in restructuring the futures and stock markets remains to be seen. However, it is clear that practices such as portfolio insurance that link different markets together have forced study of the problems of a segmented regulatory system. In the paper entitled, “Stock Index Futures: Does the Tail Wag the Dog?,” Finnerty and Park (1987) discuss how index futures trading can affect the spot market in detail. 89.6 Empirical Studies of Portfolio Insurance1 Much of the early research on portfolio insurance focused on simulation models to see whether synthetic-option strategies could successfully limit losses on portfolios of risky assets. In the early to mid-1980s, most of the empirical research on portfolio insurance was conducted by financial institutions such 1
Part of this section is from the “Portfolio Insurance Strategies” by L. C. Ho in Encyclopedia Finance 2nd edn (Forthcoming, 2012).
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3126
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
as Drexel, Burnham, Lambert and Salomon Brothers. However, more articles on the subject have appeared recently in journals such as The Financial Analyst’s Journal and Journal of Portfolio Management. 89.6.1 Leland (1985) Leland (1985) developed a replication strategy in the presence of transaction costs and tested this approach by using a simulation. As Leland pointed out, because transaction costs increase without limit as the revision period becomes shorter, it may be very costly to assure a given level of accuracy in the synthetic option. In addition, “transaction costs associated with replicating strategies are path-dependent: they depend not only on the initial and final stock prices, but also on the entire sequence of stock prices in between.” Another problem with trying to create portfolio insurance in the presence of transaction costs is that the transaction costs are correlated with the market. Leland poses the question, “In the presence of transaction costs, is there an alternative to the Black–Scholes replicating strategy which will overcome these problems?” Leland proposed a procedure that adjusts upward the volatility estimate to the replicating procedure. The modification is a positive function of transaction costs and the replication period. This increase in the volatility estimate is due to the fact that each time a transaction occurs, the associated trading costs cause the purchase cost to be higher and the selling proceeds to be lower than without these costs. With transaction costs, the cost of purchasing stock is higher than without these costs; similarly, the price of selling the stock is lower, inclusive of these costs. “This accentuation of up or down movements of the stock price can be modeled as if the volatility of the actual stock price was higher.” The adjustment to volatility is 2/πk 2 2 √ , (89.8) σA = σ 1 + σ t 2 is the transaction-cost adjusted volatility; σ is the annualized stanwhere σA dard deviation of the natural logarithm of the price; k is the round-trip transaction costs as a proportion of the volume of transaction; and t is the revision interval as a proportion of a year. The modification to the volatility estimate has the following characteristics.
(1) Transaction costs remain bounded as the revision period becomes short.
page 3126
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3127
3127
(2) The strategy replicates the option return inclusive of transaction costs with an error that is uncorrelated with the market and approaches zero as the revision period becomes short. (3) Expected turnover and associated transaction costs of the modified replicating strategy can easily be calculated, given the revision interval. Since the error inclusive of transaction costs is uncorrelated with the market, these transaction costs put bounds on the option prices. The delta value of this strategy takes into account transaction costs and will be higher than without transaction costs. As can be seen from equation (89.8), the volatility estimate increases as the revision period and transaction costs increase. Leland believes that the modified replication strategy is superior to following the Black–Scholes strategy and paying transaction costs that result from the strategy because the expected error is negative, “reflecting the fact that transaction costs are not ‘covered’ by the initial option price.” He assumes that the Black–Scholes strategy will result in higher turnover than the modified strategy because the latter “is based on a higher volatility which tends to ‘smooth’ the required trading.” Leland uses a simulation to determine the effectiveness of this strategy. The risk-free rate in the simulation is 10%; the expected return for the stock is 16%, with a standard deviation of 20%. Leland uses rebalancing periods of one week, four weeks, and eight weeks to replicate options expiring in three months, six months, and twelve months. Here, H is the difference between the listed call and the synthetic option after the revision period, Δt. ΔH = ΔC − [Cs S + r (C − Cs S)] ,
(89.9)
where ΔC is the change in the listed call option over the revision period, Cs is the delta value and is the number of shares in the replicated portfolio, and r is the interest rate over the revision period. If the replicated option perfectly matches the changes in the listed option, ΔH will equal zero. For each of the revision periods, the mean and standard deviation of the errors are calculated. Using simulated data with no transaction costs, the Black–Scholes strategy replicates the option with mean errors equal to zero for a revision period of one week. The standard deviation increases as the time until expiration decreases. As expected, the mean error increases as the revision period increases, with the greatest error occurring for the eight-week rebalancing period. Leland uses his modification to the volatility estimate to calculate the accuracy of option replication with transaction costs. He compares the errors from the modified strategy with a Black–Scholes replicating scheme in which delta values and borrowing amounts are derived from the Black–Scholes
July 6, 2020
15:54
3128
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
model and then transaction costs are added on. In every case, the preciseness of the modified strategy exceeds the accuracy of the Black–Scholes design when round-trip transaction costs of 1% are used. 89.6.2 Asay and Edelsburg (1986) Asay and Edelsburg used Monte Carlo simulations to determine how closely a synthetic-option strategy can match the returns on options on Treasurybill futures. They state that there are three objections to a synthetic-option plan. First, changing delta values mean that positions must be continuously adjusted for the returns to be similar. However, rebalancing may not be possible if prices change overnight. If the position changes, too few or too many shares will be held in the underlying asset and the returns will not be the same as in the option position. The second problem with a synthetic strategy is that transaction costs may be significant, particularly if the rebalancing period is short. The third problem is that the position could be whipsawed. Whipsawing occurs when the underlying asset increases enough to trigger rebalancing. After more shares are added, the underlying asset decreases in value and the additional shares are sold at a lower price than what was paid for them. A common remedy for this problem is to use a larger adjustment gap or filter rule; however, the wrong number of shares could be held if the filter rule was increased, particularly if the stock moved in a linear manner. Whipsawed positions commonly occur when the asset fluctuates around a constant level. Asay and Edelsburg used a Monte Carlo simulation of an option on a Treasury-bill future to determine if these and other problems would cause a divergence between the theoretical and synthetic positions. Profits and losses on the synthetic option are compared with those that would be earned through the purchase and sale of the option at the theoretical price if it were held to maturity. The profit on the replicated option is equal to the value of the option at maturity minus the price paid for option. They use adjustment gaps of 5–50 basis points in the simulated one-year interest-rate futures to trigger rebalancing. Fifty simulations on six-month options were run. The option prices generated by the synthetic approach were very close to the options purchased at their theoretical values. Larger adjustment gaps proved more accurate when the futures traded in a narrow range, but the larger gaps were “disastrous” when the futures moved in one direction. Thus, they were unable to recommend a best adjustment gap. Asay and Edelsburg found the transaction cost of the replicating strategy to be insignificant.
page 3128
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3129
3129
To determine the amount of mispricing due to using an incorrect volatility estimate, they used a “correct” volatility measure for the listed option and an “incorrect” measure for the synthetic option. They found that input of a wrong estimate into the option-pricing model seems “to have only a minimal effect through the resulting incorrect hedge ratio.” The authors concluded that problems such as transaction costs, whipsawing, and frequent rebalancing do not negate the effectiveness of the synthetic option on Treasury-bill futures (Asay and Edelsburg, 1986). 89.6.3 Eizman (1986) The purpose of Eizman’s paper is to determine which rebalancing strategy is the most effective in a dynamic-hedging portfolio-insurance framework. He considered three rebalancing disciplines: (1) to fully rebalance the required stock/cash mix at discrete time intervals, (2) to fully rebalance every time the market changes by a specified percentage, and (3) to rebalance only when the actual mix lags the required mix by more than a specified lag factor in either direction, “and then rebalance only to the extent of bringing the actual mix up to a lag factor away from the required mix. . .” Log normal returns to generate one-year puts are obtained using a multiyear Monte Carlo simulation. Eizman assumes a standard deviation of 13%, an interest rate of 8%, futures transaction costs of 0.15%, and a dividend yield of 4%. Put premiums and delta values for the synthetic put were generated by the Black–Scholes model. Then average annual transaction costs and average annual replication errors are calculated. In all three strategies, there is a tradeoff between transaction costs and accuracy; the liberal adjustment strategies had lower transaction costs and higher replication errors. For the discrete-time adjustment strategy, he rebalanced monthly, weekly, semiweekly, daily, and hourly. The transaction costs increased and replication errors decreased as the rebalance periods shortened. He used a utility function of Min (X + Y ), where X is average annual transaction costs and r is average annual replication error, to determine the most effective rebalance period. Using this criterion, the best method is the weekly method, with the semiweekly strategy coming in second. For the second strategy, market moves from 5% to 0% were used to trigger rebalancing. Again, larger percentage rebalance periods lead to lower transaction costs and larger errors. The third discipline, in which the portfolio is rebalanced only if the actual futures/stock mix lags the required mix by more than a prescribed percentage, works the best, according to Eizman. Under this method, the adjustment is made so that the actual mix is brought
July 6, 2020
15:54
3130
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
up to the 3% boundary and not up to the required mix. The lag that provided the best cost/error tradeoff was 3%. 89.6.4 Rendleman and McEnally (1987) This study addresses the issue of portfolio-insurance cost. It tries to answer the following questions. (1) What is the probability that the portfolio will earn no more than its insured floor return? (2) What are the expected returns and utilities of return of the insured portfolio versus those of a reasonable alternative strategy? (3) What are the probabilities that the insured portfolio or the alternative will have the higher return? (4) Over the long run, how will the accumulation of value from the insured portfolio stack up against the alternative? They used a Monte Carlo simulation with a 16% expected return and a 10% interest rate. The Black–Scholes model is used to derive delta values for the put option. In addition, a logarithmic utility function is specified to measure the utility of a portfolio-insurance strategy. Rendleman and McEnally simulate one- and three-year put options using minimum returns of 5%, 0%, and −5%, stock volatility of 20% and 18%, and interest rates of 10% and 6%. Various simulations are run and the costs of the put positions are calculated along with expected utility for the insured portfolio and an uninsured portfolio. Probability distributions are generated to determine the probability that the insured and uninsured portfolios have a return greater than the minimum floor. They concluded that the “insured portfolio is . . . more likely than the optimal portfolio to achieve only the guaranteed minimum return.” In addition, they found that portfolio insurance strategies “. . . suffer in comparison with an ‘optimal’ portfolio designed to maximize the rate of growth of portfolio value over time.” Using the various inputs on the simulations, they determined that the insured portfolio can expect to have a lower return than a similar optimal portfolio. They concluded that portfolio-insurance strategies are optimal only when investors are highly risk averse. 89.6.5 Garcia and Gould (1987) The main goal of this article is to compute the costs of a synthetic-call portfolio-insurance strategy. They do not deal with dynamic hedging because
page 3130
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3131
3131
“. . .sufficient historical experience is not yet available to simulate meaningfully the results from long equities/short futures implementations.” They use closing prices of the S&P 500 index from January 1, 1963 to December 31, 1983. They generate returns for 240 overlapping years by taking 20 January to January returns, then 20 February to February returns, and so on. This is the first published portfolio-insurance study to use real data as opposed to simulated data. The returns for hedged and unhedged portfolios were calculated. Over the period of study, the arithmetic mean of the unhedged S&P 500 returns was 9.63% with a standard deviation of 16.22%; the mean return for a portfolio insurance strategy with a 0% floor and 0.5% transaction costs was 7.08 with a lower standard deviation of 9.33%. The mean return for the strategy with a −5% floor was 8.18% with a standard deviation of 11.80%. Opportunity costs of the portfolio insurance were calculated. If the return on the unhedged portfolio was 10% and the return on the hedged portfolio was 0.10%, the opportunity cost is 5%. They conclude that with a zero floor insurance, investors gain 10.27% in a down year. When the market is up, hedged portfolios forgo 7.21%. They concluded that the cost of portfolio insurance is 170 basis points for zero-floor insurance and 83 basis points for a −5% strategy, and that a dynamically balanced portfolio will not outperform a buy-and-hold (BH) portfolio. 89.6.6 Zhu and Kavee (1988) They evaluated and compared the performances of the two traditional portfolio strategies, the synthetic-put approach of Rubinstein and Leland (1981) and the constant-proportion approach of Black and Jones (1987). They employed a Monte Carlo simulation methodology, assuming log-normally distributed daily returns with an annual mean return of 15% and paired with different values for market volatility, in order to discover whether these strategies can really guarantee the floor return and how much investors have to pay for the protection. Both strategies are able to reshape the return distribution so as to reduce downside risk and retain a certain part of the upside gains. However, they demonstrated that a certain degree of protection can be achieved at a considerable cost. There are two types of costs in implementing a portfolio insurance strategy. The first is the explicit cost, that is, the transactions costs. The other is the implicit cost, which is the average return forgone in exchange for protection against the downside risk. When the market becomes
July 6, 2020
15:54
3132
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
volatile, the protection error of the synthetic-put approach increases, and the transaction costs may be unbearable. On the contrary, while the constant proportion approach may have lower transactions costs, its implicit cost may still be substantial. 89.6.7 Perold and Sharpe (1988) Using simulated stocks and bills prices, they examined and compared how the four dynamic asset allocation strategies, namely, the BH, constant mix (CM), CPPI, and option-based portfolio insurance (OBPI), perform in bull, bear, and flat markets and in volatile and not-so-volatile markets. CPPI and OBPI strategies sell stocks as the market falls and buy stocks as the market rises. This dynamic allocation rule represents the purchase of portfolio insurance and has a convex payoff function, which results in a better downside protection and a better upside potential than a BH strategy. However, they do worse in relatively trendless, volatile markets. On the contrary, a CM strategy — holding a constant fraction of wealth in stocks — buys stocks as the market falls and sells them as it rises. This rebalancing rule effectively represents the sale of portfolio insurance and has a concave payoff function, which leads to less downside protection than, and not as much upside as a BH strategy. However, it does best in relatively trendless and volatile markets. They suggested that no one particular type of dynamic strategy is best in all situations. Financial analysts can help investors understand the implications of various strategies, but they cannot, and should not, choose a strategy without a substantial understanding of an investor’s circumstances and desires. 89.6.8 Rendleman and O’Brien (1990) They addressed the issue that the misestimation of volatility input can have a significant impact on the final payoffs of a portfolio using a synthetic put strategy. In an OBPI strategy, the daily portfolio adjustments depend on the delta of the put option on the risky asset. In the original Black and Scholes (1973) valuation equation, the delta at each day is a function of the following variables: Dt = f (St , k¯0 , r¯0 , σ0 , T ),
(89.10)
where St is the price of the risky asset at date t, k0 the strike price, r0 the annual continuously compounded riskless rate of interest, σ0 the ex ante
page 3132
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3133
3133
volatility parameter at the beginning of the insurance period, and T the maturity of the option. Assume that at the beginning of the insurance period, one manager predicts a high-volatile market and a 20% annualized volatility, while a second manager believes a low-volatile market and a 10% volatility. The delta for the first manager is higher which means he would buy more insurance and allocate less to a position in the risky asset. Assume that the ex post volatility turns out to be 10%. The second manager will have made the proper allocation between risky and riskless assets. In contrast, the high-estimate manager will have invested too little in the risky asset. This misallocation would lose the opportunity of participating in the price appreciation in a strong market. Thus, a manager who underestimates volatility will typically end up buying less insurance than is necessary to ensure a given return, while a manager who overestimates volatility will buy more insurance than is necessary and forgo gains. As for the issue of portfolio rebalancing, they examined adjustment frequencies by time intervals, namely, daily, weekly, monthly, and bimonthly. The effect (error) is measured as the difference of the horizon insurance values between non-continuous and continuous trading. They suggested that weekly rebalancing produces an amount of error that appears to be tolerable by most portfolio managers. More importantly, they addressed the biggest potential risk of implementing portfolio insurance strategies — the gap risk — by simulating the performance of the OBPI strategy over the period of the October 1987 market crash. They indicated that most insured portfolios would have fallen short of their promised values because the managers would not be able to adjust the portfolio in time before a big drop in the market. The gap risk will be discussed further in the financial stability section. 89.6.9 Loria, Pham, and Sim (1991) They simulated the performance of a synthetic-put strategy using futures contracts based on the Australian All Ordinaries Index for the period from April 1984 to March 1989. Their study contains 20 consecutive nonoverlapping three-month insurance periods whose expiration dates coincide with the expiration dates of each SPI futures contract traded on the Sydney Futures Exchange. Four implementation scenarios are examined: a zero floor versus 5% floor which correspond to a portfolio return of 0% and −5% per annual, respectively; and a realized volatility versus Leland’s (1985) modified volatility. They reported that there is no perfect guarantee of
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3134
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
loss prevention under any scenario. Even in the scenario with a 5% floor and modified volatility, two out of 20 contracts do not meet the desired floor. In addition, the OBPI strategy is most effective under severe market conditions. In other periods characterized by insignificant market declines, the value of the insured portfolio is below that of the market portfolio. They suggested futures mispricing may be one potential culprit for this outcome. 89.6.10 Do and Faff (2004) This empirical paper is an extension of Loria et al. (1991). They examined two approaches (the OBPI and CPPI) and conducted simulations across two implementation strategies (via Australia stock index and bills, and via SPI futures and stock index). Furthermore, they considered the use of implied volatility as an input into the model, and the use of a zero floor versus 5% floor. The dataset consists of 59 non-overlapping three-month insurance periods, which span from October 1987 to December 2002. Thus, their key contributions relative to Loria et al. (1991) are the examination of a futures-based CPPI, a fine-tuned algorithm that allows for dividend payments, the consideration of ex ante volatility information, and more up-to-date data. In terms of floor protection, the futures-based portfolio insurance implementation generally dominates its index-and-bill rival in both floor specifications, which reflects the low transaction costs in the futures market. Furthermore, the perfect floor protection is possible when implied volatility is used rather than using ex post volatilities. From the cost of insurance perspective, the futures-based strategy generally induces a lower cost of insurance than its index-and-bill rival in both floor specifications, which reflects the same reason as above. However, the cost of insurance is higher when implied volatility is used compared to ex post volatilities. The possible explanation is that the implied volatility is often higher than the same period ex post volatility, which results in over-hedging. As for the performances between the OBPI and CPPI approaches, there is no strong evidence to distinguish between them. Within the futures-based implementation, the synthetic put appears to dominate the CPPI with respect to floor protection, while the latter appears to slightly outperform in terms of upside participation. They also examined whether portfolio insurance strategies work under stress conditions by assessing the strategies’ effectiveness during tranquil and turbulent periods. Tranquil periods are defined as ones that return more
page 3134
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3135
3135
than −4% and have a realized volatility of less than 15%; violation of either one or both of these conditions is regarded as indicating a turbulent period. All portfolio insurance strategies achieve 100% floor protection during tranquil periods, whereas the futures-based OBPI approach records the highest portfolio return. During turbulent times, futures-based portfolio insurance continues to perform quite well. The 1987 stock market crash makes the assessment difficult because of trading halts. However, assuming the futures continued to trade during that crisis, from the algorithm’s perspective, the futures-based CPPI maintains a positive return, while the OBPI results in a negative return. 89.6.11 Cesari and Cremonini (2003) This study is an extensive comparison of a wide variety of traditional portfolio insurance strategies. There are basically five dynamic asset allocation strategies: (1) BH, (2) CM, (3) constant proportion (without and with the lock-in of profits, CP and CPL), (4) the option-based approach (with three variations, BCDT, NL, and PS), and (5) technical strategy (with two kinds of stop-loss mechanism, MA and MA2); therefore, nine strategies in total are considered. For each strategy, eight measures for risk-return, and risk-adjusted performance are calculated, namely, mean return, standard deviation, asymmetry, kurtosis, downside deviation, Sharpe ratio, Sortino ratio and return at risk. The strategies are then compared in different market situations (bear, no-trend, bull markets) and with different market volatility periods (low, medium, and high periods), taking into account transaction costs and discrete rebalancing of portfolios. The three market situations are defined accordingly if market average returns fall into the three ranges: (−30%, −5%), (−5%, +5%), (+5%, +30%), and the three ranges for the volatile periods are: (10%, 15%), (15%, 25%), (25%, 30%). Transaction costs are treated in two ways: a proportional cost to the value traded, and a correction to Leland’s (1985) option volatility. Two main rebalancing disciplines are used: a time discipline with weekly adjustment and a price discipline with adjustment only when prices are increased/decreased by 2.5% with respect to the previous rebalance time. Monte Carlo simulations of MSCI World, North America, Europe, and Pacific stock market returns show that no strategy is dominant in all market situations. However, in bear and no-trend markets, CP and CPL strategies appear to be the best choice. In a bull market or in no-trend but with high volatility market, the CM strategy is preferable. If the market phase
July 6, 2020
15:54
3136
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
is unknown, CP, CPL, and BCDT strategies are recommended. In addition, these results are independent of the volatility level and the risk-adjusted performance measure adopted. 89.6.12 Herold, Maurer, and Purschaker (2005) By constructing a fixed income portfolio, in which the risky asset is the JPMorgan government bond index and the riskless asset is cash (one-month yield), they compared the hedging performances between the traditional CPPI strategy and the risk-based (specifically, VaR-based) strategy with a one-year investment horizon that begins at each year from 1987 to 2003. CPPI avoids losses in the bear years of 1994 and 1999. The mean return is inferior to (about 40 bp below) that of the risk-based strategy. CPPI also produces a higher turnover. 89.6.13 Hamidi, Jurczenko, and Maillet (2007) Although they proposed a conditional CPPI multiplier determined either by VaR or by ES, only a VaR-based measure is studied in this empirical work, and the ES measure is absent for future research. The dataset contains 29 years of daily returns of the Dow Jones Index, from January 2, 1987 to May 20, 2005, 4,641 observations in total. They used a rolling window of 3,033 returns to estimate the VaR, and there are 1,608 VaRs in the out-of-sample period. They resorted to eight methods of VaR calculation: one non-parametric method using the historical simulation approach; three parametric methods based on distributional assumptions, namely, the normal VaR, the RiskMetrics VaR based on the normal distribution, and the GARCH VaR based on the Student-t distribution; four semiparametric methods using quantile regression to estimate the conditional autoregressive VaR (CAViaR), namely, the symmetric Absolute Value CAViaR, the Asymmetric Slope CAViaR, the IGARCH(1,1) CAViaR, and the Adaptive CAViaR. According to the 1,608 back-testing results, the Asymmetric Slope CAViaR is the best model to fit the data. After having calculated the VaR values, the conditional multipliers can be determined. The estimations spread between 1.5 and 6, which are compatible with multiple values used by practitioners in the market (between 3 and 8). Using the time-varying CPPI multipliers estimated by different methods, and a “multi-start” analysis — the fixed one-year investment horizon beginning at every day of the out-of-sample period — they found that the final returns of these insured portfolios are not significantly different.
page 3136
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3137
3137
89.6.14 Ho, Cadle, and Theobald (2008) This empirical study presents a complete structure of comparing traditional portfolio insurance strategies (OBPI, CPPI) with modern risk-based portfolio insurance strategies (VaR-, ES-based RBPI). By constructing a currency portfolio, in which the risky asset is the Australian dollar and the riskless asset is a US dollar overnight deposit, they compared the dynamic hedging performances between the traditional and the modern strategies with a one-year investment horizon that begins at each year from 2001 to 2007. When implementing the OBPI strategy, the delta is calculated based on the modified Black–Scholes formula, Dt = f (St , k¯0 , rt , σt , T ). That is, daily annualized historical volatilities are used as inputs in the put-replication process instead of the constant ex ante or the implied volatility in the original model to mitigate the volatility misestimation problem. Besides, the latter two parameters would make the portfolio daily returns more volatile. The interest rates are also updated daily for the same reason. When CPPI is implemented, the possible upper bounds of the multiplier are examined via the extreme value theory, which ranges from 4 to 6 corresponding to different confidence levels. When risk-based approaches are employed, both historical distribution and normal distribution with exponentially weighted volatility of risky asset returns are assumed. A daily rebalancing principle is adopted in their research without any modification to show the original results of hedging. The performances are evaluated from six differing perspectives. In terms of the Sharpe ratio and the volatility of portfolio returns, the CPPI is the best performer, while the VaR based upon the normal distribution is the worst. From the perspective that the return distribution of the hedged portfolio is shifted to the right and in terms of both the average and the cumulative portfolio returns across years, the ES-based strategy using the historical distribution ranks first. Moreover, the ES-based strategy results in a lower turnover within the investment horizon, thereby saving transaction costs. 89.7 Contingent Immunization and Bond Portfolio Management Contingent immunization allows a bond-portfolio manager to pursue the highest yields available through active strategies while relying on the techniques of bond immunization to assure that the portfolio will achieve a given minimal return over the investment horizon. Using this strategy, the portfolio manager attempts to earn returns in excess of the immunized return, but at the same time attempts to constrain or control losses that may result from poor forecasts of interest-rate movements.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
3138
Risk control is the major objective of contingent immunization. At the inception of the strategy, the manager determines the degree of risk he or she is willing to accept. If a sequence of interest-rate movements causes the portfolio to approach the predetermined risk level, the manager alters the portfolio’s duration to completely immunize it from any further risk. On the other hand, if the interest-rate movements provide additional returns, the portfolio manager does nothing. The difference between the minimal, or floor, rate of return and the rate of return on the market is called the cushion spread. Equation (89.11) shows the relationship between the market rate of return Rm and the cushion C to be the floor rate of return, RFL . RFL = Rm − C.
(89.11)
Interest-rate movement favorable to the bond-portfolio manager’s position will enlarge the spread — that is, Rm goes up and the portfolio is long bonds, thereby increasing the realized return. Adverse interest-rate movements will reduce the cushion spread Rm – C up to the point that RFL = Rm . At this point, the portfolio manager will immunize the portfolio, which will ensure that the realized return will equal RFL .
Figure 89.13:
Contingent immunization.
page 3138
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3139
3139
Figure 89.13 is a graphical presentation of contingent immunization. The realized change in the market rate of return is shown on the horizontal axis, and the potential rate of return for the portfolio is shown on the vertical axis. The potential return is a function of the market return, the floor return, and the cushion. If the interest-rate change is +2%, the portfolio manager shifts to an immunization strategy that locks in the RFL at 10% for the planning horizon. Regardless of interest-rate movements thereafter, the portfolio will realize a return of 10%. If interest rates were to go down, the portfolio would earn a return in excess of the RFL because of the manager’s ability to have a portfolio with a duration larger than the investment horizon. The contingent-immunization approach allows the portfolio manager the ability to follow the active investment strategies discussed earlier in this chapter. At the same time, the manager is able to protect a minimum return by using a duration-based immunization strategy. There is an old saying in the market: “Let your winners run and cut your losses.” This is exactly the philosophy behind the contingent-immunization strategy. 89.8 Summary This chapter has discussed basic concepts and methods of portfolio insurance for stocks. Strategies and implementation of portfolio insurance have also been explored in detail. Other issues related to portfolio insurance and dynamic hedging have also been studied. Portfolio insurance was described not as an insurance technique but rather as an asset-allocation or hedging technique. The general methods of portfolio insurance — (l) stop-loss orders, (2) market-traded index options, (3) synthetic options, and (4) futures trading — were discussed. One of the major issues facing investors and regulators during the later 1980s was introduced — the effect of portfolio insurance on the stock market.
Bibliography M. Al-Shammari and H. Masri (2015). Multiple Criteria Decision Making in Finance, Insurance and Investment, Springer. G. Anders (1986). Investors Risk for Portfolio Insurance. The Wall Street Journal, 14, 6. M. Asay and C. Edelsburg (1986). Can a Dynamic Strategy Replicate the Returns of an Option. Journal of Futures Markets, 6, 63–70. K. Becker (1988). Rebalancing Strategies for Synthetic Call Options. Doctoral dissertation, University of Illinois. S. Benninga and M. Blume (1985). On the Optimality of Portfolio Insurance. The Journal of Finance, 40, 1341–1352.
July 6, 2020
15:54
3140
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
C. F. Lee
P. Bertrand and J. L. Prigent (2005). Portfolio Insurance Strategies: OBPI versus CPPI. Finance, 26(1), 5–32. F. Black and R. Jones (1986). Simplifying Portfolio Insurance. New York: Goldman Sachs. F. Black and R. Jones (1987). Simplifying Portfolio Insurance. Journal of Portfolio Management, 14, 48–51. F. Black and M. Scholes (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 81, 637–659. M. E. Blume, A. C. Mackinlay, and B. Terker (1989). Order Imbalances and Stock Price Movements on October 19 and 20, 1987. Journal of Finance, 44, 827–848. Z. Bodie, A. Kane, and A. J. Marcus (2011). Investments, 9th ed. McGraw-Hill. M. J. Brennan and R. Solanki (1981). Optimal Portfolio Insurance. Journal of Financial and Quantitative Analysis, 16, 279–300. R. Cesari and D. Cremonini (2003). Benchmarking, Portfolio Insurance and Technical Analysis: A Monte Carlo Comparison of Dynamic Strategies of Asset Allocation. Journal of Economic Dynamics & Control, 27, 987–1011. S. S. Chen, C. F. Lee, and K. Shresth (2019). Hedge Ratio and Time Series Analysis. Handbook of Financial Econometrics, Mathematics, Statistics, and Technology, Singapore: World Scientific. B. H. Do and R. W. Faff (2004). Do Futures-Based Strategies Enhance Dynamic Portfolio Insurance? Journal of Futures Markets, 6, 591–608. W. Dreher (1988). Does Portfolio Insurance Ever Make Sense? Journal of Portfolio Management, 14, 25–32. E. S. Eizman (1986). Rebalance Disciplines for Portfolio Insurance. Journal of Portfolio Management, 13, 59–62. J. Finnerty and H. Park (1988). How to Profit from Program Trading. Journal of Portfolio Management, 14, 40–46. J. Finnerty and H. Park (1987). Stock Index Futures: Does the Tail Wag the Dog? Financial Analysts Journal, 43, 57–61. C. B. Garcia and F. J. Gould (1987). An Empirical Study of Portfolio Insurance. Financial Analysts Journal, 43, 44–54. S. A. Gidel (1999). Stock Index Futures & Options: The Ins and Outs of Trading Any Index, Anywhere, 1st ed. Wiley. S. Grossman (1988). An Analysis of the Implications for Stock and Futures Price Volatility of Program Trading and Dynamic Hedging Strategies. Journal of Business, 61, 275– 298. S. Grossman (1988). Insurance Seen and Unseen: The Impact on Markets. Journal of Portfolio Management, 14, 5–8. B. Hamidi, E. Jurczenko, and B. Maillet (2007). An extended expected CAViaR approach for CPPI. Working paper, Variances, University of Paris-1. U. Herold, R. Maurer, and N. Purschaker (2005). Total Return Fixed-Income Portfolio Management: A Risk-Based Dynamic Strategy. Journal of Portfolio Management, 31, 32–43. J. M. Hill (1987). Portfolio Insurance: Volatility Risk and Futures Mispricing, New York: Kidder Peabody Financial Futures Department. L. C. Ho (2012). Portfolio Insurance Strategies. In C. F. Lee, and A. C. Lee (eds.), Encyclopedia Finance, 2nd ed. New York: Springer, Forthcoming. L. C. Ho, J. Cadle, and M. Theobald (2010). Portfolio Insurance Strategies: Review of Theory and Empirical Studies. In C. F. Lee, A. C. Lee, and J. Lee (eds.), Handbook of Quantitative Finance and Risk Management. New York: Springer, 319–332.
page 3140
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch89
Synthetic Options, Portfolio Insurance, and Contingent Immunization
page 3141
3141
L. C. Ho, J. Cadle, and M. Theobald (2008). An Analysis of Risk-Based Asset Allocation and Portfolio Insurance Strategies. Working paper, Central Bank of the Republic of China (Taiwan). J. Hull (2017). Options, Futures, and Other Derivatives, 10th ed. Upper Saddle. River, New Jersey: Prentice Hall. L. M. John, L. M. Donald, E. P. Jerald, and W. M. Dennis (2007). Managing Investment Portfolios: A Dynamic Process, 3rd ed. CFA Institute. M. Kritzman (1986). What’s Wrong with Portfolio Insurance? The Journal of Portfolio Management, 13, 13–16. C. F. Lee, J. Finnerty, J. Lee, A. Lee, and D. Wort (2013). Security Analysis and Portfolio Management, and Financial Derivatives, Singapore: World Scientific. H. E. Leland (1985). Option Pricing and Replication with Transaction Costs. Journal of Finance, 40, 1283–1307. H. E. Leland (1980). Who Should Buy Portfolio Insurance. Journal of Finance, 35, 581– 594. S. Loria, T. M. Pham, and A. B. Sim (1991). The Performance of a Stock Index FuturesBased Portfolio Insurance Scheme: Australian Evidence. Review of Futures Markets, 10(3), 438–457. D. Luskin (1988). Portfolio Insurance: A Guide to Dynamic Hedging, Wiley. B. Malkiel (1988). The Brady Commission Report: A Critique. Journal of Portfolio Management, 14, 9–13. E. Nolan, M. A. Sola, and S. Crouch (2010). The Insured Portfolio: Your Gateway to Stress-Free Global Investments, Wiley. T. O’Brien (1988). The Mechanics of Portfolio Insurance. Journal of Portfolio Management, 14, 40–47. A. F. Perold (1986). Constant Proportion Portfolio Insurance. Working Paper, Harvard Business School. A. R. Perold and W. F. Sharpe (1988). Dynamic Strategies for Asset Allocation. Financial Analysts Journal, 44(1), 16–27. Presidential Task Force on Market Mechanisms (1988). Tire Brady Report, Washington, DC: US. Government Printing Office. R. J. Rendleman and R. W. McEnally (1987). Assessing the Cost of Portfolio Insurance. Financial Analysts Journal, 43, 27–37. R. J. Rendleman, Jr. and T. J. O’Brien (1990). The Effects of Volatility Misestimation on Option-Replication Portfolio Insurance. Financial Analysts Journal, 46(3), 61–70. M. Rubinstein (1985). Alternative Paths to Portfolio Insurance. Financial Analysts Journal, 41, 42–52. M. Rubinstein and H. Leland (1981). Replicating Options with Positions in Stock and Cash. Financial Analysts Journal, 37, 63–72. A. Saliba and J. Schwager (2016). Managing Expectations: Driving Profitable Option Trading Outcomes through Knowledge, Discipline, and Risk Management, Martinkronicle. C. Singleton and R. Grieves (1984). Synthetic Puts and Portfolio Insurance Strategies. Journal of Portfolio Management, 10, 63–69. J. A. Tilley and G. O. Latainer (1985). A Synthetic Option Framework for Asset Allocation. Financial Analysts Journal, 41, 32–41. Y. Zhu and R. C. Kavee (1988). Performance of Portfolio Insurance Strategies. Journal of Portfolio Management, 14(3), 48–54.
This page intentionally left blank
July 17, 2020
14:41
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Chapter 90
Alternative Security Valuation Model: Theory and Empirical Results Cheng Few Lee Contents 90.1 90.2 90.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Warren and Shelton Model . . . . . . . . . . . . . . . . . . . Johnson & Johnson as a Case Study . . . . . . . . . . . . . . 90.3.1 Data sources and parameter estimations . . . . . . . 90.3.2 Procedure for calculating WS model . . . . . . . . . . 90.4 Francis and Rowell Model . . . . . . . . . . . . . . . . . . . . 90.4.1 The FR model specification . . . . . . . . . . . . . . . 90.4.2 A brief discussion of FR’s empirical results . . . . . . 90.5 Feltham–Ohlson Model for Determining Equity Value . . . . 90.6 Combined Forecasting Method to Determine Equity Value . . 90.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 90A: Procedure of Using Microsoft Excel to Run FINPLAN Program . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 90B: Program of FINPLAN with an Example . . . . . . .
3144 3144 3149 3149 3157 3165 3170 3177 3177 3180 3180 3181 3182 3184
Abstract In this chapter, we will discuss four alternative security valuation models. These four models are as follows: (i) Warren and Shelton model, (ii) Francis and Rowell model, (iii) Feltham–Ohlson model, and (iv) combined forecasting model. In this chapter, we will show how accounting, stock price, and economic information can be used to determine
Cheng Few Lee Rutgers University e-mail: cfl[email protected] 3143
page 3143
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3144
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
security values in terms of finance theory. Algebraic simultaneous equation, econometrics model, and Excel program will be used for empirical studies. Keywords Financial ratios • Francis and Rowell model • Feltham–Ohlson model • Warren and Shelton model • Production cost • Common stock valuation.
90.1 Introduction Following Lee et al. (2017), either linear-programming approach or simultaneous-equation approach can be used to do security variation. In this chapter, we will consider two models based upon a simultaneous-equation approach to financial planning. The first model we will consider, in this chapter, is the Warren and Shelton (1971) (hereafter, WS) model. The other model, that of Francis and Rowell (1978) (hereafter, FR), is an expansion of the WS model. Both models allow the financial planner/manager to analyze important operating and financial variables. Both models are also computerbased and much less complicated for the user than the Carleton model. To certify some of the uses of WS type of simultaneous equation models, we will use a case study of Johnson and Johnson. In addition, we have discussed Felthan–Ohlson model for determining equity value. Finally, we have explored the usefulness of integrating WS model and Felthan–Ohlson model to improve the determination of equity value. In Section 90.2, we discuss the Warren and Shelton model. Johnson & Johnson as a case study is presented in Section 90.3. The Francis and Rowell model is explored in Section 90.4, while the Feltham–Ohlson model is discussed in Section 90.5. The combined forecasting method to determine equity value is discussed in Section 90.6. Finally, Section 90.7 summarizes the chapter. 90.2 Warren and Shelton Model A good financial-planning model will have the following characteristics: (1) The model results and the assumptions should be plausible and/or credible. (2) The model should be flexible so that it can be adapted and expanded to meet a variety of circumstances. (3) It should improve on current practice in a technical or performance manner.
page 3144
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
b3568-v3-ch90
page 3145
3145
(4) The model inputs and outputs should be comprehensible to the user without extensive additional training. (5) It should take into account the interrelated investments, financing, dividend, and production decisions and their effects on the market value of the firm. (6) The model should be fairly simple for the user to operate without extensive intervention of non-financial personnel and tedious formulation of the input. With the exception of point (6), Carleton’s (1970) model fits this framework. In an effort to improve upon Carleton’s model, WS devised a simultaneous-equation model. The WS model has some similarities to Carleton’s model. WS do take greater account of the interrelation of the financing, dividend, and investment decisions. They rely upon a sales forecast as a critical input to the model. Unlike Carleton, WS explicitly used various operating ratios in their model. Carleton had those ratios in his model implicitly through the manner in which the forecasts were made up. The explicit positioning of the ratios means that the WS method is computationally less tedious, and thus its use is more time-efficient. As can be seen from examining Table 90.1, the WS model has four distinct segments corresponding to the sales, investment, financing, and return-toinvestment concepts in financial theory. The entire model is a system of 20 equations of a semi-simultaneous nature. The actual solution algorithm is recursive, between and within segments. Now, we will consider in detail an Excel computer program called FINPLAN, which is used to solve the WS model.1 First, we will consider the inputs to the WS model. Second, we will develop into the interaction of the equations in the model. Third, we will look at the inputs to the FINPLAN model. The 20-equation model appears in Table 90.1, and the parameters used as inputs to the model are demonstrated in the second part of Table 90.2. As in the Carleton model, the driving force of the model is the sales-growth estimates (GSALSt ). Equation (1) shows that the sales for period t is simply the product of sales in the prior period multiplied by the growth rate in sales for the period t. We then derive earning before interest and taxes (EBIT) 1
The original FINPLAN is a financial-forecasting model using FORTRAN computer program based upon Warren and Shelton (1971) simultaneous-equation approach to financial planning. The detailed description of the computer program can be found in Appendix 90B.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3146 Table 90.1:
The WS model.
I. Generation of sales and earnings before interest and taxes for period t (1) Salest = Salest−1 × (1+GSALSt ) (2)
EBITt = REBITt ×SALESt
II. Generation of total assets required for period t (3)
CAt = RCAt × SALES t
(4)
FAt = RFAt × SALES t
(5)
At = CAt + FAt
(6)
CLt = RCLt SALESt
(7)
N Ft = (At − CLt − P F DSKt ) − (Lt−1 − LRt ) − St−1 − Rt−1
(8)
N Ft + bt (1 − Tt )[iet N Lt + Ut1 N Lt ] = N Lt + N St
(9)
Lt = Lt−1 − LRt + N Lt
(10)
St = St−1 + N St
(11)
Ri = Rt−1 + bt {(1 − Tt )[EBITt − it Lt
(12)
−Ut1 N Lt ] − P F DIVt } Lt−1 −LRt t + iet NL it = it−1 Lt Lt
−bt {(1 − Tt )[EBITt − it−1 (Lt−1 − LRt )] − P F DIVt }
(13)
Lt St +Rt
= Kt
III. Financing the desired level of assets IV. Generation of per share data for period t (14)
EAF CDt = (1 − Tt )[EBITt − it Lt −Ut1 N Lt ] − P F DIVt
(15)
CM DIVt = (1 − bt )EAF CDt
(16)
N U M CSt = N U M CSt−1 + N EW CSt
(17)
N EW CSt =
(18)
Pt = mt EP St
(19)
EP St =
EAF CDt NU M CSt
(20)
DP St =
CM DIVt NU M CSt
NSt (1−Uts )Pt
Notes: The above system is “complete” in 20 equations and 20 unknowns. The unknowns are listed and defined in Table 23.2, together with the parameters (inputs) that management is required to provide. Source: Warren and Shelton (1971, Appendix I). Reprinted by permission.
as a percentage of sales ratio, as in equations (3) and (4) through the use of the CA/SALES and FA/SALES ratios. The sum of CA and FA is the total assets for the period (equation (5)). The financing of the desired level of assets is undertaken in Section III. In equation (6), current liabilities in period t is derived from the ratio of
page 3146
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Table 90.2:
page 3147
3147
List of unknowns and list of parameters provided by management.
I. Unknowns 1. SALESt 2. CAt 3. FAt 4. At 5. CLt 6. NFt 7. EBITt 8. NLt 9. NSt 10. Lt 11. St 12. Rt 13. it 14. EAFCDI 15. CMDIVt 16. NUMCSt 17. NEWCSt 18. Pt 19. EPSt 20. DPSt
Sales Current Assets Fixed Assets Total Assets Current Payables Needed Funds Earnings before Interest and Taxes New Debt New Stock Total Debt Common Stock Retained Earnings Interest Rate on Debt Earnings Available for Common Dividends Common Dividends Number of Common Shares Outstanding New Common Shares Issued Price per Share Earnings per Share Dividends per Share
II Provided by Management 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
SALESt−1 GSALSt RCAt RFAt RCLt PFDSKt PFDIVt Lt−1 LRt St−1 Rt−1 bt Tt it−1 iet REBITt Ut1 Uts Kt NUMCSt−1 mt
Sales in Previous Period Growth in Sales Current Assets as a Percent of Sales Fixed Assets as a Percent of Sales Current Payables as a Percent of Sales Preferred Stock Preferred Dividends Debt in Previous Period Debt Repayment Common Stock in Previous Period Retained Earnings in Previous Period Retention Rate Average Tax Rate Average Interest Rate in Previous Period Expected Interest Rate on New Debt Operating Income as a Percent of Sales Underwriting Cost of Debt Underwriting Cost of Equity Ratio of Debt to Equity Number of Common Shares Outstanding in Previous Period Price-Earnings Ratio
Source: Warren and Shelton (1971, Table 1). Reprinted by permission.
July 6, 2020
15:54
3148
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
CL/SALES multiplied by SALES. Equation (7) is the funds required (NFt ). Like Carleton’s model, FINPLAN assumes that the amount of preferred stock is constant over the planning horizon. In determining the needed funds, FINPLAN uses accounting identities. As equation (7) shows, the assets for period t are the basis for financing needs. Current liabilities, as determined in equation (6), are one source of funds and are therefore subtracted from asset levels. As mentioned above, preferred stock is constant and therefore must be subtracted. After the first parenthesis in equation (7), we have the financing that must come from internal sources (retain earnings) and long-term external sources (debt and stock issues). The second parenthesis takes into account the remaining old debt outstanding, after retirements, in period t. Then the funds provided by existing stock and retained earnings are subtracted. The last quantity is the funds provided by operations during period t. Once the funds needed for operations are defined, equation (8) specifies that new funds, after taking into account underwriting costs and additional interest costs from new debt, are to come from long-term debt and new stock issues. Equations (9) and (10) simply update the debt and equity accounts for the new issuances. Equation (11) updates the retained earnings account for the portion of earnings available to common shares as a result of operations during period t. The term bt is the retention rate in period t (i.e., the complement of the payout ratio) and (1 −Tt ) is the after-tax percentage, which is multiplied by the earnings from the period after netting out interest costs on both new and old debts. Since preferred stockholders must be paid before common stockholders, preferred dividends must be subtracted from funds available for common dividends. Equation (12) calculates the new weighted-average interest rate for the firm’s debt. Equation (13) is the new debt-to-equity ratio for period t. Section IV of Table 90.1 is concerned with the common stockholder, dividends, and market value. Equation (14) is the earnings available for common dividends. It is the same as the portion in equations (11)–(15) by using the complement of the retention rate multiplied by the earnings available for common dividends. Equation (16) updates the number of common shares for new issues. As equation (17) shows, the number of new common shares is determined by the total amount of the new stock issue divided by the stock price after discounting for issuance costs. Equation (18) determines the price of the stock through the use of a price–earnings ratio (mt ) of the stock purchase. Equation (19) determines EPS, as usual, by dividing earnings available for
page 3148
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
b3568-v3-ch90
page 3149
3149
common by the number of common shares outstanding. Equation (20) determines dividends per share in a similar manner. This completes the model of 20 equations in 20 unknowns. Table 90.3 shows the variable numbers and their input formats. A sample FINPLAN input is demonstrated, together with a sensitivity analysis. Sensitivity analysis is accomplished by changing one parameter and noting the effect the change has on the result. FINPLAN allows sensitivity analysis to be built into a single-input deck through the use of the run code. The procedure for performing the sensitivity analysis is indicated in Table 90.3. Sensitivity analysis is very helpful in answering questions about what the results might have been if a different decision had been made. Since the future cannot be forecast with perfect certainty, the manager/planner must know how a deviation from the forecast will affect his or her plans. The financial manager/planner must also make contingency plans for probable deviations from his or her forecast. While the WS model allows the user a greater control over more details than does the Carleton model, it does not explicitly consider the production segment of the firm. In an effort to deal with the production function and other issues, FR have formulated a simultaneous-equation model that expands on the WS model. 90.3 Johnson & Johnson as a Case Study In this section, a case study is used to demonstrate how WS’s model set forth in Table 90.1 can be used to perform financial analysis, planning, and forecasting for an individual firm. 90.3.1 Data sources and parameter estimations In this case study, Johnson & Johnson (JNJ) company is chosen to perform financial planning and analysis using the WS model. The base year of the planning is 2009 and the planning period is one year, that is, 2010. Accounting and market data are required to estimate the parameters of WS financial-planning model. The COMPUSTAT data file is the major source of accounting and market information. The following paragraphs briefly discuss the parameter-estimation processes. All dollar terms are in millions, and the number of shares outstanding is also millions. Using these parameter estimates given in Table 90.3, the 20 unknown variables related to income statement and balance sheet can be solved for algebraically. The calculations are set forth in the following section.
July 6, 2020
SALEt−1 GCALSt RCAt−1 RFAt−1 RCLt−1 PFDSKt−1 PFDIVt−1 Lt−1 LRt−1 St−1 Rt−1 bt−1 Tt−1 it−1 iet−1 REBITt−1 UL UE Kt NUMCSt−1 mt−1
Net Sales at t − 1 = 2009 Growth in Sales Current Assets as a Percentage of Sales Fixed Assets as a Percentage of Sales Current Payables as a Percentage of Sales Preferred Stock Preferred Dividends Long-Term Debt in Previous Period Long-Term Debt Repayment (Reduction) Common Stock in Previous Period Retained Earnings in Previous Period Retention Rate Average Tax Rate (Income Taxes/Pretax Income) Average Interest Rate in Previous Period Expected Interest Rate on New Debt Operating Income as a Percentage of Sales Underwriting Cost of Debt Underwriting Cost of Equity Ratio of Debt to Equity Number of Common Shares Outstanding in Previous Period Price–Earnings Ratio
Notes: ∗ Variable number as defined in Table 90.2. ∗∗ Data obtained from JNJ Balance Sheets and Income Statements.
b3568-v3-ch90
61897.0 −0.2900 0.6388 0.8909 0.3109 0.0000 0.0000 8223.0 219.0 3120.0 67248.0 0.5657 0.2215 0.0671 0.0671 0.2710 0.0671 0.1053 0.1625 2754.3 14.5
Description
9.61in x 6.69in
Variable
C. F. Lee
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
Data∗∗
FINPLAN input format. Handbook of Financial Econometrics,. . . (Vol. 3)
Variable∗ Number
15:54
3150
Table 90.3:
page 3150
July 6, 2020
Historical or Base-Period Input 15:54
Balance Sheet
Dec-09
Dec-06
Dec-04
Dec-03
Dec-02
Dec-01
19,425.00 12,809.00
9,315.00
4,084.00 16,138.00 12,884.00
9,523.00
7,475.00
7,972.00
9,444.00 5,110.00 · 6,076.00
8,712.00 4,889.00 · 5,290.00
6,574.00 3,588.00 · 3,310.00
5,399.00 3,303.00 · 3,089.00
4,630.00 2,992.00 · 2,879.00
9,646.00 5,180.00 · 5,290.00
9,719.00 5,052.00 · 6,797.00
Dec-05
7,010.00 3,959.00 · 4,287.00
6,831.00 3,744.00 · 3,861.00
39,541.00 34,377.00 29,945.00 22,975.00 31,394.00 27,320.00 22,995.00 19,266.00 18,473.00 29,251.00 27,392.00 26,466.00 24,028.00 19,716.00 18,664.00 17,052.00 14,314.00 12,458.00 8,886.00
8,228.00
5,604.00
4,739.00
Property, Plant, and 14,759.00 14,365.00 14,185.00 13,044.00 10,830.00 10,436.00 9,846.00 Equipment — Total (Net) Investments and Advances — 0 0 0 0 0 0 0 Equity Method Investments and Advances — . 4 2 16 20 46 84 Other Intangibles 31,185.00 27,695.00 28,763.00 28,688.00 12,175.00 11,842.00 11,539.00 Deferred Charges 266 136 481 259 1,218.00 1,001.00 1,021.00 Other Assets — Sundry 8,931.00 8,335.00 7,578.00 5,574.00 2,388.00 2,672.00 2,778.00
8,710.00
7,719.00
0
·
121
969
9,246.00 959 2,254.00
9,077.00 0 2,250.00
(Continued )
3151
94,682.00 84,912.00 80,954.00 70,556.00 58,025.00 53,317.00 48,263.00 40,556.00 38,488.00
b3568-v3-ch90
7,206.00
TOTAL ASSETS
14,492.00 13,027.00 12,281.00 10,984.00
9.61in x 6.69in
Total Current Assets Property, Plant, and Equipment — Total (Gross) Depreciation, Depletion, and Amortization (Accumulated)
Dec-07
Alternative Security Valuation Model
ASSETS Cash and Short-Term Investments Receivables Inventories — Total Prepaid Expense Other Current Assets
Dec-08
Handbook of Financial Econometrics,. . . (Vol. 3)
JOHNSON & JOHNSON TICKER SYMBOL: JNJ SIC Code: 2834 ANNUAL BALANCE SHEET ($ MILLIONS)
page 3151
July 6, 2020
Dec-05
Dec-04
Dec-03
Dec-02
Dec-01
34 6,284.00 5,541.00 442 9,430.00 9,430.00
221 9 3,511.00 2,454.00 7,503.00 6,909.00 417 223 9,200.00 10,242.00 9,200.00 10,242.00
9 4,570.00 5,691.00 724 8,167.00 8,167.00
12 656 4,315.00 940 6,712.00 6,712.00
18 262 5,227.00 1,506.00 6,914.00 6,914.00
224 915 4,966.00 944 6,399.00 6,399.00
77 2,040.00 3,621.00 710 5,001.00 5,001.00
228 337 2,838.00 537 4,104.00 4,104.00
Total Current Liabilities Long-Term Debt — Total Deferred Taxes Investment Tax Credit Minority Interest Other Liabilities EQUITY Preferred Stock — Redeemable Preferred Stock — Nonredeemable
21,731.00 20,852.00 19,837.00 19,161.00 12,635.00 13,927.00 13,448.00 11,449.00 8,223.00 8,120.00 7,074.00 2,014.00 2,017.00 2,565.00 2,955.00 2,022.00 1,424.00 1,432.00 1,493.00 1,319.00 211 403 780 643 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12,716.00 11,997.00 9,231.00 8,744.00 5,291.00 4,609.00 4,211.00 3,745.00
8,044.00 2,217.00 493 0 0 3,501.00
Total Preferred Stock Common Stock Capital Surplus Retained Earnings Less: Treasury Stock — Total Dollar Amount
0 0 0 0 0 0 0 0 0 3,120.00 3,120.00 3,120.00 3,120.00 3,120.00 3,120.00 3,120.00 3,120.00 3,120.00 0 0 0 0 0 −11 −18 −25 −30 67,248.00 58,424.00 54,587.00 47,172.00 40,716.00 34,708.00 29,913.00 25,729.00 22,536.00 19,780.00 19,033.00 14,388.00 10,974.00 5,965.00 6,004.00 6,146.00 6,127.00 1,393.00
Total Common Equity
50,588.00 42,511.00 43,319.00 39,318.00 37,871.00 31,813.00 26,869.00 22,697.00 24,233.00
TOTAL STOCKHOLDERS’ EQUITY
50,588.00 42,511.00 43,319.00 39,318.00 37,871.00 31,813.00 26,869.00 22,697.00 24,233.00
TOTAL LIABILITIES AND STOCKHOLDERS’ EQUITY COMMON SHARES OUTSTANDING
94,682.00 84,912.00 80,954.00 70,556.00 58,025.00 53,317.00 48,263.00 40,556.00 38,488.00
LIABILITIES Debt — Due in One Year Notes Payable Accounts Payable Income Taxes Payable Accrued Expense Other Current Liabilities
0 0
2,754.32
0 0
2,769.18
0 0
2,840.22
0 0
2,893.23
0 0
2,974.48
0 0
2,971.02
0 0
2,967.97
0 0
2,968.30
0 0
3,047.22
b3568-v3-ch90
Dec-06
9.61in x 6.69in
Dec-07
Handbook of Financial Econometrics,. . . (Vol. 3)
Dec-08
C. F. Lee
Dec-09
15:54
3152
(Continued )
page 3152
July 6, 2020 15:54
Income Statement
Dec-09
Dec-06
Dec-05
Dec-04
Dec-03
Sales Cost of Goods Sold Selling, General, and Administrative Expense
61,897.00 63,747.00 15,560.00 15,679.00 26,787.00 29,067.00
61,035.00 14,974.00 28,131.00
53,194.00 50,434.00 47,348.00 41,862.00 36,298.00 33,004.00 12,880.00 11,861.00 11,298.00 10,307.00 8,785.00 7,931.00 24,558.00 23,189.00 21,063.00 18,815.00 16,173.00 15,583.00
Operating Income Before Depreciation Depreciation and Amortization Interest Expense Nonoperating Income (Expense)
19,550.00 19,001.00 2,774.00 2,832.00 552 582 −54 515
17,930.00 2,777.00 426 551
15,756.00 15,384.00 14,987.00 12,740.00 11,340.00 2,177.00 2,093.00 2,124.00 1,869.00 1,662.00 181 165 323 315 258 996 812 316 440 295
Pretax Income Income Taxes — Total Minority Interest
15,755.00 16,929.00 3,489.00 3,980.00 0 0
Income Before Extraordinary Items
12,266.00 12,949.00
ii. 13,283.00 14,587.00 13,656.00 12,838.00 10,308.00 2,707.00 3,534.00 3,245.00 4,329.00 3,111.00 0 0 0 0 0 10,576.00
11,053.00 10,411.00
8,509.00
7,197.00
Dec-02
Dec-01
9,490.00 1,605.00 248 513
9,291.00 2,694.00 0
7,898.00 2,230.00 0
6,597.00
5,668.00
b3568-v3-ch90
Dec-07
9.61in x 6.69in
Dec-08
Alternative Security Valuation Model
JOHNSON & JOHNSON TICKER SYMBOL: JNJ SIC: 2834 ANNUAL INCOME STATEMENT COMPARING HISTORICAL AND RESTATED INFORMATION ($ MILLIONS, EXCEPT PER SHARE)
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
(Continued )
3153 page 3153
July 6, 2020
Extraordinary Items and Discontinued Operations Net Income (Loss)
0
Dec-07 0
Dec-06 0
Dec-05 0
12,266.00 12,949.00 10,576.00 11,053.00 10,411.00
Dec-04
Dec-03
Dec-02
Dec-01
0
0
0
0
8,509.00
7,197.00
6,597.00
5,668.00
4.62
3.67
3.76
3.5
2.87
2.42
2.2
1.87
4.45
4.62
3.67
3.76
3.5
2.87
2.42
2.2
1.87
2,759.50
2,802.50
2,882.90
2,936.40
2,973.90
2,968.40
2,968.10
4.4
4.57
3.63
3.73
3.46
2.84
4.4
4.57
3.63
3.73
3.46
2.84
2,998.30
3,033.80
2.4
2.16
1.84
2.4
2.16
1.84
9.61in x 6.69in
4.45
C. F. Lee
Earnings Per Share (Primary) — Excluding Extraordinary Items Earnings Per Share (Primary) — Including Extraordinary Items Common Shares Used to Calculate Primary EPS Earnings Per Share (Fully Diluted) — Excluding Extraordinary Items Earnings Per Share (Fully Diluted) — Including Extraordinary Items
0
Dec-08
Handbook of Financial Econometrics,. . . (Vol. 3)
Dec-09
15:54
3154
(Continued )
b3568-v3-ch90 page 3154
July 6, 2020
(Continued ) 15:54
Statement of Cash Flows
Dec-09
Dec-04
Dec-03
Dec-02
Dec-01
12,266.00 12,949.00 10,576.00 11,053.00 10,411.00 2,774.00 2,832.00 2,777.00 2,177.00 2,093.00 0 0 0 0 0 −436 22 −1,762.00 −1,168.00 −46 0 0 0 0 · 0 0 0 0 0
8,509.00 2,124.00 0 −498 · 0
7,197.00 1,869.00 0 −720 · 0
6,597.00 1,662.00 0 −74 · 0
5,668.00 1,605.00 0 −106 · 0
21 −111 11 607
924 −691 39 2,192.00
183 −510 −109 1,420.00
204 −258 −167 1,401.00
· · · · 141 963 468 −215 14,248.00 11,877.00 11,131.00 10,595.00
· −993 8,176.00
· 517 8,864.00
894 −736 −101 −272
2,205.00 −416 14 2,642.00
· · · 1,240.00 −616 −787 16,571.00 14,972.00 15,249.00
Dec-06
1,204.00 −699 −210 1,750.00
(Continued )
b3568-v3-ch90
5,660.00 11,617.00 7,590.00 6,923.00 8,188.00 9,187.00 12,061.00 8,062.00 7,353.00 5,967.00 · · · · · 2,632.00 2,175.00 2,262.00 2,099.00 1,731.00 · · · · · 987 580 2,812.00 478 225 −187 −36 76 −50 84 −279 −2,347.00 −4,526.00 −2,197.00 −4,093.00
3155
10,040.00 3,668.00 9,659.00 467 7,232.00 3,059.00 7,988.00 426 · · · · 2,365.00 3,066.00 2,942.00 2,666.00 · · · · 2,470.00 1,214.00 1,388.00 18,023.00 45 702 −138 439 −7,598.00 −4,187.00 −6,139.00 −20,291.00
331 −568 −396 −911
9.61in x 6.69in
Dec-05
686 453 95 −507
Dec-07
Alternative Security Valuation Model
INDIRECT OPERATING ACTIVITIES Income Before Extraordinary Items Depreciation and Amortization Extraordinary Items and Disc. Operations Deferred Taxes Equity in Net Loss (Earnings) Sale of Property, Plant, and Equipment and Sale of Investments — Loss (Gain) Funds from Operations — Other Accounts Receivable — Decrease (Increase) Inventory — Decrease (Increase) Accounts Payable and Accrued Liab. — Increase (Decrease) Income Taxes — Accrued — Increase (Decrease) Other Assets and Liabilities — Net Change Operating Activities — Net Cash Flow INVESTING ACTIVITIES Investments — Increase Sale of Investments Short-Term Investments — Change Capital Expenditures Sale of Property, Plant, and Equipment Acquisitions Investing Activities — Other Investing Activities — Net Cash Flow
Dec-08
Handbook of Financial Econometrics,. . . (Vol. 3)
JOHNSON & JOHNSON TICKER SYMBOL: JNJ SIC: 2834 ANNUAL STATEMENT OF CASH FLOWS ($ MILLIONS)
page 3155
July 6, 2020
Dec-07
Dec-06
Dec-05
Dec-04
Dec-03
Dec-02
Dec-01
533 2,363.00
525 4,068.00
314 4,099.00
143 4,250.00
151 3,429.00
222 3,880.00
206 3,146.00
141 2,006.00
185 2,090.00
9.61in x 6.69in
882 1,486.00 1,562.00 1,135.00 696 642 311 390 514 2,130.00 6,651.00 5,607.00 6,722.00 1,717.00 1,384.00 1,183.00 6,538.00 2,570.00 5,327.00 5,024.00 4,670.00 4,267.00 3,793.00 3,251.00 2,746.00 2,381.00 2,047.00 9 1,638.00 5,100.00 6 6 17 1,023.00 22 14 219 24 18 13 196 395 196 245 391 2,693.00 1,111.00 −2,065.00 3,752.00 483 −777 −1,072.00 1,799.00 −771 0 0 0 0 0 0 0 0 0 −4,092.00 −7,464.00 −5,698.00 −6,109.00 −4,521.00 −5,148.00 −3,863.00 −6,953.00 −5,251.00 161 −323 275 180 −225 190 277 110 −40 −520 5,042.00 2,998.00 3,687.00 −11,972.00 6,852.00 3,826.00 2,483.00 −864
C. F. Lee
FINANCING ACTIVITIES Sale of Common and Preferred Stock Purchase of Common and Preferred Stock Cash Dividends Long-Term Debt — Issuance Long-Term Debt — Reduction Current Debt — Changes Financing Activities — Other Financing Activities — Net Cash Flow Exchange Rate Effect Cash and Cash Equivalents — Increase (Decrease) DIRECT OPERATING ACTIVITIES Interest Paid — Net Income Taxes Paid
Dec-08
Handbook of Financial Econometrics,. . . (Vol. 3)
Dec-09
15:54
3156
(Continued )
Note: The above data of financial statements are downloaded from the COMPUSTAT dataset; @NA represents data are not available. b3568-v3-ch90 page 3156
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model
page 3157
3157
90.3.2 Procedure for calculating WS model By using the data above, we are able to calculate the unknown variables as follows: (1) Sales t = Sales t−1 × (1 + GCALS t ) = 61897.0 × 0.71 = 43,946.87, (2) EBIT t = REBIT t−1 × Sales t = 0.2710 × 43,946.87 = 11,909.60, (3) CAt = RCAt−1 × Sales t = 0.6388 × 43,946.87 = 28,073.26, (4) FAt = RFAt−1 × Sales t = 0.8909 × 43,946.87 = 39,152.27, (5) At = CAt + FAt = 28,073.26 + 39,152.27 = 67,225.53, (6) CLt = RCLt−1 × Sales t = 0.3109 × 43,946.87 = 13,663.08, (7) NF t = (At − CLt − PFDSK t ) − (Lt−1 LR t ) − St−1 Rt−1 − bt ×{(1 − Tt )[EBIT t − it−1 (Lt−1 − LR t )] − PFDIV t } = (67,225.53 − 13,663.080) − (8,223.0 − 219.0) − 3,120.0 −67,248.0 − 0.5657 × {(1 − 0.2215) × [11,909.60 − 0.0671 ×(8,223.0 − 219.0)] − 0} = −29,817.99, (12) it Lt = it−1 (Lt−1 − LR t ) + iet−1 × NLt = 0.0671 × (8,223.0 − 219.0) + 0.0671 × NLt = 537.0684 + 0.0671 × NLt , (8) NF t + bt (1 − T )[it−1 × NLt + UtL × NLt ] = NLt + NS t −29817.99 + 0.5657 × (1 − 0.2215) × [0.0671NL t + 0.0671NLt )
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3158
= NLt + NS t −29817.99 + 0.0591 × NLt = NLt + NS t , (a) NS t + 0.9409NL t = −29,817.99, (9) Lt = Lt−1 − LRt + N Lt , (b) Lt = 8,223.0 − 219.0 + N Lt Lt − N Lt = 8,004, (10) St = St−1 + NS t , (c) − NS t + St = 3,120.0, (11) Rt = Rt−1 + bt {(1 − Tt )[EBIT t − it Lt − UtL NLt ] − PFDIV t } = 67,248.0 + 0.5657 × {(1 − 0.2215) ×[11,909.60 − it Lt − 0.0671NLt ] − 0}, Substitute (12) into (11) Rt = 67,248.0 + 0.5657 × {0.7785 × [11,909.60 − (537.0684 + 0.0671 ×NLt ) − 0.0671NL t ]} = 67,248.0 + 5,008.4347 − 0.0591NLt , (d) 72,256.435 = Rt + 0.0591NLt , (13) Lt = (St + Rt )Kt , Lt = 0.1625St + 0.1625Rt , (e) Lt − 0.1625St − 0.1625Rt = 0, (b) − (e) = (f), 0 = (Lt − NLt − 8,004) − (Lt −0.1625St − 0.1625Rt ), 8,004 = 0.1625St + 0.1625Rt − NLt , (f) − 0.1625(c) = (g), 8,004 − 507 = (0.1625St + 0.1625Rt − NLt ) −0.1625(−NS t + St ), 7,497 = 0.1625NS t − NLt + 0.1625Rt , (g) − 0.1625(d) = (h), 7,497 − 0.1625 × 72,256.435 = (0.1625N St − N Lt + 0.1625Rt ) −0.1625(Rt + 0.0591NLt ) − 4,244.67 = 0.1625NS t − 1.0096NLt ,
page 3158
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
(h) − 0.1625(a) = (i) 0.1625NS t − 1.0096NLt −0.1625(NS t + 0.9409NLt ) = −4,244.67 + 0.1625(29,817.99) NLt = −600.7533/1.1625 = −516.777. Substitute NLt in (a) NS t + 0.9409 × (−516.777) = −29,817.99, NS t = −29,331.755. Substitute NLt in (b) Lt = 8,223.0 − 219.0 − 516.777 = 7,487.223. Substitute NSt in (c) 29,331.755 + St = 3,120.0, St = −26211.755. Substitute NLt in (d) 72,256.43 = Rt + 0.0591NLt , Rt = 72,256.43 − 0.0591(−516.777), Rt = 72,286.98. Substitute N Lt Lt in (12). . . it (7,487.223) = 537.0684 + 0.0671 × (−516.777), it = 0.0671. (14) EAFCD t = (1 − Tt )(EBIT t − it Lt − UtL NLt ) − PFDIV t = 0.7785 × [11,909.60 − (0.0671)(7,487.223) −0.0671(−516.777)] = 8,907.51, (15) CMDIV t = (1 − bt )EAFCD t = 0.4343(8,907.51) = 3,868.53, (16) NUMCS t = X1 = NUMCS t−1 + NEWCS t , X1 = 2754.3 + NEWCS t ,
b3568-v3-ch90
page 3159
3159
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3160
(17) NEWCS t = X2 = NS t /(1 − UtE )Pt , X2 = −29,331.755/(1 − 0.1053)Pt , (18) Pt = X3 = mt EPS t , X3 = 14.5(EPS t ), (19) EPS t = X4 = EAFCD t /NUMCS t , X4 = 8,907.5075/NUMCS t , (20) DPS t = X5 = CMDIV t /NUMCS t , X5 = 3,868.53/NUMCS t , (A) = For (18) and (19), we obtain X3 = 14.5(8,907.51)/NUMCS t = 129,158.9/X1 . Substitute (A) into equation (17) to calculate (B) (B) = X2 = −29,331.755/[(1 − 0.1053) × 129,158.9/X1 ], (B) = X2 = −0.2538X1 . Substitute (B) into equation (16) to calculate (C) (C) = X1 = 2754.3 − 0.2538X1 , (C) = X1 = 2196.76. Substitute (C) into (B). . . (B) = X2 = −0.2538 × 2196.76, (B) = X2 = −557.54. From equation (19) and (20), we obtain X4 , X5 and X3 : X4 = 8,907.5075/2196.76 = 4.0548, X5 = 3,868.53/2196.76 = 1.7610, X3 = 14.5(4.0548) = 58.79. The results of the above calculations allow us to forecast the following information regarding JNJ in the 2010 fiscal year (dollars in thousands, except for per share data): ◦ ◦ ◦ ◦
Sales = $43,946.87, Current Assets = $28,073.26, Fixed Assets = $39,152.27, Total Assets = $67,225.53,
page 3160
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
b3568-v3-ch90
page 3161
3161
Current Payables = $13,663.08, Needed Funds = ($29,817.99), Earnings before Interest and Taxes = $11,909.60, New Debt = ($516.777), New Stock = ($−29,331.755), Total Debt = $7,487.223, Common Stock = ($26211.755), Retained Earnings = $72,286.98, Interest Rate on Debt = 6.71%, Earnings Available for Common Dividends = $8,907.51, Common Dividends = $3,868.53, Number of Common Shares Outstanding = 2196.76, New Common Shares Issued = (557.54), Price per Share = $58.79, Earnings per Share = $4.0548, Dividends per Share = $1.7610.
About 18 out of 20 unknowns are listed in Table 90.4, the actual data is also listed to allow the calculation of the forecast errors. In the last column of Table 90.4, the relative absolute forecasting errors (|(A − F)/A|) are calculated to indicate the performance of the WS model in forecasting important financial variables. It was found that the quality of the sales-growth rate estimate is the key to successfully using the WS model in financial planning and forecasting. By comparing the forecast and actual values in Table 90.4, we find that the forecasting numbers generated by FINPLAN are very close to the ones on actual financial statements. During the financial-planning period, the company’s financial policy does not change and the economy is neither in a big recession nor booming. This provides us with an environment in which the historical data are useful for financial planning. From the solution, we know that, under the assumed parameter values, the company must issue both debt and equity. If the company wants to avoid equity financing, WS model also enables us to investigate alternative methods to achieve this goal by changing the parameter values. Therefore, the model can answer what if questions and, hopefully, the company can choose the best alternative. Finally, the model also can help us to understand the impacts of changes in parameters on key financial variables, such as earnings per share (EPS), price
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3162
Table 90.4: The comparison of financial forecast of JNJ: Hand calculation and FINPLAN forecasting.
Category
Manual calculation
Financial plan model
Variance (|(A− F)/A|) (%)
43,946.87 11,909.60 502.39 11,372.53 2,519.02 8,853.52 3,868.53 219.00
43,946.87 11,909.60 502.39 11,372.53 2,519.02 8,853.52 3,845.08 219.00
0.0 0.0 0.0 0.0 0.0 0.0 0.6 0.0
28,073.26 39,152.27 67,225.53
28,073.26 39,152.27 67,225.53
0.0 0.0 0.0
13,663.24 7,487.20 (26,211.89) 72,286.98 67,225.53
0.0 0.0 0.0 0.0 0.0
INCOME STATEMENT Sales Operating Income Interest Expense Income before taxes Taxes Net Income Common Dividends Debt Repayments BALANCE SHEET Assets Current Assets Fixed Assets Total Assets
LIABILITIES AND NET WORTH Current Payables Total Debt Common Stock Retained Earnings Total Liabilities and Net Worth
13,663.08 7,487.22 (26,211.7) 72,286.98 67,225.53
PER SHARE DATA Price per Share Earnings per Share (EPS) Dividends per Share (DPS)
58.79 4.05 1.76
58.51 4.04 1.75
0.5 0.5 0.5
per share (PPS), dividend per share (DPS), and earning before interest and taxes (EBIT), through the complicated interactions among the investment, financial, and dividend policies. To do multiperiod forecasting and sensitivity analysis, the program of FINPLAN of Microsoft Excel, as listed in Appendix 90A, can be used. Using these programs provided, the pro forma financial statements listed in Tables 90.6 and 90.7 can be produced. The input parameters and the values used to produce the output in Tables 90.6 and 90.7 are listed in Table 90.5. The list of these parameters can be found in Table 90.3. To perform the
page 3162
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Table 90.5:
page 3163
3163
FINPLAN input.
FINPLAN input 2009
Value of data
Variable number∗
Beginning period
Last period
4
1
0
0
61897.0000 −0.2900 0.6388
21 22 23
0 1 1
0 4 4
0.8909
24
1
4
0.3109
25
1
4
0.0000 0.0000 8223.0000
26 27 28
1 1 0
4 4 0
219.0000
29
1
4
3120.0000
30
0
0
67248.0000
31
0
0
0.5657 0.2215
32 33
1 1
4 4
0.0671
34
0
0
0.0671
35
1
4
0.2710
36
1
4
0.0671 0.1053 0.1625 2, 754.321
37 38 39 40
1 1 1 0
4 4 4 0
14.4700
41
1
4
Description The number of years to be simulated Net Sales at t−1=2009 Growth in Sales Current Assets as a Percentage of Sales Fixed Assets as a Percentage of Sales Current Payables as a Percentage of Sales Preferred Stock Preferred Dividends Long-Term Debt in Previous Period Long-Term Debt Repayment (Reduction) Common Stock in Previous Period Retained Earnings in Previous Period Retention Rate Average Tax Rate (Income Taxes/Pretax Income) Average Interest Rate in Previous Period Expected Interest Rate on New Debt Operating Income as a Percentage of Sales Underwriting Cost of Debt Underwriting Cost of Equity Ratio of Debt to Equity Number of Common Shares Outstanding in Previous Period Price–Earnings Ratio
Note: ∗ Variable numbers except the number of years to be simulated are as defined in Table 90.2.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3164
Pro forma balance sheet of JNJ: 2010–2013.
Table 90.6: Item/Year
2010
2011
2012
2013
28,073.26 39,152.27 67,225.53
19,932.01 27,798.11 47,730.12
14,151.73 19,736.66 33,888.39
10,047.73 14,013.03 24,060.76
Assets Current assets Fixed assets Total assets Liabilities and Net Worth Current liabilities Long-term debt Preferred stock Common stock Retained earnings Total liabilities and net worth Computed DBT/EQ Int.rate on total debt Per Share Data Earnings Dividends Price
13,663.24 9,700.90 6,887.64 4,890.22 7,489.12 5,317.28 3,775.27 2,680.44 0.00 0.00 0.00 0.00 −26,214.00 −43,199.96 −55,258.11 −63,817.52 72,287.17 75,911.90 78,483.59 80,307.61 67,225.53 47,730.12 33,888.39 24,060.76 0.16 0.16 0.16 0.16 0.07 0.07 0.07 0.07 4.04 1.75 58.42
Table 90.7:
3.43 1.49 49.59
2.95 1.28 42.68
2.54 1.10 36.74
Pro forma income statement of JNJ: 2010–2013.
Item/Year Sales Operating income Interest expense Underwriting commission–debt Income before taxes Taxes Net income Preferred dividends Available for common dividends Common dividends Debt repayments Actual funds needed for investment
2010
2011
2012
43,946.87 31,202.28 22,153.62 11,909.60 8,455.82 6,003.63 502.74 356.94 253.43 34.56 131.09 88.81 11,372.30 7,967.78 5,661.39 2,518.44 1,764.49 1,253.73 8,853.87 6,203.29 4,407.65 0.00 0.00 0.00 8,853.87 6,203.29 4,407.65 3,845.14 2,694.03 1,914.20 219.00 219.00 219.00 −29,848.88 −18,938.80 −13,381.16
2013 15,729.07 4,262.58 179.93 58.79 4,023.85 891.10 3,132.75 0.00 3,132.75 1,360.52 219.00 −9,435.24
sensitivity analysis, both high and low values are assigned to the growth rate of sales (g), retention rate (b), and target leverage ratio (k). The results of the sensitivity analysis related to EPS, DPS and PPS are demonstrated in Table 90.8. Table 90.8 indicates that the increases in g, b, and k will generally have positive impacts on EPS, DPS, and PPS.
page 3164
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Table 90.8:
bt−1 = 0.5657
bt−1 = 0.5657
bt−1 = 0.5657
bt−1 = 0.3
bt−1 = 0.7
bt−1 = 0.5657
bt−1 = 0.5657
3165
Results of sensitivity analysis.
Year GSALSt = –0.2900 EPS = DPS = PPS = GSALSt = –0.4 EPS = DPS = PPS = GSALSt = 0.09 EPS = DPS = PPS = GSALSt = –0.2900 EPS = DPS = PPS = GSALSt = –0.2900 EPS = DPS = PPS = GSALSt = –0.2900 EPS = DPS = PPS = GSALSt = –0.2900 EPS = DPS = PPS =
page 3165
2010
2011
2012
2013
4.04 1.75 58.42
3.43 1.49 49.59
2.95 1.28 42.68
2.54 1.10 36.74
3.69 1.60 53.47
2.88 1.25 41.71
2.29 0.99 33.10
1.82 0.79 26.27
5.09 2.21 73.61
5.65 2.46 81.81
6.23 2.70 90.11
6.86 2.98 99.26
3.97 2.78 57.46
3.31 2.32 47.92
2.80 1.96 40.52
2.37 1.66 34.27
4.07 1.22 58.90
3.49 1.05 50.44
3.03 0.91 43.80
2.63 0.79 38.03
3.97 1.72 57.42
3.46 1.50 50.02
2.99 1.30 43.23
2.58 1.12 37.37
3.94 1.71 56.97
3.39 1.47 49.01
2.86 1.24 41.40
2.42 1.05 34.98
Kt = 0.1625
Kt = 0.1625
Kt = 0.1625
Kt = 0.1625
Kt = 0.1625
Kt = 0.1
Kt = 0.5
90.4 Francis and Rowell Model2 The model presented here extends the simultaneous linear-equation model of the firm developed by WS in 1971. The object of this model is to generate 2
A major portion of this section is reprinted from Francis and Rowee (1978, pp. 29–44). By permission of the authors and Financial Management.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
C. F. Lee
3166 Table 90.9:
List of variables for FR model.
Endogenous SalesP t C SF t
Sat SP t γ1t γ2t
Kt N Kt FAt
Exogenous
Potential industry sales (units) Full capacity unit output (company) Actual company unit output
GSALSt
Potential company unit output Measure of necessary new investment (based on units) Measure of slack due to underutilization of existing resources Units of capital stock Desired new capital (capital units) Fixed assets (current $)
INVt−1
SalesP t−1 C SF t−1
FAt−1 γt
ct θ Pkt
NFt
Desired new investment (current $)
P
Pts
Output price
δ2
$St
Sales dollars (current $)
Φ
COGt
Cost of goods (current $)
N
OCt
LRt
INVt
Overhead, selling, cost of goods (current $) Nonoperating income (current $) Depreciation expense (current $) Inventory (current $)
Lt iL t
Long-term debt Cost of new debt (%)
NLt
New long-term debt needed ($) New common stock (equity) needed ($)
OC2t Dt
NSt
b3568-v3-ch90
Growth rate in potential industry sales Previous period potential industry sales (units) Previous period company full capacity unit output Previous period company finished goods inventory Previous period company fixed asset base ($) Capacity utilization index
Desire market share Proportionality coefficient of SRC to Kt t GNP component index for capital equipment Percentage markup of output price over ratio of GOPt /INVt Proportionality coefficient of OCt to $St Proportionality coefficient of Dt to FAt Proportionality coefficient of INVt to $St Repayment of long-term debt
Tt
Corporate tax rate
bt
Retention rate
UL t
Underwriting cost of new debt Preferred dividend Previous period weighted average cost of long-term debt Previous period long-term debt Optimal capital structure assumption
PFDIVt iA t−1 Lt−1 k
(Continued )
page 3166
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Table 90.9:
REt EBITt iA t vEBIT ist vNIAT
Net income after tax (current $) Retained earnings Earnings before interest and taxes Weighted average cost of long term debt Coefficient of variation of EBIT Cost of new stock issue
TEVt
Coefficient of variation of NIAT Total equity value
gta
Growth rate in $St
EAFCDt
Earnings available for common dividend Common dividend Contributions to RE made in the t-th period Gross operating profit (current $)
CMDIVt ΔREt GPOt
3167
(Continued )
Endogenous NIATt
page 3167
Exogenous αL , βL αs , βs GOPt−1 δ1 δ3 α1 , α2 , α3 1
2 2 ρ σSaleδP
Coefficients in risk–return tradeoff for new debt Coefficients in risk–return tradeoff for new stock Gross operating profit of previous period Ratio of COGt to actual net sales Ratio of OC2 to net sales Production function coefficients Ratio of CAt to net sales Ratio of CLt to net sales Standard deviation of potential industry sales
pro forma financial statements that describe the future financial condition of the firm for any assumed pattern of sales. Parameters of various equations in the system can be changed to answer what if question, perform sensitivity analysis, and explore various paths toward some goals or goals that may or may not be optimal. The FR model is composed of 10 sectors with a total of 36 equations (see Tables 90.9 and 90.10). The model incorporates an explicit treatment of risk by allowing for stochastic variability in industry sales forecasts. The exogenous input of sales variance is transformed (through simplified linear relations in the model) to coefficients of variation for EBIT and net income after taxes (NIAT) (see Table 90.13). These are used in risk–return functions that determine the costs of new financing. Lee and Rahman (1997) use dynamic optimal control model to discuss the interactions of investment, financing, and dividend decisions. Lee and Rahman’s approach can integrate with FR’s model.
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
b3568-v3-ch90
C. F. Lee
3168 Table 90.10:
List of equations for FR model.
1.
Industry Sales (1) Salespt = Salespt−1 (1 + GSALSt )
2.
Company Production Sector C FC (2) SF = α1 St−1 + α2 INVt−1 + α3 FAt−1 t (3) (4)
3.
9.61in x 6.69in
Sta = γt → Sta StF C Stp = ct Salespt
= γt StF C
Capital Stock Requirements Sector (5) Stp − Sta = StF C − Sta + Stp − StF C (6) StF C − Sta = γ2t (7) Stp − StF C = γ1t (8) K1 =
(0 ≤ γ1t )
θStF C
(9) N Kt = θγ1t 4.
Pricing Sector (10) PKt · Kt = F At or F At /Kt = PKt (11) PKt · N Kt = N Ft (12) Pst · Sta = $Sta (13) Pts = p (GOPt−1 /INVt−1 )
5.
Production Cost Sector (14) OCt = δ2 ($Sta ) (15) COGt = δ1 ($Sta ) (16) GOPt = $Sta − COGt (17) OC2t = δ3 ($Sta )
6.
Income Sector (18) INVt = N ($Sta ) (19) EBITt = $Sta − OCt + OC2t − Dt (20) NIATt = (EBIT2 − iA t Lt )(1 − T ) (20 ) CLt = 2 ($Sta )
7.
New Financing Required Sector L (21) NFt + bt (1 − T )[iL t NLt + Ut NLt ] =
NLSt + ΔREt + (CLt − CLt−1 )
(22) NLSt = NSt + NLt L (23) ΔREt = bt {(1 − T )[EBITt − iA t Lt − Ut NLt ] − PFDIVt
Lt−1 −LRt A (24) iA + iL t = it−1 t NLt Lt Lt
(Continued )
page 3168
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model Table 90.10: (25)
NLt NSt +ΔREt
b3568-v3-ch90
page 3169
3169
(Continued )
=k
(26) Lt = Lt−1 − LRt + NLt 8.
Risk Sector 2 2 p (27) σebit = θ12 · θ22 · σSales t
2 2 p (28) σniat = θ52 · θ62 · θ22 · σSales t
9.
Costs of Financing Sector (29) iL t = αL + βL vEBIT (30) vEBIT =
σEBIT ¯ EBIT R
(31) iat = αs + βs vNIAT (32) vNIAT = 10.
σNIAT ¯ NIAT R
Valuation of Equity Sector (33) TEV t =
CMDIVt s is t −gt
L (34) EAFCDt = (1 − Tt ) × [EBITt − iA t Lt − Ut NLt ] − PFDIVt
(35) CMDIVt = (1 − bt )EAFCDt (36) gta =
$St −$St−1 $St−1
The model also incorporates some variables external to the firm that are important from a financial-planning viewpoint. These industry or economywide variables are introduced in every sector to enable the financial planner to explore their influence on plans. They include: market share, an industry capacity-utilization index, the tax rate, and a GNP component price index for explicit analysis of the effects of inflation. The FR model explicitly allows for divergence between planned (or potential) and actual levels in both sales and production. That is, sales forecasts and production potential are compared to determine the existence of slack or idle capacity and company expansion possibilities. Any positive difference between potential or forecasted company sales and actual company sales is decomposed into the portion facilities. As a result, a forecasted sales increase need not lead to investment in new capital. Likewise, a forecasted sales downturn would not lead to a divestiture of capital. An advantage of this disaggregation is that it allows for greater realism — that is, it permits both a lagged production response to sales upturns and downturns, as well as lags, overadjustment, and underadjustment in new investment decisions.
July 6, 2020
15:54
3170
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
The FR model offers a disaggregation of the sales equation into separate market share, production, and pricing equations, which has several distinct advantages. It offers the opportunity to treat sales forecast in physical units that can be compared to technical production capabilities in physical units for both potential and actual levels of sales and production. Such disaggregation also allows distinction between physical units of sales and production and dollar units. Therefore, the pricing decision can be treated separately. This feature is helpful in analyzing the effect of changing prices. Another aspect of the FR financial-forecasting model is the econometrical advantage. The FR model’s risk–return function and its production function are estimated econometrically. Additionally, standard econometric techniques to evaluate goodness-of-fit and predictive power of a simultaneous equation system are reported. In the remaining subsections of this section, the FR model is explained in its general form. Then the coefficients of the equations are set to equal to the values that characterize the operations of an existing company, and then the active operations of a well-known firm are simulated, to test this financial model empirically. 90.4.1 The FR model specification The FR model is composed of 10 sectors: (1) industry sales, (2) production sector, (3) fixed capital-stock requirements, (4) pricing, (5) production costs, (6) income, (7) new financing required, (8) risk, (9) costs of financing, and (10) common stock valuation. This is illustrated in the equation specifications as defined in Table 90.10. The flowchart conveniently illustrates the simultaneity discussed above. All 10 sectors are portrayed, labeled, and outlined by dot-dash borders with arrows displaying their interaction. This is summarized for sectors 1–10 in the interdependence table (Table 90.11). An “X” is placed in the table to represent the direction of an arrow (from explaining to explained) on the flowchart. Looking more deeply reveals that the FR model is, to a large extent (but not entirely), recursive between sectors. All entries of the sector interdependence table, with the exception of one (between sectors seven and nine), are below the diagonal. It has been structures in this manner for the specific purpose of ease of exposition and computation. The simultaneity of the FR model is primarily within each sector’s equations. This is illustrated for sector seven in the variable interdependence Table 90.12.
page 3170
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Table 90.11:
page 3171
3171
Sector interdependence. Earning sector
1
Explained sector
1 2 3 4 5 6 7 8 9 10
Table 90.12:
2
3
4
5
6
7
X X
X
8
9
10
X X X X X X
X
X
X X
X
X X X
X
Variable interdependence within sector seven. Explaining variables REt
Explained variables
REt Lt NLt NSt iA t NLSt
Lt
NLt
X X X
X
X X
NSt
iA t
NLSt
X X
X X X
X
SECTOR ONE: INDUSTRY SALES The primary importance of the industry-sales forecast sector is highlighted by its upper left position on the flowchart. It influences directly the risk sector and production sector and, indirectly, every sector of the model. The industry sales sector can be any size and is abbreviated here to merely a single equation (see FR example (1) in Table 90.10). The industry-sales equation shows that an industry-sales forecast must be made by some means over a predefined forecast period and given as an exogenous input to the FR model. Although sales remain the driving force for the FR model, it is the industry instead of the company sales that drive the model, since forecasting experience indicates that industry sales can usually be more accurately forecasted than company sales. In addition, two parameters of the industry sales forecast are employed: the mean and the standard deviation. The mean enters
July 6, 2020
15:54
3172
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
the model in the conventional way, whereas the standard deviation is mathematically transformed to obtain the standard deviation of its derivative quantities, the company’s NIAT and EBIT.3 SECTOR TWO: COMPANY SALES AND PRODUCTION Company sales are obtained through a market-share assumption, which is typically a more stable parameter than a company’s dollar sales level. Potential company sales is obtained from forecasted industry sales through this market-share assumption. Equation (4) in Table 90.10 shows the relationship explicitly. The FR model distinguishes between potential and actual sales levels; this allows a realistic treatment of slack or idle capacity in the firm. Because of the possibility of directly underutilized assets, it is not necessary that every sales upturn be translated directly into an increase in the asset base. Some or all of the sales upturn can be absorbed by more complete utilization of available resources. Company production potential is obtained from a production function that defines full-capacity company production. This is determined by previous-period full-capacity production sales, inventory, and fixed assets (see equation (2) in Table 90.10 for the exact specification). Actual company production is derived from full-capacity production by a capacity-utilization index in equation (3) of Table 90.10. The production function allows explicit definition of the company’s fullcapacity production levels. It serves the useful purpose of relaxing the unrealistic assumption (used in many models) that whatever is produced is sold. Full-capacity production is typically adjusted gradually, or dynamically, over the long run, to upward changes in potential sales and is often not responsive to downturns. The non-proportionality and asymmetry discussed earlier with respect to the distinctions between actual and potential sales also 3
The FR model could easily be linked to macroeconomic forecasting model to obtain the sales forecast for the industry and the firm. The expanded macroeconomics and microeconomics model could provide detailed forecasts of the economy, the firm’s industry, the firm itself, and the firm’s equity returns. A small simultaneous-equation model to explain a single firm’s changes in earnings per share and stock price per share has been developed by Francis (1977). Francis’s model is driven by macroeconomics factors, with some forces from within the firm treated as unexplained residuals (called unsystematic risk). If the Francis quarterly equity-returns model were provided with exogenous input data about aggregate profits and a stock-market index, it could be modified to operate with the FR model and provide detailed analysis of period-by-period equity returns.
page 3172
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model
page 3173
3173
applies to the distinctions between potential full capacity and actual production. For instance, slack (that is, idle capacity) may be decreased to meet a sales upturn without increasing the firm’s investment in manufacturing machinery. SECTOR THREE: FIXED CAPITAL-STOCK REQUIREMENTS Necessary new investments is not linked directly to company sales in the FR model, but instead results from comparison between potential and actual company sales. Equation (7) of Table 90.10 measures the company expansion possibility by the difference between potential company sales (influenced by management’s industry-sales forecast and company market-share assumption) and full-capacity sales. The units of required new capital are derived from this difference in equation (9), shown in Table 90.10.4 A capacity–utilization index for the simulated company and industry translates full-capacity output (from the production function) into actual company sales, just as a market-share assumption is used to translate potential industry sales into potential company sales. Any positive difference between potential company sales and actual company sales is decomposed into the contribution due to idle capacity and the contribution due to company expansion possibility, as shown mathematically in equation (5) of Table 90.10. SECTOR FOUR: PRICING The pricing sector of the model plays a key role by relating real or units sector to the nominal or dollar sectors. The real sectors of industry sales, company sales and production, and fixed capital-stock requirements are all denominated in physical units of output. However, the nominal sectors of production costs, income, financing required, and valuation are all dollardenominated. The real sectors and the nominal sectors are connected by the pricing sector. 4
Through this specification, the FR model recognizes the asymmetrical response of the asset base to changes in sales levels. A strict ratio between sales and asset levels, such as those used in other pro forma models derived by Pindyck and Rubinfeld (1976) and by Salzman (1967), presume a proportionate and symmetrical response of asset levels to both sales upturns and downturns. The FR distinction between actual and potential sales and the concept of slack allows a realistic non-proportionality and asymmetry in the simulation. (For instance, a sales downturn need not and usually does not lead to a reduction in asset levels; instead, it typically causes a decrease in capacity utilization.)
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3174
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
This sector separation allows explicit treatment of the product-pricing decision apart from the sales and production decisions. Also, it maintains the important distinction between real and nominal quantities and thus permits an analysis of inflation’s impact on the firm (as suggested by the Securities and Exchange Commission (1976)). FR equation (13) is a simple formula that generates product price by relating it, through a markup, to the ratio of previous-period gross operating profit to inventory. Real units of company sales are priced out in FR equation (12). Required new capital units are priced out using the average unit capital cost specified in FR equation (11) of Table 90.10. SECTOR FIVE: PRODUCTION COSTS The production cost sector is similar to previous models; production cost and inventory are related directly to actual company sales dollars. Also, depreciation is linked directly to existing fixed investment. SECTOR SIX: INCOME As in the production cost sector, the income-sector ties inventory, earnings before interest and taxes, and net income after taxes directly to actual company sales dollars. This simplicity is preserved here to create a lineardetermined income statement that produces EBIT as a function of actual company sales (given a few simplifying assumptions). The NIAT is derived from EBIT after deduction of interest expense (also linearly related to actual sales levels and taxes). SECTOR SEVEN: NEW FINANCING REQUIRED The new-financing-required sector is composed primarily of accounting relationships that determine the dollar amount of external financing required from the new capital requirements (Sector Three) and internal financing capability (Sector Six). In FR model, equation (21) obtains this external financing requirement. The retained-income portion of internal financing is derived from FR equation (23) of Table 90.10. Finally, the breakdown of new external financing into new equity and new debt occurs in FR equation (25), where the notion of optimal capital structure is exploited. The weighted-average cost of debt, FR equation (24), consists of a weighted sum of new debt costs and the cost of existing debt. The cost of the new debt is not exogenous in this model; it is estimated in a simplified risk–return tradeoff from Sector Nine.
page 3174
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model
page 3175
3175
SECTOR EIGHT: RISK The linear derivation of both EBIT and NIAT in the income sector is used (with simplifying assumptions) in the risk sector to obtain the standard deviation of each income measure. The derivation (presented in Table 90.13) demonstrates how management’s judgment as to the variability (i.e., standard deviation) of forecasting industry sales affects the risk character (of both the business and financial risk) of the company. This risk character influences the costs of financing new stock and debt in risk–return tradeoff equations of Sector Nine. In this way, risk is explicitly accounted for as the principal determinant of financing costs, and financing costs are made endogenous to the model. In addition, the risk relationship from the ratio of fixed to variable cost (an operating leverage measures) to the standard deviation of EBIT. The debt-to-equity ratio (a financial leverage ratio) also positively influences the NIAT standard deviation. Thus, the leverage structure of the firm endogenously influences the costs of financing in a realistic way. SECTOR NINE: COST OF FINANCING Market factors enter into the determination of financing costs through the slope (β1 and β2 ) and intercept (α1 and α2 ) coefficients of the risk–return tradeoff functions, namely, equations (29) and (31) of Table 90.10. At the present time, all four coefficients must be exogenously provided by management. However, this is not a difficult task. Historical coefficients can be estimated empirically using simple linear regression. The regression coefficients would establish a plausible range of values that might be used by management to determine the present or future coefficient values. SECTOR TEN: COMMON STOCK VALUATION The valuation model used finds the present value of dividends, which are presumed to grow perpetually at a constant rate. This venerable model can be traced from Williams (1938) through more recent analysts. Algebraically reduced to its simplest form, the single-share valuation model is shown below: Share price =
Cash dividend per year . (Equity capitalization rate, ist ) − (Growth rate, gta )
Equation (33) of Table 90.10 differs slightly from the per-share valuation model above because it values the firm’s total equity outstanding. This
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3176
Table 90.13: Transformation of industry sales moments to company NIAT and EBIT moments. EBIT
EBITt = $Sta − OCt − Dt = $Sta − δ2 $Sta − ΦF At $S a = $Sta − δ2 $Sta − ΦPkt θ γ1t ∴ Pstt = 1 − δ2 − Φ PPkt θ γ1t $Sta st
= θ1 $Sta . If Stp = StF C , then StF C = ct Salespt ∴ Stp = ct Salespt . Since Sta = γt StF C = γt [ct Salesat ], then Pts Sta = $St = Pts γt [ct Salespt ], and $Sta = θ2 Salespt . 2 p Hence, EBITt = θ12 θ22 σsales t
2 2 p, then σEBIT = θ12 θ22 σsales t
NIAT
NIATt = [1 − T ] EBITt − iA Lt − U L NLt , if U I = 0 also,
Lt =
1
P θ + γ kP t − 2 t ts
[1+ k1 ]
$Sta
= θ4 $St
a NAITt = [1 − T ] θ1$Sta − iA t θ4 $St A a = [1 − T ] θ1 − it θ4 $St p = [1 − T ] θ1 − iA t θ4 θ2 Salet p = θ5 θ6 θ2 Salest , NAITt = θ5 θ6 θ2 Salespt , then 2 2 p, = θ52 θ62 θ22 θsales σNAIT t
where
k θ1 = 1 − δ2 − Φ PPts θ γ1t ,
θ2 = P ⎧tsγt ct , ⎫ ⎪ ⎪ k Pk − ⎨ 1 + θ1+ 2 ⎬ 1 k , θ4 = [1+ k1 ] ⎪ ⎪ ⎩ ⎭
θ5 = [1
− Tt ], θ6 = θ1 − iA t θ4 , and CAt = 1 $Sta , Dt = ΦFAt , also, parameters δ2 , θ, γt , 2 , are defined in the List of equations (Table 90.10).
page 3176
July 17, 2020
14:41
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model
3177
change was accomplished merely by multiplying both sides of the valuation equation shown above by the number of shares outstanding. The remaining equations of this sector are then accounting statements. 90.4.2 A brief discussion of FR’s empirical results FR (1978) used Anheuser–Busch Company annual reports to perform full simulation experiments, and show one prediction comparison for the FR model and the WS model. Overall, FR found that their model is very useful in performing financial planning and forecasting. In addition, FR also argued that their model has superior explanatory power over a wide range of applications (see footnote 4). Detailed discussion of FR’s empirical results is beyond the scope of this book. Hence, it is omitted and left for students’ further study. A case study of using both FR and Carleton’s (1970) models to analyze a forecast General Motors’ financial position can be found in Lee (1984). 90.5 Feltham–Ohlson Model for Determining Equity Value Ohlson Model introduced the clean surplus relations (CSR) assumption requiring that income over a period equals net dividends and the change in book value of equity. CSR ensures that all changes in shareholder equity that do not result from transactions with shareholders (such as dividends, share repurchases or share offerings) are reflected in the income statement. In other words, CSR is an accounting system recognizing that the periodically value created is distinguished from the value distributed. Let NIATt denote the earnings for period (t − 1, t), TEVt denote the book value of equity at time t, Rf denote the risk-free rate plus one, CMDIVt denote common dividends, and NIATat = NIATt − (Rf − 1) TEVt denote the abnormal earnings at time t. The change in book value of equity between two days equals earnings plus dividends, so the clean surplus relation (CSR)TEVt = TEVt−1 + NIATt − CMDIVt implies that Pts = TEVt +
∞ τ =1
Rf−τ Et NIATat+τ ,
(90.1)
the price of firm’s equity(Pts ) is equal to its book value of equity adjusted for the present value of expected future abnormal earnings. The variables on the right-hand side of (90.1) are still forecasts, not past realizations. To deal with this problem, Ohlson Model introduced the information dynamics to
page 3177
July 6, 2020
15:54
3178
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
link the value to the contemporaneous accounting data. Assume {NIATat }τ ≥1 follows the stochastic process: NIATat+1 = ωNIATat + vt + ε˜1,t+1 , v˜t+1 = +γvt + ε˜2,t+1 ,
(90.2)
where vt is value relevant information other than abnormal earnings and 0 ≤ ω, γ ≤ 1. Based on equations (90.1) and (90.2), Ohlson Model demonstrated that the value of the equity is a function of contemporaneous accounting variables as follows: ˆ 1 NIATat + α ˆ 2 vt , Pts = TEVt + α
(90.3)
ˆ / (Rf − ω ˆ ) and α ˆ 2 = Rf / (Rf − ω ˆ ) (Rf − γˆ ), or equivalently, where α ˆ1 = ω Pts = κ (ϕxt − dt ) + (1 − κ) TEVt + α2 vt ,
(90.4)
ˆ / (Rf − ω ˆ ) and ϕ = Rf / (Rf − 1). Equations (90.3) where κ = (Rf − 1) ω and (90.4) imply that the market value of the equity is equal to the book value adjusted for (i) the current profitability as measured by abnormal earnings and (ii) other information that modifies the prediction of future profitability. One major limitation of the Ohlson Model is that it assumed unbiased accounting. Feltham and Ohlson (1995) (hereafter FO) introduce additional dynamics to deal with the issue of biased (conservative) accounting data. The FO Model analyzes how firm value relates to the accounting information that discloses the results from both operating and financial activities. For the financial activities, there are relatively perfect markets and the accounting measures for book value and market value of these assets are reasonably close. However, for the operating assets, accrual accounting usually results in difference between the book value and the market value of these assets since they are not traded in the market. Accrual accounting for the operating assets consequently results in discrepancy between their book value and market value and thus influences the goodwill of the firm. Similar to Ohlson Model, the information dynamics in the FO Model is oxat+1 = ω10 + ω11 oxat + ω12 oat + ω13 v1t + ε˜1t+1 , oat+1 = ω20 + ω22 oxat + ω24 v2t + ε˜2t+1 , v˜1t+1 = ω30 + ω33 v1t + ε˜3t+1 , v˜2t+1 = ω40 + ω44 v2t + ε˜4t+1 ,
(90.5)
page 3178
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model
page 3179
3179
where oxat is the abnormal operating earnings, oat is the operating assets, v1t and v2t are the other value-relevant information variables for firm at time t, respectively. The operating assets and the financial assets are calculated as follows: Operating Assets = Total assets − Financial Assets, Operating Liabilities = Preferred Shares + Total Liabilities − Financial Liabilities, Financial Assets = Cash and Cash Equivalent + Investments and Advancements + Short term Investments, Financial Liabilities = Long term debt + Debt in Current Liabilities + Notes Payable, Net Operating Assets = Operating Assets − Operating Liabilities, Net Financial Assets = Financial Assets − Financial Liabilities. The derived implied pricing function is ˆ0 + λ ˆ 1 oxa + λ ˆ 2 oat + λ ˆ 3 v1t + λ ˆ 4 v2t , Pt = yt + λ t where
(90.6)
⎡
⎤ ω ˆ 10 (1 + r − ω ˆ 22 ) (1 + r − ω ˆ 33 ) (1 + r − ω ˆ 44 ) (1 + r) ⎣ +ˆ ˆ 20 (1 + r − ω ˆ 33 ) + ω ˆ 13 ω ˆ 30 (1 + r − ω ˆ 22 ) ⎦ ω12 ω ˆ 40 (1 + r − ω ˆ 44 ) +ˆ ω14 ω ˆ0 = , λ r (1 + r − ω ˆ 11 ) (1 + r − ω ˆ 22 ) (1 + r − ω ˆ 33 ) (1 + r − ω ˆ 44 ) ˆ1 = λ
ω ˆ 11 , r (1 + r − ω ˆ 11 )
ˆ2 = λ
(1 + r) ω ˆ 12 , (1 + r − ω ˆ 11 ) (1 + r − ω ˆ 22 )
ˆ3 = λ
(1 + r) ω ˆ 13 , (1 + r − ω ˆ 11 ) (1 + r − ω ˆ 33 )
ˆ4 = λ
(1 + r) ω ˆ 14 , (1 + r − ω ˆ 11 ) (1 + r − ω ˆ 44 )
(90.7)
or equivalently, ˆ 3 v1t + λ ˆ 4 v2t , ˆ 2 oat + λ Pt = k (φxt − dt ) + (1 − κ) yt + α
(90.8)
ˆ 11 / (Rf − ω ˆ 11 ) and φ = Rf /(Rf − 1). The implied valwhere κ = (Rf − 1) ω uation function in equations (90.6) and (90.8) show a weighted average of firm’s operating earnings, firm’s book value, and the other value-relevant information with an adjustment for the understatement of the operating
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3180
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
assets resulting from accrual accounting. The major contribution of the FO Model is that it considered the accounting conservatism in the equity valuation. 90.6 Combined Forecasting Method to Determine Equity Value Chen et al. (2015) investigate the stock price forecast ability of Ohlson (1995) model, FO (1995) model, and WS (1971) Model. They use simultaneousequation estimation approach to estimate the information dynamics for Ohlson model and FO model and forecast future stock prices. Empirical results show that the simultaneous equation estimation of the information dynamics improves the ability of the Ohlson Model and FO model in capturing the dynamic of the abnormal earnings process. Chen et al. (2015) also find that WS model can generate smaller future stock prices prediction errors than those predicted by the Ohlson model and FO model, indicating that WS model has better forecast ability to determining future stock prices. The superior accuracy comparing to the Ohlson model and FO model are due to the incorporation of both operation and financing decisions of the firms. Using various time-varying parameters models proposed by Granger and Newbold (1973) and Diebold and Pauly (1987), Chen et al. (2015) further examine whether forecast combination provides better prediction accuracy. They also employ the linear and quadratic deterministic time-varying parameters model to produce time-varying weights. The evidence shows that combined forecast method can reduce the prediction errors. By using a similar technique Chen et al. (2016) investigated how technical, fundamental, and combined information can separate winners from losers of stock selection. 90.7 Summary Two simultaneous-equation financial planning models were discussed in detail in this chapter. There are 20 equations and 20 unknowns in the WS model. Annual financial data from JNJ company were used to show how the WS model can be used to perform financial analysis and planning. A computer program of the WS model is presented in Appendix 90B. The FR model is a generalized WS financial-planning model. There are 36 equation and 36 unknown in the FR model. The two simultaneous-equation financial-planning models discussed in this chapter are an alternative to
page 3180
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
b3568-v3-ch90
page 3181
3181
Carleton’s linear-programming model, to perform financial analysis, planning, and forecasting. In this chapter, we have also briefly discussed Felthan–Ohlson model for determining equity value. In addition, we have explored the usefulness of integrating WS model and FO model to improve the determination of equity value. Bibliography R. A. Brealey, S. C. Myers, and F. Allen (2016). Principles of Corporate Finance, 12th ed. Burr Ridge, IL: McGraw-Hill. W. T. Carleton (1970). An Analytical Model for Long-Range Financial Planning. Journal of Finance, 25, 291–315. H. Y. Chen, C. F. Lee, and W. K. Shih (2016). Technical, Fundamental, and Combined Information for Separating Winners from Losers. Pacific-Basin Finance Journal, 39, 224–242. H. Y. Chen, C. F. Lee, and W. K. Shih (2015). Alternative Equity Valuation Models. In Cheng F. L. and J. C. Lee (eds.), Handbook of Financial Econometrics and Statistics, pp. 2401–2444. B. E. Davis, G. J. Caccappolo, and M. A. Chandry (1973). An Econometric Planning Model for American Telephone and Telegraph Company. The Bell Journal of Economics and Management Science, 4, 29–56. F. X. Diebold and P. Pauly (1987). Structural Change and the Combination of Forecasts. Journal of Forecasting, 6, 21–40. W. J. Elliott (1972). Forecasting and Analysis of Corporate Financial Performance with an Econometric Model of the Firm. Journal of Financial and Quantitative Analysis, 1499–1526. G. A. Feltham and J. A. Ohlson (1995). Valuation and Clean Surplus Accounting for Operating and Financial Activities. Contemporary Accounting Research, 11, 689–731. J. C. Francis (1977). Analysis of Equity Returns: A Survey with Extensions. Journal of Economics and Business, 181–192. J. C. Francis and D. R. Rowell (1978). A Simultaneous-Equation Model of the Firm for Financial Analysis and Planning. Financial Management, 7, 29–44. G. W. Gershefski (1969). Building a Corporate Financial Model. Harvard Business Review, 61–72. C. W. J. Granger and P. Newbold (1973). Some Comments on the Evaluation of Economic Forecasts. Applied Economics, 5, 35–47. A. C. Lee, J. C. Lee, and C. F. Lee (2017). Financial Analysis, Planning and Forecasting: Theory and Application, 3rd ed. Singapore: World Scientific Publishing Company. C. F. Lee (1984). Alternative Financial Planning and Forecasting Models: An Integration and Extension. Mimeo: The University of Illinois at Urbana-Champaign. C. F. Lee, J. E. Finnerty, and E. A. Norton (1997). Foundations of Financial Management, 3rd ed. Minneapolis/St. Paul: West Publishing Co. C. F. Lee, J. C. Lee, and A. C. Lee (2000). Statistics for Business and Financial Economics. Singapore: World Science Publishing Co. C. F. Lee and S. Rahman (1997). Interaction of Investment, Financing, and Dividend Decisions: A Control Theory Approach. Advances in Financial Planning and Forecasting, 7
July 6, 2020
15:54
3182
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
E. Lerner and W. T. Carleton (1964). The Integration of Capital Budgeting and Stock Valuation. American Economics Review, 54, 683–702. S. C. Myers and A. Pogue (1974). A Programming Approach to Corporate Financial Management. Journal of Finance, 29, 579–599. J. A. Ohlson (1995). Earnings, Book Values, and Dividends in Equity Valuation. Contemporary Accounting Research, 11, 661–687. R. S. Pindyck and D. L. Rubinfeld (1976). Econometric Models and Economic Forecasts. New York: McGraw-Hill Book Co. S. A. Ross, R. W. Westerfield, and J. F. Jaffe (2002). Corporate Finance, 6th ed. Boston, MA: McGraw-Hill Irwin Publishing Co. S. Salzman (1967). An Econometric Model of a Firm. Review of Economics and Statistics, 49, 332–342. Securities and Exchange Commission, Release No. 5695 (1976). Notice of Adoption of Amendments to Regulations S-X Requiring Disclosure of Certain Replacement Cost Data, March 23. Securities and Exchange Commission, Release No. 33–5699 (1976). April 23. J. M. Warren and J. P. Shelton (1971). A Simultaneous-Equation Approach to Financial Planning. Journal of Finance, 26, 1123–1142. J. B. Williams (1938). The Theory of Investment Value. Cambridge, MA: Harvard University Press. Websites http://finance.yahoo.com.
Appendix 90A: Procedure of Using Microsoft Excel to Run FINPLAN Program The program of FINPLAN is available on the Website: http:// centerforpbbefr.rutgers.edu/.
page 3182
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
b3568-v3-ch90
page 3183
3183
July 17, 2020
14:41
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
3184
Appendix 90B: Program of FINPLAN with an Example This program is composed under Visual Basic Application (VBA) environment. Sub FinPlan() Dim i As Integer Dim bNYEARFound As Boolean Dim NDATE As Integer Dim NUMVR As Integer Dim NYEAR() As Integer Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim
N As Integer SALES() As Double GSALS() As Double CARAT() As Double FARAT() As Double CLRAT() As Double PFDSK() As Double PFDIV() As Double ZL() As Double ZLR() As Double S() As Double R() As Double B() As Double T() As Double ZI() As Double ZIE() As Double ORATE() As Double UL() As Double US() As Double ZK() As Double ZNUMC() As Double
’Looping control variable ’Check if Year Being Simulated is found ’Year immediately preceeding the first forecasted year ’Variable code number ’Year being simulated
Dim PERAT() As Double
’1 The number of years to be simulated ’21 Sales in the simulation year ’22 Growth rate of sales ’23 Ratio of current assets to sales ’24 Ratio of fixed assets to sales ’25 Ratio of current liabilities to sales ’26 Preferred stock ’27 Preferred dividends ’28 Long term debt ’29 Debt repayments ’30 Common stock ’31 Retained earnings ’32 Retention rate ’33 Federal income tax rate ’34 Interest rate on total debt ’35 Interest rate on new debt ’36 Operating income rate (EBIT/SALES) ’37 Underwriting commission of new debt ’38 Underwriting commission of new stock ’39 Desired debt to equity ratio ’40 Cumulative number of common stock shares outstanding ’41 Price / Earnings ratio
Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim Dim
’Operating income ’Current assets ’Fixed assets ’Total assets ’Current liabilities ’Estimated needed funds ’Value of new debt issued ’Interest expense ’Debt underwriting commission ’Earnings after interest and before tax ’Federal income taxes ’Earnings after interest and after tax ’Earnings available for common dividends ’Common stock dividends ’Value of new common stock issued ’Total liabilities and net worth ’Computed debt to equity ’Actual needed funds ’Per share market price of common stock ’Value of new common stock shares issued ’Common stock earnings per share ’Common stock dividends per share
O() As Double CA() As Double FA() As Double A() As Double CL() As Double ZNF() As Double ZNL() As Double EXINT() As Double DBTUC() As Double EAIBT() As Double TAX() As Double EAIAT() As Double EAFCD() As Double COMDV() As Double ZNS() As Double TLANW() As Double COMPK() As Double ANF() As Double P() As Double ZNEW() As Double EPS() As Double DPS() As Double
page 3184
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Alternative Security Valuation Model
b3568-v3-ch90
page 3185
3185
On Error GoToErrorHandler Columns("a").ColumnWidth = 29
’Set default column A width
Range("a2").Select NDATE = ActiveCell.Value
’Get the year being simulated from cell A2
Range("b5").Select NUMVR = ActiveCell.Value
’Get the variable code number from cell B5
bNYEARFound = False While NUMVR Empty And Not bNYEARFound If NUMVR = 1 Then ’If the number of years to be simulated is found N = ActiveCell.Previous.Value + 1 bNYEARFound = True End If ActiveCell.Offset(1, 0).Activate NUMVR = ActiveCell.Value Wend If Not bNYEARFound Then N = 5
ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim
’If the number of years to be simulated is not found ’then set the default of N as 5
NYEAR(N) SALES(N) GSALS(N) ORATE(N) T(N) CARAT(N) FARAT(N) CLRAT(N) ZL(N) ZI(N) ZIE(N) ZLR(N) PFDSK(N) PFDIV(N) S(N) ZNUMC(N) R(N) B(N) ZK(N) PERAT(N) UL(N) US(N)
NYEAR(1) = NDATE For i = 2 To N NYEAR(i) = NYEAR(i - 1) + 1 Next Range("b5").Select NUMVR = ActiveCell.Value While NUMVR Empty Select Case NUMVR Case 21 SALES(1) = ActiveCell.Previous.Value
July 6, 2020
15:54
3186
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
Case 22 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 GSALS(i) = ActiveCell.Previous.Value Next Case 23 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 CARAT(i) = ActiveCell.Previous.Value Next Case 24 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 FARAT(i) = ActiveCell.Previous.Value Next Case 25 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 CLRAT(i) = ActiveCell.Previous.Value Next Case 26 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 PFDSK(i) = ActiveCell.Previous.Value Next Case 27 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 PFDIV(i) = ActiveCell.Previous.Value Next Case 28 ZL(1) = ActiveCell.Previous.Value Case 29 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 ZLR(i) = ActiveCell.Previous.Value Next Case 30 S(1) = ActiveCell.Previous.Value Case 31 R(1) = ActiveCell.Previous.Value Case 32 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 B(i) = ActiveCell.Previous.Value Next Case 33 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 T(i) = ActiveCell.Previous.Value Next Case 34 ZI(1) = ActiveCell.Previous.Value Case 35 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 ZIE(i) = ActiveCell.Previous.Value
page 3186
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch90
Alternative Security Valuation Model Next Case 36 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 ORATE(i) = ActiveCell.Previous.Value Next Case 37 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 UL(i) = ActiveCell.Previous.Value Next Case 38 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 US(i) = ActiveCell.Previous.Value Next Case 39 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 ZK(i) = ActiveCell.Previous.Value Next Case 40 ZNUMC(1) = ActiveCell.Previous.Value Case 41 For i = ActiveCell.Next.Value + 1 To ActiveCell.Next.Next.Value + 1 PERAT(i) = ActiveCell.Previous.Value Next End Select ActiveCell.Offset(1, 0).Activate NUMVR = ActiveCell.Value Wend ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim ReDim
O(N) CA(N) FA(N) A(N) CL(N) ZNF(N) ZNL(N) EXINT(N) DBTUC(N) EAIBT(N) TAX(N) EAIAT(N) EAFCD(N) COMDV(N) ZNS(N) TLANW(N) COMPK(N) ANF(N) P(N) ZNEW(N) EPS(N) DPS(N)
For i = 2 To N
’Solve simultaneous equations for N periods
page 3187
3187
July 6, 2020
15:54
Handbook of Financial Econometrics,. . . (Vol. 3)
3188
9.61in x 6.69in
b3568-v3-ch90
C. F. Lee
SALES(i) = SALES(i - 1) * (1 + GSALS(i)) O(i) = ORATE(i) * SALES(i) CA(i) = CARAT(i) * SALES(i) FA(i) = FARAT(i) * SALES(i) A(i) = CA(i) + FA(i) CL(i) = CLRAT(i) * SALES(i) ZNF(i) = (A(i) - CL(i) - PFDSK(i)) - (ZL(i - 1) - ZLR(i)) - S(i - 1) - R(i - 1) _ - B(i) * ((1 - T(i)) * (O(i) - ZI(i - 1) * (ZL(i - 1) - ZLR(i))) - PFDIV(i)) ZNL(i) = (ZK(i) / (1 + ZK(i))) * (A(i) - CL(i) - PFDSK(i)) - (ZL(i - 1) - ZLR(i)) ZL(i) = (ZL(i - 1) - ZLR(i)) + ZNL(i) ZI(i) = ZI(i - 1) * ((ZL(i - 1) - ZLR(i)) / ZL(i)) + ZIE(i) * (ZNL(i) / ZL(i)) If ZNL(i) 0.4. Compared to the conditional F -test (Shanken’s Bayesian ˆ∗ − SR SR
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch93
L.-J. Kao, H. C. Soo & C. F. Lee
3254 Table 93.4:
CRSP value-weighted portfolio’s efficiency tests. Accept H0
Reject H0
Total
A. Conditional F -Test (1941.1–1973.12) ˆ∗ − SR ˆ > 0.4 SR 82 ˆ ≤ 0.4 ˆ∗ − SR 10 SR Total 92
175 69 244
257 79 336
B. Proposed Bayesian Test (1941.1–1973.12) ˆ > 0.4 ˆ∗ − SR 200 SR ˆ ≤ 0.4 ˆ∗ − SR 10 SR Total 210
57 69 126
257 79 336
C. Shanken’s Bayesian Test (1941.1–1973.12) ˆ∗ − SR ˆ > 0.4 SR 67 ∗ ˆ ≤ 0.4 ˆ 10 SR − SR Total 77
190 69 259
257 79 336
Notes: In panel A, the p-values of the conditional F -test is compared to the significant level α = 0.10. In panel B, the proposed Bayes factor B10 of (93.8) is compared to the threshold h∗ = 12. When the Bayes factor B10 is greater than h∗ = 12, the null hypothesis H0 : ρ = 1 is rejected. In panel C, the null hypothesis H0 : ρ = 1 is rejected when Shanken’s Bayes factor (93.3) is greater than h∗∗ = 2. The median (ρ) is 0.75. The asymptotic variance VGM M in (93.9) is 0.0236. Table 93.5:
CRSP value-weighted portfolio’s efficiency tests. Accept H0
Reject H0
Total
A. Conditional F -Test (1980.1–2012.12) ˆ∗ − SR ˆ > 0.4 SR 28 ∗ ˆ ≤ 0.4 ˆ 2 SR − SR Total 30
160 146 306
188 148 336
B. Proposed Bayesian Test (1980.1–2012.12) ˆ∗ − SR ˆ > 0.4 SR 142 ˆ ≤ 0.4 ˆ∗ − SR 2 SR Total 144
46 146 192
188 148 336
C. Shanken’s Bayesian Test (1980.1–2012.12) ˆ∗ − SR ˆ > 0.4 SR 100 ˆ ≤ 0.4 ˆ∗ − SR 2 SR Total 102
88 146 234
188 148 336
Notes: In panel A, the p-values of the conditional F -test is compared to the significant level α = 0.10. In panel B, the proposed Bayes factor B10 of (93.8) is compared to the threshold h∗ = 18. When the Bayes factor B10 is greater than h∗ = 18, the null hypothesis H0 : ρ = 1 is rejected. In panel C, the null hypothesis H0 : ρ = 1 is rejected when Shanken’s Bayes factor (93.3) is greater than h∗∗ = 1. The median (ρ) is 0.45. The asymptotic variance VGMM in (93.9) is 0.0193.
page 3254
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch93
Bayesian Portfolio Mean–Variance Efficiency Test
page 3255
3255
test), the likelihoods are 82/257 (67/257) and 28/188 (100/188) for the first and second time periods, respectively. A simulation study of M = 10,000 runs is given in Table 93.6, where the maximum Sharpe ratio SR∗ = 0.4, the number of risky assets N = 10, and the length of the sampling period T = 60 months. As most of the empirical Sharpe ratios of CRSP indexes are between 0 and 0.4 (Gibbons, Ross, and Shanken, 1989), a uniform distribution within the interval (0, 0.4) is chosen for the prior distribution π(•) of the ex ante Sharpe ratio SR. Powers of rejecting the null H0 : ρ = 1 for the conditional F -test, the proposed Bayes test, and Shanken’s Bayesian tests when the ex ante Sharpe ratio SR = 0.1, ˆ are 0.2, 0.3, 0.4, respectively, are calculated. Three different levels of SR ˆ chosen: the low, medium and high Sharpe ratios S R are set to SR-2σSR , 1/2 SR, and SR+2σSR , respectively, where σSR = VGMM , where VGMM is the asymptotic variance in (93.9). When the sampling window length T = 60, the standard error σSR is set to 0.0840. In panel B, the median of the ratio ρ = SR/SR ∗ is assumed to be 0.50 and 0.75, respectively. The null H0 : ρ =1 is rejected when the Bayes factor >1.0. From Table 93.6, the powers of the three tests are all increasing as the ex ante Sharpe ratio SR deviates further from the maximum Sharpe ratio Table 93.6:
Simulation study of portfolio efficiency test of H0 : ρ = 1.
SR = 0.4000
SR = 0.3000
SR = 0.2000
SR = 0.1000
ˆ SR
ˆ SR
Power
ˆ SR
ˆ SR
0.1321 0.3000 0.4679
0.2978 0.2832 0.2619
0.0321 0.2000 0.3679
Power
A. The conditional F -test Low 0.2321 0.1000 Medium 0.4000 0.1000 High 0.5679 0.1000
Power
0.4601 −0.0679 0.4461 0.1000 0.4156 0.2679
Power
0.5482 0.5459 0.5209
B. Power of Shanken’s Bayesian test Median(ρ) = 0.50 Median(ρ) = 0.75
0.0920 0.2380
0.3900 0.5620
0.6920 0.7540
0.8440 0.8720
0.5320
0.8980
0.9940
C. Power of Proposed Bayesian test 0.0900
Notes: The number of risky assets N = 10 and the sampling window length T = 60. Powers of the three tests when the ex ante maximum Sharpe ratio SR∗ = 0.4 and the ex ante Sharpe ratio SR = 0.1, 0.2, 0.3, 0.4, respectively, are calculated. In panel A, the low, ˆ are set to SR-2σSR , SR, and SR+2σSR , respectively, medium and high Sharpe ratios S R 1/2 = 0.0840. In panel B, 0.50 and 0.75 are assumed where the standard error σSR = V GMM∗ for the median of the ratio ρ = SR/SR . In panel C, the prior distribution of the ex ante Sharpe ratio SR is chosen to be uniform (0, 0.4). The null H0 is rejected when the Bayes factor > 1.0.
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
3256
9.61in x 6.69in
b3568-v3-ch93
L.-J. Kao, H. C. Soo & C. F. Lee
SR∗ = 0.4. Among the three tests, the proposed Bayes test has the largest power as the ex ante Sharpe ratio SR = 0.1 and 0.2. On the other hand, as the ex ante Sharpe ratio SR = SR ∗ = 0.4, the proposed Bayes test has the smallest power or the smallest probability of incorrectly rejecting of the null H0 : ρ = 1. It can also be seen in Table 93.6 that the conditional F -test always has the smallest power under the alternative hypothesis H1 : ρ < 1. When the ex ante Sharpe ratio SR = 0.3, the performance of the proposed Bayes test compared to that of Shanken’s depends on the prior information of the median (ρ), where ρ = SR/SR ∗ . When the median (ρ) is set to 0.50, the power of the proposed Bayes test outperforms Shanken’s Bayes test. However, as the median (ρ) is set to 0.75, the power of the proposed Bayes test almost ties with Shanken’s Bayes test. This also highlights the fact that the Shanken’s test is very sensitive to the parameter value chosen for median (ρ), which can be found in panel B of Table 93.6.
93.5 Conclusion Because the expected mean and covariance matrix of all the asset returns are not observed in practice, a “plug-in” portfolio has been widely adopted as the market portfolio, and its sample mean and sample covariance matrix are used as proxies. However, the out-of-sample performance of the plug-in portfolio is usually poor (Ao et al., 2018), and the mean–variance efficiency of the market portfolio is often controversial (Levy and Roll, 2010). To resolve the problem, researchers seek to improve portfolio performance by plugging in better estimates of the underlying mean and covariance matrix. Regarding estimation of the covariance matrix, a widely used alternative estimator is the linear shrinkage estimator proposed in Ledoit and Wolf (2003, 2004), which estimates the covariance matrix using a suitable linear combination of sample covariance matrix and a target matrix. More recently, Ledoit and Wolf (2017) propose a nonlinear shrinkage estimator of the covariance matrix and its factor-model-adjusted version that are suitable for portfolio optimization. The Bayesian test of market portfolio’s efficiency, which accounts for the ˆ sampling error associated with the ex post Sharpe ratio SR, provides some guidance to decide if a “plug-in” portfolio is mean–variance efficient and used as a proxy for the market portfolio. Both empirical analysis and simulation study are given to show the performance of the proposed Bayesian test. It not only successfully resolves the concerns over the power of the conditional F -test by Gibbson, Ross, and Shanken (1989) but also outperforms
page 3256
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Bayesian Portfolio Mean–Variance Efficiency Test
b3568-v3-ch93
page 3257
3257
the Bayesian test developed by Shanken (1987). Compared to previous developed Bayesian tests of portfolio’s efficiency in the literature, less prior information is required in the proposed Bayesian test. It can be expected that more appropriate conclusions can be made by the proposed Bayesian test.
Bibliography M. Ao, Y. Li, and X. Zheng (2018). Approaching mean–variance efficiency for large portfolios, Available at SSRN: https://ssrn.com/abstract=2699157. F. Black, C. Jensen, and M. Scholes (1972). The capital asset pricing model: Some empirical tests. In M. C. Jensen ed., Studies in the Theory of Capital Markets, Praeger, New York: Praeger. M. Bri`ere, B. Drut, V. Mignon, K. Oosterlinck, and A. Szafarz (2013). Is the Market Portfolio Efficient? A New Test of Mean–Variance Efficiency When All Assets are Risky. Finance, 34(1), 7–41. S. Christie (2005). Is the Sharpe Ratio Useful in Asset Allocation? MAFC Research Paper No.31, Applied Finance Center, Macquarie University. E. F. Fama and M. MacBeth (1973). Risk, Return, and Equilibrium: Empirical Tests. J. Politic. Econom, 81, 607–636. M. Gibson (1982). Multivariate Tests of Financial Models: A New Approach, Journal of Financial Economics, 10, 3–28. M. Gibson, S. Ross, and J. Shanken (1989). A Test of the Efficiency of a Given Portfolio. Econometrica 57, 1121–1152. C. R. Harvey and G. Zhou (1990). Bayesian Inference in Asset Pricing Tests, Journal of Financial Economics, 26, 221–254. H. Jeffreys (1961). Theory of Probability, London: Oxford University Press. J. D. Jobson and B. M. Korkie (1985). Some Tests of Linear Asset Pricing with Multivariate Normality. Canadian Journal of Administrative Science, 2, 114–138. R. E. Kass and A. E. Raftery (1995). Bayes Factor, Journal of the American Statistical Association, 90(430), 773–795. S. Kandel, R. McCulloch, and R. F. Stambaugh (1995). Bayesian Inference and Portfolio Efficiency. Review of Financial Studies, 8, 1–53. O. Ledoit and M. Wolf (2003). Improved Estimation of the Covariance Matrix of Stock Returns with an Application to Portfolio Selection. Journal of Empirical Finance, 10(5), 603–621. O. Ledoit and M. Wolf (2004). A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices. Journal of Multivariate Analysis, 88(2), 365–411. O. Ledoit and M. Wolf (2017). Nonlinear Shrinkage of the Covariance Matrix for Portfolio Selection: Markowitz Meets Goldilocks. The Review of Financial Studies, 30(12), 4349–4388. M. Levy and R. Roll (2010). The Market Portfolio may be Mean–Variance Efficient After All. Review of Financial Studies, 23, 2464–2491. J. Lintner (1965). The Valuation of Risk Assets and Selection of Risky Investments in Stock Portfolios and Capital Budgets. Review of Economics and Statistics, 47, 13–27. C. MacKinlay (1987). On Multivariate Tests of the CAPM. Journal of Financial Economics, 18, 341–371. A. C. MacKinlay and M. P. Richardson (1991). Using Generalized Method of Moments to Test Mean–Variance Efficiency, Journal of Finance, 46, 511–527.
July 6, 2020
15:55
3258
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch93
L.-J. Kao, H. C. Soo & C. F. Lee
H. M. Markowitz (1952). Portfolio Selection, Journal of Finance, 7(1), 77–91. D. F. Morrison (1976). Multivariate Statistical Methods, 2nd ed., New York: McGraw-Hill. J. D. Opdyke (2007). Comparing Sharpe Ratio: So Where are the p-Values? Journal of Asset Management, 8(5), 208–336. R. Roll (1977). A Critique of the Asset Pricing Theory’s Tests; Part I: On Past and Potential Testability of the Theory. Journal of Financial Economics, 4, 129–176. R. Roll and S. A. Ross (1994). On the Cross-Sectional Relation Between Expected Returns and Betas. Journal of Finance, 49, 101–121. J. Shanken (1985). Multivariate Tests of the Zero-Beta CAPM. Journal of Financial Economics, 14, 327–348. J. Shanken (1987). A Bayesian Approach to Testing Portfolio Efficiency. Journal of Financial Economics, 19, 195–215. J. Shanken (1996). Statistical Methods in Tests of Portfolio Efficiency: A Synthesis. In Handbook of Statistics, 14, Elsevier Science, 693–711. W. F. Sharpe (1964). Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk. Journal of Finance, 19(3), 425–442.
Appendix 93A: Derivation of F -Test Statistics Vu by Gibbons, Ross and Shanken Rewrite the multivariate normal system (93.1) as rt = α + βRt + εt , where rt = (r1t ,. . ., rN t ) and Rt are the time-t excess returns of the N risky assets and the test portfolio p, respectively, α = (α1 ,. . ., αN ) are the N intercepts, β = (β1 ,. . ., βN ) are the N slopes, 1≤ t ≤ T . The cross-sectional disturbances εt = (ε1t ,. . ., εN t ) ∼MN(0, Σ), 1≤ t ≤ T . In addition, the time series of random disturbances ε1 ,. . ., εT are i.i.d. Thus, the ordinary leastˆ N ) of the N intercepts α conditional on squares estimators α ˆ = (ˆ α1 , . . . , α test portfolio p’s excess returns R = (R1 ,. . ., RT ) have a multivariate normal distribution with ⎛ ⎞
T ⎜
T α α, Σ⎟ ˆ ∼ M N ⎝ ⎠. 2 2 ˆ ˆ 1 + SR 1 + SR ˆ are Note α ˆ and the maximum likelihood estimator of the covariance matrix Σ ˆ independent. As (T − 2) Σ has a Wishart distribution, by Morrison (1976, P. 131), the F -test statistics Vu =
ˆ −1 α ˆΣ T (T − N − 1) α ˆ , N (T − 2) ˆ2 1 + SR
page 3258
July 17, 2020
14:43
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch93
Bayesian Portfolio Mean–Variance Efficiency Test
3259
has a non-central F distribution with degrees of freedom N and (T − N − 1), with the non-centrality parameter T SR∗2 − SR2 , λ= ˆ2 1 + SR where SR is the ex ante Sharpe ratio of the test portfolio p, and SR∗ is the ex ante maximum Sharpe ratio attainable by the N risky assets and the test portfolio p. Appendix 93B: Bayesian Test Using Posterior-Odds Ratio or Bayes Factor The posterior-odds ratio summarizes a prior belief as well as evidence provided by the sample data about the hypotheses being tested. Specifically, let pr(H0 ) and pr(H1 ) = 1−pr(H0 ) denote the prior probabilities for the null and alternative hypothesis, respectively, and D is the sample dataset. The posterior-odds ratio is defined as Posterior − odds =
pr (D|H0 ) pr (H0 ) pr (H0 |D) = × . pr (H1 |D) pr (D|H1 ) pr (H1 )
(93B.1)
Note that when the priors pr(H1 ) =pr(H0 ) =1/2 are even, i.e., the priorodds is one, the posterior-odds ratio can be expressed as the Bayes factor B10 as B10 =
p (D|H1 ) . p (D|H0 )
(93B.2)
From Bayes theorem, the posterior probabilities are pr (Hk |D) =
pr (D|Hk ) pr (Hk ) . pr (D|H0 ) pr (H0 ) + pr (D|H1 ) pr (H1 )
where pr(D|Hk ) is the likelihood under Hk for k = 0, 1. When the hypothesis Hk , k = 0, 1, is composite that involves an unknown parameter θ, the likelihood pr(D|Hk ) becomes (93B.3) pr (D|Hk ) = pr (D|θ, Hk ) π (θ)dθ, where π(•) is the prior distribution of the parameter θ (Kass and Raftery, 1995).
page 3259
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch93
L.-J. Kao, H. C. Soo & C. F. Lee
3260
Appendix 93C: Derivation of Bayes Factor B10 by Shanken Note that the null hypothesis H0 : λ = 0 is a point mass, whereas the alternative hypothesis H1 : λ = 0 is composite, where λ is the non-centrality parameter of (93.3). By Jeffreys (1961), as a point null H0 is to be tested against a composite alternatives H1 , some prior probability π to the point null H0 is assigned first, and the remaining mass is spreading out according to some continuous density. Let g(λ) be the continuous density for the composite alternatives H1 : λ = 0. Here g(λ) is chosen to be the prior distribution of λ given in (93.4). By (93B.1), the posterior-odds ratio is the ratio:
pr (D|H0 ) pr (H0 ) × = pr (D|H1 ) pr (H1 )
(1 − π)
∞ 0
f ( u| λ) g (λ) dλ
πf ( u| λ = 0)
,
(93C.1)
where f (•|λ) is the density of the non-central F distribution with N and T − N −1 degrees of freedom and non-centrality parameter λ. By Shanken (1985), the numerator of (93C.1) is ∞ (1 − π)
f ( u| λ) g (λ) dλ = (1 − π) (1 + T c)−1 f (1 + T c)−1 u λ = 0 ,
0
together with the fact that the density of a central F distribution with N and T − N − K degrees of freedom satisfies −(T −1)/2 . f ( u| λ = 0) ∝ u(N −2)/2 1 + N (T − N − 1)−1 u Let the prior probability π to the point null H0 : λ = 0 be 0.5, one obtains the Bayes factor in (93.6).
Appendix 93D: Derivation of Bayes Factor in (93.8) ˆ ˆ where SR Considering the two-dimensional test statistics D = (Vu , SR), is the ex post Sharpe ratio of the test portfolio p, and the statistics Vu ˆ the statistics Vu is non-central is defined in (93.2). Recall that given SR, F distributed with N and (T − N − 1) degrees of freedom, respectively, and the non-centrality parameter λ is given in (93.3). Note the non-centrality parameter λ depends on the ex ante Sharpe ratio SR, thus as SR = s, the
page 3260
July 6, 2020
15:55
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Bayesian Portfolio Mean–Variance Efficiency Test
b3568-v3-ch93
page 3261
3261
non-centrality parameter can be rewritten as ˆ2 , λ(s) = T SR∗2 − s2 / 1 + S R where SR∗ is the ex ante maximum Sharpe ratio by the N risky assets. As the ex ante Sharpe SR = s, the joint likelihood pr(D|Hk ), k = 0, 1, is ˆ g(SR|SR = s)fN,T −N −1 ( Vu | λ (s)) ,
(93D.1)
ˆ and fN,T −N −1 (•|λ(s)) is the p.d.f. of the nonwhere g(•|•) is the p.d.f. of SR, central F distribution. Note as λ(s) = 0 under the null H0 : ρ = 1, the noncentral F distribution fN,T −N −1 (•|λ(s)) becomes a central F distribution fN,T −N −1 (•) with N and (T − N − 1) degrees of freedom, respectively. The joint likelihood pr(D|H1 ) for the composite alternative hypothesis H1 : ρ < 1 is derived as follows. Let π(•) be the prior distribution of the ex ante Sharpe ratio SR, and assume the ex ante Sharpe ratio SR∗ equals ˆ ∗ . By (93A.2), the joint likelihood p(D|H1 ) is the ex post Sharpe ratio SR ˆ (93D.2) g(SR|SR = w)fN,T −N −1 (Vu |λ∗ (w)) π(w)dw, ∗2 2 2 ˆ ˆ = T S R − w / 1 + S R . The integration (93B.2) can be where approximated via Monte Carlo numerical integration by drawing K realizations of the ex ante Sharpe ratio SR, namely, w1 , . . . , wK , for sufficient large K from its prior distribution π(•) to yield λ∗ (w)
p (D|H1 ) =
K
ˆ j )fN,T −N −1 (Vu |λ∗ (wj )/K) . g(SR|w
(93D.3)
j=1
The Bayes factor in (93.8) can be obtained accordingly from (93D.2) and (93D.3).
This page intentionally left blank
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Chapter 94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?∗ Hong-Yi Chen, Sheng-Syan Chen, Chin-Wen Hsin and Cheng Few Lee Contents 94.1 94.2
94.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Revenue, Earnings, and Price Momentum Strategies . . . . . 94.2.1 Measures for earnings surprises and revenue surprises 94.2.2 Measuring the profitability of revenue, earnings, and price momentum strategies . . . . . . . . . . . . . . . Data and sample descriptions . . . . . . . . . . . . . . . . . . 94.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.3.2 Sample descriptions . . . . . . . . . . . . . . . . . . . 94.3.3 Descriptive statistics for stocks grouped by SURGE, SUE, and prior returns . . . . . . . . . . . . . . . . .
3264 3269 3269 3270 3270 3270 3271 3274
Hong-Yi Chen National Chengchi University e-mail: [email protected] Sheng-Syan Chen National Chengchi University e-mail: [email protected] Chin-Wen Hsin Yuan Ze University e-mail: [email protected] Cheng Few Lee Rutgers University e-mail: cfl[email protected] ∗
This chapter is a reprint of the paper “Does revenue momentum drive or ride earnings or price momentum?,” which was published in Journal of Banking and Finance, Vol. 38, pp. 166–185 (2014). 3263
page 3263
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3264
94.4 94.5
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
Empirical Results of Univariate Momentum Strategies . . Interrelation of Revenue, Earnings, and Price Momentum 94.5.1 Testing for dominance among the momentum strategies . . . . . . . . . . . . . . . . . . . . . . . 94.5.2 Two-way sorted portfolio returns and momentum cross-contingencies . . . . . . . . . . . . . . . . . . 94.5.3 Combined momentum strategies . . . . . . . . . . 94.6 Persistency and Seasonality . . . . . . . . . . . . . . . . . 94.6.1 Persistence of momentum effects . . . . . . . . . . 94.6.2 Seasonality . . . . . . . . . . . . . . . . . . . . . . 94.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 94: Measures of Earnings and Revenue Surprises . . .
. . . .
3278 3281
. .
3281
. . . . . . . .
3290 3299 3303 3303 3311 3312 3313 3316
. . . . . . . .
Abstract This chapter examines the profits of revenue, earnings, and price momentum strategies in an attempt to understand investor reactions when facing multiple information of firm performance in various scenarios. We first offer evidence that there is no dominating momentum strategy among the revenue, earnings, and price momentums, suggesting that revenue surprises, earnings surprises, and prior returns each carry some exclusive unpriced information content. We next show that the profits of momentum driven by firm fundamental performance information (revenue or earnings) depend upon the accompanying firm market performance information (price), and vice versa. The robust monotonicity in multivariate momentum returns is consistent with the argument that the market does not only underestimate the individual information but also the joint implications of multiple information on firm performance, particularly when they point in the same direction. A three-way combined momentum strategy may offer monthly return as high as 1.44%. The information conveyed by revenue surprises and earnings surprises combined account for about 19% of price momentum effects, which finding adds to the large literature on tracing the sources of price momentum. Keywords Revenue surprises • Earnings surprises • Post-earnings-announcement drift • Momentum strategies.
94.1 Introduction Financial economists have long been puzzled by two robust and persistent anomalies in the stock market: price momentum (see Jegadeesh and Titman, 1993, 2001; Rouwenhorst, 1998), and post-earnings-announcement drift (see Ball and Brown, 1968; Foster et al., 1984; Bernard and Thomas, 1989; Chan et al., 1996). More recently, Jegadeesh and Livnat (2006b) also find that
page 3264
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3265
3265
price reactions to revenue surprises on announcement dates only partially reflect the incremental information conveyed by the surprises. The information contents carried by revenue, earnings and stock prices are intrinsically linked through firm operations and investor evaluation, and there is evidence of mutual predictability for respective future values (e.g., see Jegadeesh and Livnat, 2006b). Nonetheless, investors, aware of the linkages among the information content conveyed by revenue, earnings and prices (see Ertimur et al., 2003; Raedy et al., 2006; Heston and Sadka, 2008), may still fail to take full account of their joint implications when pricing the stocks. This chapter investigates how investors price securities when facing multiple information contents of a firm, particularly those firm performance information that are most accessible for investors — price, earnings, and revenue.1 The long–short strategy of momentums, widely used in the literature, provides a venue to detect market reactions toward individual and multiple information contents. Accordingly, this study will start with documenting the revenue momentum profits and reconfirming the earnings and price momentums profits. Explorations with momentum strategies expect to yield implications that answer our two research questions. First, among the performance information of revenue surprises, earnings surprises, and prior returns, does each carry some exclusive information content that is not priced by the market? Second, do investors misreact toward the joint implications as well as individual information of firm revenue, earnings, and price? Our first research question is explored by testing momentum dominance. One momentum strategy is said to be dominated if its payoffs can be fully captured by the information measure serving as the sorting criterion of another momentum strategy. Note that our emphasis here is not asset pricing tests; instead, as in Chan et al. (1996) and Heston and Sadka (2008), we focus on the return anomalies based on revenue surprises, earnings surprises, and
1
Researches in the literature offer some evidence on the information linkage among revenue, earnings and prices. For example, Lee and Zumwalt (1981) find that revenue information is complementary to earnings information in security rate of return determination. Bagnoli et al. (2001) find that revenue surprises but not earnings surprises can explain stock prices both during and after the internet bubble. Swaminathan and Weintrop (1991) and Ertimur et al. (2003) suggest that the market reacts significantly more strongly to revenue surprises than to expenses surprises. Rees and Sivaramakrishnan (2001) and Jegadeesh and Livnat (2006b) also find that, conditional on earnings surprises, there is still a certain extent of market reaction to the information conveyed by revenue surprises. Ghosh et al. (2005) find that sustained increases in earnings are supported by sustained increases in revenues rather than by cost reductions.
July 6, 2020
15:56
3266
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
prior returns. Results from both a pairwise–nested comparison and a regression analysis indicate that revenue surprises, earnings surprises, and prior returns each lead to significant momentum returns that cannot be explained away by one another. That is, revenue momentum neither drives nor rides earnings or price momentum. Following the information diffusion hypothesis of Hong and Stein (1999), our evidence then suggests that revenue surprises, earnings surprises, and prior returns each contribute to the phenomenon of gradual information flow or that each have some exclusive information content that is not priced by the market.2 Further regression tests indicate that earnings surprise and revenue surprise information each accounts for about 14% and 10% of price momentum returns, and that these two fundamental performance information combined account for just about 19% of price momentum effects. These results provide additional evidence in the literature on the sources of price momentum (e.g., see Moskowitz and Grinblatt, 1999; Lee and Swaminathan, 2000; Piotroski, 2000; Grundy and Martin, 2001; Chordia and Shivakumar, 2002, 2005; Ahn et al., 2003; Griffin et al., 2003; Bulkley and Nawosah, 2009; Chui et al., 2010; Novy-Marx, 2012). Our second research question inquires how the market reacts to the joint implications of multiple information measures. The three measures under our study all carry important messages on innovations in firm performance, and therefore expect to trigger investor reactions. They become ideal target to be studied to entail implications on how investors process multiple information interactively in pricing stocks. The results from two-way sorted portfolios find that the market anomalies vary monotonically with the joint condition of revenue surprises, earnings surprises, and prior returns, and anomalies tend to be strongest when stocks show the strongest signals in the same direction. The cross-contingencies of momentums are observed in that the momentum returns driven by fundamental performance information (revenue surprises or earnings surprises) change with the accompanying market performance information (prior returns), and vice versa. Such finding, as interpreted by the gradual-information-diffusion model, is consistent
2
The asset pricing tests of Chordia and Shivakumar (2006) support that price momentum is subsumed by the systematic component of earnings momentum, even though they also find earnings surprises and past returns have independent explanatory power for future returns. This latter finding is consistent with the results of Chan et al. (1996) and our results, as reported later. In comparison, Chan et al. (1996), Jegadeesh and Livnat (2006b), and we focus on whether and how firm characteristics, such as revenue surprises, earnings surprises, and prior returns, are related to future cross-sectional returns, while Chordia and Shivakumar (2006) also conduct asset pricing tests.
page 3266
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3267
3267
with the suggestion that the market not only underreacts to individual firm information but also underestimates the significance of the joint implications of revenue, earnings, and price information.3 These results also have interesting implications for investment strategies that the fundamental performance information plays an important role in differentiating future returns among price winners, while the market performance information is particularly helpful in predicting future returns for stock with high surprises in revenue or earnings. Specifically, price winners, compared to price losers, yield higher returns from revenue/earnings momentum strategies; stocks with greater surprises in fundamentals yield greater returns from price momentums. The results of our dominance tests and multivariate momentum suggest that a combined momentum strategy should yield better results over singlecriterion momentum strategies. A combined momentum strategy using all three performance measures is found to yield monthly returns as high as 1.44%, which amounts to an annual return of 17.28%. Such a combined momentum strategy outperforms single-criterion momentum strategies by at least 0.72 percentage points in monthly return. Our conclusions remain robust whether we use raw returns or risk-adjusted returns, whether we include January results or not, and whether we use dependent or independent sorts. Chan et al. (1996), Piotroski (2000), Griffin et al. (2005), Mohanram (2005), Sagi and Seasholes (2007), Asem (2009), and Asness et al. (2013) conduct similar tests on combined momentum strategies using alternative sorting criteria.4 In comparison, our study is the first to document results considering these three firm performance information — revenue surprises, earnings surprises, and prior returns — altogether.
3
The firm performance measures, revenue, earnings, and stock price, do not only share common origins endogenously but also have added implications for future values of one another. Jegadeesh and Livnat (2006b) have documented evidence on the temporal linkages among these variables. In this chapter, we focus on the further inquiry whether investors fully exploit such temporal linkages among these firm performance information in pricing stocks. 4 Chan et al. (1996) and Griffin et al. (2005) find that when sorting prior price performance and earnings surprises together, the profits of a zero-investment portfolio are higher than those of single sorting. Piotroski (2000) and Mohanram (2005) develop fundamental indicators, FSCORE and GSCORE, to separate winners from losers. Sagi and Seasholes (2007) find that price momentum strategy becomes even more profitable when applied to stocks with high revenue growth volatility, low costs, or valuable growth options. Asness et al. (2013) find that the combination of value strategy and momentum strategy can perform better than either one alone. Asem (2009) find the momentum profits can be enhanced combining prior price returns and dividend behaviors.
July 6, 2020
15:56
3268
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
In terms of persistency, the earnings momentum strategy is found to exhibit the strongest persistence, while the revenue momentum strategy is relatively short-lived. All the same, the short-lived revenue momentum effect is prolonged when the strategy is executed using stocks with the best prior price performance and more positive earnings surprises. In fact, the general conclusion supports our claim of cross-contingencies of momentum as applied to momentum persistence. This study contributes to the finance literature in several respects. First, we specifically identify the profitability of revenue momentum and its relation with earnings surprises and prior returns in terms of momentum strength and persistence. A revenue momentum strategy executed with a six-month formation period and six-month holding-period strategy yields an average monthly return of 0.61% for the period between 1974 and 2009. Second, this study identifies empirical inter-relations of anomalies arising from three firm performance information — revenue, earnings and price. To the best of our knowledge, we are the first to offer evidence that there is no dominating momentum strategy among the three, and that the profits of momentum driven by firm fundamental performance information (revenue or earnings) depend upon the accompanying firm market performance information (price), and vice versa.5 Third, aside from academic interest, the aforementioned findings may well serve as useful guidance for asset managers seeking profitable investment strategies. Fourth, this study also adds to the large literature attempting to trace the sources of price momentum. Our numbers indicate that the information conveyed by revenue surprises and earnings surprises combined account for about 19% of price momentum effects. Last, our results offer additional evidence to the literature using the behavioral explanation for momentums.6 Our empirical results are consistent with the suggestion that revenue surprises, earnings surprises, and prior returns each carry some exclusive unpriced information content. Moreover, the monotonicity of abnormal returns found in multivariate momentums suggests that 5
Heston and Sadka (2008) and Novy-Marx (2012) also provide evidence that earnings surprises are unable to explain price momentum. However, this study is the first to consider earnings surprises and revenue surprises at the same time in explaining price momentum. 6 Barberis et al. (1998), Daniel et al. (1998), Hong and Stein (1999), Jackson and Johnson (2006), Verardo (2009), and Moskowitz et al. (2012) provide evidence in support of behavioral explanation to momentum effect, while Grundy and Martin (2001), Johnson (2002), Ahn et al. (2003), Sagi and Seasholes (2007), Li et al. (2008), Liu and Zhang (2008), and Wang and Wu (2011) attribute momentum effect to missing risk factors. In addition, Korajczyk and Sadka (2004) and Lesmond et al. (2004) re-examine the profitability of momentum strategies after taking the transaction cost into account and get mixed results.
page 3268
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3269
3269
the market does not only underestimate the individual information but also the joint implications of multiple information on firm performance. Such suggestion is new to the literature, and may also present a venue to track the sources of price momentum. The chapter is organized as follows. In Section 94.2, we develop our models and describe the methodologies. In Section 94.3, we describe the data. In Section 94.4, we report the results on momentum strategies based on a single criterion. In Section 94.5, we discuss the empirical results of exploration of inter-relations among revenue, earnings, and price momentums using strategies built on multiple sorting criteria. In Section 94.6, we test the persistency and seasonality of momentum strategies. Finally, Section 94.7 concludes. 94.2 Revenue, Earnings, and Price Momentum Strategies 94.2.1 Measures for earnings surprises and revenue surprises We follow Jegadeesh and Livnat (2006a, b) and measure revenue surprises and earnings surprises based on historical revenues and earnings.7 Assuming that both quarterly revenue and quarterly earnings per share follow a seasonal random walk with a drift, we define the measure of revenue surprises for firm i in quarter t, standardized unexpected revenue growth (SURGE), as R QR i,t − E(Qi,t ) , (94.1) SURGEi,t = R σi,t R where QR i,t is the quarterly revenue of firm i in quarter t, E(Qi,t ) is the R is the expected quarterly revenue prior to earnings announcement, and σi,t standard deviation of quarterly revenue growth. The same method is applied to measure earnings surprises, specifically standardized unexpected earnings (SUE), defined as E QE i,t − E(Qi,t ) , (94.2) SUEi,t = E σi,t
where QE i,t is the quarterly earnings per share from continuing operations, E(QE i,t ) is the expected quarterly earnings per share prior to earnE is the standard deviation of quarterly earnings ings announcement, and σi,t growth. 7
See Appendix for a detailed discussion of measures to estimate revenue and earnings surprises.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3270
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
94.2.2 Measuring the profitability of revenue, earnings, and price momentum strategies We construct all the three momentum strategies based on the approach suggested by Jegadeesh and Titman (1993). To evaluate the information effect of earnings surprises on stock returns, we form an earnings momentum strategy analogous to the one designed by Chordia and Shivakumar (2006). At the end of each month, we sort sample firms by SUE and then group the firms into ten deciles.8 December 1 includes stocks with the most negative earnings surprises, and December 10 includes those with the most positive earnings surprises. The SUEs used in every formation month are obtained from the most recent earnings announcements, made within three months before the formation date. We hold a zero-investment portfolio, long the most positive earnings surprises portfolio and short the most negative earnings surprises portfolio, for K (K = 3, 6, 9, and 12) subsequent months, not rebalancing the portfolios during the holding period. Such positive-minus-negative (PMN) strategy holds K different long-positive and short-negative portfolios each month. Accordingly, we obtain a series of zero-investment portfolio returns, which are the monthly returns to this earnings momentum strategy. Similarly, we apply this PMN method to construct a revenue momentum strategy. In the case of price momentum, we form a zero-investment portfolio each month by taking a long position in the top-decile portfolio (winner) and a short position in the bottom-decile portfolio (loser), and we hold this winnerminus-loser portfolio (WML) for subsequent K months. We thus obtain a series of zero-investment portfolio returns, i.e., the returns to the price momentum strategy. 94.3 Data and sample descriptions 94.3.1 Data We collect from Compustat the firm basic information, earnings announcement dates, and firm accounting data. Stock prices, stock returns, share codes, and exchange codes are retrieved from the Center for Research in Security Prices (CRSP) files. The sample period is from 1974 to 2009. 8
Note that we sort the sample firms into five quintile portfolios on each criterion in our later construction of multivariate momentum strategies. To conform to the same sorting break points, we also test the single momentum strategies based on quintile portfolios and find the results remain similar to those based on decile portfolios.
page 3270
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3271
3271
Only common stocks (SHRCD = 10, 11) and firms listed in New York Stock Exchange, American Stock Exchange, or Nasdaq (EXCE = 1, 2, 3, 31, 32, 33) are included in our sample. We exclude from the sample regulated industries (SIC = 4000–4999) and financial institutions (SIC = 6000–6999). We also exclude firms with stock prices below $5 on the formation date, considering that investors generally pay only limited attention to such stocks. For the purpose of estimating their revenue surprises (SURGE), earnings surprises (SUE), and prior price performance, firms in the sample should have at least eight consecutive quarterly earnings announcements and six consecutive monthly returns before each formation month. To examine the return drift following the estimated SURGE, SUE, and prior price performance, firms in the sample need to have at least 12 consecutive monthly returns following each formation month. Firms in the sample should also have corresponding SURGE, SUE, size and book-to-market factors available in each formation month. 94.3.2 Sample descriptions Table 94.1 presents the summary statistics for firm size, estimates of revenue surprises and estimates of earnings surprises for our sample firms between 1974 and 2009. Panel A shows that there are 223,831 firm-quarters during the sample period. Median firm market capitalization is $235 million. Panels B and C describe the distributions of the revenue surprises (SURGE) and the earnings surprises (SUE) across firms of different market capitalization and different book-to-market ratio. Around 54% of revenue surprises and 50% of earnings surprises are positive.9 The values of SURGE and SUE are expected to be positively correlated. After all, a firm’s income statement starts with revenue (sales) and ends with earnings; these two attributes share common firm operational information to a great extent, and their innovations, SURGE and SUE, should be correlated as well. Table 94.2 shows the time-series average of the cross-sectional correlations between 1974 and 2009. Panels A and B present, respectively, the Pearson correlations and Spearman rank correlations. The average of 9
To ensure that firm accounting information is available to public investors at the time the stock returns are recorded, we follow the approach of Fama and French (1992) and match the accounting data for all fiscal years ending in calendar year t − 1 with the returns for July of year t through June of t + 1. The market capitalization is calculated by the closing price on the last trading day of June of a year times the number of outstanding shares at the end of June of that year.
July 6, 2020
Panel A: Sample size and firm market capitalization Number of firm-quarters ALL
Market cap (million dollars) Mean 2,276
223,831
Median 235
Min 0.91
Max 602,433
Positive SURGE
Zero SURGE
N
Mean
Median
STD
N
Mean
Median
STD
N
121,525 45,670 50,881 24,974 61,827 38,338 21,360
3.31 3.63 3.21 2.91 3.19 3.41 3.45
2.84 3.25 2.73 2.41 2.70 2.98 2.99
2.34 2.40 2.32 2.20 2.31 2.37 2.40
102,306 27,829 46,309 28,168 54,935 30,591 16,780
−3.00 −2.84 −3.05 −3.06 −2.96 −3.02 −3.06
−2.56 −2.35 −2.62 −2.69 −2.57 −2.56 −2.56
2.21 2.21 2.25 2.15 2.14 2.28 2.33
0 0 0 0 0 0 0
9.61in x 6.69in
ALL Growth Mid-BM Value Small Mid-Size Large
Negative SURGE
H.-Y. Chen et al.
Panel B. Descriptive statistics of SURGE
Handbook of Financial Econometrics,. . . (Vol. 3)
Summary statistics of sample firm characteristics.
15:56
3272
Table 94.1:
b3568-v3-ch94 page 3272
July 6, 2020 15:56
ALL Growth Mid-BM Value Small Mid-Size Large
Negative SUE
Zero SUE
N
Mean
Median
STD
N
Mean
Median
STD
N
112,068 37,928 48,767 25,373 56,746 35,031 20,291
2.42 2.47 2.41 2.37 4.42 2.43 2.42
1.89 1.98 1.88 1.79 1.87 1.91 1.92
1.94 1.92 1.94 1.95 1.94 1.93 1.92
111,330 35,407 48,221 27,702 59,765 33,773 17,792
−2.92 −2.83 −2.92 −3.04 −2.86 −2.98 −3.01
−2.11 −2.10 −2.09 −2.17 −2.04 −2.18 −2.21
2.59 2.43 2.60 2.76 2.54 2.65 2.66
433 164 202 67 251 125 57
9.61in x 6.69in b3568-v3-ch94
Notes: This table presents the descriptive statistics for major characteristics of our sample stocks. Our sample includes stocks listed on the NYSE, the AMEX, and Nasdaq with data available to compute book-to-market ratios, revenue surprises, and earnings surprises. All financial service operations and utility companies are excluded. Firms with prices below $5 as of the earnings announcement date are also excluded. Panel A lists numbers of firm-quarter observations between January 1974 and December 2009. Panel B and Panel C, respectively, list the mean and median values the measure of revenue surprises (SURGE) and for the measure of earnings surprises (SUE) across all firm-quarters in our sample. Statistics for positive surprises, negative surprises, and zero surprises are presented separately. Sample firms are also classified into bottom 30%, middle 40%, and top 30% groups by their respective market capitalizations or book-to-market ratios. The breakpoints for the size subsamples are based on ranked values of market capitalization of NYSE firms. The breakpoints for the book-to-market subsamples are based on ranked values of book-to-market ratio of all sample firms.
Handbook of Financial Econometrics,. . . (Vol. 3)
Positive SUE
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Panel C. Descriptive statistics of SUE
3273 page 3273
July 6, 2020
15:56
3274
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
both types of correlations between SURGE and SUE is 0.32, while prior price performance is not as significantly correlated with SURGE or SUE, with average correlations equal to about 0.15 and 0.19, respectively. We then partition the sample by book-to-market ratio (B/M) and size. Value firms and small firms are found to exhibit slightly higher correlations among SURGE, SUE, and prior price performance than growth firms and large firms, although the differences in correlations across B/M and size groups are not significant. Table 94.2 also shows the fractions of months where non-zero correlations are significant at the 1% level. These numbers again confirm that the correlations between SURGE and SUE tend to be strongest across various classifications of firms, followed by correlations between SURGE and prior returns, and then those between SUE and prior returns. These preliminary results suggest that revenue surprises and earnings surprises share highly correlated information, while each still have a distinctive content, a conclusion consistent with Swaminathan and Weintrop (1991) and Jegadeesh and Livnat (2006b). The information content conveyed by market information, i.e., prior returns, differs more from that carried by the two fundamental information measures, SURGE and SUE.
94.3.3 Descriptive statistics for stocks grouped by SURGE, SUE, and prior returns We next compare the firm characteristics for portfolios characterized by different revenue surprises (SURGE), earnings surprises (SUE) and prior returns. All sample stocks are sorted into quintiles based on their SURGE, SUE, and prior 6-month returns independently. The characteristics of those quintile portfolios are reported in Table 94.3. Several interesting observations emerge. The price level, as expected, is found to be the lowest for the price losers (P1). Stocks with negative revenue surprises (R1) or negative earnings surprises (E1) also have lower price levels, while the trend is not as obvious as for price losers. We also find price losers (P1) and price winners (P5) tend to be smaller stocks. Another interesting observation revealed in the bookto-market ratios is that stocks with the most positive SURGE or the most winning returns tend to be growth stocks. Stocks with the most positive SUE also have lower B/M ratios, but to much less of a degree. This suggests that growth stocks are characterized by strong revenue but not necessarily strong earnings.
page 3274
July 6, 2020 15:56
Panel A. Pearson correlations among SURGE , SUE , and Prior 6-month returns Subsample by B/M Correlated variables (SURGE, SUE)
(SURGE, Prior returns)
0.3200∗∗∗
(101.17) [100%] 0.1458∗∗∗ (44.09) [88.7%] 0.1868∗∗∗ (65.54) [98.4%]
Value 0.3331∗∗∗
(84.46) [100%] 0.1272∗∗∗ (33.86) [41.5%] 0.2120∗∗∗ (57.68) [81.9%]
Mid 0.3361∗∗∗
(107.04) [100%] 0.1263∗∗∗ (35.67) [64.6%] 0.2015∗∗∗ (54.40) [92.7%]
Growth 0.2818∗∗∗
(65.93) [100%] 0.1353∗∗∗ (35.36) [62.7%] 0.1496∗∗∗ (47.01) [68.1%]
Small 0.3641∗∗∗
(118.69) [100%] 0.1686∗∗∗ (55.44) [86.9%] 0.2330∗∗∗ (75.82) [98.8%]
Mid 0.2917∗∗∗
(69.91) [100%] 0.1304∗∗∗ (29.78) [55.4%] 0.1523∗∗∗ (40.74) [67.1%]
Large 0.2362∗∗∗ (42.64) [71.1%] 0.1061∗∗∗ (17.78) [35.9%] 0.0959∗∗∗ (20.93) [23.7%] (Continued )
9.61in x 6.69in
(SUE, Prior returns)
All firms
Subsample by Size
Handbook of Financial Econometrics,. . . (Vol. 3)
Correlation among revenue surprises, earnings surprises, and prior price performance.
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Table 94.2:
3275 page 3275
July 6, 2020
(Continued ).
Panel B. Spearman rank correlations among SUE , SURGE , and Prior 6-month-returns
Correlated variables (SURGE, SUE)
(SURGE, Prior returns)
(SUE, Prior returns)
Subsample by Size
Value
Mid
Growth
Small
Mid
Large
0.3231∗∗∗ (106.09) [100%] 0.1426∗∗∗ (42.61) [86.6%] 0.1834∗∗∗ (63.98) [97.2%]
0.3367∗∗∗ (93.92) [100%] 0.1227∗∗∗ (33.68) [41.8%] 0.2117∗∗∗ (59.29) [84.0%]
0.3397∗∗∗ (112.08) [100%] 0.1255∗∗∗ (36.33) [63.4%] 0.1980∗∗∗ (54.79) [91.1%]
0.2828∗∗∗ (68.22) [99.8%] 0.1315∗∗∗ (33.09) [58.2%] 0.1383∗∗∗ (41.56) [62.0%]
0.3652∗∗∗ (124.45) [100%] 0.1647∗∗∗ (55.45) [87.8%] 0.2314∗∗∗ (76.68) [99.3%]
0.2952∗∗∗ (72.92) [100%] 0.1285∗∗∗ (29.37) [53.3%] 0.1501∗∗∗ (39.50) [64.8%]
0.2407∗∗∗ (45.40) [74.4%] 0.1032∗∗∗ (17.58) [35.0%] 0.0959∗∗∗ (20.21) [23.2%]
9.61in x 6.69in b3568-v3-ch94
Notes: This table presents the correlations among SURGE, SUE and prior returns of our sample firms. At the end of each month, each sample firm should have its corresponding most current SUE, most current SURGE, and previous six-month return. SURGE and SUE are winsorized at 5% and 95%, setting all SURGE and SUE values greater than the 95th percentile to the value of the 95th percentile and all SURGE and SUE values smaller than the 5th percentile to the value of the 5th percentile. Panel A lists the average Pearson correlations among SUE, SURGE, and prior returns between 1974 and 2009. Panel B lists the average Spearman rank correlations, where all sample firms are grouped into 10 portfolios based on SURGE, SUE, and prior-six-month-returns independently at the end of each month. Decile 1 portfolio consists of firms with the lowest value of the attribute (SURGE, SUE, or prior six-month returns), and Decile 10 consists of firms with the highest value of the attribute. The correlations are calculated at the end of each month. The values reported in the table are monthly averages of those correlations. Sample firms are further classified into bottom 30%, middle 40%, and top 30% groups by their respective market capitalizations or book-to-market ratios at the end of the formation months. The breakpoints for the size subsamples are based on ranked values of market capitalization of NYSE firms. The breakpoints for the book-to-market subsamples are based on ranked values of book-to-market ratio of all sample firms. The numbers in parentheses are the average t-statistics under the null hypothesis that the correlation is zero. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5% and 10%, respectively. Percentages in brackets represent the fraction of the months with non-zero correlations that are significant at the 1% level.
H.-Y. Chen et al.
All firms
Handbook of Financial Econometrics,. . . (Vol. 3)
Subsample by B/M
15:56
3276
Table 94.2:
page 3276
July 6, 2020
Table 94.3:
Descriptive statistics of characteristics of various portfolio groups.
R2
R4
R5
E1
E2
E3
E4
E5
P1
P2
P3
P4
P5
24.13 18.13 27.11
25.03 19.00 30.71
27.27 21.13 29.78
23.16 17.80 27.27
23.59 17.38 32.49
23.93 17.75 26.73
25.26 19.50 25.36
26.03 19.63 28.05
16.86 12.63 21.33
23.39 18.12 27.84
26.62 21.13 27.77
28.29 22.50 29.17
26.83 20.13 31.85
2,483 239 14,720
2,400 250 12,593
2,567 286 14,426
2,247 238 12,883
2,111 227 12,774
2,189 236 11,838
2,771 275 15,122
2,561 253 14,516
1,316 169 8,718
2,524 256 14,233
3,018 310 15,384
3,120 322 16,527
1,902 222 10,874
0.6782 0.5381 0.5119
0.6389 0.4948 0.5034
0.5529 0.4133 0.4695
0.6868 0.5446 0.5250
0.6762 0.5389 0.5226
0.6765 0.5371 0.5157
0.6485 0.5103 0.4975
0.6378 0.4988 0.4948
0.7774 0.6408 0.5595
0.7317 0.6027 0.5209
0.6793 0.5471 0.4991
0.6148 0.4822 0.4760
0.5223 0.3836 0.4566
0.0107 0.0100 0.0457
0.0153 0.0145 0.0464
0.0220 −0.0037 0.0209 −0.0038 0.0488 0.0460
0.0054 0.0047 0.0461
0.0107 0.0098 0.0459
0.0156 0.0145 0.0453
0.0229 −0.0443 −0.0109 0.0216 −0.0403 −0.0093 0.0469 0.0322 0.0236
0.009 0.0096 0.0220
0.0297 0.0294 0.0236
0.0676 0.0637 0.0359
−0.0497 0.1297 3.1100
0.5215 0.6600 3.1184
1.0800 −5.2584 −1.5994 1.1971 −4.9957 −1.4842 3.4312 2.2130 0.8896
0.0029 0.0169 0.6252
1.4750 1.4329 0.6952
4.3051 −1.4463 −0.6197 −0.1473 4.1362 −1.0550 −0.3205 0.0709 1.5130 3.5826 3.3691 3.3150
0.2722 0.4132 3.2623
0.8561 0.8788 3.2648
0.3942 0.5175 1.2745
2.4354 2.5045 1.2251
5.7428 −1.2127 −0.2628 5.6739 −1.5664 −0.4180 1.7428 4.0386 3.7035
0.4276 0.4178 3.5492
1.0688 1.1661 3.5099
2.1560 −0.6392 −0.0739 2.3349 −0.7582 −0.1127 3.6600 3.8580 3.7885
0.8732 0.9336 3.7634
1.6730 1.7770 3.7450
0.3424 0.3592 3.7861
b3568-v3-ch94
3277
Notes: This table presents the descriptive statistics of firm characteristics for stocks sorted on SURGE, SUE, and prior returns. All sample stocks are sorted independently according to their SURGE, SUE, and prior 6-month returns. R1 (E1) represents the quintile portfolio of stocks with the most negative SURGE (SUE), and R5 (E5) represents the quintile portfolio of stocks with the most positive SURGE (SUE). Similarly, P1 denotes the quintile portfolio of stocks with the lowest prior six-month returns while P5 denotes the portfolio of stocks with the highest prior six-month returns. Reported characteristics include price level, market capitalization, B/M ratio, SURGE, SUE and prior six-month returns for component stocks in each corresponding quintile portfolio. The reported mean values are the equally weighted averages for stocks in each quintile portfolio.
9.61in x 6.69in
R3
Handbook of Financial Econometrics,. . . (Vol. 3)
Price Mean 22.05 23.50 Median 16.38 17.75 STD 25.92 26.41 Mkt Cap (million dollars) Mean 2,117 2,312 Median 218 239 STD 12,173 13,316 B/M Mean 0.7426 0.7130 Median 0.6085 0.5769 STD 0.5284 0.5215 Prior-6-month-returns Mean −0.0026 0.0056 Median −0.0025 0.0052 STD 0.0452 0.0446 SUE Mean −1.9168 −0.7191 Median −1.5894 −0.4592 STD 3.5535 3.2176 SURGE Mean −4.7404 −1.6453 Median −4.5579 −1.5386 STD 1.8859 1.3789
Prior 6-month returns
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
R1
SUE
15:56
SURGE
page 3277
July 6, 2020
15:56
3278
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
The last three sections of Table 94.3 list the SURGE, SUE, and prior returns for those sorted portfolios. Stocks with strong SURGE also tend to have higher SUE and higher prior returns. A similar pattern is seen for stocks with high SUE or high prior returns. Stocks with strong SURGE, strong SUE, or winning prior returns tend to excel on all three information dimensions. This relation is consistent with the positive correlations reported in Table 94.2. 94.4 Empirical Results of Univariate Momentum Strategies Table 94.4 presents the monthly returns to momentum strategies based on firms’ revenue surprises (SURGE), earnings surprises (SUE), and prior price performance, respectively, termed as revenue momentum, earnings momentum, and price momentum strategies. Decile portfolio results are reported here. We first examine the profitability of revenue momentum. We are interested in knowing whether the well-documented post-announcement revenue drift also enables a profitable investment strategy. Following a similar strategy of earnings momentum by Chordia and Shivakumar (2006), we define a revenue momentum portfolio as a zero-investment portfolio by buying stocks with the most positive revenue surprises and selling stocks with the most negative revenue surprises. Panel A of Table 94.4 reports significant returns to the revenue momentum strategies. These strategies yield average monthly returns of 0.94%, 0.93%, and 0.84%, respectively, by holding the relativestrength portfolios for 3, 6, and 9 months. This research, to the best of our knowledge, is the first to document specific evidence on the profitability of revenue momentum. We also test with more recent data the profitability of earnings momentum and price momentum strategies, which have both been studied in the literature. Panel B of Table 94.4 reports the results for the earnings momentum strategies. We again find that these positive-minus-negative (PMN) zeroinvestment portfolios yield significantly positive returns for holding periods ranging from 3 to 12 months. The profit is strongest when the PMN portfolios are held for three months, leading to an average monthly return of 0.99%, significant at the 1% level. The results are consistent with those of Bernard and Tomas (1989) and Chordia and Shivakumar (2006). Chordia and Shivakumar (2006) find a significant monthly return of 0.96% on a 6-month holding-period earnings momentum strategy executed over 1972– 1999, while we show a significant monthly return of 0.71% for a sample period extending to 2009.
page 3278
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Table 94.4: strategies.
page 3279
3279
Returns to revenue momentum, earnings momentum, and price momentum
Panel A. Revenue Momentum Returns Holding period 3 months 6 months 9 months 12 months
Low ∗∗∗
0.0074 (2.56) 0.0097∗∗∗ (3.34) 0.0118∗∗∗ (4.01) 0.0131∗∗∗ (4.43)
High ∗∗∗
0.0163 (5.37) 0.0158∗∗∗ (5.17) 0.0154∗∗∗ (5.03) 0.0145∗∗∗ (4.78)
PMN ∗∗∗
0.0089 (7.19) 0.0061∗∗∗ (5.10) 0.0036∗∗∗ (3.03) 0.0014 (1.24)
CAPM Adj. (1) FF3 Adj. (2) 0.0084∗∗∗ (6.88) 0.0056∗∗∗ (4.71) 0.0030∗∗∗ (2.58) 0.0010 (0.87)
0.0105∗∗∗ (9.22) 0.0079∗∗∗ (7.32) 0.0054∗∗∗ (5.16) 0.0034∗∗∗ (3.36)
Panel B. Earnings Momentum Returns Holding period 3 months 6 months 9 months 12 months
Low
High
PMN
0.0079∗∗∗ (2.71) 0.0098∗∗∗ (3.35) 0.0116∗∗∗ (3.92) 0.0127∗∗∗ (4.28)
0.0178∗∗∗ (6.14) 0.0169∗∗∗ (5.81) 0.0164∗∗∗ (5.65) 0.0155∗∗∗ (5.37)
0.0099∗∗∗ (9.77) 0.0071∗∗∗ (7.82) 0.0048∗∗∗ (5.68) 0.0028∗∗∗ (3.60)
CAPM Adj. (1) FF3 Adj. (2) 0.0099∗∗∗ (9.71) 0.0070∗∗∗ (7.71) 0.0048∗∗∗ (5.59) 0.0028∗∗∗ (3.64)
0.0102∗∗∗ (9.90) 0.0077∗∗∗ (8.42) 0.0056∗∗∗ (6.63) 0.0037∗∗∗ (4.47)
Panel C. Price Momentum Returns Holding period 3 months 6 months 9 months 12 months
Loser
Winner
WMK
0.0085∗∗ (2.18) 0.0088∗∗ (2.29) 0.0099∗∗∗ (2.62) 0.0109∗∗∗ (2.94)
0.0179∗∗∗ (4.81) 0.0182∗∗∗ (4.94) 0.0183∗∗∗ (5.02) 0.0171∗∗∗ (4.72)
0.0094∗∗∗ (3.23) 0.0093∗∗∗ (3.47) 0.0084∗∗∗ (3.57) 0.0061∗∗∗ (2.93)
CAPM Adj. (1) FF3 Adj. (2) 0.0101∗∗∗ (3.48) 0.0098∗∗∗ (3.62) 0.0085∗∗ (3.62) 0.0062∗∗∗ (2.94)
0.0113∗∗∗ (3.80) 0.0112∗∗∗ (4.09) 0.103∗∗∗ (4.32) 0.0085∗∗∗ (4.06)
Notes: This table presents monthly returns and associated t-statistics from revenue, earnings, and price momentum strategies executed during the period from 1974 through 2009. For the revenue momentum strategy, firms are grouped into 10 deciles based on the measure SURGE during each formation month. Decile 1 represents the most negative revenue surprises, and Decile 10 represents the most positive revenue surprises. The values of SURGE for each formation month are computed using the most recent revenue announcements made within three months before the formation date. (Continued )
July 6, 2020
15:56
3280
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al. Table 94.4:
(Continued )
The zero-investment portfolios — long the most positive revenue surprises portfolio and short the most negative revenue surprises portfolio (PMN) — are held for K (K = 3, 6, 9, and 12) subsequent months and are not rebalanced during the holding period. Panel A lists the average monthly returns earned from the portfolio of those firms with the most negative SURGE (low), from the portfolio of those with the most positive SURGE (high), and from the earnings momentum strategies (PMN). Earnings momentum strategies are developed with the same approach of revenue momentum strategies, by buying stocks with the most positive earnings surprises and selling stocks with the most negative earnings surprises. The zero investment portfolios are then held for K subsequent months. Panel B lists the average monthly returns earned from the portfolio of those firms with the most negative SUE (low), from the portfolio of those with the most positive SUE (high), and from the earnings momentum strategies (PMN). For the price momentum strategy, firms are sorted into 10 ascending deciles on the basis of previous 6-month returns. Portfolios of buying Decile 1 (winner) and selling Decile 10 (loser) are held for K subsequent months and not rebalanced during the holding period. The average monthly returns of winner, loser, and price momentum strategies are presented in Panel C. Risk-adjusted momentum returns are also provided in this table. Adj. (1) is momentum returns adjusted by CAPM, and Adj. (2) is momentum returns adjusted by the Fama–French 3-factor model. ∗∗∗ and ∗∗ indicate statistical significance at 1% and 5%, respectively.
Panel C shows the performance of price momentum strategies. Similar to the results in Jegadeesh and Titman (1993), price momentum strategies yield average monthly returns of 0.94%, 0.93%, 0.84%, and 0.61%, for the 3, 6, 9, and 12 months holding-period, respectively. A comparison of the three momentum strategies indicates that the highest returns are for price momentum, followed by earnings momentum and revenue momentum. Meanwhile, the profitability for earnings momentum portfolio deteriorates faster than for price momentum as the holding period extends from 3 to 12 months.10 The revenue momentum strategy yields the smallest and the shortest-lived profits, with returns diminishing to an insignificant level when the holding period is extended to 12 months. Following a similar approach by Fama and French (1996) and Jegadeesh and Titman (2001), we implement the capital asset pricing model and a Fama–French three factor (FF-3) model to examine whether the momentum returns can be explained by pricing factors.11 The last two columns in Panel A of Table 94.4 list the risk-adjusted returns to revenue momentum, which remain significant. The market risk premium, size factor, and book-to-market 10 We show later that earnings momentum actually demonstrates stronger persistence than price momentum when the momentum portfolios are held over 2 years. 11 We obtain monthly data on market return, the risk-free rate, and SMB and HML from Kenneth French’s website (http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/).
page 3280
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3281
3281
factor, while serving to capture partial effects of the revenue momentum strategy, are still unable to explain away abnormal returns entirely. The FF-3 factor adjusted return for 6 months remains strong at 0.79% with a t-statistic equal to 7.32. The risk-adjusted returns to earnings momentum and price momentum in Panels B and C of Table 4 are similar to those in the literature (see Jegadeesh and Titman, 1993; Chordia and Shivakumar, 2006) and generally confirm the conclusion of Fama (1998) that post-earningsannouncement drift and price momentum profits remain significant. 94.5 Interrelation of Revenue, Earnings, and Price Momentum We further examine the interrelation of momentum strategies through tests of dominance, cross-contingencies, and combined strategies. The objective is to find empirical support for hypotheses for our two research questions. First, we hypothesize that revenue surprises, earnings surprises, and prior returns each have some exclusive information content that is not captured by the market. Under this hypothesis, a particular univariate momentum strategy should not be subsumed by another strategy, which we examine through dominance tests. Second, we hypothesize that the market not only underreacts to individual firm information, but also underestimates the significance of the joint implications of revenue, earnings, and price information. Under this hypothesis, return anomalies are likely to be most pronounced when the information variables all point in the same direction. 94.5.1 Testing for dominance among the momentum strategies To tackle the interrelation of momentums, we first explore whether any of the three momentum strategies is entirely subsumed by another strategy. Stock price represents the firm value evaluated by investors in the aggregate, given their available information. The most important firm fundamental information for investors is undoubtedly firm earnings, which summarize firm performance. Jegadeesh and Livnat (2006b) point out that an important reference for investors regarding the persistence of firm earnings is offered by firm revenue information. Obviously, these three pieces of firm-specific information, revenue, earnings and stock price, share significant information content with each other. The anomalies of their corresponding momentums therefore may arise from common sources. That is, payoffs to a momentum strategy based on one measure, being revenue surprises, earnings surprises,
July 6, 2020
15:56
3282
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
or prior returns, may be fully captured by another measure. The dominance tests serve to test for such a possibility. We first apply the pairwise nested comparison model introduced by George and Hwang (2004) and test whether one particular momentum strategy dominates another. Table 94.5 reports the results in three panels. Panel A compares the revenue momentum and earnings momentum strategies. In Panel A.1, stocks are first sorted on earnings surprises, with each quintile further sorted on revenue surprises. We find that, when controlling for the level of earnings surprises, the revenue momentum strategy still yields significant profits. The zero-investment portfolio returns for six-month holding periods range from 0.26% to 0.36%. In Panel A.2, stocks are first sorted on revenue surprises, and then on earnings surprises. Likewise, the returns to an earnings momentum strategy, when controlling for the level of revenue surprises, are still significantly positive. These paired results indicate that neither earnings momentum nor revenue momentum dominates one another. We follow the same process in comparing revenue momentum and price momentum strategies. Results in Panel B indicate that all the nested revenue momentum strategies and the nested price momentum strategies are found profitable, with the exception of revenue momentum in the loser stock group. In general, we still conclude that neither revenue momentum nor price momentum is dominated by the other. Panel C of Table 94.5 presents the results of the nested momentum strategies based on two-way sorts on earnings surprises and prior returns. Returns to all these nested momentum strategies remain significantly positive. The pairwise nested comparisons suggest that revenue surprises, earnings surprises, and prior returns each convey some unpriced information which is not shared by each other, and therefore further contributes to a momentum effect. A second approach allows us to simultaneously isolate the returns contributed by each momentum portfolio. Taking advantage of George and Hwang’s (2004) model, we implement a panel data analysis with six performance dummies. Rit = αjt + β1jt Ri,t−1 + β2jt sizei,t−1 + β3jt R1i,t−j + β4jt R5i,t−j + β5jt E1i,t−j + β6jt E5i,t−j + β7jt P 1i,t−j + β8jt P 5i,t−j + eit ,
(94.3)
where j = 1, . . . , 6. We first regress firm i’s return in month t on control variables and six dummies for the portfolio ranks. We include the previous month return Ri,t−1 to control for the bid-ask bounce effect and the market
page 3282
July 6, 2020 15:56
A.1 Revenue momentum in various SUE groups A.2 Earnings momentum in various SURGE groups Portfolios classified by SUE E1 (Low) E2
E4 E5 (High)
R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1
0.0065 0.0101 0.0036 0.0086 0.0115 0.0028 0.0090 0.0119 0.0029 0.0096 0.0122 0.0026 0.0116 0.0149 0.0033
t-stats
Portfolios classified by SURGE R1 (Low)
(3.24) (2.85) (3.29) (2.70) (3.22)
R2 R3 R4 R5 (High)
Portfolios classified by SUE
Ave. monthly return
E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1
0.0064 0.0104 0.0040 0.0079 0.0113 0.0034 0.0089 0.0131 0.0042 0.0096 0.0140 0.0043 0.0112 0.0152 0.0040
t-stats
(4.66) (4.91) (6.03) (5.59) (4.74) (Continued )
b3568-v3-ch94
Ave. monthly return
9.61in x 6.69in
E3
Portfolios classified by SURGE
Handbook of Financial Econometrics,. . . (Vol. 3)
Panel A. Revenue momentum vs. earnings momentum
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Table 94.5: Momentum strategies: Two-way dependent sorts by revenue surprises, earnings surprises, and prior returns.
3283 page 3283
July 6, 2020
Panel B. Revenue momentum vs. price momentum B.1 Revenue momentum in various prior ret groups B.2 Price momentum in various SURGE groups
P1 (Loser)
R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1 R1 (Low) R5 (High) R5-R1
0.0070 0.0077 0.0008 0.0083 0.0099 0.0015 0.0091 0.0123 0.0032 0.0089 0.0132 0.0042 0.0106 0.0175 0.0070
P2 P3 P4 P5 (Winner)
t-stats
Portfolios classified by SURGE R1 (Low)
(0.67) (1.82) (4.33) (5.53) (7.03)
R2 R3 R4 R5 (High)
Portfolios classified by prior ret P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1
Ave. monthly return 0.0072 0.0095 0.0024 0.0084 0.0110 0.0026 0.0092 0.0135 0.0042 0.0092 0.0149 0.0057 0.0080 0.0176 0.0096
t-stats
(1.35) (1.51) (2.29) (3.35) (4.82)
b3568-v3-ch94
Ave. monthly return
9.61in x 6.69in
Portfolios classified by SURGE
H.-Y. Chen et al.
Portfolios classified by prior ret
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3284
Table 94.5:
page 3284
July 6, 2020 15:56
Portfolios classified by prior ret
P1 (Loser)
E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1 E1 (Low) E5 (High) E5-E1
P2 P3 P4 P5 (Winner)
Ave. monthly return 0.0063 0.0096 0.0034 0.0082 0.0106 0.0024 0.0090 0.0126 0.0036 0.0091 0.0137 0.0046 0.0104 0.0178 0.0073
t-stats
Portfolios classified by prior ret E1 (Low)
(3.73) (3.67) (5.96) (7.69) (8.78)
E2 E3 E4 E5 (High)
Portfolios classified by SURG P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1 P1 (Loser) P5 (Winner) P5-P1
Ave. monthly return 0.0066 0.0097 0.0031 0.0083 0.0118 0.0035 0.0081 0.0134 0.0052 0.0096 0.0143 0.0047 0.0100 0.0177 0.0077
t-stats
(1.62) (1.80) (2.87) (2.65) (4.16)
b3568-v3-ch94
3285
Notes: This table presents the results of pairwise nested comparison between momentum strategies. Panel A shows the comparison between revenue momentum and earnings momentum during the period from 1974 to 2009. In each month, stocks are first sorted into five groups by earnings surprises (revenue surprises), then further sorted by revenue surprises (earnings surprises) in each group. All portfolios are held for 6 months. The monthly returns to 10 extreme portfolios and five conditional earnings (revenue) momentum strategies are presented. Pair tests are provided under the hypothesis that conditional earnings (revenue) momentum profits are the same. Panel B shows the comparison between revenue and price momentum strategies, and Panel C shows the comparison between earnings and price momentum strategies.
9.61in x 6.69in
Portfolios classified by SURGE
Handbook of Financial Econometrics,. . . (Vol. 3)
C.1 Earnings momentum in various prior ret groups C.2 Price momentum in various SUE groups
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Panel C. Earnings momentum vs. price momentum
page 3285
July 6, 2020
15:56
3286
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
capitalization sizei,t−1 to control for the size effect in the cross-sectional regressions. Momentum portfolio dummies, R1i,t−j , R5i,t−j , E1i,t−j , E5i,t−j , P 1i,t−j , and P 5i,t−j , indicate whether firm i is included in one or more momentum portfolios based on their scores in month t−j. To obtain momentum profits corresponding to the Jegadeesh and Titman (1993) strategies, we average the estimated coefficients of the independent variable over j = 1, . . . , 6, and then subtract the coefficient average for the bottom quintile portfolio from that for the top quintile portfolio. These are the returns contributed by each momentum strategy when the contributions from other momentum strategies are controlled for. Panel A of Table 94.6 reports the regression results. The returns isolated for revenue momentum, earnings momentum, and price momentum are listed in the last three rows. The results are all significant in terms of either raw returns or FF-3 factor adjusted returns when all months are included or when all non-January months are included. Note, however, that the isolated returns to revenue momentum (R5–R1) and to price momentum (P5–P1) strategies are no longer significantly positive in January. The insignificant returns in January are consistent with the tax-loss-selling hypothesis, proposing that investors sell poorly performing stocks in October through December and buy them back in January (e.g., see Keim, 1989; Odean, 1998; Grinblatt and Moskowitz, 2004). The overall significant profits contributed by R5–R1 (E5–E1 or P5–P1) indicate market underreactions with respect to the information content of revenue surprises (earnings surprises or prior price performance) unrelated to the other two information measures. The isolated returns are greatest for price momentum (0.66%), followed by earnings momentum (0.43%) and then revenue momentum (0.28%). This is similar to our earlier results on single-criterion momentum. Such a finding again rejects the existence of a dominating momentum strategy among the three. We do not find that information leading to revenue momentum or earnings momentum fully captures the price momentum returns. Similar findings are documented by Chan et al. (1996), Heston and Sadka (2008), and NovyMarx (2012) for the relation between earnings surprises and price momentum. We would like to examine specifically how much of the price momentum can be explained by revenue surprises and/or earnings surprises information. For this reason, we perform similar regressions by including only a subset of portfolio dummies. The results are reported in Panel B of Table 94.6. In the case of raw returns, the return to price momentum without isolating other momentum sources is 0.81%, while it is only reduced to 0.73% after
page 3286
July 6, 2020
Table 94.6:
Comparison of revenue, earnings, and price momentum strategies.
15:56
Panel A. Contribution of momentum returns solely from prior performance information
Ri,t−1 Size R1 Dummy R5 Dummy E1 Dummy
P1 Dummy P5 Dummy R5-R1
P5-P1
All months
Jan.
Feb. – Dec.
0.0130 (4.96) −0.0412 (−8.95) >−0.0001 (−2.94) −0.0015 (−3.39) 0.0013 (2.18) −0.0018 (−5.07) 0.0024 (6.86) −0.0026 (−2.14) 0.0040 (2.92) 0.0028 (3.23) 0.0043 (7.89) 0.0066 (3.35)
0.0354 (3.13) −0.1146 (−6.04) >−0.0001 (−2.64) 0.0042 (2.81) −0.0002 (−0.07) −0.0037 (−2.89) 0.0045 (3.61) 0.0125 (1.96) 0.0023 (0.53) −0.0044 (−1.50) 0.0082 (4.64) −0.0102 (−1.15)
0.0110 (4.14) −0.0346 (−7.55) >−0.0001 (−1.88) −0.0020 (−4.43) 0.0014 (2.31) −0.0017 (−4.42) 0.0023 (6.09) −0.0040 (−3.35) 0.0041 (2.88) 0.0035 (3.82) 0.0039 (6.92) 0.0081 (4.10)
0.0051 (6.31) −0.0371 (−8.54) >−0.0001 (−1.12) −0.0020 (−4.68) 0.0021 (4.09) −0.0016 (−4.43) 0.0026 (6.81) −0.0036 (−3.17) 0.0044 (3.39) 0.0041 (5.45) 0.0041 (7.45) 0.0080 (3.95)
0.0075 (2.79) −0.0851 (−4.60) >−0.0001 (−0.02) 0.0015 (1.02) 0.0019 (1.06) −0.0028 (−1.96) 0.0055 (4.05) 0.0072 (1.13) 0.0033 (0.67) 0.0004 (0.18) 0.0083 (4.34) −0.0039 (−0.39)
0.0048 (5.64) −0.0327 (−7.48) >−0.0001 (−0.99) −0.0023 (−5.20) 0.0021 (3.81) −0.0015 (−4.03) 0.0023 (6.06) −0.0048 (−4.43) 0.0045 (3.33) 0.0044 (5.48) 0.0038 (6.66) 0.0092 (4.59) (Continued )
3287
Feb. – Dec.
b3568-v3-ch94
E5-E1
Jan.
9.61in x 6.69in
E5 Dummy
All months
Handbook of Financial Econometrics,. . . (Vol. 3)
Intercept
Risk-adjusted returns
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Raw returns
page 3287
July 6, 2020
Panel B. Univariate price momentum return and conditional price momentum returns Raw returns Intercept Ri,t−1
R5 Dummy E1 Dummy E5 Dummy P1 Dummy
−0.0030 (−2.44)
0.0130 (4.96) −0.0412 (−8.95) > −0.0001 (−2.94) −0.0015 (−3.39) 0.0013 (2.18) −0.0018 (−5.07) 0.0024 (6.86) −0.0026 (−2.14)
0.0053 (6.62) −0.0363 (−8.29) −0.0001 (−0.83)
0.0051 (6.36) −0.0368 (−8.41) −0.0001 (−0.88)
0.0044 (−3.83)
−0.0021 (−5.53) 0.0030 (7.47) −0.0038 (−3.36)
0.0053 (6.58) −0.0368 (−8.46) −0.0001 (−1.16) −0.0026 (−5.69) 0.0026 (4.87)
−0.0039 (−3.48)
0.0051 (6.31) −0.0371 (−8.54) > −0.0001 (−1.12) −0.0020 (−4.68) 0.0021 (4.09) −0.0016 (−4.43) 0.0026 (6.81) −0.0036 (−3.17)
b3568-v3-ch94
−0.0034 (−2.74)
−0.0022 (−5.78) 0.0028 (6.95) −0.0029 (−2.31)
R1 Dummy
0.0131 (5.03) −0.0407 (−8.87) −0.0003 (−2.97) −0.0021 (−4.53) 0.0018 (2.88)
9.61in x 6.69in
0.0129 (4.92) −0.0409 (−8.83) −0.0003 (−2.75)
H.-Y. Chen et al.
Size
0.0131 (4.97) −0.0404 (−8.72) −0.0003 (−2.72)
Risk-adjusted returns
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3288
Table 94.6:
page 3288
July 6, 2020 15:56
0.0042 (3.01)
0.0081 (4.02)
0.0050 (8.07) 0.0070 (3.52)
R5–R1 E5–E1 P5–P1
0.0043 (3.14) 0.0039 (4.26)
0.0073 (3.70)
0.0040 (2.92) 0.0028 (3.23) 0.0043 (7.89) 0.0066 (3.35)
0.0052 (3.98)
0.0047 (3.59)
0.0096 (4.70)
0.0051 (8.26) 0.0085 (4.18)
0.0047 (3.64) 0.0051 (6.45)
0.0086 (4.29)
0.0044 (3.39) 0.0041 (5.45) 0.0041 (7.45) 0.0080 (3.95)
Notes: This table presents returns to relative strength portfolios and momentum strategies. Each month during the period from 1974 through 2009, six cross-sectional regressions are estimated for revenue, earnings, and price momentum strategies: Rit = αjt + β1jt Ri,t−1 + β2jt sizei,t−1 + β3jt R1i,t−j + β4jt R5i,t−j + β5jt E1i,t−j + β6jt E5i,t−j + β7jt P 1i,t−j + β8jt P 5i,t−j + eit ,
9.61in x 6.69in b3568-v3-ch94
where Rit and sizei,t are the return and the market capitalization of stock i in month t; and R1i,t−j (R5i,t−j ) is the most negative (positive) revenue surprise dummy that takes the value of 1 if revenue surprises for stock i are ranked in the bottom (top) quintile in month t−j, and zero otherwise. The dummies with respect to earnings surprises (E1i,t−j and E5i,t−j ), and the dummies with respect to prior 6 month price returns (P 1i,t−j and P 5i,t−j ) are similar to the settings of R1i,t−j and R5i,t−j . The estimated coefficients of independent variable are averaged over j = 1, . . . , 6. The numbers reported for raw returns are the time-series average of these averages. The t-statistics calculated from the time series are in parentheses. The risk adjusted returns are intercepts from Fama–French 3-factor regressions on raw returns; their t-statistics are in parentheses. Panel A presents returns to relative strength portfolios and momentum strategies solely belong to each of prior price performance, earnings surprise, and revenue surprises. Panel B presents raw return and conditional returns of price momentum strategy.
Handbook of Financial Econometrics,. . . (Vol. 3)
0.0046 (3.33)
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
P5 Dummy
3289 page 3289
July 6, 2020
15:56
3290
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
controlling for revenue momentum, to 0.70% after controlling for earnings momentum, and to 0.66% after controlling for both. In other words, information leading to revenue momentum and earnings momentum each accounts for about 10% and 14% of price momentum, and the two pieces of information combined account for just about 19% of price momentum effects. The results for risk-adjusted returns are similar. This conclusion adds to the large literature attempting to trace the sources of price momentum. Our numbers indicate that the information conveyed by revenue surprises or earnings surprises seems to make only a limited contribution to price momentums. Results of the pairwise nested comparisons in Table 94.5 and the regression analysis in Table 94.6 both support the hypothesis that revenue surprises, earnings surprises, and prior returns each have some unpriced information content that is exclusive to each measure itself. This conclusion also suggests the possibility that one can improve momentum strategies by using all three information measures. 94.5.2 Two-way sorted portfolio returns and momentum cross-contingencies Here and in the next section, we examine the momentum strategies using multiple sorting criteria. These results serve to answer the research question of whether investors underestimate the implications of joint information of revenue surprises, earnings surprises, and prior returns. Given that the market usually informs investors with not just a single piece but multiple pieces of firm information, the incremental information content of additional firm data is likely to be contingent upon other information for the stock. Jegadeesh and Livnat (2006b) suggest that the information content of SURGE has implications for the future value of SUE and such information linkage is particularly significant when both measures point in the same direction. Jegadeesh and Livnat (2006a) further find that the market, including financial analysts, underestimates the joint implications of these measures and thus firm market value. Our second research question extends Jegadeesh and Livnat (2006b) by additionally considering the information of prior price performance. We hypothesize that return anomalies should be most pronounced when the joint implications of multiple measures are most underestimated by the market, and this likely occurs when all information variables point in the same direction. In addition, a different but related issue is that any momentum profits driven by one measure may well depend on the accompanying alternative
page 3290
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3291
3291
information, which we call the cross-contingencies of momentum. We use multivariate sorted portfolios to test this hypothesis. 94.5.2.1 Two-way sorts on revenue surprises and earnings surprises We start by testing the performance of investment strategies based on the joint information of revenue surprises and earnings surprises. We sort stocks into quintiles on the basis of their revenue surprises and then independently into quintiles based on earnings surprises during the six-month formation period on each portfolio formation date. Panel A of Table 94.7 presents the raw returns of these 25 two-way sorted portfolios. The intersection of R1 and E1, labeled as R1 × E1, is the portfolio formed by the stocks with both the lowest SURGE and the lowest SUE, and the intersection of R5 and E5 labeled as R5 × E5, represents the portfolio formed by the stocks with both the highest SURGE and the highest SUE. We first note that the next-period returns of the 25 two-way sorted portfolios increase monotonically with SURGE as well as with SUE. The return to the portfolio with a similar level of SURGE increases with SUE (e.g., the return increases from 0.88% for R1 × E1 to 1.21% for R1 × E5). Similarly, the payoffs to the portfolio of stocks with a similar level of SUE increase with SURGE (e.g., the return increases from 1.23% for R1 × E5 to 1.70% for R5 × E5). That is, stocks that have performed well in terms of revenue and earnings continue to outperform expectations and yield higher future returns. Panel D of Table 94.7 shows the corresponding risk-adjusted abnormal returns for each of the 5 × 5 double-sorted portfolios based on SURGE and SUE. The monotonicity we see in raw returns in Panel A persists for the risk-adjusted returns. The most positive abnormal returns are for the portfolio of high-SURGE and high-SUE stocks (R5 × E5) while the most negative abnormal returns are for the portfolio of low-SURGE and low-SUE stocks (R1 × E1). This provides direct and robust evidence that the return anomalies tend to be most pronounced when SURGE and SUE point in the same direction. The evidence of monotonicity suggests that the market underreaction is at its extreme when different elements of stock performance information signal in the same direction, i.e., the scenarios of R1 × E1 or R5 × E5. These are the scenarios where the information of SURGE and SUE are expected to have the most significant joint implications for firm value, while market underestimation of their joint implications is found to be the strongest,
July 6, 2020
Panel A. Raw returns sorted on revenue surprises (SURGE) and earnings surprise (SUE) E1(Low)
E2
E4
0.0109 0.0112 0.0117 0.0129 0.0134 0.0142 0.0133 0.0137 0.0141 0.0146 0.0032 0.0034 (2.84) (2.84) – R1 × E1
E5(High) 0.0121 0.0139 0.0154 0.0165 0.0170 0.0036 (2.70)
Arbitrage returns on portfolios sorted by earnings 0.0049 0.0041 0.0048 0.0057 0.0039
(4.63) (4.71) (5.40) (5.84) (3.43)
0.0081
(6.25)
Panel B. Raw returns sorted on revenue surprises (SURGE) and prior price performance Prior price performance P2 P3 P4 0.0109 0.0121 0.0139 0.0141 0.0156 0.0048 (5.21)
0.0122 0.0135 0.0161 0.0176 0.0198 0.0078 (6.43)
Arbitrage returns on portfolios sorted by price 0.0034 0.0036 0.0053 0.0076 0.0108
(1.45) (1.59) (2.26) (3.66) (4.67)
0.0109
(4.53)
b3568-v3-ch94
R1(Low) 0.0089 0.0104 0.0109 R2 0.0099 0.0112 0.0121 SURGE R3 0.0108 0.0125 0.0133 R4 0.0100 0.0125 0.0131 R5(High) 0.0090 0.0112 0.0143 Arbitrage returns on 0.0001 0.0008 0.0033 portfolios sorted by revenue (0.06) (0.79) (3.69) Revenue-Price combined momentum strategy: R5 × P5 – R1 × P1
P5(Winner)
9.61in x 6.69in
P1(Loser)
H.-Y. Chen et al.
R1(Low) 0.0088 0.0107 R2 0.0098 0.0112 SURGE R3 0.0106 0.0121 R4 0.0108 0.0124 R5(High) 0.0123 0.0141 Arbitrage returns on 0.0043 0.0034 portfolios sorted by revenue (2.86) (2.59) Revenue-Earnings combined momentum strategy: R5 × E5
SUE E3
Handbook of Financial Econometrics,. . . (Vol. 3)
Momentum strategies: Two-way sorts by revenue surprises, earnings surprises, and prior returns.
15:56
3292
Table 94.7:
page 3292
July 6, 2020 15:56
Prior price performance P2 P3 P4
E1(Low) 0.0083 0.0103 0.0109 E2 0.0098 0.0115 0.0119 SUE E3 0.0099 0.0117 0.0127 E4 0.0106 0.0120 0.0133 E5(High) 0.0107 0.0127 0.0149 Arbitrage returns on 0.0030 0.0023 0.0040 portfolios sorted by revenue (2.66) (3.10) (5.49) Price-Revenue combined momentum strategy: E5 × P5 – E1 × P1
0.0107 0.0126 0.0134 0.0138 0.0160 0.0053 (7.51)
P5(Winner) 0.0105 0.0141 0.0162 0.0168 0.0201 0.0078 (7.79)
Arbitrage returns on portfolios sorted by price 0.0045 0.0044 0.0062 0.0062 0.0092
(1.94) (1.89) (2.73) (2.81) (4.01)
0.0118
(5.47)
Panel D. Risk-adjusted returns sorted on revenue surprises (SURGE) and earnings surprise SUE E3
E4
−0.0026 −0.0020 −0.0013 −0.0017 −0.0008 0.0002 −0.0004 0.0008 0.0017 0.0006 0.0011 0.0018 0.0027 0.0023 0.0033 0.0046 0.0053 0.0043 (4.32) (3.98) (4.16) strategy: R5 × E5 – R1 × E1
E5(High) 0.0005 0.0013 0.0032 0.0044 0.0054 0.0045 (3.54)
Risk-adjusted returns on portfolios sorted by earnings 0.0049 0.0043 0.0049 0.0052 0.0033
(4.52) (4.79) (5.38) (5.21) (2.86)
0.0081
(6.25) (Continued )
b3568-v3-ch94
R1(Low) −0.0043 R2 −0.0029 SURGE R3 −0.0016 R4 −0.0008 R5(High) 0.0021 Risk-adjusted returns on 0.0064 portfolios sorted by revenue (4.78) Revenue-Earnings combined momentum
E2
9.61in x 6.69in
E1(Low)
Handbook of Financial Econometrics,. . . (Vol. 3)
P1(Loser)
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Panel C. Raw returns sorted on earnings surprises (SUE) and prior price performance
3293 page 3293
July 6, 2020
Panel E. Risk-adjusted returns sorted on revenue surprises (SURGE) and prior price P1(Loser)
Prior price performance P2 P3 P4
0.0002 0.0015 0.0040 0.0059 0.0086 0.0087 (7.49)
0.0052 0.0052 0.0066 0.0085 0.0118
(2.22) (2.26) (2.76) (4.05) (4.98)
0.0097
(7.86)
P1(Loser)
SUE
E1(Low) E2 E3 E4 E5(High)
−0.0051 −0.0039 −0.0033 −0.0025 −0.0018
Prior price performance P2 P3 P4 −0.0024 −0.0014 −0.0011 −0.0004 0.0003
−0.0012 −0.0005 0.0004 0.0012 0.0029
−0.0010 0.0006 0.0013 0.0020 0.0041
Risk-adjusted returns on P5(Winner) portfolios sorted by price 0.0010 0.0027 0.0042 0.0051 0.0083
0.0062 0.0066 0.0075 0.0076 0.0096
(2.65) (2.79) (3.20) (3.35) (4.07)
9.61in x 6.69in
Panel F. Risk-adjusted returns sorted on earnings surprises (SUE) and prior price
H.-Y. Chen et al.
R1(Low) −0.0050 −0.0028 −0.0018 −0.0015 R2 −0.0037 −0.0018 −0.0004 −0.0002 SURGE R3 −0.0025 −0.0002 0.0009 0.0017 R4 −0.0026 0.0002 0.0013 0.0025 R5(High) −0.0032 −0.0005 0.0029 0.0044 0.0018 0.0023 0.0047 0.0059 Risk-adjusted returns on portfolios sorted by price (1.38) (2.49) (5.80) (6.86) Revenue-Price combined momentum strategy: R5 × P5 – R1 × P1
Risk-adjusted returns on P5(Winner) portfolios sorted by price
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3294
Table 94.7:
b3568-v3-ch94 page 3294
July 6, 2020 15:56
0.0072 (7.05) 0.0133
(6.09)
9.61in x 6.69in
Notes: For each month, we form equal-weighted portfolios according to the breakpoints of two of three firm characteristics: a firm’s revenue surprises (SURGE), its earnings surprises (SUE), and its prior six-month stock performance. Panel A and Panel D present raw returns and risk-adjusted returns of the 25 portfolios independently sorted on SURGE and on SUE. The returns of a revenue-earnings combined momentum strategy are obtained by buying the portfolio of the best SURGE stocks and the stocks with the best SUE (SURGE = 5 and SUE = 5) and selling the portfolio of the poorest SURGE stocks and the stocks with the poorest SUE (SURGE = 1 and SUE = 1). Panel B and Panel E present raw returns and risk-adjusted returns of the 25 portfolios independently sorted on SURGE and on prior price performance. The returns of a revenue-price combined momentum strategy is obtained by buying stocks in the portfolio of the best SURGE and the highest price performance and selling stocks in the portfolio of the poorest SURGE and the lowest price performance. Panel C and Panel F present the raw returns and risk-adjusted of the 25 portfolios independently sorted on SUE and on prior price performance. The returns of an earnings-price combined momentum strategy is obtained by buying stocks in the portfolio of the best SUE and the highest price performance and selling stocks in the portfolio of the poorest SUE and the lowest price performance. We also present the arbitrage returns and risk-adjusted arbitrage returns of single sorted portfolios based on the quintiles of price performance, SUE or SURGE at the bottom (and on the right hand side) of each panel for the purpose of comparisons. Risk-adjusted return is the intercept of the Fama–French 3-factor regression where the dependent variable is the arbitrage return or the excess return which is the difference between the raw return and the risk-free rate.
Handbook of Financial Econometrics,. . . (Vol. 3)
0.0051 (7.07)
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Risk-adjusted returns on 0.0036 0.0027 0.0041 portfolios sorted by learning (3.25) (3.64) (5.61) Price-Revenue combined momentum strategy: E5 × P5 – E1 × P1
3295 page 3295
July 6, 2020
15:56
3296
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
leading to the most pronounced return drifts in the next period. This observation is consistent with the suggestion by Jegadeesh and Livnat (2006a and 2006b). Investors may execute various long–short strategies with those 25 portfolios. Those listed in the farthest right column of Panel A indicate earnings momentum returns for stocks with a particular level of SURGE, while those listed in the last row are returns on revenue momentum for stocks with a given level of SUE.12 We now examine the cross-contingencies of momentum. The revenue momentum measure is 0.36% per month in the high-SUE subsample E5 and 0.43% per month in the low-SUE subsample E1. Meanwhile, the earnings momentum measure is 0.39% per month in the high-SURGE subsample R5, and 0.49% per month in the low-SURGE subsample R1. We do not observe significant variations in momentum returns across SUE or SURGE. Panel D shows similar patterns when returns to momentum portfolios are adjusted for size and B/M risk factors. All of the profits generated earnings momentum strategies or revenue momentum strategies remain significantly positive.
94.5.2.2 Two-way sorts on revenue surprises and prior returns We apply similar sorting procedures based on the joint information of revenue surprises and prior price performance. The results for raw returns as shown in Panel B of Table 94.7 generally exhibit a pattern similar to Panel A but with the following differences. Although the future returns still rise with SURGE among the average and winner stocks, they become insensitive to SURGE for loser stocks. A closer look at the return for portfolio R1 × P1 down to the return for portfolio R5 × P1 indicates that loser portfolio returns simply do not vary much with the level of SURGE. Panel E lists risk-adjusted returns for the 5 × 5 portfolios sorted on prior returns and SURGE. A similar monotonic pattern, now in relation with SURGE as well as with prior returns, is observed for most of those abnormal returns. That is, stocks that have performed well in terms of revenue (firm fundamental information) and prior returns (firm market information)
12
Similar to Hong et al. (2000), one may characterize the former strategy as earnings momentum strategies that are “revenue-momentum-neutral” and the latter as revenue momentum strategies that are “earnings-momentum-neutral”.
page 3296
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3297
3297
continue to outperform expectations and yield higher future returns, and vice versa. As to the cross-contingencies of momentums, the results in Panel B indicate that the revenue momentum strategies executed with winner stocks yield higher returns than those executed with loser stocks. For example, the revenue momentum strategy executed with the most winning stocks yields a monthly return of 0.78% (R5 × P5 – R1 × P5), while with the most losing stocks it yields only a monthly return of 0.01% (R5 × P1 – R1 × P1). Likewise, the price momentum strategy executed with stocks with greater SURGE yields higher returns than with those with lower SURGE. For example, the price momentum strategy executed with the lowest SURGE stocks yields a monthly return of 0.34% (R1 × P5 – R1 × P1), while with the highest SUE stocks it yields a monthly return as high as 1.08% (R5 × P5 – R5 × P1). The difference of 0.74 percentage between R1 and R5 subsamples is statistically and economically significant, with price momentum profits more than 200% higher in R5 than in R1. These observations suggest that the revenue surprise information is least efficient among winner stocks, producing the greatest revenue drift for the next period, and that the prior return information is least efficient among stocks with the most positive SURGE producing the strongest return continuation. One noteworthy point is that the revenue momentum is no longer profitable among loser stocks. Panel E shows similar patterns of momentum cross-contingencies when returns to momentum portfolios are adjusted for size and B/M risk factors. The message for investment strategy is that prior returns are most helpful in distinguishing future returns among stocks with high SURGE, and the same is true for the implications of revenue surprises for stocks of high prior returns. On the other hand, when a stock is priced unfavorably by the market, the information of revenue surprises does not offer much help in predicting its future returns. 94.5.2.3 Two-way sorts on earnings surprises and prior returns Panel C of Table 94.7 shows the raw returns for multivariate momentum strategies based on the joint information of earnings surprises and prior returns. Several findings are observed. First, as in the cases shown in Panels A and B, the next-period returns of the 25 two-way sorted portfolios increase monotonically with SUE as well as with prior returns. For example, when a firm has a highly positive earnings surprises (E5) while having had winning
July 6, 2020
15:56
3298
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
stock returns (P5), these two pieces of information together are likely to have particularly strong joint implications for firm value. Such condition leads to an average monthly return as high as 2.01% in the next six-month period, possibly attributable to even greater investor underreactions. Panel F of Table 94.7 shows the risk-adjusted abnormal returns for each of the 5 × 5 double-sorted portfolios based on SUE and prior returns. The monotonicity we see in raw returns in Panel C persists for the risk-adjusted returns. The most positive abnormal returns are for the portfolio of highSUE and high-prior-return stocks (E5 × P5) while the most negative abnormal returns are for the portfolio of low-SUE and low-prior-return stocks (E1 × P1). Looking now at the cross-contingencies between earnings momentum and price momentum, the earnings momentum strategy executed with winner stocks yields higher returns (0.78%) than that executed with loser stocks (0.30%), and that the price momentum strategy executed with positiveSUE stocks yields higher returns (0.92%) than that executed with negativeSUE stocks (0.45%). Panel F shows risk-adjusted returns for these momentum strategies and reveals a similar pattern as in Panel C for raw returns. Results indicate that the market underreactions to price performance are contingent upon the accompanying earnings performance, and vice versa. Can we reconcile our results on momentum cross-contingencies with the behavioral explanations for momentum returns? Barberis et al. (1998) observe that a conservatism bias might lead investors to underreact to information and then result in momentum profits. The conservatism bias, described by Edwards (1968), suggests that investors underweight new information in updating their prior beliefs. If we accept the conservatism bias explanation for momentum profits, one might interpret our results as follows. Investors update their expectations of stock value using firm fundamental performance information as well as technical information, and their information updates are subject to conservatism biases. The evidence of momentum cross-contingencies suggests that the speed of adjustment to market performance information (historical price) is contingent upon the accompanying fundamental performance information (earnings and/or revenue), and vice versa. Our results in Panels B and C of Table 94.7 suggest that stock prices suffer from a stronger conservatism bias from investors and thus delay more in their adjustment to firm fundamental performance information (earnings or revenue) when those stocks experience good news, instead of bad news, as to market performance (prior returns). This then leads to greater earnings or revenue momentum returns for winner stocks than for
page 3298
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3299
3299
loser stocks. Similar scenario also leads to greater price momentum returns for high-SUE or high-SURGE stocks than for low-SUE or low-SURGE stocks. This would mean that investors are subject to a conservatism bias that is asymmetric with respect to good news vis-` a-vis bad news. That is, investors tend to be even more conservative in reacting to information on firm fundamental performance (market performance) for stocks issuing good news than those issuing bad news about their market performance (fundamental performance). 94.5.3 Combined momentum strategies The negative results on dominance tests in Tables 94.5 and 94.6 mean that each of the information variables, SURGE, SUE, and prior returns, at least to some extent, independently leads to abnormal returns. This then suggests that a combined momentum strategy using more than one of these information measures should offer improved momentum profits. While Chan et al. (1996), Piotroski (2000), Griffin et al. (2005), Mohanram (2005), Sagi and Seasholes (2007), Asness et al. (2013), and Asem (2009) have examined the profitability of combined momentum strategies based on other measures, to the best of our knowledge, we offer the first evidence on the profitability of combined momentum strategies using the three most accessible information on firm performance, i.e., prior returns, earnings surprises, and revenue surprises altogether. 94.5.3.1 Bivariate combined momentums Table 94.8 compares and analyzes the combined momentum returns. Panel A shows raw and FF-3 factor adjusted returns to momentum strategies based on one-way, two-way, and three-way sorts. We start with bivariate combined momentums. If we buy stocks with the highest SURGE and the highest SUE (R5 × E5) while selling stocks with the lowest SURGE and the lowest SUE (R1 × E1), such a revenue-and-earnings combined momentum strategy yields a monthly return as high as 0.81%, which is higher than the univariate momentum return earned solely on the basis of revenue surprises (0.47%) or earnings surprises (0.58%) when using quintile portfolios. This result is also a consequence of our observation that the sorted portfolio returns increase monotonically with both SURGE and SUE.
July 6, 2020
Panel A. Summary of momentum returns from various single/multiple sorting criteria One-way sorts Momentum strategy Mom(R) Mom(E)
0.0047∗∗∗ (4.42) 0.0058∗∗∗ (8.17) 0.0072∗∗∗ (3.36)
Adj. return Momentum strategy 0.0063∗∗∗ (6.77) 0.0063∗∗∗ (8.81) 0.0087∗∗∗ (4.01)
Mom(R + E) Mom(R +P) Mom(E +P)
Three-way sorts
Raw return
Adj. return
Momentum strategy
Raw return
Adj. return
0.0081∗∗∗ (6.25) 0.0109∗∗∗ (4.53) 0.0118∗∗∗ (6.25)
0.0097∗∗∗ (7.86) 0.0136∗∗∗ (5.75) 0.0133∗∗∗ (6.09)
Mom(R + E +P)
0.0144∗∗∗ (6.06)
0.0168∗∗∗ (7.12)
Panel B. Contribution of momentum returns from single prior performance information Incremental return contribution of earnings momentum
Diff. in momentum strategies
Diff. in momentum strategies
Mom(R + P) − Mom(P) Mom(R + E) − Mom(E)
0.0038∗∗∗ (3.91) 0.0023∗∗∗ (2.28) 0.0024∗∗∗ (2.70)
Mom(E + P) − Mom(P) Mom(R +E) − Mom(R) Mom(R + E + P) − Mom(R + P)
Return difference 0.0048∗∗∗ (6.69) 0.0035∗∗∗ (5.76) 0.0033∗∗∗ (4.47)
Diff. in momentum strategies
Return difference
0.0061∗∗∗ (3.48) Mom(R +P) − Mom(R) 0.0063∗∗∗ (3.58) Mom(R + E + P) − Mom(R + E) 0.0062∗∗∗ (4.04) Mom(E +P) − Mom(E)
b3568-v3-ch94
Mom(R + E + P) − Mom(P + E)
Return difference
Incremental return contribution of price momentum
9.61in x 6.69in
Incremental return contribution of revenue momentum
H.-Y. Chen et al.
Mom(P)
Raw return
Two-way sorts
Handbook of Financial Econometrics,. . . (Vol. 3)
Comparisons of assorted single and combined momentum strategies.
15:56
3300
Table 94.8:
page 3300
July 6, 2020 15:56
Incremental return contribution of (revenue + price) momentum
Incremental return contribution of (earnings + price) momentum
Diff. in momentum strategies
Return difference
Diff. in momentum strategies
Return difference
Diff. in momentum strategies
0.0072∗∗∗ (5.47)
Mom(R+E+P) − Mom(E)
0.0085∗∗∗ (4.38)
Mom(R+E+P) – Mom(R)
Mom(R+E+P) − Mom(P)
Return difference 0.0096∗∗∗ (5.54)
9.61in x 6.69in
Notes: This table presents the return contribution by considering additional sorting criterion, being revenue surprises, earnings surprises or prior returns. In the table, R, E , and P respectively refer to revenue momentum, earnings momentum, and price momentum strategy. Momentum strategies based on combined criteria are indicated with plus signs. For example, R + P denotes revenue-price combined momentum strategy, that is, R5 × P 5 − R1 × P 1. Panel A summarizes raw returns and risk-adjusted returns obtained from momentum strategies based on one-way sorts, two-way sorts, and three-way sorts. Risk-adjusted return is the intercept of the Fama–French 3-factor regression on raw return. Panel B lists the return contributions of each additional sorting criterion based on the return differences. The associated t-statistics are in parentheses. Panel C lists the incremental returns obtained by applying additional two sorting criteria. All returns are expressed as monthly returns. ∗∗∗ and ∗∗ indicate statistical significance at 1%, and 5%, respectively.
Handbook of Financial Econometrics,. . . (Vol. 3)
Incremental return contribution of (revenue + earnings) momentum
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Panel C. Contribution of momentum returns from multiple prior performance information
3301 page 3301
July 6, 2020
15:56
3302
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
Panel A of Table 94.8 also shows that investors earn an average monthly return of 1.09% by buying stocks with the highest SURGE and the most winning prior returns (R5 × P5) and selling stocks with the lowest SURGE and the most losing prior returns (R1 × P1). This revenue-and-price combined momentum strategy again outperforms the simple revenue momentum (0.47%) and the simple price momentum strategy (0.72%). Similarly, an earnings-and-price combined momentum strategy offers an average monthly return of 1.18%, which outperforms the univariate earnings momentum (0.58%) and the price momentum strategy (0.72%). Note that the strategy using SURGE and SUE yields a return (0.81%) poorer than that using SURGE and prior returns (1.09%) or that using SUE and prior returns (1.18%). This suggests that it is important to take advantage of market information (prior returns) as well as firm fundamental information (SURGE and SUE) when it comes to formulation of investment strategies. 94.5.3.2 Multivariate combined momentums Next, we further sort stocks into quintiles independently and simultaneously based on SURGE, SUE, and prior price performance to obtain three-way sorted portfolios. A revenue-earnings-price combined momentum strategy is performed by buying the stocks with the most positive revenue surprises, the most positive earnings surprises, and the highest prior returns (R5 × E5 × P5), and selling the stocks with the most negative revenue surprises, the most negative earnings surprises, and the lowest prior returns (R1 × E1 × P1). This leads to a monthly momentum return of 1.44%, which provides the highest investment returns of all the paired momentum strategies discussed so far. Panels B and C of Table 94.8 present the differences in portfolio performance, which indicate the incremental contribution to momentum portfolio returns from each additional sorting criterion. The results are straightforward. The joint consideration of each additional performance measure, whether it is revenue surprises, earnings surprises, or prior returns, helps improve the profits of momentum strategies significantly. The net contribution from price momentum is the greatest (0.62%), followed by earnings momentum (0.33%), and then revenue momentum (0.24%). This result further supports the argument that revenue, earnings, and price all convey to some extent exclusive but unpriced information.
page 3302
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3303
3303
94.5.3.3 Dependent sorts versus independent sorts With highly correlated sorting criteria, as indicated in Table 94.2, independent multiple sorts may result in portfolios with limited numbers of stocks and therefore insufficient diversification. This will then lead to results that might be confounded by factors other than the intended sorting features. More important, only dependent sorts provide a way to identify the precise conditional momentum returns. Table 94.9 presents the returns and the associated t-statistics for twoway and three-way sorted combined momentum strategies using independent sorts and dependent sorts in different orders. For two-way sorted combined momentum strategies, dependent sorts are found to generate returns that are insignificantly different from those from independent sorts. For threeway sorted combined momentum strategies, however, the results are found to vary significantly with the sorting method. The three-way dependent sorts, in any order, yield investment strategies that significantly outperform those using independent sorts; independent sorts create an average monthly return of 1.44%, while dependent sorts lead to an average monthly return ranging from 1.66% to 1.89%. Yet to take advantage of a more simplified presentation, we report results from only independent sorts in Tables 94.7 and 94.8. Note that the general conclusions we have drawn remain unchanged with dependent sorts. 94.6 Persistency and Seasonality 94.6.1 Persistence of momentum effects We next examine the persistence of momentum effects driven by revenue surprises, earnings surprises, and prior price performance. Stock prices tend to adjust slowly to information, and abnormal returns will not continue once information is fully incorporated into prices. Following the argument of conservatism bias (see Edwards, 1968; Barberis et al., 1998), an examination of the persistence of momentum returns will reveal the speed of adjustment in reaction to revenue surprises, earnings surprises, and prior returns. More interestingly, the variations of persistence in conditional momentums will demonstrate how one element of information (e.g., revenue surprises) affects the speed of adjustment to another (e.g., prior returns). Table 94.10 presents the cumulative returns from revenue, earnings, and price momentum strategies. The formation period is kept at six months,
July 6, 2020 15:56
3304
Table 94.9:
Returns of combined momentum strategies — A comparison between dependent sorts and independent sorts.
Momentum strategies Mom(R +E)
∗∗∗
0.0081
(6.25) Dep sorts – Indep sorts
SURGE | SUE ∗∗∗
0.0084
SUE | SURGE 0.0088∗∗∗
(6.95) (0.55)
(6.88) (1.49)
(t-statistic only)
Mom(R +P)
SURGE | P6
0.0104∗∗∗
0.0106∗∗∗
(4.53)
(4.66) (–1.17)
(5.19) (–0.57)
P6 | SUE
SUE | P6
∗∗∗
0.0115∗∗∗
Dep sorts – Indep sorts (t-statistic only)
Mom(E +P)
∗∗∗
0.0118
(5.47)
(5.24)
(6.20)
(–1.76)
(–0.55)
P6 | SURGE | SUE SURGE | P6 | SUE P6 | SUE | SURGE SUE | P6 | SURGE SURGE | SUE | P6 SUE | SURGE | P6 Mom(R +E +P) Dep sorts – Indep sorts
0.0144∗∗∗ (6.06)
0.0175∗∗∗ (4.16)
0.0166∗∗∗ (4.12)
0.0189∗∗∗ (4.29)
0.0188∗∗∗ (4.45)
0.0171∗∗∗ (4.47)
0.0168∗∗∗ (4.36)
(1.86)
(1.45)
(2.44)
(2.60)
(1.61)
(1.39)
(t-statistic only)
b3568-v3-ch94
Notes: This table presents returns and the associated t-statistics from two-way and three-way sorted combined momentum strategies, which are formed using independent sorts or dependent sorts. A momentum strategy formed on the basis of multiple criteria, which we call combined momentum strategy, is said to apply independent sorts if portfolios are independently sorted into quintiles according to their SURGE, SUE, and prior price performance, with the partition points being independent across these criteria. A combined momentum strategy is said to apply dependent sorts if portfolios are sorted into quintiles according to their SURGE, SUE, and prior price performance, with a particular sorting order. For example, a two-way sorted momentum strategy based on SURGE and SUE using dependent sorts could be formed by first sorting on SURGE then on SUE (SUE | SURGE) or first sorting on SUE then on SURGE (SURGE | SUE). We present here the returns of momentum strategies following all possible sequences of two-way dependent sorts and three-way dependent sorts. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
9.61in x 6.69in
Dep sorts – Indep sorts (t-statistic only)
0.0111
H.-Y. Chen et al.
P6 | SURGE 0.0109∗∗∗
Handbook of Financial Econometrics,. . . (Vol. 3)
Dependent sorts Independent sorts
page 3304
July 6, 2020 15:56
Cumulative returns from revenue, earnings, and price momentum strategies.
Negative Positive t SURGE SURGE (month) (%) (%) 1.69 3.26 4.66 6.06 7.49 8.87 10.21 11.51 12.82 14.02 15.19 16.39 17.57 18.78 20.01 21.33 22.67
Negative Positive SUE SUE (%) (%)
t Loser (%) Winner WMN (month) (%) (%)
PMN (%)
t (month)
1.02∗∗∗ 1.75∗∗∗ 2.21∗∗∗ 2.49∗∗∗ 2.71∗∗∗ 2.75∗∗∗ 2.72∗∗∗ 2.59∗∗∗ 2.42∗∗∗ 2.09∗∗∗ 1.76∗∗∗ 1.44∗∗∗ 1.31∗∗∗ 1.19∗∗∗ 1.15∗∗∗ 1.09∗∗ 1.07∗∗
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
0.66 1.53 2.50 3.59 4.76 5.97 7.19 8.51 9.88 11.27 12.66 14.05 15.28 16.49 17.66 18.95 20.26
1.83 3.47 4.95 6.42 7.92 9.40 10.80 12.14 13.50 14.75 16.00 17.33 18.69 20.05 21.42 22.89 24.41
PMN (%) 1.17∗∗∗ 1.94∗∗∗ 2.44∗∗∗ 2.83∗∗∗ 3.17∗∗∗ 3.43∗∗∗ 3.61∗∗∗ 3.63∗∗∗ 3.62∗∗∗ 3.49∗∗∗ 3.34∗∗∗ 3.28∗∗∗ 3.41∗∗∗ 3.57∗∗∗ 3.76∗∗∗ 3.94∗∗∗ 4.15∗∗∗
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.12 1.96 2.79 3.69 4.67 5.66 6.64 7.76 9.00 10.22 11.52 12.91 14.31 15.73 17.13 18.64 20.15
1.48 3.19 4.76 6.42 8.10 9.86 11.62 13.21 14.77 16.18 17.54 18.80 19.91 21.04 22.19 23.43 24.72
0.36 1.23∗∗∗ 1.97∗∗∗ 2.74∗∗∗ 3.4∗∗∗ 4.21∗∗∗ 4.99∗∗∗ 5.45∗∗∗ 5.78∗∗∗ 5.95∗∗∗ 6.02∗∗∗ 5.89∗∗∗ 5.60∗∗∗ 5.31∗∗∗ 5.06∗∗∗ 4.79∗∗∗ 4.57∗∗∗
(Continued )
b3568-v3-ch94
0.68 1.50 2.45 3.57 4.78 6.13 7.49 8.92 10.40 11.93 13.44 14.95 16.26 17.59 18.86 20.23 21.61
Panel C. Price momentum
9.61in x 6.69in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Panel B. Earnings momentum
Handbook of Financial Econometrics,. . . (Vol. 3)
Panel A. Revenue momentum
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
Table 94.10:
3305 page 3305
July 6, 2020
Negative Positive t SURGE SURGE (month) (%) (%) 24.03 25.38 26.79 28.13 29.48 30.91 32.38 33.79 35.19 36.57 37.98 39.41 40.85 42.38
Negative Positive SUE SUE (%) (%)
t Loser (%) Winner WMN (month) (%) (%)
PMN (%)
t (month)
1.07∗∗ 0.98∗∗ 0.85∗ 0.68 0.57 0.54 0.55 0.54 0.51 0.49 0.45 0.35 0.26 0.25
18 19 20 21 22 23 24 25 26 27 28 29 30 31
21.56 22.90 24.34 25.77 27.21 28.67 30.18 31.62 33.11 34.54 36.01 37.52 39.04 40.54
25.94 27.44 28.95 30.41 31.88 33.39 34.90 36.36 37.80 39.22 40.69 42.18 43.62 45.14
PMN (%) 4.37∗∗∗ 4.54∗∗∗ 4.62∗∗∗ 4.64∗∗∗ 4.67∗∗∗ 4.72∗∗∗ 4.72∗∗∗ 4.74∗∗∗ 4.69∗∗∗ 4.68∗∗∗ 4.67∗∗∗ 4.66∗∗∗ 4.58∗∗∗ 4.60∗∗∗
18 19 20 21 22 23 24 25 26 27 28 29 30 31
21.59 22.91 24.33 25.79 27.23 28.74 30.29 31.87 33.48 35.03 36.67 38.38 40.00 41.54
26.09 27.71 29.29 30.89 32.38 33.90 35.41 36.67 38.00 39.29 40.61 41.90 43.31 44.86
4.50∗∗∗ 4.79∗∗∗ 4.96∗∗∗ 5.10∗∗∗ 5.14∗∗∗ 5.17∗∗∗ 5.12∗∗∗ 4.79∗∗∗ 4.52∗∗∗ 4.26∗∗∗ 3.94∗∗∗ 3.53∗∗∗ 3.31∗∗∗ 3.32∗∗∗
9.61in x 6.69in
22.96 24.40 25.94 27.45 28.91 30.37 31.83 33.24 34.68 36.08 37.53 39.06 40.58 42.13
Panel C. Price momentum
H.-Y. Chen et al.
18 19 20 21 22 23 24 25 26 27 28 29 30 31
Panel B. Earnings momentum
Handbook of Financial Econometrics,. . . (Vol. 3)
Panel A. Revenue momentum
(Continued ).
15:56
3306
Table 94.10:
b3568-v3-ch94 page 3306
July 6, 2020 15:56
43.93 45.44 46.95 48.46 49.97
0.16 0.06 −0.01 −0.03 −0.10
32 33 34 35 36
42.08 43.60 45.11 46.59 48.15
46.66 48.14 49.62 51.20 52.80
4.59∗∗∗ 4.53∗∗∗ 4.51∗∗∗ 4.60∗∗∗ 4.65∗∗∗
32 33 34 35 36
43.10 44.70 46.35 47.86 49.46
46.50 48.09 49.61 51.15 52.68
3.39∗∗∗ 3.39∗∗∗ 3.27∗∗∗ 3.29∗∗∗ 3.22∗∗∗
9.61in x 6.69in
Notes: This table reports the cumulative returns of zero-cost momentum portfolio in each month following the formation period. t is the month after portfolio formation. Three different momentum strategies are tested. The sample period is from 1974 through 2009. Panel A reports the results from the revenue momentum strategy, where sample firms are grouped into five groups based on the measure SURGE during each formation month. The revenue momentum portfolios are formed by buying stocks with the most positive SURGE and selling stocks with the most negative SURGE. Listed are the cumulative portfolio returns for the portfolio with the most negative SURGE, the portfolio with the most positive SURGE, and the revenue momentum portfolio. Panel B reports the results from the earnings momentum strategy, where firms are grouped into five groups based on the measure SUE during each formation month. The earnings momentum portfolios are formed by buying stocks with the most positive SUE and selling stocks with the most negative SUE. Listed are the cumulative portfolio returns for the portfolio with the most negative SUE, the portfolio with the most positive SUE, and the earnings momentum portfolio. Panel C reports the results from the price momentum strategy. The price momentum portfolios are formed by buying Quintile 1 (winner) stocks and selling Quintile 5 (loser) stocks on the basis of previous six months returns. Listed are the cumulative portfolio returns for the loser portfolio, the winner portfolio, and the price momentum portfolio. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
Handbook of Financial Econometrics,. . . (Vol. 3)
43.77 45.38 46.96 48.49 50.06
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
32 33 34 35 36
3307 page 3307
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
H.-Y. Chen et al.
3308
and the cumulative returns are calculated up to 36 months after the event time. Panel A shows that the zero-investment portfolios built upon revenue surprises maintain their return momentum for six months. The buy-andhold returns drop to insignificance 21 months after the portfolio formation. In Panel B, the profits of earnings momentum portfolios, although not as high as on price momentum in the short term, demonstrate greater persistence than price momentum, with the cumulative returns continuing to drift upward for 25 months after portfolio formation. The cumulative returns still remain significant at 4.65% three years after portfolio formation. Panel C shows that the profits to price momentum portfolio drift upward for 11 months after portfolio formation and start to reverse thereafter. The cumulative returns remain significant at 3.22% on monthly terms 36 months after portfolio formation. Figure 94.1 compares the cumulative returns to those three univariate momentum strategies. Price momentum generates the highest cumulative 7 6 5 Monthly Return (%)
July 6, 2020
4 3 2 1 0
1 -1
4
7
10
13
16
19
22
25
28
31
34
Holding Period (months)
Mom(R)
Mom(E)
Mom(P)
Figure 94.1: Persistence of momentum effects. Notes: This figure shows the average cumulative returns of relative strength portfolios with respect to revenue surprises, earnings surprises, and prior price performance. The relative strength portfolio is buying stocks in highest quintile and selling stocks in lowest quintile on every formation date, and holding for 36 months. The cumulative returns are calculated by adding monthly returns from formation month t to month t + i.
page 3308
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3309
3309
returns in the short term (for a one year holding period), while earnings momentum demonstrates the most persistent performance, as cumulative returns continue to grow up to two years after portfolio formation. On the other hand, the payoffs to revenue momentum seem to be neither as persistent nor as strong as the other two strategies. Figure 94.2 presents the cumulative returns for momentum strategies conditional on alternative performance measures. Figures 94.2(a) and 94.2(b) present the cumulative returns of revenue momentum conditional on highlow SUEs and prior returns. They show that the revenue momentums remain short-lived, regardless of the level of SUE or the level of prior returns. The portfolio returns to a revenue momentum strategy with loser stocks not only quickly dissipate in the short term but also actually reverse to negative returns starting seven months after portfolio formation. Figures 94.2(c) and 94.2(d) demonstrate the cumulative returns for earnings momentums conditional on high-low SURGE and prior returns. Figure 94.2(c) shows that the earnings momentum returns remain similar for the low-SURGE and the high-SURGE stocks during the first 20 months after portfolio formation. Such finding of momentum contingencies in fact conforms to our results in Panel A of Table 94.8. More interestingly, as we hold the portfolio for over 20 months, the earnings momentum strategy with low-SURGE stocks starts deteriorating while the strategy with high-SURGE stocks still maintains significantly positive returns up to 36 months after the portfolio formation. Figure 94.2(d), on the other hand, shows that earnings momentum effects are both greater and longer-lasting for winner stocks than for loser stocks. The caveat on investment strategy is that earnings momentum returns are higher and more longer-lived when applied over stocks with superior price history in the past six months. In Figures 94.2(e) and 94.2(f), price momentum strategies yield higher and more persistent returns for stocks with positive SUE or SURGE than for stocks with negative SUE or SURGE. A comparison of Figures 94.2(e) and 94.2(f) also finds that high-SURGE serves as a more effective driver than high-SUE for stocks to exhibit greater and more persistent price momentum. These observations on momentum persistence provide further support for our claim on momentum cross-contingencies. We find that the persistence of a momentum, just like the magnitude of the momentum returns, depends on the accompanying condition of another firm information. Such cross-contingencies are again not as strong in the relation between revenue momentum and SUE or between earnings momentum and SURGE, as shown in Figures 94.2(a) and 94.2(c). Results suggest that investors update their
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
6
4
4
Cumulative Return (%)
6
2 0 1
4
7
10
13
16
19
22
25
28
31
34
-2 -4 -6
2 0 1
4
7
10
13
16
22
25
28
31
34
-4
-8 Month
Month
Mom(R) | SUE1
Mom(R) | Prior Price Performance 1
Mom(R) | SUE5
Mom(R) | Prior Price Performance 5
(a)
(b) Cumulative Returns of Earnings Momentum Conditional on Prior Price Perfromance
Cumulative Returns of Earnings Momentum Conditional on SURGE
12 Cumulative Return (%)
12 Cumulative Return (%)
19
-2
-6
-8
10 8 6 4 2
10 8 6 4 2 0
0 1
4
7
10
13
16
19
22
25
28
31
34
-2
-2
1
4
7
10
13
16
19
22
25
28
31
34
Month
Month Mom(E) | SURGE1
Mom(E) | Prior Price Performance 1
Mom(E) | SURGE5
Mom(E) | Prior Price Performance 5
(c)
(d)
Cumulative Returns of Price Momentum Conditional on SURGE
Cumulative Returns of Price Momentum Conditional on SUE 12
Cumulative Return (%)
12 10 8 6 4 2 0 -2
page 3310
Cumulative Returns of Revenue Momentum Conditional on Prior Price Perfromance
Cumulative Returns of Revenue Momentum Conditional on SUE
Cumulative Return (%)
b3568-v3-ch94
H.-Y. Chen et al.
3310
Cumulative Return (%)
July 6, 2020
1
4
7
10
13
16
19
22
25
28
31
34
10 8 6 4 2 0 -2
Mom(P) | SURGE1
(e)
1
4
7
10
13
16
19
22
25
28
31
34
Month
Month Mom(P) | SURGE5
Mom(P) | SUE1
Mom(P) | SUE5
(f)
Figure 94.2: Cumulative returns of momentum effect conditional on performance measure. Notes: These figures show the average cumulative returns of relative strength portfolio with respect to revenue surprises, earnings surprises, and prior price performance conditional on one another. The holding period is up to 36 months. The cumulative profits are calculated by adding monthly returns from formation month t to month t + i.
expectations based on the joint information of revenue surprises, earnings surprises, and prior price performance, and the speed of adjustment to firm fundamental information (SURGE or SUE) depends on the prevailing content of firm market information (prior returns), and vice versa.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch94
Does Revenue Momentum Drive or Ride Earnings or Price Momentum?
page 3311
3311
94.6.2 Seasonality Jegadeesh and Titman (1993), Heston and Sadka (2008), Asness et al. (2013), Novy-Marx (2012), and Yao (2012) find that prior return winners outperform losers in all months except January, leading to positive profits for a price momentum strategy in all months except January but negative profits for that strategy in January. Chordia and Shivakumar (2006) also find significant seasonality effects in returns to the earnings momentum strategy. Do revenue momentum strategy and combined momentum strategies exhibit similar seasonalities? Table 94.11 presents results for tests of seasonal patterns in returns to univariate momentum strategies and combined momentum strategies. For all types of momentum strategies, momentum profits in January are either negative or insignificantly different from zero. F -tests reject the hypothesis that the returns to momentum strategies are equal in January and nonJanuary months. We therefore conclude that, as in the finding elsewhere, there is seasonality in momentum strategies, and revenue surprises, earnings, surprises, and prior returns all yield significantly positive returns only in non-January months.
Table 94.11:
Returns of momentum strategies in January and non-January months.
Momentum strategies Mom(R) Mom(E) Mom(P) Mom(R +E) Mom(R +P) Mom(E +P) Mom(R +E +P)
All months
Jan.
Feb. - Dec.
F -statistic
p-value
0.0047∗∗∗ (4.42) 0.0058∗∗∗ (8.17) 0.0072∗∗∗ (3.36) 0.0081∗∗∗ (6.25) 0.0109∗∗∗ (4.53) 0.0118∗∗∗ (5.47) 0.0144∗∗∗ (6.06)
−0.0061 (−1.59) 0.0026 (0.72) −0.0134 (−1.32) −0.0062 (−1.09) −0.0164 (−1.44) −0.0082 (−0.73) −0.0131 (−1.11)
0.0057∗∗∗ (5.19) 0.0061∗∗∗ (8.67) 0.0090∗∗∗ (4.25) 0.0094∗∗∗ (7.22) 0.0134∗∗∗ (5.62) 0.0136∗∗∗ (6.48) 0.0169∗∗∗ (7.26)
31.66
0
Handbook of Financial Econometrics,. . . (Vol. 3)
FSCORE
H.-Y. Chen, C. F. Lee & W.-K. Shih
Ret3
Correlation among fundamental signals, BOS ratio, and past returns.
15:56
3334
Table 95.2:
b3568-v3-ch95 page 3334
July 6, 2020 15:56
BOS
P12
G1
G2
G3
G4
G5
G6
G7
G8
0.0334 0.0641 0.1018 0.2032
0.0239 0.0331 0.6019 −0.1448 0.0747
0.0289 0.0409 0.6883 −0.1201 0.0934 0.5518
0.0101 0.0153 0.2676 −0.0077 0.0364 −0.0870 0.2256
0.0312 0.0455 0.5407 −0.1364 0.0837 0.2533 0.2022 0.0416
0.0261 0.0391 0.5274 −0.0752 0.0727 0.1689 0.1754 0.0547 0.3114
−0.0016 −0.0038 0.2222 −0.0081 −0.0041 −0.0386 −0.0168 0.0344 −0.0658 −0.0240
−0.0038 −0.0060 0.1856 0.0077 −0.0125 0.0091 −0.0025 −0.0137 −0.0027 0.0291 −0.0227
0.0065 0.0072 0.4709 −0.0840 0.0015 0.1337 0.2126 0.0614 0.1002 0.0950 0.0426 −0.0142
Panel B: Fundamental signals for GSCORE Ret1 Ret3 GSCORE BOS P12 G1: ROA ≥ IndM G2: CFO ≥ IndM G3: Accrual < 0 G4: σNI ≤ IndM G5: σSG ≤ IndM G6: RDINT ≥ IndM G7: ADINT ≥ IndM
0.5418
0.0355 0.0503
−0.0150 −0.0206 −0.1679
9.61in x 6.69in b3568-v3-ch95
Notes: This table presents the average Spearman rank-order correlation among the fundamental signals, past returns, and BOS ratio for sample stocks. Panel A includes FSCORE and its fundamental signals. FSCORE is the sum of nine fundamental signals, which is assigned a value of 1 if the following criteria are met (and 0 otherwise): F1: ROA > 0, F2: AROA > 0, F3: CFO > 0, F4: Accrual < 0, F5: DMargin > 0, F6: DTurn > 0, F7: DLever < 0, F8: DLIQUD > 0, and F9: EQOFFER = 0. Panel B includes GSCORE and its fundamental signals. GSCORE is the sum of eight fundamental signals, which is assigned a value of 1 if the following criteria are met (and 0 otherwise): G1: ROA ≥ IndM , G2: CFO ≥ IndM , G3: Accrual < 0, G4: σNI ≤ IndM , G5: σSG ≤ IndM , G6: RDINT ≥ IndM , G7: ADINT ≥ IndM , and G8: CAPINT ≥ IndM . The definitions of these variables are provided in Section 2 and Table 1. P12 is the buy-and-hold over the 12-month period for each stock before portfolio formation. BOS ratio is defined as the covariance between the monthly return and the adjusted trading volume over the 12-month period for each stock before the portfolio formation.
Handbook of Financial Econometrics,. . . (Vol. 3)
GSCORE
Technical, Fundamental, and Combined Information
Ret3
3335 page 3335
July 6, 2020
15:56
3336
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
future performance of the stocks in our sample, 0.0334 with a 1-month future return and 0.0641 with a 3-month future return, indicating that momentum profits will be observed, as suggested by Jegadeesh and Titman (1993). In addition, the correlations among past returns, the BOS ratio, and the fundamental scores are low. Five of 12 pairs are negatively correlated, and none of the correlations is higher than 0.21, indicating that past returns, the BOS ratio, and the fundamental scores can capture different information content of a firm. Therefore, we expect that the combined investment strategy incorporating past returns, the BOS ratio, and fundamental scores can generate better performance than the momentum strategy, which only uses prior price information, can. 95.4 Performance of Alternative Investment Strategies 95.4.1 Momentum strategy Table 95.3 provides average monthly excess returns for five quintile portfolios constructed based on the past-12-month cumulative returns and average arbitrage returns for the momentum portfolio in different holding periods. In Panel A, for all sample firms, the average monthly returns are 1.06%, 0.88%, 0.68%, and 0.53% for momentum strategies with 3-, 6-, 9-, and 12-month holding periods, respectively. Our results are consistent with those of Jegadeesh and Titman (1993), that is, that trading strategies based on past-12-month winners/losers and 1-month to 12-month holding periods exhibit strong momentum returns. Moreover, Table 95.3 reports the risk-adjusted return from the Fama–French three-factor model for each winner’s and loser’s portfolio and long–short investment strategy. The riskadjusted return of the portfolio relative to the three factors is the estimated intercept coefficient from the following time-series regression using monthly portfolio returns: Ri,t = αi + βi Rm,t + φi SMBt + ϕi HMLt + ei,t ,
(95.4)
where Ri,t is the monthly excess return for the long–short portfolio i, Rm,t is the monthly excess return for the market index, SMBt is the Fama–French small-firm factor, HM Lt is the Fama–French book-to-market factor, and βi , φi , and ϕi are corresponding factor loadings. Consistent with results found from monthly excess returns, momentum strategy can generate significantly positive risk-adjusted returns in different holding periods. Panels B and C present momentum returns for value stocks and growth stocks, respectively. Similar to the results in Panel A, momentum returns are positive and
page 3336
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
Technical, Fundamental, and Combined Information Table 95.3: QM 1
QM 2
page 3337
3337
Returns to momentum strategy. QM 3
QM 4
QM 5
QM 5 − QM 1
0.8371∗∗∗ (3.53) 0.8462∗∗∗ (3.59) 0.8395∗∗∗ (3.56) 0.8706∗∗∗ (3.76)
0.9628∗∗∗ (4.09) 0.9460∗∗∗ (4.02) 0.9058∗∗∗ (3.84) 0.9118∗∗∗ (3.90)
1.1399∗∗∗ (4.22) 1.0598∗∗∗ (3.92) 0.9660∗∗∗ (3.57) 0.9196∗∗∗ (3.41)
1.0580∗∗∗ (5.96) 0.8844∗∗∗ (5.22) 0.6794∗∗∗ (4.21) 0.5259∗∗∗ (3.54)
Panel A: All firms Average monthly excess returns (%) 3-month 6-month 9-month 12-month
0.0819 (0.27) 0.1754 (0.59) 0.2866 (0.97) 0.3937 (1.37)
0.6300∗∗ (2.47) 0.6684∗∗∗ (2.65) 0.7071∗∗∗ (2.81) 0.7615∗∗∗ (3.09)
Fama–French three-factor model monthly adj. returns (%) −0.7388∗∗∗ (−5.03) 6-month −0.6669∗∗∗ (−4.54) 9-month −0.5663∗∗∗ (−3.94) 12-month −0.4816∗∗∗ (−3.40)
3-month
−0.1560∗ (−1.66) −0.1334 (−1.44) −0.0969 (−1.06) −0.0610 (−0.69)
0.0892 (1.26) 0.0904 (1.27) 0.0828 (1.17) 0.0934 (1.35)
0.2441∗∗∗ (3.69) 0.2228∗∗∗ (3.34) 0.1881∗∗∗ (2.85) 0.1748∗∗∗ (2.66)
0.4423∗∗∗ (4.68) 0.3685∗∗∗ (4.06) 0.2797∗∗∗ (3.28) 0.2034∗∗ (2.47)
1.1811∗∗∗ (6.98) 1.0354∗∗∗ (6.30) 0.8459∗∗∗ (5.53) 0.6850∗∗∗ (4.87)
1.0225∗∗∗ (4.23) 1.0276∗∗∗ (4.25) 1.0102∗∗∗ (4.17) 1.0322∗∗∗ (4.34)
1.1245∗∗∗ (4.65) 1.1152∗∗∗ (4.63) 1.0692∗∗∗ (4.44) 1.0640∗∗∗ (4.50)
1.2749∗∗∗ (4.77) 1.1981∗∗∗ (4.49) 1.0934∗∗∗ (4.12) 1.0435∗∗∗ (3.97)
0.9121∗∗∗ (5.38) 0.7518∗∗∗ (4.65) 0.5557∗∗∗ (3.61) 0.4188∗∗∗ (2.97)
0.4594∗∗∗ (4.80) 0.3903∗∗∗ (4.20) 0.3043∗∗∗ (3.40) 0.2396∗∗∗ (2.74)
0.9930∗∗∗ (5.95) 0.8585∗∗∗ (5.26) 0.6861∗∗∗ (4.42) 0.5458∗∗∗ (3.71)
Panel B: Value stocks Average monthly excess returns (%) 3-month 6-month 9-month 12-month
0.3628 (1.21) 0.4463 (1.51) 0.5377∗ (1.82) 0.6247∗∗ (2.17)
0.8277∗∗∗ (3.20) 0.8764∗∗∗ (3.42) 0.9002∗∗∗ (3.51) 0.9417∗∗∗ (3.76)
Fama–French three-factor model monthly adj. returns (%) −0.5337∗∗∗ (−3.31) 6-month −0.4682∗∗∗ (−2.94) 9-month −0.3818∗∗ (−2.42) 12-month −0.3062∗∗ (−1.96)
3-month
−0.0451 (−0.44) −0.0104 (−0.10) 0.0186 (0.19) 0.0489 (0.50)
0.1870∗∗ (2.31) 0.1838∗∗ (2.32) 0.1695∗∗ (2.15) 0.1798∗∗ (2.31)
0.2923∗∗∗ (3.96) 0.2831∗∗∗ (3.87) 0.2517∗∗∗ (3.48) 0.2401∗∗∗ (3.28)
(Continued )
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
3338
Table 95.3: QM 1
QM 2
(Continued )
QM 3
QM 4
QM 5
QM 5 − QM 1
0.6435∗∗∗ (2.66) 0.6539∗∗∗ (2.73) 0.6690∗∗∗ (2.81) 0.7062∗∗∗ (3.01)
0.7892∗∗∗ (3.30) 0.7705∗∗∗ (3.23) 0.7347∗∗∗ (3.07) 0.7552∗∗∗ (3.17)
0.9898∗∗∗ (3.56) 0.9292∗∗∗ (3.33) 0.8444∗∗∗ (3.01) 0.7945∗∗∗ (2.84)
1.1304∗∗∗ (5.89) 0.9913∗∗∗ (5.44) 0.7815∗∗∗ (4.47) 0.5999∗∗∗ (3.69)
0.3962∗∗∗ (3.91) 0.3389∗∗∗ (3.53) 0.2496∗∗∗ (2.81) 0.1596∗ (1.88)
1.2801∗∗∗ (7.25) 1.1651∗∗∗ (6.80) 0.9687∗∗∗ (6.09) 0.7804∗∗∗ (5.35)
Panel C: Growth stocks Average monthly excess returns (%) −0.1406 (−0.46) 6-month −0.0621 (−0.21) 9-month 0.0630 (0.21) 12-month 0.1946 (0.67) 3-month
0.4471∗ (1.71) 0.4715∗ (1.83) 0.5248∗∗ (2.06) 0.5963∗∗ (2.39)
Fama–French three-factor model monthly adj. returns (%) −0.8839∗∗∗ (−5.98) 6-month −0.8262∗∗∗ (−5.59) 9-month −0.7191∗∗∗ (−4.98) 12-month −0.6208∗∗∗ (−4.34)
3-month
−0.2534∗∗ −0.0127 (−2.53) (−0.16) −0.2460∗∗ −0.0132 (−2.50) (−0.17) −0.2023∗∗ −0.0041 (−2.06) (−0.06) −0.1582∗ 0.0016 (−1.65) (0.02)
0.1810∗∗∗ (2.71) 0.1532∗∗ (2.25) 0.1122∗ (1.65) 0.1007 (1.49)
Notes: This table provides momentum returns of 3-, 6-, 9- and 12-month holding period returns from a long–short portfolio constructed from the past 12 months’ winner and loser stocks. We report average monthly excess returns and Fama–French three-factor model monthly adjusted returns in percentage terms. (Associated White heteroskedasticity corrected t-statistics are reported below the returns.) Monthly excess return is the difference between portfolio return and monthly return on a 3-month Treasury bill. Fama–French risk-adjusted return is the estimated intercept coefficient from the Fama–French threefactor model. At the end of each month t, stocks are sorted into five quintile portfolios independently by cumulative returns in the previous year, from month t − 12 to t − 1. QM 5 (QM 1 ) is the portfolio consisting of stocks with the past 12 months’ cumulative returns in the top (bottom) 20%. (QM 5 − QM 1 ) is profits from the long–short investment strategy, in which the long position consisted of past winner stocks and the short position consisted of past loser stocks. We measure the difference in average 3-, 6-, 9-, and 12-month returns between the monthly rebalanced winner and loser portfolios. The differences between winner and loser portfolios are calculated by averaging monthly profits for an overlapping portfolio that in each month contains an equally weighted portfolio of long–short momentum portfolios selected in the previous 12 months. Panel A presents momentum returns for all sample stocks. Panels B and C present momentum returns for a value stock portfolio and a growth stock portfolio. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
page 3338
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Technical, Fundamental, and Combined Information
b3568-v3-ch95
page 3339
3339
significant for value stocks and growth stocks, whereas higher momentum returns can be observed for growth stocks. This is consistent with Daniel and Titman’s (1999) finding that lower book-to-market firms have more growth options and intangible assets, which cannot be priced efficiently and therefore leads to a stronger momentum effect. 95.4.2 Fundamental momentum strategy Table 95.4 presents the monthly average excess return for portfolios double sorted by the past-12-month returns and the fundamental score (FSCORE or GSCORE) for different holding periods. In terms of a 6-month holding period, Panel B shows that the fundamental momentum strategy based on the FSCORE can yield a monthly return of 1.3866%, or a risk-adjusted return of 1.5137%. The fundamental momentum strategy based on the FSCORE can significantly outperform the momentum strategy by 0.5022%. Panel F shows that the fundamental momentum strategy based on the GSCORE can generate a monthly return and a risk-adjusted return as high as 1.3034% and 1.4977%, which are higher than the performance of the momentum strategy. The positive returns and the superior performance of the fundamental strategies can also be found in the value stock portfolio and the growth stock portfolio. Therefore, the results in Table 95.4 are consistent with the findings of Piotroski (2000) and Mohanram (2005), that is, that fundamental scores can further separate winners (losers) from the winner (loser) group. The results in Tables 95.3 and 95.4 suggest that returns to the momentum strategy and the fundamental momentum strategy documented in the literature also exist in the sample and the sample period we choose in this study. We next examine the strength of momentum returns when past trading volume is considered. 95.4.3 BOS momentum strategy As discussed in Section 95.2, Wu (2007) argues that a momentum effect arises because of the information asymmetry between informed and uninformed investors in the market. Wu (2007) also indicates that stronger momentum returns are expected for stocks subject to a larger degree of information asymmetry. Therefore, using the BOS ratio as a proxy of information asymmetry, we may find that winner (loser) stocks with lower (higher) BOS ratios are those that are subject to a higher degree of information asymmetry and are expected to generate higher momentum returns.
July 6, 2020
Returns to fundamental momentum strategy.
QF 5
Value stocks (QF 5 − QF 1 )
Growth stocks
QF 1
QF 5
(QF 5 − QF 1 )
0.0226 (0.07) 1.1188∗∗∗ (3.80)
0.6625∗∗ (2.28) 1.3458∗∗∗ (4.99)
0.6400∗∗∗ (4.19) 0.2271∗ (1.80)
QF 1
QF 5
(QF 5 − QF 1 )
0.3259 (1.08) 1.0394∗∗∗ (3.78)
0.8557∗∗∗ (5.02) 0.2864∗∗ (2.41)
Return
FF-adj
Panel A: FSCORE, 3-month average excess returns (%) QM 1 (Losers)
(QM 5 , QF 5 ) − (QM 1 , QF 1 ) ΔFMOM −MOM
0.7734∗∗∗ (5.53) 0.1881∗ (1.78)
Return
FF-adj
Return
FF-adj
1.5605∗∗∗
1.6486∗∗∗
1.3232∗∗∗
1.3746∗∗∗
1.5692∗∗∗
(6.68) 0.4241∗∗∗ (3.90)
(7.54) 0.4056∗∗∗ (3.56)
(5.72) 0.4111∗∗∗ (3.10)
(6.38) 0.3816∗∗∗ (2.85)
(6.30) 0.4389∗∗∗ (3.39)
1.7216∗∗∗ (7.70) 0.4415∗∗∗ (3.52)
0.7307∗∗ (2.54) 1.3034∗∗∗ (4.87)
0.6856∗∗∗ (4.88) 0.2840∗∗ (2.34)
0.3802 (1.29) 1.0146∗∗∗ (3.72)
0.8917∗∗∗ (5.63) 0.3175∗∗∗ (2.94)
−0.5299 (−1.54) 0.7529∗∗ (2.50)
Panel B: FSCORE, 6-month average excess returns (%) QM 1 (Losers) QM 5
−0.2278 (−0.70) (Winners) 0.8589∗∗∗ (2.95)
(QM 5 , QF 5 ) − (QM 1 , QF 1 )
0.7513∗∗∗ (6.18) 0.2999∗∗∗ (3.15)
Return
FF-adj
Return
FF-adj
Return
FF-adj
1.3866∗∗∗ (6.54) 0.5022∗∗∗ (5.00)
1.5137∗∗∗ (7.78) 0.4783∗∗∗ (4.68)
1.2583∗∗∗ (5.83) 0.5065∗∗∗ (4.22)
1.3433∗∗∗ (6.56) 0.4848∗∗∗ (4.02)
1.5261∗∗∗ (6.45) 0.5348∗∗∗ (4.43)
1.7030∗∗∗ (8.01) 0.5379∗∗∗ (4.74)
0.0451 (0.14) 1.0194∗∗∗ (3.46)
−0.5115 (−1.52) 0.6971∗∗ (2.33)
b3568-v3-ch95
ΔFMOM −MOM
0.5234∗ (1.84) 1.1588∗∗∗ (4.39)
9.61in x 6.69in
0.4393 (1.52) 1.2264∗∗∗ (4.59)
H.-Y. Chen, C. F. Lee & W.-K. Shih
QM 5
−0.3341 (−1.00) (Winners) 1.0383∗∗∗ (3.60)
Handbook of Financial Econometrics,. . . (Vol. 3)
All stocks QF 1
15:56
3340
Table 95.4:
page 3340
July 6, 2020 15:56
QM 5 (Winners)
−0.1285 (−0.40) 0.7464∗∗ (2.55)
(QM 5 , QF 5 ) − (QM 1 , QF 1 ) ΔFMOM −MOM
0.6269∗∗ (2.21) 1.0949∗∗∗ (4.13)
0.7554∗∗∗ (6.44) 0.3486∗∗∗ (3.82)
0.8291∗∗∗ (2.88) 1.1956∗∗∗ (4.50)
0.6786∗∗∗ (5.10) 0.2908∗∗ (2.50)
Return
FF-adj
Return
1.2234∗∗∗ (5.99) 0.5440∗∗∗ (5.62)
1.3571∗∗∗ (7.16) 0.5112∗∗∗ (5.15)
0.1505 (0.46) 0.9048∗∗∗ (3.07)
0.8692∗∗∗ (5.74) 0.3712∗∗∗ (3.65)
FF-adj
Return
FF-adj
1.0451∗∗∗ (5.06) 0.4894∗∗∗ (4.28)
1.1473∗∗∗ (5.70) 0.4612∗∗∗ (3.98)
1.3664∗∗∗ (6.01) 0.5849∗∗∗ (5.18)
1.5463∗∗∗ (7.60) 0.5776∗∗∗ (5.31)
0.9173∗∗∗ (3.27) 1.1762∗∗∗ (4.48)
0.6417∗∗∗ (5.11) 0.3124∗∗∗ (2.85)
0.5755∗∗ (2.04) 0.9071∗∗∗ (3.31)
0.8247∗∗∗ (5.75) 0.3565∗∗∗ (3.64)
Panel D: FSCORE, 12-month average excess returns (%) QM 1 (Losers) QM 5 (Winners)
0.0058 (0.02) 0.7068∗∗ (2.42)
(QM 5 , QF 5 ) − (QM 1 , QF 1 ) ΔFMOM −MOM
0.7286∗∗∗ (2.64) 1.0593∗∗∗ (4.02)
0.7228∗∗∗ (6.45) 0.3525∗∗∗ (4.00)
Return
FF-adj
Return
FF-adj
Return
FF-adj
1.0535∗∗∗ (5.52) 0.5276∗∗∗ (5.71)
1.1774∗∗∗ (6.38) 0.4924∗∗∗ (5.15)
0.9006∗∗∗ (4.63) 0.4818∗∗∗ (4.42)
0.9899∗∗∗ (5.02) 0.4442∗∗∗ (3.99)
1.1563∗∗∗ (5.42) 0.5564∗∗∗ (5.20)
1.3258∗∗∗ (6.81) 0.5454∗∗∗ (5.32)
0.2756 (0.87) 0.8638∗∗∗ (2.97)
−0.2493 (−0.77) 0.5506∗ (1.85)
9.61in x 6.69in
0.4706 (1.61) 0.9677∗∗∗ (3.52)
−0.3987 (−1.19) 0.5965∗∗ (1.99)
Technical, Fundamental, and Combined Information
QM 1 (Losers)
Handbook of Financial Econometrics,. . . (Vol. 3)
Panel C: FSCORE, 9-month average excess returns (%)
b3568-v3-ch95
(Continued )
3341 page 3341
July 6, 2020
QG5
(QG5 − QG1 )
Growth stocks
QG1
QG5
(QG5 − QG1 )
0.0376 (0.11) 1.1255∗∗∗ (3.87)
0.7094∗∗∗ (2.41) 1.3248∗∗∗ (5.06)
0.6719∗∗∗ (4.03) 0.1993 (1.46)
QG1
(QG5 − QG1 )
0.4761 (1.58) 0.9953∗∗∗ (3.81)
1.0568∗∗∗ (5.30) 0.2349∗ (1.64)
Return
FF-adj
Panel E: GSCORE, 3-month average excess returns (%) QM 1 (Losers) QM 5
−0.2749 (−0.83) (Winners) 1.0107∗∗∗ (3.39)
(QM 5 , QG5 ) − (QM 1 , QG1 ) ΔGMOM −MOM
0.5334∗ (1.85) 1.1544∗∗∗ (4.53)
0.8083∗∗∗ (4.75) 0.1436 (1.15)
Return
FF-adj
Return
FF-adj
1.4292∗∗∗
1.6010∗∗∗
1.2872∗∗∗
1.2992∗∗∗
(5.77) 0.3712∗∗∗ (2.90)
(7.93) 0.4199∗∗∗ (3.94)
(5.35) 0.3751∗∗∗ (2.72)
(6.04) 0.3062∗∗ (2.43)
−0.5807 (−1.70) 0.7604∗∗ (2.39)
1.5760∗∗∗ (6.04) 0.4456∗∗∗ (3.18)
1.7835∗∗∗ (8.16) 0.5034∗∗∗ (4.08)
9.61in x 6.69in
QG5
Handbook of Financial Econometrics,. . . (Vol. 3)
QG1
Value stocks
H.-Y. Chen, C. F. Lee & W.-K. Shih
All stocks
(Continued )
15:56
3342
Table 95.4:
b3568-v3-ch95 page 3342
July 6, 2020 15:56
QM 5 (Winners)
−0.1893 (−0.58) 0.8903∗∗∗ (3.01)
(QM 5 , QG5 ) − (QM 1 , QG1 ) ΔGM OM −M OM
0.5891∗∗ (2.08) 1.1141∗∗∗ (4.36)
0.7785∗∗∗ (4.76) 0.2237∗ (1.92)
0.7787∗∗∗ (2.70) 1.2849∗∗∗ (4.95)
0.6317∗∗∗ (3.94) 0.2845∗∗ (2.23)
Return
FF-adj
Return
1.3034∗∗∗ (5.49) 0.4190∗∗∗ (3.45)
1.4977∗∗∗ (7.64) 0.4623∗∗∗ (4.61)
1.1379∗∗∗ (4.93) 0.3861∗∗∗ (2.97)
0.1470 (0.45) 1.0004∗∗∗ (3.50)
−0.5656∗ (−1.68) 0.6813∗∗ (2.17)
0.4642 (1.58) 1.0035∗∗∗ (3.80)
1.0298∗∗∗ (5.47) 0.3222∗∗ (2.44)
FF-adj
Return
FF-adj
1.1866∗∗∗ (5.57) 0.3281∗∗∗ (2.75)
1.5691∗∗∗ (6.28) 0.5778∗∗∗ (4.34)
1.7929∗∗∗ (8.55) 0.6277∗∗∗ (5.39)
9.61in x 6.69in
Notes: This table provides returns of 3-, 6-, 9- and 12-month holding periods from a long–short investment strategy based on past-12-month returns and fundamental scores. Returns to a fundamental momentum strategy for all sample stocks, value stocks, and growth stocks are presented. Average monthly excess returns and monthly returns adjusted by the Fama–French three-factor model are presented in percentage terms. (Associated White heteroskedasticity corrected t-statistics are reported below returns.) At the end of each month, sample stocks are sorted sequentially by cumulative returns in the past 12 months and fundamental scores. QM 5 (QM 1 ) is the portfolio consisting of stocks with the past-12-month cumulative returns in the bottom (top) 20%. QF 5 and QG5 (QF 1 and QG1 ) are portfolios with the highest (lowest) FSCORE and GSCORE. ΔFMOM −MOM and ΔGMOM −MOM are the differences between returns, where ΔFMOM −MOM = [(QM 5 , QF 5 ) − (QM 1 , QF 1 )] − [QM 5 − QM 1 ] and ΔGMOM −MOM = [(QM 5 , QG5 ) − (QM 1 , QG1 )] − [QM 5 − QM 1 ]. The paired-difference t-test is used to test whether ΔFMOM −MOM (ΔGMOM −MOM ) is statistically significantly different from zero. Panels A–D present returns to fundamental momentum strategy based on FSCORE; Panels E, F, G, and H present returns to a fundamental momentum strategy based on GSCORE. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
Handbook of Financial Econometrics,. . . (Vol. 3)
QM 1 (Losers)
Technical, Fundamental, and Combined Information
Panel F: GSCORE, 6-month average excess returns (%)
b3568-v3-ch95
3343 page 3343
July 6, 2020
15:56
3344
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
If the information asymmetry between informed and uninformed investors causes momentum returns, the trading strategy constructed by these extreme portfolios is expected to generate higher long–short portfolio returns than would a momentum strategy based solely on past returns. Therefore, we introduce the BOS momentum strategy by buying winner stocks with lower BOS ratios, selling loser stocks with higher BOS ratios and holding for 3, 6, 9, or 12 months before rebalancing the portfolio. Specifically, at the end of each month in the sample period, we sort stocks based on their past-12-month returns to form five quintile portfolios, QM 1 to QM 5 . We then sort stocks in each quintile portfolio based on their BOS ratios, the covariance between their past-12-month returns and trading volume to form five quintile portfolios, QB1 to QB5 . That is, the QB5 (QB1 ) portfolio consists of stocks with the greatest (least) covariance between past cumulative returns and past trading volume. Returns to the BOS momentum strategy, which is based on both past returns and the BOS ratio, are expected to be greater than returns to the momentum strategy found in Table 95.4. We therefore formulate a testable hypothesis. H1: The BOS momentum strategy based on both past cumulative returns and the BOS ratio generates greater returns than the momentum strategy based solely on past cumulative returns. We can test the hypothesis in the following manner: ΔBOS−MOM = [(QM 5 , QB1 ) − (QM 1 , QB5 )] − [QM 5 − QM 1 ] ≥ 0,
(95.5)
where ΔBOS−MOM is the return differences between the BOS momentum strategy and the momentum strategy, [(QM 1 , QB5 ) − (QM 5 , QB5 )] is the return to the BOS momentum strategy, and [QM 5 − QM 1 ] is the return to the momentum strategy. Table 95.5 presents the returns to portfolios double sorted with respect to previous 12-month returns and the BOS ratio. Controlling for loser momentum, the long–short investment strategy with a long position in quintile portfolio QB1 and a short position in quintile portfolio QB5 generates a significantly positive return (e.g., 0.4780% of 6-month average return for all stocks). This return indicates that using an additional sorting variable, the BOS ratio, allows investors to obtain the worst loser (QM 1 , QB5 ) among the loser portfolio. Similarly, controlling for winner momentum, portfolio (QB5 − QB1 ) among the loser portfolios generates positive returns, but those returns are significant only in 9-month and 12-month holding periods for all
page 3344
July 6, 2020 15:56
Table 95.5:
Returns to BOS momentum strategy.
QB5
(QB1 − QB5 )
QB1
QB5
Growth stocks
(QB1 − QB5 )
QB1
QB5
(QB1 − QB5 )
Panel A: 3-month average excess returns (%) QM 1 (Losers) QM 5
0.1831 −0.3954 (0.60) (−1.23) (Winners) 1.1410∗∗∗ 0.9589∗∗∗ (4.06) (3.28) Return
(QM 5 , QB1 ) − (QM 1 , QB5 ) ΔBOS −MOM
0.5785∗∗∗ (4.75) 0.1822 (1.59) FF-adj
0.5360∗ −0.1455 (1.73) (−0.46) 1.3467∗∗∗ 1.0369∗∗∗ (4.98) (3.51) Return
0.6815∗∗∗ (4.63) 0.3099∗∗ (2.25) FF-adj
−0.0284 −0.6181∗ (−0.09) (−1.82) 0.9832∗∗∗ 0.7662∗∗ (3.32) (2.48) Return
1.6283∗∗∗
1.4922∗∗∗
1.4810∗∗∗
1.6014∗∗∗
(7.08) 0.4784∗∗∗ (4.32)
(7.70) 0.4472∗∗∗ (4.17)
(6.96) 0.5801∗∗∗ (4.45)
(6.89) 0.4880∗∗∗ (3.73)
(6.54) 0.4710∗∗∗ (3.38)
0.6405∗∗ −0.0578 (2.12) (−0.18) 1.2469∗∗∗ 0.9462∗∗∗ (4.64) (3.27)
0.6983∗∗∗ (5.33) 0.3007∗∗ (2.47)
FF-adj 1.7266∗∗∗ (7.22) 0.4464∗∗∗ (3.50)
Panel B: 6-month average excess returns (%) QM 1 (Losers) QM 5
0.3084 −0.3073 (1.04) (−0.97) (Winners) 1.0593∗∗∗ 0.8406∗∗∗ (3.79) (2.90) Return
(QM 5 , QB1 ) − (QM 1 , QB5 )
FF-adj
Return
FF-adj
0.0846 −0.5770∗ (0.28) (−1.72) 0.9453∗∗∗ 0.6770∗∗ (3.22) (2.21) Return
1.3665∗∗∗
1.4897∗∗∗
1.3047∗∗∗
1.3125∗∗∗
1.5223∗∗∗
(6.60) 0.4821∗∗∗ (4.69)
(7.45) 0.4543∗∗∗ (4.64)
(6.46) 0.5529∗∗∗ (4.83)
(6.48) 0.4540∗∗∗ (4.19)
(6.59) 0.5310∗∗∗ (4.07)
0.6616∗∗∗ (4.28) 0.2683∗∗ (2.26) FF-adj 1.6755∗∗∗ (7.50) 0.5104∗∗∗ (4.21)
b3568-v3-ch95
ΔBOS −MOM
0.6157∗∗∗ (5.39) 0.2187∗∗ (2.07)
9.61in x 6.69in
1.5365∗∗∗
0.5897∗∗∗ (3.56) 0.2170∗ (1.70)
Technical, Fundamental, and Combined Information
QB1
Value stocks
Handbook of Financial Econometrics,. . . (Vol. 3)
All stocks
(Continued )
3345 page 3345
July 6, 2020
Table 95.5:
QB5
(QB1 − QB5 )
Growth stocks
QB5
(QB1 − QB5 )
0.7203∗∗ (2.41) 1.1522∗∗∗ (4.33)
0.0395 (0.13) 0.8095∗∗∗ (2.84)
0.6808∗∗∗ (5.60) 0.3427∗∗∗ (3.06)
Return
FF-adj
QB1
QB5
(QB1 − QB5 )
Panel C: 9-month average excess returns (%) QM 1 (Losers) QM 5
0.4206 −0.1870 (1.43) (−0.59) (Winners) 0.9896∗∗∗ 0.7042∗∗ (3.54) (2.42) Return
ΔBOS −MOM
FF-adj
0.2030 −0.4416 (0.68) (−1.33) 0.9042∗∗∗ 0.5456∗ (3.06) (1.77) Return
1.2986∗∗∗
1.1127∗∗∗
1.1323∗∗∗
1.3458∗∗∗
(5.91) 0.4971∗∗∗ (5.04)
(6.84) 0.4527∗∗∗ (5.00)
(5.75) 0.5570∗∗∗ (5.27)
(5.88) 0.4462∗∗∗ (4.70)
(6.03) 0.5643∗∗∗ (4.61)
0.1553 (0.51) 0.7295∗∗∗ (2.58)
0.6270∗∗∗ (5.45) 0.3754∗∗∗ (3.45)
FF-adj 1.4969∗∗∗ (7.02) 0.5282∗∗∗ (4.72)
Panel D: 12-month average excess returns (%) 0.5240∗ −0.0589 (1.83) (−0.19) (Winners) 0.9844∗∗∗ 0.6221∗∗ (3.54) (2.15)
QM 1 (Losers) QM 5
(QM 5 , QB1 ) − (QM 1 , QB5 ) ΔBOS −MOM
0.5828∗∗∗ (5.55) 0.3623∗∗∗ (3.58)
0.7822∗∗∗ (2.71) 1.1049∗∗∗ (4.20)
Return
FF-adj
Return
FF-adj
1.0433∗∗∗ (5.55) 0.5174∗∗∗ (5.37)
1.1397∗∗∗ (6.34) 0.4547∗∗∗ (5.07)
0.9497∗∗∗ (5.20) 0.5308∗∗∗ (5.18)
0.9578∗∗∗ (5.24) 0.4121∗∗∗ (4.63)
0.3438 −0.2916 (1.17) (−0.90) 0.9110∗∗∗ 0.4531 (3.10) (1.48) Return 1.2026∗∗∗ (5.64) 0.6027∗∗∗ (5.14)
0.6355∗∗∗ (4.65) 0.4578∗∗∗ (4.09) FF-adj 1.3320∗∗∗ (6.54) 0.5516∗∗∗ (5.05)
b3568-v3-ch95
Notes: This table provides returns of 3-, 6-, 9- and 12-month holding periods from a long–short investment strategy based on the past-12-month returns and BOS ratios. Returns to a BOS momentum strategy for all sample stocks, value stocks, and growth stocks are presented. Average monthly excess returns and monthly returns adjusted by the Fama–French three-factor model are presented in percentage terms. (Associated White heteroskedasticity corrected t-statistics are reported below returns.) At the end of each month, sample stocks are sorted sequentially by cumulative returns in the past-12 months and BOS ratios. QM 5 (QM 1 ) is a portfolio consisting of stocks with the past-12-month cumulative returns in the bottom (top) 20%. QB5 (QB1 ) is the portfolio with the highest (lowest) BOS ratio. ΔBOS −MOM is the difference between returns; ΔBOS −MOM = [(QM 5 , QB1 ) − (QM 1 , QB5 )] − [QM 5 − QM 1 ]. The paired difference t-test is used to test whether ΔBOS −MOM is statistically significantly different from zero. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
9.61in x 6.69in
1.1765∗∗∗
0.6446∗∗∗ (4.44) 0.3585∗∗∗ (3.10)
H.-Y. Chen, C. F. Lee & W.-K. Shih
(QM 5 , QB1 ) − (QM 1 , QB5 )
0.6076∗∗∗ (5.55) 0.2853∗∗∗ (2.75)
Handbook of Financial Econometrics,. . . (Vol. 3)
QB1
15:56
QB1
Value stocks
3346
All stocks
(Continued )
page 3346
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Technical, Fundamental, and Combined Information
b3568-v3-ch95
page 3347
3347
stocks and growth stocks. These returns suggest that the BOS ratio can only marginally separate the best winners from the winner portfolio. Table 95.5 also provides returns to the BOS momentum strategy. In terms of a 6-month holding period, the BOS momentum strategy can generate an average monthly return as high as 1.3665% with t-statistics of 6.60 and a risk-adjusted monthly return as high as 1.4897% with t-statistics of 7.45. Compared to the return and risk-adjusted return to a momentum strategy, the BOS momentum strategy significantly outperforms the momentum strategy by 0.4821% and 0.4543% in terms of monthly return and risk-adjusted return, respectively. The basis of the superior performance of the BOS momentum strategy can be found in different holding periods. When we apply the BOS momentum strategy to value stocks and growth stocks, BOS momentum strategies can still generate significant profits and outperform the momentum strategy. Our results therefore demonstrate that the BOS ratio indeed helps investors measure the level of information asymmetry and identify the best (worst) stocks among winner (loser) portfolios. In addition, one may observe a smaller difference between the high-BOS group (QB5 ) and the low-BOS group (QB1 ) for winner stocks in Table 95.5. The difference may indicate that the momentum effect is stronger for loser stocks with a higher-level information asymmetry problem. This can be explained by the limit of arbitrage proposed by Shleifer and Vishny (1997) and Arena et al. (2008). As mentioned above, a momentum strategy is suggested for investors long for winner stocks and short for loser stocks. However, short selling for loser stocks is relatively difficult compared to buying winner stocks in practice. Due to the limitation on short selling for loser stocks, loser stocks with higher information asymmetry may take a longer time to reflect inefficient information, and therefore, a stronger momentum effect can be observed for loser stocks with higher BOS ratios. In contrast, investors can buy winner stocks with higher information asymmetry without limitation, so there is little difference between a low-BOS winner and a high-BOS winner. Studying trading volume literature, Datar et al. (1998) find a negative relationship between past trading volume and future returns for stocks. They demonstrate that stocks with a low trading volume in the recent past generate higher future returns than do those with a high trading volume. Lee and Swaminathan (2000) find that low-volume stocks outperform high-volume stocks after controlling for price momentum, and momentum is stronger among high-volume stocks. Simple trading volume could proxy for many different factors, such as size, liquidity, and degree of asymmetric information.
July 6, 2020
15:56
3348
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
However, the BOS ratio provides a proxy for asymmetric information by measuring the covariance between past returns and past trading volume and therefore narrows down subsets concerning our investment strategy. In general, momentum returns are stronger when past trading volume is incorporated into separating winners from losers when forming an investment strategy. Because these winner and loser stocks could have fundamentally different financial characteristics, however, we wonder whether further analyses concerning a firm’s fundamentals could aid investors in selecting the best (worst) among winner (loser) stocks. We next examine the combined investment strategy when fundamental analysis indicators FSCORE/GSCORE are incorporated. 95.4.4 Combined investment strategy based on technical and fundamental information Piotroski (2000) and Mohanram (2005) show that fundamental indicators FSCORE and GSCORE do help investors separate winner stocks from loser stocks based on firm-specific financial characteristics for value stocks and growth stocks, respectively. Their results indicate that financially healthier firms will enjoy a higher price appreciation than will their counterparts with more financial constraints. In this section, we propose a combined investment strategy based on past returns, the BOS ratio, and fundamental indicators FSCORE/GSCORE. Specifically, at the end of each month in the sample period, we apply a three-way sort based on past-12-month returns, the BOS ratio, and FSCORE/GSCORE, and we group the sample stocks into 125 portfolios. The combined investment strategy is constructed by holding a long position in winner stocks with a lower BOS ratio and a higher FSCORE/GSCORE and a short position in loser stocks with a higher BOS ratio and a lower FSCORE/GSCORE. Similar to the momentum strategy and the BOS momentum strategy, we hold the long–short portfolio for 3, 6, 9, and 12 months and then rebalance it. We conjecture that the combination of the technical information (past returns and the BOS ratio) and fundamental information (composite fundamental scores) is useful to separate momentum winners from losers. Therefore, we expect that post-formation returns to the combined investment strategy will be significantly higher than those to the momentum strategy and the BOS momentum strategy. Testable hypotheses are formulated as follows: H2: The combined investment strategy based on portfolios sorted by past cumulative returns, BOS ratio, and FSCORE/GSCORE generates higher returns than does the momentum strategy.
page 3348
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
Technical, Fundamental, and Combined Information
page 3349
3349
H3: The combined investment strategy based on portfolios sorted by past cumulative returns, BOS ratio, and FSCORE/GSCORE generates higher returns than does the BOS momentum strategy. We can test these hypotheses in the following manner: ΔCS-MOM = [(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )] − [QM 5 − QM 1 ] ≥ 0,
(95.6)
ΔCS-BOS = [(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )] − [(QM 5 , QB1 ) − (QM 1 , QB5 )] ≥ 0,
(95.7)
where ΔCS−MOM and ΔCS−BOS are the return differences between the combined investment strategy and the momentum strategy and between the combined investment strategy and the BOS momentum strategy, [(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )] is the return to the combined investment strategy based on the FSCORE, [QM 5 − QM 1 ] is the return to the momentum strategy, and [(QM 5 , QB1 ) − (QM 1 , QB5 )] is the return to the BOS momentum strategy. The combined investment strategy in equations (95.6) and (95.7) uses the FSCORE as the fundamental indicator. We also construct the combined investment strategy based on the GSCORE and compare the combined investment strategy with the momentum strategy and the BOS momentum strategy. Panel A of Table 95.6 provides a summary of returns to the combined investment strategy based on the FSCORE. We first observe that financially healthier firms indeed outperform those with more financial constraints. For example, in terms of a 6-month holding period, financially healthy firms based on the FSCORE outperform financially constrained firms by 0.8845% and 0.3046% for winner stocks and loser stocks with higher information asymmetry. In addition, the combined investment strategy can generate a significant average monthly return and a risk-adjusted return as high as 1.9244% and 2.0905% with t-statistics of 7.31 and 8.51, respectively. The returns to the combined investment strategy are significantly higher than are those to the momentum strategy and the BOS momentum strategy by 1.04% and 0.5578% in terms of a 6-month holding period. Findings that can be observed in different holding periods indicate that the top-quintile portfolio outperforms the bottom-quintile portfolio, sorted by FSCORE after controlling for previous 12-month returns and the BOS ratio. The significantly higher return to the combined investment strategy indicates a stronger
July 6, 2020
6-month QF 5 − QF 1
QF 1
QF 5
9-month QF 5 − QF 1
QF 1
QF 5
−0.5380 (−1.51)
0.2739 (0.89)
12-month QF 5 − QF 1
QF 1
QF 5
−0.3386 (−0.98)
0.4142 (1.39)
QF 5 − QF 1
Panel A: Combined with FSCORE (QM 1 , QB5 ) (QM 5 , QB1 )
−0.7197∗ (−1.91) 0.9596∗∗∗ (3.15)
−0.0316 (−0.10) 1.2994∗∗∗ (4.54) Return
0.6881∗∗∗ (3.07) 0.3399∗∗ (2.14)
−0.7159∗∗ (−1.98)
0.1685 (0.54)
0.9039∗∗∗ 1.2085∗∗∗ (3.01) (4.29)
0.8845∗∗∗ (4.51) 0.3046∗∗ (2.25)
0.8289∗∗∗ 1.1709∗∗∗ (2.77) (4.18)
0.8120∗∗∗ (4.49) 0.3421∗∗∗ (2.85)
0.8009∗∗∗ 1.1534∗∗∗ (2.71) (4.15)
0.7527∗∗∗ (4.49) 0.3525∗∗∗ (3.16)
Return
FF-adj
Return
FF-adj
Return
FF-adj
2.1785∗∗∗ (8.29)
1.9244∗∗∗ (7.31)
2.0905∗∗∗ (8.51)
1.7090∗∗∗ (6.70)
1.8420∗∗∗ (7.79)
1.4919∗∗∗ (6.08)
1.5734∗∗∗ (6.97)
ΔCSF −MOM
0.9611∗∗∗ (4.52)
0.9974∗∗∗ (5.00)
1.0400∗∗∗ (5.49)
1.0551∗∗∗ (5.65)
1.0296∗∗∗ (5.75)
0.9961∗∗∗ (5.76)
0.9660∗∗∗ (5.63)
0.8884∗∗∗ (5.65)
ΔCSF −BOS
0.4827∗∗∗ (2.92)
0.5502∗∗∗ (3.59)
0.5578∗∗∗ (3.96)
0.6008∗∗∗ (4.37)
0.5324∗∗∗ (4.09)
0.5434∗∗∗ (4.24)
0.4486∗∗∗ (3.70)
0.4337∗∗∗ (3.74)
ΔCSF −FMOM
0.5370∗∗∗ (3.05)
0.5918∗∗∗ (3.82)
0.5377∗∗∗ (3.54)
0.5768∗∗∗ (4.05)
0.4856∗∗∗ (3.47)
0.4849∗∗∗ (3.71)
0.4384∗∗∗ (3.36)
0.3960∗∗∗ (3.33)
9.61in x 6.69in
FF-adj
2.0191∗∗∗ (7.02)
(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )
Handbook of Financial Econometrics,. . . (Vol. 3)
QF 5
H.-Y. Chen, C. F. Lee & W.-K. Shih
QF 1
Returns to combined investment strategy.
15:56
3350
Table 95.6: 3-month
b3568-v3-ch95 page 3350
July 6, 2020 15:56
6-month QG5 − QG1
QG1
QG5
9-month QG5 − QG1
QG1
QG5
−0.5221 (−1.47)
0.4391 (1.44)
12-month QG5 − QG1
QG1
QG5
QG5 − QG1
Panel B: Combined with GSCORE (QM 1 , QB5 ) (QM 5 , QB1 )
−0.6710∗ (−1.79) 1.0133∗∗∗ (3.26)
0.0460 (0.14)
0.7170∗∗∗ (3.08)
1.1956∗∗∗ (4.23)
0.1823 (1.12)
Return
−0.6667∗ (−1.85)
0.2976 (0.95)
0.9164∗∗∗ 1.1233∗∗∗ (3.04) (4.05)
0.9642∗∗∗ (4.51) 0.2069 (1.43)
0.7741∗∗∗ 1.1032∗∗∗ (2.59) (4.01)
0.9612∗∗∗ (4.73) 0.3291∗∗ (2.45)
−0.4214 (−1.23) 0.7137∗∗ (2.41)
0.5454∗ (1.82)
0.9668∗∗∗ (5.05)
1.1265∗∗∗ (4.12)
0.4128∗∗∗ (3.22)
Return
FF-adj
Return
FF-adj
Return
FF-adj
2.0269∗∗∗ (7.48)
1.7900∗∗∗ (6.18)
1.9769∗∗∗ (8.28)
1.6253∗∗∗ (5.84)
1.8109∗∗∗ (7.87)
1.5479∗∗∗ (5.90)
1.6901∗∗∗ (7.38)
ΔCSG−MOM
0.8086∗∗∗ (3.52)
0.8458∗∗∗ (4.04)
0.9056∗∗∗ (4.43)
0.9415∗∗∗ (5.27)
0.9459∗∗∗ (4.89)
0.9650∗∗∗ (5.79)
1.0220∗∗∗ (5.55)
1.0051∗∗∗ (6.12)
ΔCSG−BOS
0.3301∗ (1.83)
0.3986∗∗ (2.44)
0.4235∗∗∗ (2.72)
0.4872∗∗∗ (3.61)
0.4488∗∗∗ (3.09)
0.5123∗∗∗ (3.92)
0.5046∗∗∗ (3.67)
0.5504∗∗∗ (4.29)
ΔCSG−GMOM
0.4374∗∗ (2.51)
0.4259∗∗ (2.55)
0.4866∗∗∗ (3.38)
0.4792∗∗∗ (3.48)
0.4565∗∗∗ (3.52)
0.4318∗∗∗ (3.55)
0.4970∗∗∗ (4.22)
0.4392∗∗∗ (3.87)
b3568-v3-ch95
Notes: This table provides a summary of momentum returns when sample stocks are sorted by past returns, BOS ratio, and fundamental indicator FSCORE or GSCORE. Average monthly excess returns and monthly returns adjusted by the Fama–French three-factor model are presented in percentage terms. (Associated White heteroskedasticity corrected t-statistics are reported below returns.) At the end of each month, stocks are sorted sequentially by cumulative returns in the past twelve months, BOS ratio, and fundamental score. Portfolios QM i and QBi have the same definition as in previous tables. QF 5 (QF 1 ) is the portfolio consisting of stocks with the highest (lowest) FSCORE. QG5 (QG1 ) is the portfolio consisting of stocks with the highest (lowest) GSCORE. (QM 5 , QB1, QF 5 ) − (QM 1 , QB5 , QF 1 ) is profits generated from the long– short investment strategy with a long position in top winners-lowest BOS-highest FSCORE stocks and a short position in top losers-highest BOS-lowest FSCORE stocks. (QM 5 , QB1 , QG5 ) − (QM 1 , QB5 , QG1 ) is profits generated from the long–short investment strategy with a long position in top winners-lowest BOS-highest GSCORE stocks and a short position in top losers-highest BOS-lowest GSCORE stocks. ΔCSF −MOM (ΔCSG−MOM ) is the difference in long–short portfolio returns between the combined strategy and momentum strategy. ΔCSF −BOS (ΔCSG−BOS ) is the difference in long–short portfolio returns between the combined strategy and BOS momentum strategy. ΔCSF −FMOM (ΔCSG−GMOM ) is the difference in long–short portfolio returns between the combined strategy and fundamental momentum strategies. The paired difference t-test is used to test whether the differences are statistically significantly different from zero. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
9.61in x 6.69in
FF-adj
1.8666∗∗∗ (5.87)
(QM 5 , QB1 , QG5 ) − (QM 1 , QB5 , QG1 )
Technical, Fundamental, and Combined Information
QG5
Handbook of Financial Econometrics,. . . (Vol. 3)
3-month QG1
3351 page 3351
July 6, 2020
15:56
3352
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
momentum return when fundamental indicators are considered to identify winners and losers. Panel B shows the summary of returns to the combined investment strategy based on the GSCORE. We can find results similar to those of the combined investment strategy based on the FSCORE. Firms with healthier finances have higher future returns than do firms with financial constraints, but the differences in winner stocks with higher information asymmetry are not significant for 3- and 6-month holding periods. The returns to the combined investment strategy based on the GSCORE are all positive at the 0.01 significance level, although those returns are slightly lower than returns to the combined investment strategy based on the FSCORE. In addition, the combined investment strategy based on the GSCORE can generate momentum returns superior to those from the momentum strategy and the BOS momentum strategy. Therefore, in general, our results in Table 95.6 suggest that incorporating fundamental indicators can improve investors’ ability to separate winners from losers and to obtain a higher return from the combined investment strategy. We further investigate the effectiveness of the combined investment strategy in two subsamples, value stocks and growth stocks. Panels A and B of Table 95.7 present returns to the combined investment strategy based on the FSCORE and GSCORE for value stocks; Panels C and D present returns to the combined investment strategy based on the FSCORE and GSCORE for growth stocks. The results for value stocks are similar to those for all samples. The combined investment strategy based on both the FSCORE and the GSCORE can generate significantly positive returns and outperform the momentum strategy and the BOS momentum strategy. This result indicates that fundamental indicators can provide more information content, which the previous 12-month return and the BOS ratio do not capture, and further separate winners from losers among value stocks. Therefore, a higher momentum return from the combined investment strategy can be observed. Panels C and D show that combined investment strategies are profitable and outperform momentum strategies for growth stocks. Compared to the BOS momentum strategies, the combined investment strategies based on the FSCORE provide better returns, whereas the combined investment strategy based on the GSCORE cannot obtain a better return for the 3-month holding period. This result implies that for growth stocks, the BOS ratio may already capture most of the short-term information content offered by fundamental scores. Therefore, we can observe only a marginal improvement
page 3352
July 6, 2020 15:56
3-month QF 1
QF 5
6-month QF 5 − QF 1
9-month
QF 1
QF 5
QF 5 − QF 1
−0.4765 (−1.29)
0.2931 (0.88)
0.7696∗∗∗ (3.01)
1.2840∗∗∗ (4.45)
0.0098 (0.05)
QF 1
QF 5
−0.3228 (−0.90)
0.3735 (1.16)
12-month QF 5 − QF 1
QF 1
QF 5
−0.1087 (−0.31)
0.4881 (1.59)
QF 5 − QF 1
Panel A: Combined with FSCORE — Value stocks (QM 1 , QB5 ) (QM 5 , QB1 )
−0.3967 (−1.01) 1.3904∗∗∗ (4.54)
−0.0284 (−0.08)
0.3682 (1.23)
1.3463∗∗∗ −0.0441 (4.46) (−0.21) Return
Return
1.1243∗∗∗ 1.2232∗∗∗ (3.81) (4.34)
0.0988 (0.60)
1.0308∗∗∗ 1.2119∗∗∗ (3.58) (4.38)
0.5968∗∗∗ (2.76) 0.1811 (1.21)
Return
FF-adj
Return
FF-adj
1.7172∗∗∗ (5.13)
1.7605∗∗∗ (5.96)
1.7602∗∗∗ (5.79)
1.5459∗∗∗ (5.58)
1.5100∗∗∗ (5.38)
1.3207∗∗∗ (5.01)
1.2493∗∗∗ (4.94)
ΔCSF −MOM
0.8309∗∗∗ (2.92)
0.7242∗∗∗ (2.58)
1.0087∗∗∗ (4.19)
0.9017∗∗∗ (3.52)
0.9902∗∗∗ (4.46)
0.8240∗∗∗ (3.65)
0.9018∗∗∗ (4.26)
0.7035∗∗∗ (3.58)
ΔCSF −BOS
0.2508 (1.07)
0.2362 (1.08)
0.4558∗∗ (2.32)
0.4477∗∗ (2.19)
0.4332∗∗ (2.44)
0.3777∗∗ (2.03)
0.3710∗∗ (2.23)
0.2915∗ (1.75)
ΔCSF −FMOM
0.4198∗ (1.75)
0.3427 (1.50)
0.5022∗∗ (2.48)
0.4169∗∗ (2.01)
0.5008∗∗∗ (2.74)
0.3628∗ (1.97)
0.4201∗∗ (2.45)
0.2594∗ (1.64) (Continued)
9.61in x 6.69in
FF-adj
1.7430∗∗∗ (5.11)
(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )
FF-adj
1.2743∗∗∗ (4.26)
0.6963∗∗∗ (2.95)
Handbook of Financial Econometrics,. . . (Vol. 3)
Returns to combined investment strategy — value stocks and growth stocks.
Technical, Fundamental, and Combined Information
Table 95.7:
b3568-v3-ch95
3353 page 3353
July 6, 2020
QG5 − QG1
QG1
QG5
−0.4304 (−1.15)
0.4339 (1.31)
9-month QG5 − QG1
QG1
12-month
QG5
QG5 − QG1
0.5545∗ (1.73)
0.8969∗∗∗ (3.52)
QG1
QG5
QG5 − QG1
0.6013∗ (1.95)
0.8558∗∗∗ (3.61)
Panel B: Combined with GSCORE — Value stocks (QM 1 , QB5 ) (QM 5 , QB1 )
−0.4236 (−1.08)
0.0941 (0.27)
0.5177∗ (1.74)
1.4629∗∗∗ 1.3793∗∗∗ −0.0836 (4.79) (4.89) (−0.41) Return
0.0350 (0.20)
−0.3424 (−0.93)
1.1017∗∗∗ 1.1516∗∗∗ 0.0499 (3.86) (4.28) (0.32) Return
FF-adj
−0.2545 (−0.72)
0.9679∗∗∗ 1.1459∗∗∗ 0.1780 (3.46) (4.33) (1.26)
FF-adj
1.7988∗∗∗ (5.75)
1.7075∗∗∗ (5.30)
1.7118∗∗∗ (5.93)
1.4939∗∗∗ 1.5129∗∗∗ (4.95) (5.49)
1.4004∗∗∗ 1.3784∗∗∗ (4.92) (5.15)
ΔCSG−MOM
0.8908∗∗∗ (3.02)
0.8058∗∗∗ (3.19)
0.9557∗∗∗ (3.58)
0.8533∗∗∗ (3.66)
0.9382∗∗∗ 0.8268∗∗∗ (3.88) (3.92)
0.9815∗∗∗ 0.8327∗∗∗ (4.28) (4.11)
ΔCSG−BOS
0.3107 (1.30)
0.3177 (1.56)
0.4028∗ (1.83)
0.3994∗∗ (2.04)
0.3812∗∗ (1.92)
0.4507∗∗ (2.45)
ΔCSG−GMOM
0.5157∗∗ (2.16)
0.4996∗∗ (2.33)
0.5696∗∗∗ (2.70)
0.5252∗∗∗ (2.73)
0.5196∗∗∗ 0.4520∗∗∗ (2.79) (2.68)
0.3806∗∗ (2.08)
Return
FF-adj
0.4206∗∗ (2.39)
0.5262∗∗∗ 0.4195∗∗∗ (3.05) (2.74)
9.61in x 6.69in
Return
1.8029∗∗∗ (5.16)
(QM 5 , QB1 , QG5 ) − (QM 1 , QB5 , QG1 )
FF-adj
1.2421∗∗∗ 1.2771∗∗∗ (4.30) (4.63)
0.8642∗∗∗ (3.20)
Handbook of Financial Econometrics,. . . (Vol. 3)
QG5
H.-Y. Chen, C. F. Lee & W.-K. Shih
QG1
(Continued )
6-month
15:56
3354
Table 95.7: 3-month
b3568-v3-ch95 page 3354
July 6, 2020 15:56
6-month QF 5 − QF 1
QF 1
QF 5
9-month QF 5 − QF 1
QF 1
QF 5
12-month QF 5 − QF 1
QF 1
QF 5
QF 5 − QF 1
−0.5710 (−1.53)
0.1995 (0.61)
0.7705∗∗∗ (3.64)
1.0579∗∗∗ (3.56)
0.3509∗∗ (2.55)
Panel C: Combined with FSCORE — Growth stocks (QM 1 , QB5 ) (QM 5 , QB1 )
−0.9969∗∗ −0.0736 (−2.44) (−0.20) 0.5745∗ (1.77)
1.1764∗∗∗ (3.73) Return
0.9233∗∗∗ −1.0130∗∗∗ −0.0624 (3.15) (−2.57) (−0.18)
0.9506∗∗∗ (3.73)
−0.8339∗∗ (−2.17)
0.6020∗∗∗ (2.92)
0.4364∗∗∗ (2.59)
0.6872∗∗ (2.16)
1.1360∗∗∗ (3.72) Return
0.8603∗∗∗ (3.72)
1.0878∗∗∗ (3.60)
0.4006∗∗∗ (2.63)
0.7070∗∗ (2.28)
Return
FF-adj
Return
FF-adj
2.3967∗∗∗ (7.98)
2.1490∗∗∗ (7.01)
2.3859∗∗∗ (8.82)
1.9217∗∗∗ (6.59)
2.1391∗∗∗ (8.07)
1.6289∗∗∗ (5.81)
1.8018∗∗∗ (6.94)
ΔCSF −MOM
1.0430∗∗∗ (3.92)
1.1166∗∗∗ (4.69)
1.1577∗∗∗ (4.98)
1.2207∗∗∗ (5.87)
1.1402∗∗∗ (5.44)
1.1704∗∗∗ (6.06)
1.0290∗∗∗ (5.21)
1.0215∗∗∗ (5.65)
ΔCSF −BOS
0.5720∗∗∗ (2.65)
0.6701∗∗∗ (3.50)
0.6268∗∗∗ (3.49)
0.7104∗∗∗ (4.47)
0.5759∗∗∗ (3.61)
0.6422∗∗∗ (4.19)
0.4263∗∗∗ (2.93)
0.4698∗∗∗ (3.34)
ΔCSF −FMOM
0.6041∗∗∗ (2.64)
0.6751∗∗∗ (3.49)
0.6229∗∗∗ (3.24)
0.6828∗∗∗ (4.03)
0.5553∗∗∗ (3.29)
0.5928∗∗∗ (4.08)
0.4726∗∗∗ (3.05)
0.4761∗∗∗ (3.40) (Continued)
9.61in x 6.69in
FF-adj
2.1733∗∗∗ (6.49)
(QM 5 , QB1 , QF 5 ) − (QM 1 , QB5 , QF 1 )
FF-adj
0.6996∗∗∗ (2.19)
0.0264 (0.08)
Handbook of Financial Econometrics,. . . (Vol. 3)
QF 5
Technical, Fundamental, and Combined Information
3-month QF 1
b3568-v3-ch95
3355 page 3355
July 6, 2020
QG5
6-month QG5 − QG1
QG1
QG5
9-month QG5 − QG1
QG1
QG5 0.2784 (0.85)
12-month QG5 − QG1
QG1
QG5
0.8804∗∗∗ (3.71)
−0.4434 (−1.22)
0.4162 (1.30)
1.1634∗∗∗ 0.6277∗∗∗ (3.96) (3.87)
0.5148 (1.59)
QG5 − QG1
Panel D: Combined with GSCORE — Growth stocks −0.7548∗ (−1.89)
(QM 5 , QB1 )
0.6440∗ (1.87)
0.0682 (0.19)
0.8230∗∗∗ (2.90)
−0.7971∗∗ (−2.08)
1.1838∗∗∗ (3.86)
0.5398∗∗∗ (2.60)
0.6613∗∗ (1.99)
Return
0.9539∗∗∗ (3.73)
−0.6021 (−1.60)
1.1574∗∗∗ (3.87)
0.4961∗∗∗ (2.76)
0.5357 (1.63)
Return
FF-adj
0.8596∗∗∗ (3.88)
1.1416∗∗∗ 0.6268∗∗∗ (3.90) (4.13)
FF-adj
2.1508∗∗∗ (7.18)
1.9545∗∗∗ (5.98)
2.1585∗∗∗ (7.83)
1.7655∗∗∗ 1.9775∗∗∗ (5.70) (7.73)
1.5851∗∗∗ 1.7619∗∗∗ (5.39) (7.21)
Return
FF-adj
ΔCSG−MOM
0.8082∗∗∗ (3.07)
0.8707∗∗∗ (3.74)
0.9632∗∗∗ (3.95)
0.9934∗∗∗ (4.56)
0.9840∗∗∗ 1.0087∗∗∗ (4.32) (5.03)
0.9851∗∗∗ 0.9816∗∗∗ (4.59) (5.23)
ΔCSG−BOS
0.3372 (1.62)
0.4242∗∗ (2.06)
0.4322∗∗ (2.32)
0.4831∗∗∗ (2.69)
0.4197∗∗ (2.42)
0.4805∗∗∗ (2.95)
0.3825∗∗ (2.34)
0.4300∗∗∗ (2.86)
ΔCSG−GM OM
0.3626∗ (1.70)
0.3672∗ (1.83)
0.3854∗∗ (2.03)
0.3657∗∗ (2.00)
0.3670∗∗ (2.18)
0.3475∗∗ (2.20)
0.3336∗∗ (2.20)
0.2877∗∗ (2.05)
b3568-v3-ch95
Notes: This table provides a summary of momentum returns when sample stocks are sorted by past returns, BOS ratio, and the fundamental indicator FSCORE or GSCORE. Average monthly excess returns and monthly returns adjusted by the Fama–French three-factor model are presented in percentage terms. (Associated White heteroskedasticity corrected t-statistics are reported below returns.) Returns to combined investment strategy for value stocks are presented in Panels A and B, and those for growth stocks are presented in Panels C and D. At the end of each month, stocks are sorted sequentially by cumulative returns in the past 12 months, BOS ratio, and fundamental score. Portfolios QM i and QBi have the same definition as in previous tables. QF 5 (QF 1 ) is the portfolio consisting of stocks with the highest (lowest) FSCORE. QG5 (QG1 ) is the portfolio consisting of stocks with the highest (lowest) GSCORE. (QM 5 , QB1 , QF 5 ) − (QM 1 , QB5, QF 1 ) is profits generated from the long–short investment strategy with a long position in top winners-lowest BOS-highest FSCORE stocks and a short position in top losers-highest BOS-lowest FSCORE stocks. (QM 5 , QB1, QG5 )−(QM 1 , QB5 , QG1 ) is profits generated from the long–short investment strategy with a long position in top winners-lowest BOS-highest GSCORE stocks and a short position in top losers-highest BOS-lowest GSCORE stocks. ΔCS−MOM is the difference in long–short portfolio returns between the combined strategy and momentum strategy. ΔCS−BOS is the difference in long–short portfolio returns between the combined strategy and BOS momentum strategy. The paired difference t-test is used to test whether the differences are statistically significantly different from zero. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
9.61in x 6.69in
Return
1.9386∗∗∗ (5.43)
(QM 5 , QB1 , QG5 ) − (QM 1 , QB5 , QG1 )
FF-adj
0.1568 (0.46)
H.-Y. Chen, C. F. Lee & W.-K. Shih
(QM 1 , QB5 )
Handbook of Financial Econometrics,. . . (Vol. 3)
3-month QG1
(Continued )
15:56
3356
Table 95.7:
page 3356
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Technical, Fundamental, and Combined Information
b3568-v3-ch95
page 3357
3357
when we introduce the GSCORE to the BOS momentum strategy with a 3-month holding period. If we apply the combined investment strategy with a longer holding period, we can still obtain a significant combined investment return that outperforms the momentum strategy and the BOS momentum strategy. Understanding that the combined investment strategy can generate a higher return than the other two investment strategies, we further compare the risk–return characteristics across three different investment strategies. Table 95.8 provides the average long–short returns and information ratios for the momentum strategy, the BOS momentum strategy, and the combined investment strategy with different holding periods. The information ratio is defined as the active return divided by the tracking error, IRi =
r¯i − r¯m , σ(ri −rm )
(95.8)
rm , is the difference between the return on differwhere the active return, r¯i −¯ ent strategies and the value-weighted market return, and the tracking error, σ(ri −rm ) , is the standard deviation of the active return. Panel A presents average long–short returns and information ratios of investment strategies for all sample stocks, and Panels B and C present average long–short returns and information ratios for value stocks and growth stocks, respectively. We find that in terms of a 6-month holding period for all sample stocks, the combined investment strategy based on the FSCORE (GSCORE) can produce an information ratio of 0.1845 (0.1619), which is higher than 0.0579 in the momentum strategy and 0.1335 in the BOS momentum strategy. Therefore, when we consider the risk, the combined investment strategy can still outperform the other two investment strategies. Table 95.8 also reports the correlation between returns to the combined investment strategies and the momentum strategy or the BOS momentum strategy. For all stocks, in terms of a 6-month holding period, the correlation between the combined investment strategy based on the FSCORE (GSCORE) and the momentum strategy is 0.6970 (0.7217), indicating that the combined investment strategy and momentum strategy share some correlated information, but each still has a distinctive content. That is, the information content carried by fundamental indicators differs from the information content of prior returns. Therefore, the combination of technical information and fundamental information can improve investors’ ability to further separate winner stocks from loser stocks. Furthermore, the correlations between the combined investment strategy based on the FSCORE (GSCORE) and the BOS momentum strategy yield a higher value,
July 6, 2020
Momentum return BOS
CSF
Correlation CSG
(MOM , BOS ) (MOM , CSF ) (MOM , CSG) (BOS , CSF ) (BOS , CSG) (CS -F, CSG)
Panel A: All stocks 0.8613
0.6765
0.7082
0.8212
0.8372
0.7442
6-month 0.8844∗∗∗ 1.3665∗∗∗ 1.9244∗∗∗ 1.7900∗∗∗ t-stat (5.22) (6.60) (7.31) (6.18) Information ratio 0.0579 0.1335 0.1845 0.1619
0.8694
0.6970
0.7217
0.8470
0.8541
0.7710
9-month 0.6794∗∗∗ 1.1765∗∗∗ 1.7090∗∗∗ 1.6253∗∗∗ t-stat (4.21) (5.91) (6.70) (5.84) Information ratio 0.0223 0.1053 0.1612 0.1453
0.8707
0.7175
0.7347
0.8643
0.8655
0.7880
0.5259∗∗∗ 1.0433∗∗∗ 1.4919∗∗∗ 1.5479∗∗∗ (3.54) (5.55) (6.08) (5.90)
0.8616
0.7243
0.7313
0.8759
0.8643
0.7920
12-month t-stat Information ratio
−0.0111
0.0813
0.1321
9.61in x 6.69in
3-month 1.0580∗∗∗ 1.5365∗∗∗ 2.0191∗∗∗ 1.8666∗∗∗ t-stat (5.96) (7.08) (7.02) (5.87) Information ratio 0.0857 0.1568 0.1851 0.1606
H.-Y. Chen, C. F. Lee & W.-K. Shih
MOM
Handbook of Financial Econometrics,. . . (Vol. 3)
Comparison of investment strategies.
15:56
3358
Table 95.8:
0.1377
b3568-v3-ch95 page 3358
July 6, 2020 15:56
BOS
CSF
CSG
(MOM , BOS ) (MOM , CSF ) (MOM , CSG) (BOS , CSF ) (BOS , CSG) (CSF , CSG)
Panel B: Value stocks 0.7936
0.5567
0.5399
0.7343
0.7385
0.6575
6-month 0.7518∗∗∗ 1.3047∗∗∗ 1.7605∗∗∗ 1.7075∗∗∗ t-stat (4.65) (6.46) (5.96) (5.30) Information ratio 0.0359 0.1233 0.1538 0.1383
0.8241
0.5814
0.5609
0.7496
0.7369
0.6474
9-month 0.5557∗∗∗ 1.1127∗∗∗ 1.5459∗∗∗ 1.4939∗∗∗ t-stat (3.61) (5.75) (5.58) (4.95) Information ratio 0.0001 0.0936 0.1323 0.1178
0.8384
0.6001
0.6071
0.7705
0.7649
0.6669
0.4188∗∗∗ 0.9497∗∗∗ 1.3207∗∗∗ 1.4004∗∗∗ (2.97) (5.20) (5.01) (4.92)
0.8301
0.5995
0.6021
0.7803
0.7740
0.6761
12-month t-stat Information ratio
−0.0318
0.0636
0.1020
0.1077 (Continued)
9.61in x 6.69in
3-month 0.9121∗∗∗ 1.4922∗∗∗ 1.7430∗∗∗ 1.8029∗∗∗ t-stat (5.38) (6.96) (5.11) (5.16) Information ratio 0.0632 0.1483 0.1362 0.1400
Handbook of Financial Econometrics,. . . (Vol. 3)
MOM
Correlation
Technical, Fundamental, and Combined Information
Momentum return
b3568-v3-ch95
3359 page 3359
July 6, 2020
Momentum return MOM
BOS
CSF
Correlation CSG
(MOM , BOS ) (MOM , CSF ) (MOM , CSG) (BOS , CSF ) (BOS , CSG) (CSF , CSG)
Panel C: Growth stocks 0.6073
0.6932
0.7662
0.8248
0.6724
6-month 0.9913∗∗∗ 1.5223∗∗∗ 2.1490∗∗∗ 1.9545∗∗∗ t-stat (5.44) (6.59) (7.01) (5.98) Information ratio 0.0739 0.1487 0.1954 0.1694
0.8261
0.6553
0.6774
0.8133
0.8307
0.7183
9-month 0.7815∗∗∗ 1.3458∗∗∗ 1.9217∗∗∗ 1.7655∗∗∗ t-stat (4.47) (6.03) (6.59) (5.70) Information ratio 0.0394 0.1255 0.1762 0.1525
0.8378
0.7025
0.6893
0.8403
0.8363
0.7334
12-month 0.5999∗∗∗ 1.2026∗∗∗ 1.6289∗∗∗ 1.5851∗∗∗ t-stat (3.69) (5.64) (5.81) (5.39) Information ratio 0.0028 0.1023 0.1404 0.1327
0.8301
0.5995
0.6021
0.7803
0.7740
0.6761
b3568-v3-ch95
Notes: This table provides a comparison of investment strategies based on different sorting variables. MOM is the momentum strategy, based solely on past returns. BOS is the BOS momentum strategy, based on past returns and the BOS ratio. CSF (CSG) is the combined investment strategy, based on past returns, the BOS ratio, and fundamental score FSCORE (GSCORE). Returns, t-statistics, and information ratios for long–short investment strategies with different holding periods are presented. The information ratio is defined as the active return divided by the tracking error. Active return is the difference between the return on different strategies and NYSE/AMEX/NASDAQ value-weighted return, and the tracking error is the standard deviation of the active return. Correlations among returns to the momentum strategy, the BOS momentum strategy, and the combined investment strategies are also presented. Panel A presents the comparison for all sample stocks, and Panels B and C present comparisons for value stocks and growth stocks. ∗∗∗ , ∗∗ , and ∗ indicate statistical significance at 1%, 5%, and 10%, respectively.
9.61in x 6.69in
0.8228
H.-Y. Chen, C. F. Lee & W.-K. Shih
3-month 1.1304∗∗∗ 1.6014∗∗∗ 2.1733∗∗∗ 1.9386∗∗∗ t-stat (5.89) (6.54) (6.49) (5.43) Information ratio 0.0938 0.1554 0.1863 0.1562
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3360
Table 95.8:
page 3360
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Technical, Fundamental, and Combined Information
b3568-v3-ch95
page 3361
3361
0.8470 (0.8541). The higher correlations confirm the role of the BOS ratio. That is, despite using only price and trading volume information, the BOS ratio can capture a certain extent of information content that belongs to the FSCORE and the GSCORE. 95.5 Conclusion In this study, we develop a BOS momentum strategy by introducing the BOS ratio to the momentum strategy and find that the BOS momentum strategy can outperform the momentum strategy. That is, the BOS ratio can effectively capture the information asymmetry between informed and uninformed investors, and therefore, the investment strategy incorporating the BOS ratio can help investors choose stocks with a higher degree of information asymmetry in their portfolios and enjoy a larger price adjustment (momentum return) in the future. We also construct a combined investment strategy by incorporating the FSCORE and the GSCORE into the momentum strategy. We find that combined investment strategies can generate a higher return than the momentum strategy and the BOS momentum strategy, indicating that composite fundamental scores can help investors include stocks with more inefficient information content in their portfolios and with a larger momentum effect in the future. Our findings suggest that fundamental analysis indeed provides information to investors in addition to technical information for selecting winner and loser stocks. We also consider that our results contribute to security analysts and portfolio managers using momentum strategy. These momentum investors usually had success during the period when the performance of winners was distinguishable from that of losers. When the market experiences an overall rally such as the one occurring in March and April of 2009, however, these momentum investors suffer substantially from the loss on the short side of their portfolio. By incorporating fundamental analysis into momentum strategy, we believe our results should be useful for the security analysis and portfolio management of these investors. Bibliography D. H. Ahn, J. Conrad, and R. F. Dittmar (2003). Risk Adjustment and Trading Strategies. Review of Financial Studies, 16, 459–485. J. S. Abarbanell and B. J. Bushee (1997). Fundamental Analysis, Future Earnings, and Stock Prices. Journal of Accounting Research, 35, 1–24. Y. Amihud and H. Mendelson (1986). Liquidity and Stock Returns. Financial Analysts Journal, 42, 43–48.
July 6, 2020
15:56
3362
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
M. P. Arena, K. S. Haggard, and S. X. Yan (2008). Price Momentum and Idiosyncratic Volatility. Financial Review, 43, 159–190. E. Asem (2009). Dividends and Price Momentum. Journal of Banking & Finance, 33, 486–494. C. S. Asness, T. J. Moskowitz, and L. H. Pedersen (2013). Value and Momentum Everywhere. Journal of Finance, 68, 929–985. R. K. Atiase (1985). Predisclosure Information, Firm Capitalization, and Security Price Behavior around Earnings Announcements. Journal of Accounting Research, 23, 21–36. N. Barberis, A. Shleifer, and R. Vishny (1998). A Model of Investor Sentiment. Journal of Financial Economics, 49, 307–343. G. Bulkley and V. Nawosah (2009). Can the Cross-Sectional Variation in Expected Stock Returns Explain Momentum? Journal of Financial and Quantitative Analysis, 44, 777–794. K. Chan, A. Hameed, and W. Tong (2000). Profitability of Momentum Strategies in the International Equity Markets. Journal of Financial and Quantitative Analysis, 35, 153–172. L. K. C. Chan, N. Jegadeesh, and J. Lakonishok (1996). Momentum Strategies. Journal of Finance, 51, 1681–1713. H. Y. Chen, S. S. Chen, C. W. Hsin, and C. F. Lee (2014). Does Revenue Momentum Drive or Ride Earnings or Price Momentum? Journal of Banking & Finance, 38, 166–185. H. Y. Chen, P. H. Chou, and C. H. Hsieh (2016). Persistency of the Momentum Effect. Working paper, National Chengchi University. T. Chordia and L. Shivakumar (2002). Momentum, Business Cycle and Time-Varying Expected Return. Journal of Finance, 57, 985–1019. T. Chordia and L. Shivakumar (2005). Inflation Illusion and the Post-Earnings Announcement Drift. Journal of Accounting Research, 43, 521–556. A. Chui, S. Titman, and K. C. J. Wei (2009). Individualism and Momentum Around the World. Journal of Finance, 65, 361–392. J. S. Conrad, A. Hameed, and C. Niden (1994). Volume and Autocovariances in ShortHorizon Individual Security Returns. Journal of Finance, 49, 1305–1329. V. T. Datar, N. Y. Naik, and R. Radcliffe (1998). Liquidity and Stock Returns: An Alternative Test. Journal of Financial Markets, 1, 203–219. K. Daniel, D. Hirshleifer, and A. Subrahmanyam (1998). Investor Psychology and Security Market Under- and Overreactions, Journal of Finance, 53, 1839–1885. K. Daniel and S. Titman (1999). Market Efficiency in an Irrational World. Financial Analysts Journal, 55, 28–40. W. F. M. DeBondt and R. Thaler (1985). Does the Stock Market Overreact? Journal of Finance, 40, 793–805. P. M. Dechow, A. P. Hutton, and R. G. Sloan (1999). An Empirical Assessment of the Residual Income Valuation Model. Journal of Accounting and Economics, 26, 1–34. Y. Ertimur, J. Livnat, and M. Martikainen (2003). Differential Market Reactions to Revenue and Expense Surprises. Review of Accounting Studies, 8, 185–211. E. F. Fama and K. R. French (1992). The Cross-Section of Expected Stock Returns. Journal of Finance, 47, 427–465. E. F. Fama and K. R. French (1995). Size and Book-to-Market Factors in Earnings and Reutrns. Journal of Finance, 50, 131–155. G. A. Feltham and J. A. Ohlson (1995). Valuation and Clean Surplus Accounting for Operating and Financial Activities. Contemporary Accounting Research, 11, 689–731. M. A. Ferreira and P. A. Laux (2007). Corporate Governance, Idiosyncratic Risk, and Information Flow. Journal of Finance, 62, 951–989.
page 3362
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Technical, Fundamental, and Combined Information
b3568-v3-ch95
page 3363
3363
S. Gilson, E. Hotchkiss, and R. Ruback (2000). Valuation of Bankrupt Firms. Review of Financial Studies, 13, 43–74. A. M. Goel and A. V. Thakor (2003). Why do Firms Smooth Earnings? Journal of Business, 76, 151–192. M. J. Gordon (1962). The Investment, Financing, and Valuation of the Corporation. Irwin, Homewood, IL. B. Graham and D. Dodd (1934). Security Analysis: The Classic 1934 Edition. McGrawHill, New York, NY. C. W. J. Granger and P. Newbold (1974). Experience with Forecasting Univariate Time Series and the Combination of Forecasts. Journal of the Royal Statistical Society. Series A (General) 137, 131–165. C. W. J. Granger and R. Ramanathan (1984). Improved Methods of Combining Forecasting. Journal of Forecasting, 3, 197–204. J. M. Griffin, X. Ji, and S. Martin (2005). Global Momentum Strategies. Journal of Portfolio Management, 31, 23–39. M. Grinblatt and T. Moskowitz (2004). Predicting Stock Price Movements from Past Returns: The Role of Consistency and Tax-Loss Selling. Journal of Financial Economics, 71, 541–579. B. D. Grundy and J. S. Martin (2001). Understanding the Nature of the Risks and the Source of the Rewards to Momentum Investing. Review of Financial Studies, 14, 29–78. H. Hong and J. C. Stein (1999). A Unified Theory of Underreaction, Momentum Trading, and Overreaction in Asset Markets. Journal of Finance, 54, 2143–2184. S. Huddart and B. Ke (2007). Information Asymmetry and Cross-Sectional Variation in Insider Trading. Contemporary Accounting Research, 24, 195–232. N. Jegadeesh and S. Titman (1993). Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency. Journal of Finance, 48, 65–91. N. Jegadeesh and S. Titman (2001). Profitability of Momentum Strategies: An Evaluation of Alternative Explanations. Journal of Finance, 56, 699–720. R. Kaplan and R. Ruback (1995). The Valuation of Cash Flow Forecasts: An Empirical Analysis. Journal of Finance, 50, 1059–1093. J. Lakonishok, A. Shleifer, and R. W. Vishny (1994). Contrarian Investment, Extrapolation, and Risk. Journal of Finance, 49, 1541–1578. A. Lee and J. Cummins (1998). Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers. Review of Quantitative Finance and Accounting, 10, 235–267. C. M. C. Lee and B. Swaminathan (2000). Price Momentum and Trading Volume. Journal of Finance, 55, 2017–2069. C. F. Lee, P. Newbold, J. E. Finnerty, and C. C. Chu (1986). On Accounting-Based, Market-Based and Composite-Based Beta Predictions: Methods and Implications, Financial Review, 21, 51–68. B. R. Lev and S. R. Thiagarajan (1993). Fundamental Information Analysis. Journal of Accounting Research, 31, 190–215. D. Liu, D. Nissim, and J. Thomas (2002). Equity Valuation Using Multiples. Journal of Accounting Research, 40, 135–172. P. S. Mohanram (2005). Separating Winners from Losers Among Low Book-to-Market Stocks Using Financial Statement Analysis. Review of Accounting Studies, 10, 133–170. R. Morck, B. Yeung, and W. Yu (2000). The Information Content of Stock Markets: Why do Emerging Markets have Synchronous Price Movements? Journal of Financial Economics, 58, 215–260.
July 6, 2020
15:56
3364
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
H.-Y. Chen, C. F. Lee & W.-K. Shih
T. J. Moskowitz and M. Grinblatt (1999). Do Industries Explain Momentum? Journal of Finance, 54, 1249–1290. J. N. Myers (1999). Implementing Residual Income Valuation with Linear Information Dynamics. Accounting Review, 74, 1–28. S. C. Myers and N. S. Majluf (1984). Coporate Financing and Investment Decisions when Firms have Information that Investor Do Not Have. Journal of Financial Economics, 13, 187–221. R. Novy-Marx (2012). Is Momentum Really Momentum? Journal of Financial Economics, 103, 429–453. J. A. Ohlson (1995). Earnings, Book Values, and Dividends in Equity Valuation. Contemporary Accounting Research, 11, 661–687. J. A. Ou and S. H. Penman (1989). Accounting Measurement, Price-Earnings Ratio, and the Information Content of Security Prices. Journal of Accounting Research, 27, 111–144. J. D. Piotroski (2000). Value Investing: The Use of Historical Financial Statement Information to Separate Winners from Losers. Journal of Accounting Research, 38, 1–41. R. Roll (1988). R2 . Journal of Finance, 25, 541–566. B. Rosenberg, K. Reid, and R. Lanstein (1985). Persuasive Evidence of Market Inefficiency. Journal of Portfolio Management, 11, 9–17. K. G. Rouwenhorst (1998). International Momentum Strategies. Journal of Finance, 3, 267–284. J. S. Sagi and M. S. Seasholes (2007). Firm-Specific Attributes and the Cross-Section of Momentum. Journal of Financial Economics, 84, 389–434. R. J. Shiller and J. Pound (1989). Survey Evidence on the Diffusion of Interest and Information Among Investors. Journal of Economic Behavior and Organization, 12, 46–66. A. Shleifer and R. W. Vishny (1997). The Limits of Arbitrage. Journal of Finance, 52, 35–55. R. G. Sloan (1996). Do Stock Prices Fully Reflect Information in Accruals and Cash Flows About Future Earnings? Accounting Review, 71, 289–315. B. Trueman and S. Titman (1988). An Explanation for Accounting Income Smoothing. Journal of Accounting Research, 26, 127–139. D. Wu (2007). Theory and Evidence: An Adverse-Selection Explanation of Momentum. Working paper, Massachusetts Institute of Technology.
Appendix 95A: Relationship Between the BOS Ratio and Alternative Information-Asymmetry Measures In this appendix, we compare the BOS ratio with three informationasymmetry alternatives, market capitalization (Atiase, 1985; Huddart and Ke, 2007), idiosyncratic volatility (Roll, 1988; Morck et al., 2000; Ferreira and Laux, 2007), and institutional ownership (Shiller and Pound, 1989; Huddart and Ke, 2007). It has been documented that stocks with smaller market capitalization, higher idiosyncratic volatility, and lower institutional ownership are associated with a higher degree of information asymmetry. We here try to show that the BOS ratio can be a candidate for the information
page 3364
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch95
Technical, Fundamental, and Combined Information
page 3365
3365
Table 95A.1: Correlation among BOS ratio, market capitalization, idiosyncratic volatility, and institutional ownership. Market capitalization
Idiosyncratic volatility
Institutional ownership
−0.1002
0.0447 −0.0186
−0.3745 0.1247 −0.1045
−0.0332
0.0236 −0.0124
−0.2547 0.1141 −0.0915
Panel A. Winner stocks BOS Market capitalization Idiosyncratic volatility Panel B. Loser stocks BOS Market capitalization Idiosyncratic volatility
Notes: This table presents the average Spearman rank-order correlation among the BOS ratio, market capitalization, and idiosyncratic volatility for winner stocks and loser stocks. A stock’s market capitalization is the multiple of its price per share and the number of shares outstanding at the end of each month. Idiosyncratic volatility is the residual variance from regressing a firm’s daily excess returns on market daily excess returns over the previous 12 months. Institutional ownership is the percentage of outstanding shares held by institutional investors. Panel A shows the average correlation for negative-BOS winner stocks, and Panel B shows the average correlation for positive-BOS loser stocks.
asymmetry measure. As Panel A of Table 95A.1 shows, for winner stocks, the BOS ratio is positively correlated to market capitalization, negatively correlated to idiosyncratic volatility, and negatively correlated to institutional ownership, indicating that winner stocks with a greater negative BOS ratio may suffer a higher degree of information asymmetry. Moreover, Panel B shows that for loser stocks, a higher BOS ratio is associated with a higher degree of information asymmetry in terms of market capitalization, idiosyncratic volatility, and institutional ownership. Therefore, in addition to the theoretical model proposed by Wu (2007), the empirical evidence also shows that winner (loser) stocks with lower (higher) BOS ratios are those subject to a higher degree of information asymmetry. As shown in Table 95A, the correlations among information asymmetry measures are low, indicating that information asymmetry measures may represent different views of information asymmetry. Therefore, we admit that this method is not completely correct but, to some extent, does provide us some evidence that the BOS ratio is associated with information asymmetry.
This page intentionally left blank
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Chapter 96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis: Theory and Empirical Evidence∗ Cheng Few Lee, Manak C. Gupta, Hong-Yi Chen and Alice C. Lee Contents 96.1 96.2 96.3 96.4 96.5 96.6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . Review of the Literature . . . . . . . . . . . . . . . . . The Model . . . . . . . . . . . . . . . . . . . . . . . . Optimum Dividend Policy . . . . . . . . . . . . . . . . Relationship Between the Optimal Payout Ratio and the Growth Rate . . . . . . . . . . . . . . . . . . . Relationship Between Optimal Payout Ratio and Risks 96.6.1 Case 1: Total risk . . . . . . . . . . . . . . . . 96.6.2 Case 2: Systematic risk . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
3368 3372 3373 3377
. . . .
. . . .
. . . .
. . . .
3379 3384 3384 3386
Cheng Few Lee Rutgers University e-mail: cfl[email protected] Manak C. Gupta Temple University e-mail: [email protected] Hong-Yi Chen National Chengchi University e-mail: [email protected] Alice C. Lee State Street Corp e-mail: alice.fi[email protected] ∗
This chapter is an update and expansion of the paper “Optimal payout ratio under uncertainty and the flexibility hypothesis: Theory and empirical evidence,” which was published in Journal of Corporate Finance, Vol. 17, pp. 483–501 (2010). 3367
page 3367
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3368
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
96.6.3 Case 3: Total risk and systematic risk . . . . . . . . . 96.6.4 Case 4: No change in risk . . . . . . . . . . . . . . . . 96.7 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . 96.7.1 Sample description . . . . . . . . . . . . . . . . . . . 96.7.2 Univariate analysis . . . . . . . . . . . . . . . . . . . 96.7.3 Multivariate analysis . . . . . . . . . . . . . . . . . . 96.7.4 Moving estimates process for structural change model 96.8 Summary and Concluding Remarks . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 96A: Derivation of Equation (96.19) . . . . . . . . . . . . Appendix 96B: Derivation of Equation (96.21) . . . . . . . . . . . . Appendix 96C: Derivation of Equations (96.28) and (96.29) . . . . . Appendix 96D: Using Moving Estimates Process to Find the Structural Change Point in Equation (96.36) . . . .
3387 3388 3389 3390 3392 3393 3398 3400 3402 3406 3408 3409 3410
Abstract Following the dividend flexibility hypothesis used by DeAngelo and DeAngelo (2006), Blau and Fuller (2008), and others, we theoretically extend the proposition of DeAngelo and DeAngelo’s (2006) optimal payout policy in terms of the flexibility dividend hypothesis. In addition, we also introduce growth rate, systematic risk, and total risk variables into the theoretical model. To test the theoretical results derived in this paper, we use data collected in the US from 1969 to 2009 to investigate the impact of growth rate, systematic risk, and total risk on the optimal payout ratio in terms of the fixed-effect model. We find that based on flexibility considerations, a company will reduce its payout when the growth rate increases. In addition, we find that a nonlinear relationship exists between the payout ratio and the risk. In other words, the relationship between the payout ratio and risk is negative (or positive) when the growth rate is higher (or lower) than the rate of return on total assets. Our theoretical model and empirical results can therefore be used to identify whether flexibility or the free cash flow hypothesis should be used to determine the dividend policy. Keywords Dividends • Flexibility hypothesis • Payout policy • Fixed-effects model.
96.1 Introduction Corporate dividend policy has long engaged the attention of financial economists, dating back to the irrelevance theorem of Miller and Modigliani (1961, M&M hereafter), in which they state that a rational and perfect economic environment is free of illusion. Since then, their rather controversial
page 3368
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3369
3369
findings have been challenged and tested by weakening the assumptions or introducing imperfections into the analysis. For example, the signaling models developed by Bhattacharya (1979), Miller and Rock (1985), and John and Williams (1985) and the free cash flow hypothesis proposed by Easterbrook (1984) and Jensen (1986) are the two well-known propositions challenging M&M’s dividend irrelevance theorem; however, empirical studies examining signaling and free cash flow hypotheses have yielded mixed results. DeAngelo and DeAngelo (2006) reexamine the irrelevance of the M&M dividend irrelevance theorem by allowing partial dividend payout. They argue that the original M&M (1961) irrelevance result is due to the fact that they consider either paying out all earnings or not paying any of them at all. Therefore, payout policy might be relevant if partial payout is allowed. In other words, DeAngelo and DeAngelo (2006) use dividend flexibility hypothesis to show their dividend relevance results. Underlying the flexibility hypothesis, the current paper (1) develops a theoretical model to support the proposition of DeAngelo and DeAngelo’s (2006) optimal payout policy when the partial payout is allowed, (2) reconciles the dispute between free cash flow hypothesis and flexibility hypothesis in dividend policy literature, and (3) performs empirical tests in terms of theoretical results derived in this paper. First of all, following DeAngelo and DeAngelo (2006), we develop a dynamic model allowing firms to hold some amount of cash into a positive NPV project for the reason of financial flexibility. Under the assumption of stochastic rate of return and the dividend flexibility hypothesis, we carry out the optimization procedure to maximize firm value, and the final expression of the optimal dividend policy of the firm is thus derived. The model is comprehensive and allows structural analysis of different variables that could be relevant for corporate dividend policy. For example, the model incorporates the rate of return on assets, growth rate, and risk (systematic, firm-specific, and total risk) to name a few variables that could affect corporate dividend decisions. Comparative statics provide insights into the effect of each of these parameters on corporate dividend policy, and this is followed by an analysis of the interaction effects of these variables on corporate dividend policy. Second, the implications of the optimization results are explained. Our results show that the relationship between the optimal payout ratio and the growth rate is negative in general. We investigate the separate and then the combined effects of market-dependent and market-independent components of risk on the optimal dividend policy. We perform comparative static analyses of the relationships between the payout ratio and (1) change
July 6, 2020
15:56
3370
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
in total risk, (2) change in systematic risk, (3) simultaneous changes in both total risk and systematic risk, and (4) no change in risk. We examine in detail the effects of variations in the profitability rate, its distribution parameters, and their dynamic behavior on the optimal dividend policy of the firm. The theoretical relationship between the payout ratio and the growth ratio implies that high growth firms need to reduce the payout ratio and retain more earnings to build up “precautionary reserves” for flexibility considerations, but low growth firms are likely to be more mature and already build up their reserves. More importantly, the relationship between the payout ratio and the risks reflects different dividend policies at high growth firms and low growth firms. With higher risk, the costs of external funds increase; therefore, to optimize shareholders’ wealth, high growth firms tend to reduce their payouts and keep more relatively low-cost funds to sustain their high growth, whereas low growth firms tend to pay more dividends and reduce the risk for shareholders. Third, our theoretical model and its implications lead us to three testable hypotheses. Although a large and growing body of empirical research on the optimal dividend payout policy has emerged, none of it has a solid theoretical model to support findings or introduces a nonlinear structured model on payout policy.1 We here try to empirically examine three hypotheses derived from our theoretical optimal payout model. Using data collected in the US from 1969 to 2009, we analyze 28,333 dividend-paying firm years. Our empirical results show that firms’ payout ratios are negatively related to firms’ growth and a negative (or positive) relationship exists between firms’ risks and the dividend payout ratios among firms with higher (or lower) growth rates relative to their rate of return on assets. Our empirical results are consistent with our theoretical model under the dividend flexibility hypothesis. We also find that growth and risk interact in explaining the payout ratio, indicating that the payout ratio is not linearly related to the growth rate or to the risk of the firm. Furthermore, we implement the moving estimates process to find the empirical breakpoint of the structural change for the relationship between the payout ratio and risks and confirm that the empirical breakpoint is not different from our theoretical breakpoint. The primary contributions of this paper are our theoretical derivation of an optimal payout ratio under the dividend flexibility hypothesis and our demonstration of a negative but nonlinear relationship between the payout
1
For example, Rozeff (1982), Jagannathan et al. (2000), Grullon et al. (2002), Aivazian et al. (2003), and Blau and Fuller (2008).
page 3370
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3371
3371
ratio and the growth rate. More importantly, we theoretically and empirically locate a structural change point for the relationship between the payout ratio and the growth rate. Rozeff (1982), Aivazian et al. (2003), Blau and Fuller (2008), and others conclude that a firm’s risk and optimal payout are negatively related. Contrary to these conclusions and generally held beliefs, the dynamic optimization model developed here shows that the optimal payout is not necessarily negatively related to risk but implies that payout policies differ in high growth firms and low growth firms. High growth firms pay out their dividends for flexibility considerations, but low growth firms pay out their dividends to reduce their free cash flow problem. In addition, this paper also shows that the relationship between firms’ payouts and their growth rates (or risks) can be affected by their risks (or growth rates). Contrary to earlier studies, the theoretical model developed here shows that the optimal payout ratio is not linearly related to the growth rate or to the risk of the firm, and whether the rate of return on assets is higher or lower than the growth rate has significant effect on the relationships between these variables. The results are borne out by rather extensive empirical research done in this paper based on 40 years of data collected in the US from 1969 to 2009. The stochastic dynamic optimization model developed here challenges many commonly held beliefs and results obtained from earlier studies. It provides more meaningful relationships among total risk and its separate components (systematic and firm-specific), growth rate, rate of return on assets, and optimal dividend policy. Our results represent an advancement in corporate finance literature because all the important variables included in this study were never simultaneously included in those earlier studies, and more importantly, the interaction effects among these variables were neither recognized nor fully appreciated. We believe that this study is the first of its kind to provide a theoretical basis for relating dividend policies with firm risks, growth rates, and return on assets (ROAs) in an interrelated fashion. The remainder of this paper is organized as follows. Section 96.2 contains a review of the literature on dividend-relevant theories and the findings of empirical work on dividend policy. In Section 96.3, we lay out a dynamic model used in subsequent sections to examine the existence, or nonexistence, of an optimal dividend policy. Section 96.4 provides the final expression of the optimal dividend policy of the firm derived from the optimization procedure to maximize firm value. In Section 96.5, we present both a detailed form and an approximated form of the relationship between the optimal dividend payout ratio and the growth rate. Section 96.6 includes a discussion of the effects of market-dependent and market-independent components of risk on the optimal dividend policy; Section 96.7 includes empirical evidence
July 6, 2020
15:56
3372
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
supporting the model and implications in previous sections, and the conclusion appears in Section 96.8.
96.2 Review of the Literature Corporate dividend policy has puzzled financial economists, dating back to the dividend irrelevance theorem proposed by M&M (1961). Since then, their controversial findings have been challenged and tested by weakening the assumptions or introducing imperfections into the analysis. The signaling models developed by Bhattacharya (1979), Miller and Rock (1985), and John and Williams (1985) suggest that because of the asymmetric information between managers and shareholders, managers use dividends as a signal to release private information to the market; however, empirical studies examining the signaling hypothesis have yielded mixed results. Nissim and Ziv (2001), Brook et al. (1998), Bernheim and Wantz (1995), Kao and Wu (1994), and Healy and Palepu (1988) support the signaling (asymmetric information) hypothesis by finding a positive association between dividend increases and future profitability. Kalay and Lowenstein (1986) and Asquith and Mullins (1983) find that dividend changes are positively associated with stock returns in the days surrounding dividend announcements, and Sasson and Kolodny (1976) identify that a positive association between the payout ratio and average rates of return. Studies by Benartzi et al. (1997), DeAngelo et al. (1996), and Grullon et al. (2002), however, reveal no support for the hypothesized relationship between dividend changes and future profitability. Using agency cost theory, Easterbrook (1984) and Jensen (1986) propose the free cash flow hypothesis, arguing that because managers cannot credibly precommit to shareholders, they will not invest excess cash in negativeNPV projects. Dividend changes may convey information about how the firm will use future cash flow. Again, the results of empirical studies have been mixed at best. Several researchers, including Agrawal and Jayaraman (1994), Jensen et al. (1992), and Lang and Litzenberger (1989) find positive supports for the agency cost hypothesis, but others find no support for this hypothesis, for example, Howe et al. (1992), Denis et al. (1994), and Yoon and Starks (1995). DeAngelo and DeAngelo (2006) argue that M&M’s (1961) dividend irrelevance occurs because they consider either paying out all earnings or not paying any of them at all; therefore, payout policy might have impact on firm value if partial payout is allowed. In other words, DeAngelo and DeAngelo (2006) use the dividend flexibility hypothesis to show their dividend relevance results. Blau and Fuller (2008) have theoretically and empirically
page 3372
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3373
3373
shown that the dividend flexibility hypothesis is indeed a reasonable dividend policy. In addition, several empirical studies show evidence of firms preferring financial flexibility, for example, Lie (2005), DeAngelo et al. (2006), Denis and Osobov (2007), and Gabudean (2007). Besides focusing on the relevance of dividend policy, a growing body of literature deals with the determinants of optimal dividend payout policy. For example, Rozeff (1982) shows that the optimal dividend payout is related to the fraction of insider holdings, the growth of the firm, and the firm’s beta coefficient. He also finds evidence that the optimal dividend payout is negatively correlated to beta risk, and supporting that beta risk reflects the leverage level of a firm. Jagannathan et al. (2000) empirically show that operation risk is negatively related to the propensity to increase payouts. Grullon et al. (2002) show that dividend changes are related to the change in the growth rate and the change in the rate of return on assets. They also find that dividend increases should be associated with subsequent declines in profitability and risk. Aivazian et al. (2003) examine eight emerging markets and show that, similar to US firms, dividend policies in emerging markets can also be explained by profitability, debt, and the market-to-book ratio. None of the foregoing scholars, however, has a solid theoretical model to support their findings. We use these authors’ works as a springboard to deal with the flexibility hypothesis and develop the theoretical underpinnings and a stochastic dynamic optimization model determining the conditions for an optimal dividend policy. We believe that ours is the first attempt ever made to include growth rate, risks, and more importantly, the rate of return and its distribution parameters in a single model, deriving the conditions for an optimal dividend policy; furthermore, this paper empirically tests the findings and conclusions of the theoretical model developed here, using an extensive data set involving 28,333 dividend-paying firm years from 1969 to 2009. The results of the empirical tests confirm and validate the findings of the theoretical model and provide new insights in the determination of optimal dividend policy, the influence of each explanatory variable listed above, and more importantly, their interaction effects on the corporate dividend policy. Notably, some of these results run counter to categorically stated conclusions reached in earlier studies.
96.3 The Model We develop the dividend policy model under the assumptions that the capital markets represent the closest approximation to the economists’ ideal of a
July 6, 2020
15:56
3374
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
perfect market — zero transaction costs, rational behavior on the part of investors, and the absence of tax differentials between dividends and capital gains. It is assumed that the firm is not restricted to financing its growth only by retained earnings, and that its rate of return, r˜(t), is a nonstationary random variable, normally distributed with mean, μ, and variance, σ(t)2 . Let A(0) represent the initial assets of the firm and h be the growth rate. Then the earnings of this firm are given by equation (96.1), which is x ˜(t) = r˜(t)A(0)eht ,
(96.1)
where x ˜(t) represents the earnings of the firm, and the tilde (∼) denotes its random character. Based upon DeAngelo and DeAngelo’s (2006) assumption, we here allow that the firm partially pays out its earnings and retains a certain amount of earnings in support of its growth.2 The retained earnings of the firm, y(t), can therefore be expressed as follows: ˜ y(t) = x ˜(t) − m(t)d(t),
(96.2)
˜ is the dividends per share and m(t) is the total number of shares where d(t) outstanding at time t. Equation (96.2) further indicates that the focus of the firm’s decision ˜ also becomes making is on retained earnings, which implies that dividend d(t) a random variable. The growth of a firm can be financed by retained earnings or by issuing new equity. The new equity raised by the firm at time t can be defined as follows: e(t) = δp(t)m(t), ˙
(96.3)
where p(t) is the price per share, m(t) ˙ = dm(t)/dt, and δ is the degree of market perfection, 0 < δ ≤ 1. The value of δ equal to one indicates that new shares can be sold by the firm at current market prices. From equations (96.1)–(96.3), investment in period t is the sum of retained earnings and funds raised by new equity, so the investment in period t can be written as follows: ˜ + δm(t)p(t). ˜(t) − m(t)d(t) ˙ hA(0)eht = x 2
(96.4)
DeAngelo and DeAngelo (2006) carefully explain why partial payout is important to obtain an optimal ratio under perfect markets. In addition, they also argue that partial payout is important to avoid the suboptimal solution for optimal dividend policy.
page 3374
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3375
3375
This implies that
˜ = [˜ ˙ m(t), d(t) r (t) − h] A(0)eht + δm(t)p(t)
(96.5)
and the mean and variance of the dividends per share can be expressed as follows: ˜ ˙ m(t), E[d(t)] = [μ − h] A(0)eht + δm(t)p(t) ˜ Var[d(t)] = A(0)2 σ(t)2 e2th /m2 (t).
(96.6)
Also, let us postulate an exponential utility function of the following form3 : ˜ ˜ U [d(t)] = −eαd(t) ,
where α > 0.
(96.7)
Following the moment generating function, we have ˜
˜
E(−e−αd(t) ) = −e−αE[d(t)]+
α2 2
˜ V ar[d(t)]
,
(96.8)
¯ is the certainty equivalent value of d(t). ˜ 4 where d(t) From equations (96.6) and (96.8), the certainty equivalent dividend stream can be written as th α A(0)2 σ(t)2 e2th ˙ ¯ = (μ − h)A(0)e + δm(t)p(t) − , d(t) m(t) m(t)2
(96.9)
where α = α/2. Therefore, taking advantage of exponential utility, we can ¯ will reduce to the obtain a risk adjusted dividend stream. Furthermore, d(t) 2 certainty case, if we assume σ(t) = 0. In accordance with the capital asset pricing theory developed by Sharpe (1964), Lintner (1963), and Mossin (1966), the total risk can be decomposed into systematic risk and unsystematic risk, that is, r˜(t) can be defined as follows: ˜ + ε˜(t), r˜(t) = a + bI(t)
(96.10)
˜ is the market index, ε˜(t)H ˜ N (0, σε2 ), a and b are regression paramwhere I(t) ˜ eters, and Var(bI(t)) and Var (˜ ε(t)) represent the systematic and unsystematic risk, respectively. 3
Pratt (1964) provides a detailed analysis of the various utility functions. Exponential, hyperbolic, and quadratic forms have been variously used in the literature; but the first two seem to have preference over the quadratic form because the latter has the undesirable property that it ultimately turns downwards. 4 From the moment generating function discussed in Hogg and Craig (1994), we know that 1 2 E(−ety ) = −etE(y)+ 2 t V ar(y) . Lett = −α, then the right-hand side of (96.8) is easily obtained.
July 6, 2020
15:56
3376
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
Following equation (96.10), equation (96.6) can be rewritten as ˜ ˙ E[d(t)] = [(a + bI¯ − h)A(o)eht + δm(t)p(t)]/m(t), ˜ ˜ + V ar(ε(t))]e2th /m(t)2 Var[d(t)] = A(0)2 [b2 V ar(I(t)) = A(0)2 [ρ(t)2 σ(t)2 + (1 − ρ(t)2 )σ(t)2 ]e2th /m(t)2 , (96.11) ˜ a is the market where ρ(t) is the correlation coefficient between r˜(t) and I, independent component of the firm’s rate of return, bI¯ is the market dependent component of the firm’s rate of return, ρ(t)2 σ(t)2 is the nondiversifiable risk, and (1 − ρ(t)2 )σ(t)2 is the diversifiable risk. The unsystematic risk usually can be diversified away by the investors,5 so the certainty equivalent value in equation (96.9) should be revised as α A(o)2 ρ(t)2 σ(t)2 e2th (a + bI¯ − h)A(o)eth + δm(t)p(t) ˙ − . dˆ (t) = m(t) m(t)2 (96.12) Following Lintner (1962), we observe that the stock price should equal the present value of this certainty equivalent dividend stream discounted at a riskless rate of return. Therefore, T (96.13) dˆ (t)e−kt dt, p(0) = 0
where p(0) is the stock price at t = 0, k is the risk free rate of return, and T is the planning horizon. This model will be used in subsequent sections to find the functional form of m(t) and optimize the payout ratio. The formulation of our model is different from that of M&M (1961), Gordon (1962), Lerner and Carleton (1966a), and Lintner (1964). For example, in contrast to our model, M&M consider neither the non-stationarity of the firm’s rate of return in their model nor explicitly incorporated uncertainty in their valuation model. Their models are also essentially static and would not permit an extensive analysis of the dynamic process of moving from one equilibrium state to another; furthermore, the formulation of our model is different from those who propose to capitalize the market-dependent and market-independent components of the uncertain stream of earnings at the risky and riskless rates, respectively.6 Instead, we view the market value of a firm as the present value of certainty 5 6
See Lintner (1965), Mossin (1966), and Sharpe (1964). See Brennan (1973).
page 3376
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3377
3377
equivalents of random future receipts. In the following section, we carry out the optimization of equation (96.13) and derive the final expression for the optimal payout ratio.7 96.4 Optimum Dividend Policy Based upon the evaluation model developed in the previous section, in this section, we will derive an optimal dividend payout ratio. Substituting equation (96.12) into equation (96.13), we obtain T ˙ (A + bI¯ − h)A(0)eth + δm(t)p(t) p(0) = m(t) 0 α A(0)2 ρ(t)2 σ(t)2 e2th −kt (96.14) e dt. − m(t)2 To maximize equation (96.14), we observe that T T −k(s−t) kt ˆ ds = e d (s)e dˆ (s)e−ks ds, p(t) = t
(96.15)
t
where s is the proxy of time in the integration. From equation (96.15), we can formulate a differential equation as dp(t) = p(t) ˙ = kp(t) − dˆ (t). d(t)
(96.16)
Substituting equation (96.12) into equation (96.16), we obtain the differential equation δm(t) ˙ − k p(t) = −G(t), (96.17) p(t) ˙ + m(t) where G(t) =
α A(o)2 ρ(t)2 σ(t)2 e2th (a + bI¯ − h)A(0)eth − . m(t) m(t)2
(96.18)
Following Kreyszig (2010) and Lee and Shi (2010), we solve the differential equation (96.17) and obtain the solution, as indicated in equation (96.19). T ekt G(s)m(s)δ e−ks ds. (96.19) p(t) = m(t)δ t 7
For further explanation of the optimization of the deterministic and stochastic control models and their applications to economic problems, please see Aoki (1967), Bellman (1990 and 2003), and Intriligator (2002).
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
3378
Then, equation (96.20) can be obtained from equations (96.18) and (96.19), implying that the initial value of a stock can be expressed as the summation of present values of its earnings stream adjusted by the risk taken by the firm. T 1 (a + bI¯ − h)A(0)eth m(t)δ−1 p(0) = m(0)δ 0 − α A(0)2 ρ(t)2 σ(t)2 e2th m(t)δ−2 e−kt dt. (96.20) To maximize firm value, the Euler–Lagrange condition for the optimization of p(o) is given by equation (96.21),8 (δ − 1)(a + bI¯ − h)A(0)eth m(t)δ−2 − α A(0)2 ρ(t)2 σ(t)2 e2th m(t)δ−3 (δ − 2) = 0.
(96.21)
Therefore, the optimal shares outstanding at time t can be derived. m(t) =
(2 − δ)α A(0)eth ρ(t)2 σ(t)2 . (1 − δ)(a + bI¯ − h)
(96.22)
From equations (96.18), (96.19), and (96.22), we can obtain the maximized stock value T (a + bI¯ − h)2 (1 − δ)ekt−thδ t eδhs−ks (ρ(s)σ(s))2δ−2 ds . (96.23) p(t) = α (2 − δ)2 ρ(t)2δ σ(t)2δ From equation (96.22), we also obtain the optimal number of shares of new equity issued at time t
h(2 − δ)α A(0)eth ρ(t)2 σ(t)2 + (2 − δ)α A(0)eth [ρ(t)2 σ(t) ˙ 2 + σ(t)2 ρ(t) ˙ 2 . m(t) ˙ = (1 − δ)(a + bI¯ − h)
(96.24) From equations (96.23) and (96.24), we have the amount generated from issuing new equity ˙ 2 (a + bI¯ − h)ekt−(δ−1)th A(0)(hρ(t)2 σ(t)2 + ρ(t)2 σ(t) T + σ(t)2 ρ(t) ˙ 2 ) t es(δh−k) (ρ(s)σ(s))2δ−2 ds . m(t)p(t) ˙ = (2 − δ)ρ(t)2δ σ(t)2δ (96.25) ¯ ¯ From equations (96.5) and (96.25), we can obtain D(t) = m(t)d(t). From ht ¯ . When δ equations (96.1) and (96.10), we can obtain x ¯(t) = (a + bI)A(0)e 8
For the derivation of equation (96.21), please refer to Appendix 96B.
page 3378
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3379
3379
approaches unity, we can derive the optimal payout ratio as
¯ [e(h−k)(T −t) − 1] (a + bI¯ − h) σ(t) ˙ 2 ρ(t) ˙ 2 D(t) 1+ = h+ + . ¯ x ¯(t) (h − k) σ(t)2 ρ(t)2 (a + bI) (96.26) Equation (96.26) implies an optimal payout ratio when we use an exponential utility function to derive the stochastic dynamic dividend policy model. This result does not necessarily imply that the dividend policy results derived by M&M (1961) are false because we allowed free cash flow to be paid out partially as assumed by DeAngelo and DeAngelo (2006) instead of paying out all free cash flows as assumed by M&M (1961). In the following section, we use equation (96.26) to explore the implications of the stochasticity, the stationarity (in the strict sense), and the non-stationarity of the firm’s rate of return for its dividend policy.9 We also investigate in detail the differential effects of variations in the systematic and unsystematic risk components of the firm’s stream of earnings on the dynamics of its dividend policy. 96.5 Relationship Between the Optimal Payout Ratio and the Growth Rate In this section, we investigate the relationship between the optimal payout ratio and the growth rate in terms of both exact and approximate approaches. Taking the partial derivative of equation (96.26) with respective to the growth rate, we obtain
¯ −k + he(h−k)(T −t) 1 ∂[D(t)/¯ x(t)] = − ∂h h−k a + bI¯
[(−k) + h (h − k) (T − t)] e(h−k)(T −t) + k h . + 1− a + bI¯ (h − k)2 (96.27) The sign of equation (96.27) is not only affected by the growth rate (h) but ¯ the duration of future also by the expected rate of return on assets (a + bI), dividend payments (T − t), and the cost of capital (k). Since the sign of equation (96.27) cannot be analytically determined, we use a sensitivity analysis approach to investigate the sign of equation (96.27). 9
See Hamilton (1994, pp. 45–46).
July 6, 2020
15:56
3380
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
Table 96.1 shows the sign of partial derivatives of equation (96.27) under different values of the growth rate and the rate of return on assets as well as duration (T − t). We find that the relationship between the optimal payout ratio and the growth rate is always negative when the growth rate is higher than the rate of return on assets. If the growth rate is lower than the rate of return on asset, the direction of relationship essentially depends on the duration of the dividend payment (T − t). We find that the sign of equation (96.27) is negative if the duration (T − t) is small and the growth rate and rate of return on assets are within a reasonable range. In addition, the curved lines in Table 96.1 also indicate a nonlinear relationship between the growth rate and the optimal payout ratio. We can therefore conclude that the relationship between the optimal payout ratio and the growth rate is nonlinear and generally negative. Based upon equation (96.27), Figure 96.1 plots the change in the optimal payout ratio with respect to the growth rate in different durations of dividend payments and costs of capital. We find a negative relationship between the optimal payout ratio and the growth rate, indicating that a firm with a higher rate of return on assets tends to pay out less when its growth opportunities increase. Moreover, a firm with a lower growth rate and higher expected rate of return will not decrease its payout when its growth opportunities increase, but a firm with a lower growth and a higher expected rate of return on asset is not a general case in the real world. We also find that the duration of future dividend payments is an important determinant of the dividend payout decision, but the effect on the cost of capital is relatively minor. In the finite growth case, if (h − k) (T − t) < 1, then following the MacLaurin expansion, the optimal payout ratio under no change in risk defined in equation (96.26) can be written as
h ¯ (1 + h(T − t)) . (96.28) [D(t)/¯ x(t)] ≈ 1 − a + bI¯ The partial derivative of equation (96.28) with respective to the growth rate is ¯ − h (T − t) − h(T − t) − 1 ¯ (a + bI) ∂[D(t)/¯ x(t)] . (96.29) ≈ ∂h a + bI¯ Equation (96.29) indicates that the relationship between the optimal dividend payout and the growth rate depends on firm’s level of growth, the rate of return on assets, and the duration of future dividend payment.10 10
See Appendix 96C for the derivation of equations (96.28) and (96.29).
page 3380
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis Table 96.1: growth rate.
page 3381
3381
Sensitivity analysis of the relationship between the optimal payout and the
Growth ROA= ROA= ROA= ROA= ROA= ROA= ROA= ROA= (%) 5% 10% 15% 20% 25% 30% 35% 40% Panel A. T − t = 3, Cost of Capital = 7% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
−18.32 −19.39 −20.52 −21.69 −22.91 −24.19 −25.56 −26.92 −28.38 −29.90 −31.48 −33.14 −34.86 −36.66 −38.54 −40.50 −42.54 −44.67 −46.89 −49.20
−7.77 −8.26 −8.78 −9.32 −9.89 −10.49 −11.12 −11.75 −12.43 −13.14 −13.88 −14.65 −15.46 −16.30 −17.18 −18.10 −19.05 −20.05 −21.09 −22.18
−4.25 −4.55 −4.87 −5.20 −5.55 −5.92 −6.31 −6.70 −7.12 −7.55 −8.01 −8.49 −8.99 −9.51 −10.06 −10.63 −11.22 −11.85 −12.49 −13.17
−2.49 −2.70 −2.92 −3.14 −3.38 −3.63 −3.90 −4.17 −4.46 −4.76 −5.08 −5.41 −5.76 −6.12 −6.50 −6.89 −7.31 −7.74 −8.20 −8.67
−1.44 −1.58 −1.74 −1.91 −2.08 −2.26 −2.46 −2.65 −2.86 −3.09 −3.32 −3.56 −3.81 −4.08 −4.36 −4.65 −4.96 −5.28 −5.62 −5.97
−0.73 −0.84 −0.96 −1.08 −1.21 −1.35 −1.50 −1.64 −1.80 −1.97 −2.14 −2.33 −2.52 −2.72 −2.94 −3.16 −3.39 −3.64 −3.90 −4.17
−0.23 −0.31 −0.40 −0.49 −0.59 −0.70 −0.81 −0.92 −1.04 −1.17 −1.31 −1.45 −1.60 −1.75 −1.92 −2.09 −2.28 −2.47 −2.67 −2.88
0.15 0.08 0.02 −0.05 −0.13 −0.21 −0.29 −0.38 −0.47 −0.57 −0.68 −0.79 −0.90 −1.03 −1.16 −1.29 −1.44 −1.59 −1.75 −1.91
0.07 −0.09 −0.27 −0.46 −0.68 −0.91 −1.18 −1.45 −1.76 −2.09 −2.45 −2.85 −3.28 −3.74 −4.25 −4.79 −5.38 −6.02 −6.70 −7.45
0.80 0.70 0.59 0.46 0.33 0.17
2 (T − t)
page 3382
July 6, 2020 15:56 Handbook of Financial Econometrics,. . . (Vol. 3) 9.61in x 6.69in b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis 3383
Figure 96.1: Sensitivity analysis of the relationship between the optimal payout and the growth rate. Notes: The figures show the sensitivity analysis of the relationship between the optimal payout and the growth rate. In each figure, each line shows the percentage change of the optimal dividend payout with a 1% change in the growth rate. The various lines represent different levels of the rate of return on assets. The figures present different durations of dividend payments (T − t) and costs of capital.
page 3383
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3384
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
Consistent with the sensitivity analysis of equation (96.27), when a firm with a high growth rate or a low rate of return on assets faces a growth opportunity, it will decrease its dividend payout to generate more cash to meet such a new investment. A possible explanation is that high growth firms need more retained earnings to meet their future growth opportunities because the growth rate is the main determinant of value in the case of such companies, but low growth firms do not need more earnings to maintain their low growth perspective and can afford to increase their payouts. Based on the flexibility concerns, the relationship between firms’ payout ratios and their growth rates is therefore negative. 96.6 Relationship Between Optimal Payout Ratio and Risks Equation (96.26) implies that the optimal payout ratio is a function of the ¯ growth rate (h), cost of capital (k), expected profitability rate (a + bI), 2 age (T − t), total risk (σ(t) ), and the correlation coefficient between profit and market rate of return (ρ(t)2 ). In addition, equation (96.26) is also a function of two dynamic variables — the relative time rate of change in the total risk of the firm [σ(t) ˙ 2 /σ(t)2 ], and the relative time rate of change in the covariability of the firm’s earnings with the market, [ρ(t) ˙ 2 /ρ(t)2 ]. This theoretical dynamic relationship between the optimal payout ratio and other determinants can be used to do empirical studies to determine dividend policy. The dynamic effects of variations in [σ(t) ˙ 2 /σ(t)2 ] 2 2 /ρ(t) ] on the time path of optimal payout ratio can be invesand [ρ(t) ˙ tigated under the following four cases: (1) changes in total risk, (2) changes in correlation between profit and the market rate of return (i.e., systematic risk), (3) changes in total risk and systematic risk, and (4) no changes in risk. 96.6.1 Case 1: Total risk First, we examine the effect of [σ(t) ˙ 2 /σ(t)2 ] on the optimal payout ratio. By differentiating equation (96.26) with respect to [σ(t) ˙ 2 /σ(t)2 ], we obtain
(h−k)(T −t) ¯ e h −1 ∂[D(t)/¯ x(t)] = 1− . (96.31) ∂[σ(t) ˙ 2 /σ(t)2 ] h−k a + bI¯
page 3384
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3385
3385
In equation (96.31), the cost of capital, k, can be either larger or smaller (h−k)(T −t) than the growth rate h. We can show that e h−k −1 is always larger than 0, regardless of whether k is larger or smaller than h.11 Thus, the sign h ), which depends on the of equation (96.31) depends on the sign of (1 − a+b I¯ ¯ growth rate h relative to (a + bI). ¯ then (1 − h ¯ ) is equal to zero. If the growth rate h is equal to (a + bI), a+bI
Equation (96.31) is thus zero, and the change in total risk will not affect the payout ratio because the first derivative of the optimized payout ratio, equation (96.26), with respect to [σ(t) ˙ 2 /σ(t)2 ] is always zero. ¯ then the entire first derivaIf growth rate h is larger than (a + bI), tive of equation (96.26) with respect to [σ(t) ˙ 2 /σ(t)2 ] is negative (i.e., equa¯ implies that the growth tion (96.31) is negative); furthermore, h > (a + bI) rate of a firm is larger than its expected profitability rate. An alternative ¯ which implies that the growth rate of a firm is less than case is h < (a + bI), its expected profitability rate. This situation can occur when a company is either in a low growth, no growth, or negative growth stage. Under this situation, a company will increase its payout ratio as shown in equation (96.31). ¯ then equation (96.31) is positive, indicating that a relative If h < (a + bI), increase in the risk of the firm would increase its optimal payout ratio. This implies that a relative increase in the total risk of the firm would decrease its optimal payout ratio. Lintner (1965) and Blau and Fuller (2008) have found this kind of relationship, yet they did not theoretically show how it can be derived. Jagannathan et al. (2000) empirically show that operation risk is negatively related to the propensity to increase payouts in general and dividends in particular. Our theoretical analysis in terms of equation (96.31) shows that the change in total risk is negatively or positively related to the payout ratio, conditional on the higher growth rate relative to the expected profitability rate. We find negative relationships between payout and the change ¯ in total risk for high growth firms (h > (a + bI)). A possible explanation is that in the case of high growth firms, a firm must reduce the payout ratio and retain more earnings to build up “precautionary reserves,” which become all the more important for a firm with volatile earnings over time. High-growth firms thus tend to retain more earnings when they face higher
11
If h > k, then e(h−k)(T −t) > 1, and both the numerator and denominator are greater than zero, resulting in a positive value; if h < k, then e(h−k)(T −t) < 1, and both the numerator (h−k)(T −t) −1 is and denominator are less than zero, resulting in a positive value. Thus, e h−k always larger than 0, regardless of whether k is larger or smaller than h.
July 6, 2020
15:56
3386
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
¯ risk. By contrast, in the case of established low-growth firms (h < (a + bI)), low-growth firms are likely to be more mature and have most likely already built such reserves over time. They probably do not need more earnings to maintain their low-growth perspective and can afford to increase the payout; consequently, when facing higher risk on their earnings, low-growth firms can reduce their shareholders’ risk by paying more dividends to their shareholders. The age of the firm (T − t), which is one of the variables in equation (96.31), becomes an important factor because the very high growth firms are also the newer firms with very little built-up precautionary reserves. Under more dynamic conditions, we provide further evidence of the validity of Lintner’s (1965) observations that, ceteris paribus, optimal dividend payout ratios vary directly with the variance of the firm’s profitability rates. The rationale for such relationships, even when the systematic risk concept is incorporated into the analysis, is obvious, that is, holding ρ(t)2 constant and letting the σ(t)2 increase imply that the covariance of the firm’s earnings with the market does not change though its relative proportion to the total risk increases. 96.6.2 Case 2: Systematic risk To examine the effect of a relative change in [ρ(t) ˙ 2 /ρ(t)2 ] (i.e., systematic risk) on the dynamic behavior of the optimal payout ratio, we differentiate equation (96.26) to obtain
(h−k)(T −t) ¯ e h −1 ∂[D(t)/¯ x(t)] = 1− . (96.32) ∂[ρ˙ 2 (t)/ρ2 (t)] h−k a + bI¯ The sign of equation (96.32) can be analyzed as with equation (96.31), so the conclusions of equation (96.32) are similar to those of equation (96.31). A relative change in ρ(t)2 can either decrease or increase the optimal payout ratio, all things being equal. The effect of non-stationarity in the firm’s non-diversifiable risk would tend to be obliterated should both the systematic and the unsystematic components of total risk not be clearly identified in the expression for optimal payout ratio. Although the total risk of the firm is stationary (i.e., [σ(t) ˙ 2 /σ(t)2 ] is equal to zero), a change in the total risk complexion of the firm could still conceivably, occur because of an increase or decrease in the covariability of its earnings with the market. Equations (96.26) and (96.32) clearly identify the effect of such a change in the risk complexion of the firm on its optimal payout ratio.
page 3386
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3387
3387
An examination of equation (96.26) indicates that only when the firm’s earnings are perfectly correlated with the market (i.e., ρ2 = 1), whether the management arrives at its optimal payout ratio using the total variance concept of risk or the market concept of risk does not matter. For every other case, the optimal payout ratio followed by management using the total variance concept of risk would be an overestimate of the true optimal payout ratio for the firm based on the market concept of risk underlying the capital asset pricing theory. Management may decide not to use the truly dynamic model and instead substitute an average of the long-run systematic risk of the firm, but for ρ˙ 2 (t) > 0, because the initial average is higher than the true ρ2 (t), the management would pay out less or more in the form of dividends than is optimal. In other words, the payout ratio followed in the initial part of the planning horizon would be an overestimate or an underestimate of the optimal payout under truly dynamic specifications. Rozeff (1982) empirically shows a negative relationship between the β coefficient (systematic risk) and the payout level. The theoretical analysis in terms of equation (96.32) provides a more detailed analytical interpretation of his findings. The explanations of these results are similar to those discussed for findings in the previous sections. 96.6.3 Case 3: Total risk and systematic risk In our third case, we attempt to investigate the compounded effect of a simultaneous change in the total risk of the firm and also a change in its decomposition into the market- dependent and market-independent components. Taking the total differential of equation (96.26) with respect to ˙ 2 /ρ(t)2 ], we obtain [σ(t) ˙ 2 /σ(t)2 ] and [ρ(t)
ρ(t) ˙ 2 σ(t) ˙ 2 ¯ + γd , (96.33) d[D(t)/¯ x(t)] = γd σ(t)2 ρ(t)2 where
(h−k)(T −t) e −1 h . γ = 1− a + bI h−k Also, γ can be either negative or positive, as shown above. Now, from equation (96.33), the greatest decrease or increase in the opti˙ 2 are posmal payout ratio would obviously occur when both σ(t) ˙ 2 and ρ(t) itive. This implies that the total risk of the firm increases, and in addition, its relative decomposition into systematic and unsystematic components also changes, making the firm’s earnings still more correlated with the market. Under this circumstance, the decrease or increase in the optimal payout
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
3388
would now represent the compounded effect of both these changes; however, ˙ 2 is negative, tending it is conceivable that although σ(t) ˙ 2 is positive, ρ(t) to offset the decrease or increase in the optimal payout ratio resulting from the former. Alternatively, σ(t) ˙ 2 could be negative, indicating a reduction in the total risk of the firm, and may offset the increase in the optimal payout ratio, resulting from a positive ρ(t) ˙ 2. To what extent the inverse variations in the total risk and the risk complexion of the firm will offset each other’s effects on the optimal payout ratio for the firm would, of course, be dependent upon the relative magnitudes of ˙ 2 . To see the precise trade-off between the two dynamic effects ρ(t) ˙ 2 and σ(t) 2 /σ(t)2 ] and [ρ(t) ˙ 2 /ρ(t)2 ] on the optimal payout ratio, let the total of [σ(t) ˙ differential of (96.26), given in equation (96.33), be set equal to zero, yielding ˙ 2 /ρ(t)2 ]. d[σ(t) ˙ 2 /σ(t)2 ] = −d[ρ(t)
(96.34)
Equation (96.34) implies that the relative increase (or decrease) in σ(t)2 has a one-to-one correspondence with the relative decrease (or increase) in ρ(t)2 , so in equation (96.34), conditions are established for relative changes in ρ(t)2 and σ(t)2 , which lead to a null effect on the optimal dividend payout ratio. 96.6.4 Case 4: No change in risk Now, we consider the least dynamic situation, in which no changes exist in ˙ 2 = 0. Under this total risk or systematic risk, assuming σ(t) ˙ 2 = 0 and ρ(t) circumstance, equation (96.26) reduces to
(h−k)(T −t) −k + he h ¯ . (96.35) [D(t)/¯ x(t)] = 1 − h−k a + bI¯ Thus, when the firm’s total risk and covariability of its earnings with the market are assumed stationary, equation (96.35) indicates that a firm’s optimal payout ratio is independent of its risk. Note that neither σ(t)2 nor ρ(t)2 now appear in the expression for the optimal payout ratio given in equation (96.35). These conclusions, like those of Wallingford (1972a, 1972b), for example, run counter to the intuitively appealing and well-accepted theory of finance emphasizing the relevance of risk for financial decision making.12 Our model clearly shows that the explanation for such unacceptable implications of the firm’s total risk and its market-dependent and market-independent components for the firm’s optimal payout policy lies, of course, in the totally unrealistic assumptions of stationarity underlying the derivation of such results, as illustrated in equation (96.35). 12
For example, Lintner (1963).
page 3388
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3389
3389
96.7 Empirical Evidence A growing body of literature focuses on the determinants of optimal dividend payout policy. Rozeff (1982), Jagannathan et al. (2000), Grullon et al. (2002), Aivazian et al. (2003), Blau and Fuller (2008), and others empirically investigate the determination of dividend policy, but none of them has a solid theoretical model to support their findings. Based upon our theoretical model and its implications discussed in the foregoing sections, we develop the three testable hypotheses that follow. H1: Firms generally reduce their dividend payouts when their growth rates increase. The negative relationship between the payout ratio and the growth ratio in our theoretical model implies that high-growth firms need to reduce the payout ratio and retain more earnings to build up “precautionary reserves,” but low-growth firms are likely to be more mature and already build up their reserves for flexibility considerations. Rozeff (1982), Fama and French (2001), Blau and Fuller (2008), and others argue that high-growth firms will have higher investment opportunities and tend to pay out less in dividends. Based upon flexibility concerns, we predict that high-growth firms pay higher dividends. This result is obtained when risk factor is not explicitly considered. Following equations (96.31) and (96.32), we theoretically find that the relationship between the payout ratio and the risk can be either negative or positive, depending upon whether the growth rate is higher or lower than the rate of return on total assets. Based upon this finding, we develop two other hypotheses. H2: The relationship between the firms’ dividend payouts and their risks is negative when their growth rates are higher than their rates of return on asset. High-growth firms need to reduce the payout ratio and retain more earnings to build up “precautionary reserves,” which become more important for a firm with volatile earnings over time. For flexibility considerations, highgrowth firms tend to retain more earnings when they face higher risk. This theoretical result is consistent with the flexibility hypothesis. H3: The relationship between the firms’ dividend payouts and their risks is positive when their growth rates are lower than their rates of return on asset. Low-growth firms are likely to be more mature and have most likely already built such reserves over time, and they probably do not need more earnings to
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3390
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
maintain their low-growth perspective and can afford to increase the payout (see Grullon et al., 2002). Because the higher risk may involve higher cost of capital and make the free cash flow problem worse, for free cash flow considerations, low-growth firms tend to pay more dividends when they face higher risk. This theoretical result is consistent with the free cash flow hypothesis. 96.7.1 Sample description We collect the firm information, including total asset, sales, net income, and dividends payout, from Compustat. Stock price, stock returns, share codes, and exchange codes are retrieved from the Center for Research in Security Prices (CRSP) files. The sample period is from 1969 to 2009. Only common stocks (SHRCD = 10, 11) and firms listed on NYSE, AMEX, or NASDAQ (EXCE = 1, 2, 3, 31, 32, 33) are included in our sample. We exclude utility services (SICH = 4900–4999) and financial institutions (SICH = 6000– 6999).13 The sample includes those firm years with at least 5 years of data available to compute average payout ratios, growth rate, return on assets, beta, total risk, size, and book-to-market ratios. The payout ratio is measured as the ratio of the dividend payout to the net income. The growth rate is the sustainable growth rate proposed by Higgins (1977). The beta coefficient and total risk are estimated by the market model over the previous 60 months. For the purpose of estimating their betas, firm years in our sample should have at least 60 consecutive previous monthly returns. To examine the optimal payout policy, only firm years with five consecutive dividend payouts are included in our sample.14 Considering the fact that firm years with no dividend payout 1 year before (or after) might not start (or stop) their dividend payouts in the first (fourth) quarter of the year, we exclude from our sample firm years with no dividend payouts one year before or after to ensure the dividend payout policy reflects the firm’s full-year condition. Table 96.2 shows the summary statistics for 2,645 sample firms during the period from 1969 to 2009. Panel A of Table 96.2 lists the number of 13
We filter out those financial institutions and utility firms based on the historical Standard Industrial Code (SIC) available from COMPUSTAT. When a firm’s historical SIC is unavailable for a particular year, the next available historical SIC is applied instead. When a firm’s historical SIC is unavailable for a particular year and all the years after, we use the current SIC from COMPUSTAT as a substitute. 14 To avoid creating a large difference in dividend policy, on one hand, managers partially adjust firms’ payout by several years to reduce the sudden impacts of the changes in dividend policy. On the other hand, they use not only 1-year firm conditions but also multiyear firm conditions to decide how much they will pay out. In examining the optimal payout policy, we use the five-year rolling averages for all variables.
page 3390
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3391
3391
firm-year observations for all sample high-growth firms and low-growth firms, respectively. High-growth firm years are those firm years that have 5-year average sustainable growth rates higher than their 5-year average rate of return on assets. Low-growth firm years are those firms with 5-year average sustainable growth lower than their 5-year average rate of return on assets. The sample size increases from 345 firms in 1969 to 1,203 firms in 1982, while declining to 610 firms by 2009. A total of 28,333 dividend paying firm years are included in the sample. When classifying high-growth firms and Table 96.2:
Summary statistics of sample firm characteristics.
Panel A. Sample size Number of firm years
Number of firm years
Year
All
Growth > ROA
Growth < ROA
1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
345 360 404 513 535 572 609 650 678 711 779 764 929 1,203 1,151 1,067 1,010 958 897 847 721
161 175 201 269 308 371 432 486 530 553 620 636 785 1,003 933 832 744 669 645 615 531
184 185 203 244 227 201 177 164 148 158 159 128 144 200 218 235 266 289 252 232 190
Year 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 All years
All
Growth > ROA
Growth < ROA
690 668 653 642 655 651 693 725 743 725 709 659 599 571 525 481 510 542 579 610 28,333
522 511 494 460 479 483 530 582 620 612 607 569 503 475 433 391 430 451 470 484 21,065
168 157 159 182 176 168 163 143 123 113 102 90 96 96 92 90 80 91 109 126 6,728
Panel B. Descriptive statistics of characteristics of sample Payout ratio
Growth rate
ROA
Beta
Total risk
Size ($MM)
M/B
0.0723 0.0648 0.0389
1.0301 1.0251 0.4272
0.0106 0.0089 0.0078
3,072 291 14,855
1.7940 1.3539 1.9479
0.0698 0.0638 0.0355
1.0624 1.0581 0.4352
0.0112 0.0095 0.0070
3,267 314 15,806
1.7951 1.3757 1.6496
All sample (N = 28,333) Mean 0.3793 0.1039 Median 0.3540 0.0886 Stdev 0.1995 0.7444 High growth firms (N = 21,065) Mean Median Stdev
0.3180 0.2996 0.1658
0.1233 0.1002 0.8060
(Continued )
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
C. F. Lee et al.
3392
Table 96.2: Payout ratio
Growth rate
(Continued )
ROA
Beta
Total risk
Size ($MM)
0.0800 0.0692 0.0476
0.9265 0.9375 0.3822
0.0087 0.0071 0.0099
2,447 229 11,250
M/B
Low growth firms (N = 6,728) Mean Median Stdev
0.5762 0.5542 0.1690
0.0413 0.0524 0.4918
1.7904 1.3007 2.6909
Notes: This table presents the descriptive statistics for those major characteristics of our sample firms. Sample includes those firms listed on NYSE, AMEX, and NASDAQ with at least 5 years of data available to compute average payout ratios, growth rate, return on assets, beta, total risk, size, and book-to-market ratios. All financial service operations and utility companies are excluded. Panel A lists the numbers of firm years observations for all sample firms, high growth firms, and low growth firms respectively during the period between year 1969 and year 2009. High-growth firm years are defined as firm years with sustainable growth rates higher than their rates of return on assets. Low-growth firm years are defined as firm years with sustainable growth rates lower than their rate of return on assets. Panel B lists the mean, median, and standard deviation values of the 5-year average of the payout ratio, growth rate, rate of return on assets, beta risk, total risk, size, and book-to-market ratio. The payout ratio is measured as the ratio of the dividend payout to the earnings. Growth rate is the sustainable growth rate proposed by Higgins (1977). The beta coefficient and total risk are estimated by the market model over the previous 60 months. Size is defined as market capitalization calculated by the closing price of the last trading day of June of that year times the outstanding shares at the end of June of that year.
low-growth firms relative to their return on assets, the proportion of highgrowth firms increases over time. The proportion of firm years with a growth rate higher than return on assets increases from less than 50% during the late 1960s and early 1970s to 80% in 2008. Panel B of Table 96.2 shows the 5-year moving averages of mean, median, and standard deviation values for the measures of payout ratio, growth rate, rate of return, beta coefficient, total risk, market capitalization, and market-to-book ratio across all firm years in the sample. Among high-growth firms, the average growth rate is 12.33%, and the average payout ratio is 31.80%; but for low-growth firms, the average growth rate is 4.13%, and the average payout ratio is 57.62%. High-growth firms undertake more beta risk and total risk, indicating that high-growth firms undertake both more systematic risk and unsystematic risk to pursue a higher rate of return. 96.7.2 Univariate analysis To examine Hypothesis 1, we divide our sample into five groups by growth rate. As Table 96.3 Panel A indicates, the average (median) payout ratio is 48.64% (48.35%) for the lowest growth group and 22.13% (19.67%) for
page 3392
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3393
3393
the highest growth group. The argument of “precautionary reserves” for flexibility concerns and Hypothesis 1 can therefore be confirmed. We further divide our sample in to five groups by beta risk and total risk indicated in Panels B and C of Table 96.3. We can observe a monotonic decrease in payout ratios when the beta risk or the total risk increases, which is consistent with the findings of Rozeff (1982), Fenn and Liang (2001), Grullon et al. (2002), Aivazian et al. (2003), and Blau and Fuller (2008, but they do not consider the relationship between the payout ratio and the risk may alter by the growth. To examine Hypotheses 2 and 3, we further divide our sample by two-way sorts on the growth rate and risks. High-growth groups are firm years with sustainable growth rates higher than their rate of return on total assets. Low-growth groups are firm years with sustainable growth rates lower than their rate of return on total assets. In Table 96.3 Panel B, among high-growth firms, the average (median) payout ratio decreases from 31.73% (30.10%) to 28.86% (25.88%) when the beta risk increases. Among low-growth firms, however, the average (median) payout ratio increases from 61.15% (58.64%) to 62.92% (60.90%) when the beta risk increases. Similar to Panel B, Panel C also shows a negative relationship between the payout ratio and the total risk among high-growth firms and a positive relationship between the payout ratio and the total risk among low-growth firms. Above all, the static analysis results of Panels B and C support Hypotheses 2 and 3 that the relationship between the payout ratio and the risk depends upon the growth rate of a firm. 96.7.3 Multivariate analysis To examine the relationship between the payout ratio and other financial variables, we propose fixed-effects models of the payout ratio as follows15 :
payout ratioi,t = α + β1 Riski,t + β2 Di,t (gi,t < c · ROAi,t ) ln 1 − (payout ratioi,t ) ·Riski + β3 Growthi,t + β4 Riski,t × Growthi,t +β5 ln(Size)i,t + β6 ROAi,t + ei,t . (96.36) 15
The dummy variable Di,t (gi,t < c · ROAi,t ) used in equation (96.36) implies that the relationship between the payout ratio and risks is nonlinear (piecewise regression). In other words, the breakpoint of the structural change is at gi,t = c · ROAi,t . Based upon our theoretical model, we assume that c is equal to 1 in our empirical work.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
page 3394
C. F. Lee et al.
3394 Table 96.3:
Payout ratios partitioned by growth rate and risks.
Panel A. Payout ratios partitioned by growth rate Low growth
All Sample
Mean Median
High growth
1
2
3
4
5
0.4864 0.4835
0.3835 0.3795
0.3272 0.3196
0.2696 0.2556
0.2213 0.1967
Panel B. Payout ratios partitioned by beta risk Low beta
High beta
1
2
3
4
5
All Sample
Mean Median
0.3912 0.3716
0.3940 0.3765
0.3963 0.3737
0.3609 0.3328
0.3371 0.3009
High-Growth Firms
Mean Median
0.3173 0.3027
0.3229 0.3081
0.3204 0.3028
0.3025 0.2874
0.2886 0.2588
Low-Growth Firms
Mean Median
0.6115 0.5864
0.6064 0.5748
0.6182 0.5947
0.6189 0.6042
0.6292 0.6090
Panel C. Payout ratios partitioned by total risk Low total risk
High total risk
1
2
3
4
5
All Sample
Mean Median
0.4274 0.4099
0.4058 0.3808
0.3618 0.3333
0.3285 0.2899
0.3041 0.2588
High-Growth Firms Low-Growth Firms
Mean Median Mean Median
0.3499 0.3381 0.6078 0.5732
0.3286 0.3138 0.6202 0.5942
0.2957 0.2745 0.6269 0.6062
0.2698 0.2431 0.6091 0.5863
0.2531 0.2190 0.6115 0.6025
Notes: This table presents the average and the median payout ratios in difference groups partitioned by growth rate, beta risk, or total risk during the sample period from 1969 to 2009. Panel A reports the average and the median payout ratios by one-way sort on the growth rate. Panel B reports the average and the median payout ratios by independent two-way sort on the growth rate and the beta. Panel C reports the average and the median payout ratios by independent two-way sort on the growth rate and the total risk. High-growth firm years are defined as firm years with sustainable growth rates higher than their rates of return on assets. Low-growth firm years are defined as firm years with sustainable growth rates lower than their rate of return on assets.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch96
Optimal Payout Ratio Under Uncertainty and the Flexibility Hypothesis
page 3395
3395
In the regression, the dependent variable is the logistic transformation of the payout ratio. Independent variables include risk measure (beta coefficient or total risk), the interaction of dummy variable and risk measure, growth rate, the interaction of risk measure and growth rate, log of size, and rate of return on total assets.16 Based upon the theoretical model and its implications from Section 96.6, we assume that c is equal to 1. The dummy variable (Di ) is equal to 1 if a firm’s five-year average growth rate is less than its fiveyear average rate of return on assets and 0 otherwise. Such structure allows us to analyze the relationship between payout ratio and growth rate and the relationship between payout ratio and risk under different growth rate levels. Thompson (2010), Peterson (2009), and Cameron et al. (2006) have pointed out that standard errors of two-way fixed-effects estimates can be biased if two-dimensional clustering (clustering in the cross-sectional errors and clustering in time-series errors) is not controlled for. Thompson (2010) and Boehmer et al. (2010) have empirically found that these clustering effects are not important for large samples. The robust standard errors are very similar to standard errors for ordinary least square (OLS), suggesting that the fixed effects and control variables are removing most of the correlation that is present across observations. In addition, our dataset cannot be meaningfully applied to the clustering effect model17 ; therefore, statistical inferences in this paper are conducted using fixed-effects standard errors. Table 96.4 provides the results of fixed-effects regressions for 2,645 firms during the period 1969–2009. Models (1) and (2) show that the estimated coefficients of the growth rate are −0.03 with a t-statistics of −4.85 and −0.03 with a t-statistics of −4.85, respectively. Such significantly negative coefficients confirm Hypothesis 1, which states that high-growth firms will pay less in dividends for the consideration of flexibility. We also include an interaction term of risk and growth rate into Models (3) and (4). The results in Models (3) and (4) also support Hypothesis 1. Models (1)–(4) show that the relationship between the payout ratio and the risk is significantly negative. The results are similar to the findings of
16
Besides merely adding an interaction dummy as indicated in equation (96.36), we include an intercept dummy to take care of the individual effect of two groups. We also run regressions for high-growth firms and low-growth firms separately. Results from both models are qualitatively the same as those from equation (96.36) and also support Hypotheses 1–3. 17 Because our sample is an unbalanced panel data, the clustering computer program cannot meaningfully estimate the variance components, variance of firm (Vˆfirm ), variance of time (Vˆtime ), and heteroskedasticity-robust OLS variance (Vˆwhite ).
July 6, 2020
(1)
0.32∗∗∗ (2.70)
−0.18∗∗∗ (−9.82)
(2)
0.45∗∗∗ (3.76)
(3)
0.31∗∗∗ (2.64)
(4)
0.45∗∗∗ (3.77)
(5) c =1
0.33∗∗∗ (2.94) 0.52∗∗∗ (4.58)
−0.23∗∗∗ 0.71∗∗∗ (−13.88) (57.57)
(7) c =1
0.32∗∗∗ (2.91)
−0.22∗∗∗ 0.70∗∗∗ (−12.69) (56.96)
(8) c =1
0.51∗∗∗ (4.53)
(6) c =1
Total risk
D∗ Total risk
Growth∗ total risk
−17.47∗∗∗ (−14.64) −0.14∗∗∗ (−7.76)
−0.29∗∗∗ (−8.84) −17.14∗∗∗ (−13.13)
−27.81∗∗∗ (−24.15)
−2.90 (−0.63)
63.35∗∗∗ (52.23)
−0.13∗∗∗ (−4.02) −29.61∗∗∗ (−23.44)
63.68∗∗∗ (52.35)
15.30∗∗∗ (3.46)
Growth
ln (size)
ROA
Adj-R2
−0.03∗∗∗ (−4.85)
0.04∗∗∗ (3.76)
−13.24∗∗∗ (−63.14)
0.1945
−0.03∗∗∗ (4.68)
0.02∗∗ (2.10)
−13.07∗∗∗ (−62.42)
0.1982
0.09∗∗∗ (6.38)
0.04∗∗∗ (3.47)
−12.96∗∗∗ (−61.17)
0.1970
0.02∗∗ (2.06)
−13.06∗∗∗ (−61.97)
0.1982
< 0.01 (0.01)
F -test (p-value)
−0.02∗∗∗ (−3.70) −0.02∗∗∗ (−3.53)
0.002 (0.21) −0.002 (−0.27)
−11.54∗∗∗ (−57.83) −11.40∗∗∗ (−56.55)
0.2867
0 when g0 > r¯ the optimal growth rate
July 6, 2020
Panel A. ROE = 40% λ = 0.95 Time (t)
g 0= 35%
g 0= 40%
g 0= 45%
g 0= 55%
g 0= 65%
g 0= 75%
g 0= 85%
g 0= 95%
0.6702 0.5561 0.4972 0.4630 0.4419 0.4283 0.4193 0.4132 0.4091 0.4063 0.4044 0.4030 0.4021 0.4015 0.4010 0.4007 0.4005
0.6336 0.5381 0.4871 0.4569 0.4380 0.4257 0.4176 0.4121 0.4083 0.4058 0.4040 0.4028 0.4019 0.4013 0.4009 0.4006 0.4005
0.5926 0.5170 0.4748 0.4493 0.4331 0.4225 0.4154 0.4106 0.4073 0.4051 0.4035 0.4024 0.4017 0.4012 0.4008 0.4006 0.4004
0.5463 0.4917 0.4597 0.4398 0.4269 0.4183 0.4126 0.4087 0.4060 0.4042 0.4029 0.4020 0.4014 0.4010 0.4007 0.4005 0.4003
0.4938 0.4610 0.4406 0.4274 0.4187 0.4128 0.4089 0.4061 0.4042 0.4029 0.4020 0.4014 0.4010 0.4007 0.4005 0.4003 0.4002
0.4335 0.4228 0.4156 0.4107 0.4074 0.4051 0.4036 0.4025 0.4017 0.4012 0.4008 0.4006 0.4004 0.4003 0.4002 0.4001 0.4001
0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3638 0.3741 0.3816 0.3870 0.3909 0.3936 0.3955 0.3969 0.3978 0.3985 0.3989 0.3993 0.3995 0.3996 0.3997 0.3998 0.3999
0.2821 0.3099 0.3326 0.3505 0.3642 0.3744 0.3818 0.3872 0.3910 0.3937 0.3956 0.3969 0.3978 0.3985 0.3989 0.3993 0.3995
0.1851 0.2212 0.2560 0.2874 0.3142 0.3361 0.3533 0.3663 0.3759 0.3829 0.3879 0.3915 0.3941 0.3958 0.3971 0.3980 0.3986
0.0681 0.0910 0.1189 0.1512 0.1864 0.2225 0.2571 0.2884 0.3151 0.3368 0.3538 0.3666 0.3762 0.3831 0.3881 0.3916 0.3941
b3568-v3-ch97
g 0= 25%
9.61in x 6.69in
g 0= 15%
H.-Y. Chen et al.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
g 0= 5%
Handbook of Financial Econometrics,. . . (Vol. 3)
Mean-reverting process of the optimal growth rate.
15:56
3424
Table 97.1:
page 3424
July 6, 2020
0.4002 0.4002 0.4001 0.4001 0.4001 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.4002 0.4001 0.4000 0.4001 0.4000 0.4000 0.4001 0.4000 0.4000 0.4001 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3999 0.3999 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3996 0.3998 0.3998 0.3999 0.3999 0.3999 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3990 0.3993 0.3995 0.3997 0.3998 0.3998 0.3999 0.3999 0.3999 0.4000 0.4000 0.4000 0.4000
0.3959 0.3971 0.3980 0.3986 0.3990 0.3993 0.3995 0.3997 0.3998 0.3998 0.3999 0.3999 0.3999
(Continued )
9.61in x 6.69in
0.4003 0.4002 0.4001 0.4001 0.4001 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
Handbook of Financial Econometrics,. . . (Vol. 3)
0.4003 0.4002 0.4002 0.4001 0.4001 0.4001 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
15:56
0.4003 0.4002 0.4002 0.4001 0.4001 0.4001 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
18 19 20 21 22 23 24 25 26 27 28 29 30
3425 page 3425
July 6, 2020
Panel B. ROE = 40% λ = 0.60 Time (t)
g 0= 35%
g 0= 40%
g 0= 45%
g 0= 55%
g 0= 65%
g 0= 75%
g 0= 85%
g 0= 95%
0.7809 0.6790 0.6118 0.5647 0.5303 0.5044 0.4845 0.4689 0.4565 0.4466 0.4385 0.4320 0.4266 0.4222 0.4185 0.4155 0.4130
0.7220 0.6408 0.5853 0.5455 0.5159 0.4934 0.4759 0.4621 0.4510 0.4422 0.4349 0.4290 0.4242 0.4202 0.4169 0.4141 0.4118
0.6591 0.5981 0.5548 0.5229 0.4988 0.4801 0.4654 0.4537 0.4443 0.4367 0.4305 0.4254 0.4212 0.4177 0.4148 0.4124 0.4104
0.5917 0.5502 0.5195 0.4961 0.4780 0.4638 0.4524 0.4433 0.4358 0.4298 0.4248 0.4207 0.4173 0.4145 0.4121 0.4102 0.4085
0.5193 0.4960 0.4779 0.4637 0.4524 0.4432 0.4358 0.4297 0.4248 0.4207 0.4173 0.4144 0.4121 0.4101 0.4085 0.4071 0.4060
0.4413 0.4342 0.4285 0.4237 0.4198 0.4165 0.4138 0.4116 0.4097 0.4082 0.4069 0.4058 0.4048 0.4041 0.4034 0.4029 0.4024
0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3570 0.3632 0.3685 0.3731 0.3771 0.3806 0.3835 0.3860 0.3881 0.3900 0.3915 0.3928 0.3939 0.3949 0.3957 0.3964 0.3969
0.2657 0.2805 0.2944 0.3072 0.3188 0.3294 0.3388 0.3471 0.3545 0.3610 0.3666 0.3715 0.3757 0.3794 0.3825 0.3851 0.3874
0.1664 0.1832 0.2003 0.2174 0.2343 0.2506 0.2663 0.2811 0.2949 0.3077 0.3193 0.3298 0.3391 0.3475 0.3548 0.3612 0.3668
0.0580 0.0670 0.0771 0.0884 0.1007 0.1142 0.1287 0.1441 0.1602 0.1769 0.1940 0.2111 0.2281 0.2446 0.2606 0.2757 0.2899
b3568-v3-ch97
g 0= 25%
9.61in x 6.69in
g 0= 15%
H.-Y. Chen et al.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
g 0= 5%
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3426
Table 97.1:
page 3426
July 6, 2020
0.4072 0.4060 0.4051 0.4042 0.4036 0.4030 0.4025 0.4021 0.4018 0.4015 0.4013 0.4011 0.4009
0.4050 0.4042 0.4036 0.4030 0.4025 0.4021 0.4018 0.4015 0.4013 0.4011 0.4009 0.4008 0.4006
0.4020 0.4017 0.4014 0.4012 0.4010 0.4009 0.4007 0.4006 0.4005 0.4004 0.4004 0.4003 0.4003
0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000 0.4000
0.3974 0.3978 0.3982 0.3984 0.3987 0.3989 0.3991 0.3992 0.3993 0.3994 0.3995 0.3996 0.3997
0.3893 0.3910 0.3924 0.3935 0.3946 0.3954 0.3961 0.3967 0.3972 0.3977 0.3980 0.3983 0.3986
0.3717 0.3759 0.3795 0.3826 0.3852 0.3875 0.3894 0.3910 0.3924 0.3936 0.3946 0.3954 0.3961
0.3031 0.3151 0.3260 0.3358 0.3445 0.3522 0.3589 0.3648 0.3700 0.3744 0.3782 0.3815 0.3843
(Continued )
9.61in x 6.69in
0.4087 0.4073 0.4061 0.4052 0.4043 0.4037 0.4031 0.4026 0.4022 0.4018 0.4015 0.4013 0.4011
Handbook of Financial Econometrics,. . . (Vol. 3)
0.4099 0.4083 0.4070 0.4059 0.4049 0.4041 0.4035 0.4029 0.4025 0.4021 0.4018 0.4015 0.4012
15:56
0.4109 0.4091 0.4077 0.4064 0.4054 0.4045 0.4038 0.4032 0.4027 0.4023 0.4019 0.4016 0.4014
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
18 19 20 21 22 23 24 25 26 27 28 29 30
3427 page 3427
July 6, 2020
Panel C. ROE = 50% λ = 0.95 Time (t)
g 0= 35%
g 0= 40%
g 0= 45%
g 0= 55%
g 0= 65%
g 0= 75%
g 0= 85%
g 0= 95%
0.7156 0.6186 0.5694 0.5420 0.5259 0.5162 0.5102 0.5064 0.5041 0.5026 0.5016 0.5010 0.5007 0.5004 0.5003 0.5002 0.5001
0.6774 0.6000 0.5593 0.5361 0.5224 0.5140 0.5088 0.5056 0.5035 0.5022 0.5014 0.5009 0.5006 0.5004 0.5002 0.5001 0.5001
0.6345 0.5780 0.5469 0.5289 0.5180 0.5113 0.5071 0.5045 0.5029 0.5018 0.5012 0.5007 0.5005 0.5003 0.5002 0.5001 0.5001
0.5860 0.5515 0.5316 0.5196 0.5123 0.5078 0.5049 0.5031 0.5020 0.5013 0.5008 0.5005 0.5003 0.5002 0.5001 0.5001 0.5001
0.5307 0.5191 0.5120 0.5076 0.5048 0.5030 0.5019 0.5012 0.5008 0.5005 0.5003 0.5002 0.5001 0.5001 0.5001 0.5000 0.5000
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4670 0.4785 0.4861 0.4911 0.4943 0.4963 0.4977 0.4985 0.4991 0.4994 0.4996 0.4998 0.4998 0.4999 0.4999 0.5000 0.5000
0.3929 0.4261 0.4503 0.4672 0.4786 0.4862 0.4911 0.4943 0.4964 0.4977 0.4985 0.4991 0.4994 0.4996 0.4998 0.4998 0.4999
0.3056 0.3560 0.3976 0.4297 0.4528 0.4689 0.4798 0.4869 0.4916 0.4946 0.4966 0.4978 0.4986 0.4991 0.4994 0.4996 0.4998
0.2013 0.2572 0.3124 0.3618 0.4022 0.4331 0.4552 0.4706 0.4809 0.4877 0.4921 0.4949 0.4968 0.4979 0.4987 0.4992 0.4995
0.0743 0.1077 0.1508 0.2021 0.2581 0.3132 0.3625 0.4028 0.4335 0.4555 0.4708 0.4810 0.4877 0.4921 0.4950 0.4968 0.4980
b3568-v3-ch97
g 0= 25%
9.61in x 6.69in
g 0= 15%
H.-Y. Chen et al.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
g 0= 5%
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3428
Table 97.1:
page 3428
July 6, 2020
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4999 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4999 0.4999 0.4999 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4997 0.4998 0.4999 0.4999 0.4999 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4987 0.4992 0.4995 0.4997 0.4998 0.4999 0.4999 0.4999 0.5000 0.5000 0.5000 0.5000 0.5000
(Continued )
9.61in x 6.69in
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
Handbook of Financial Econometrics,. . . (Vol. 3)
0.5001 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
15:56
0.5001 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
18 19 20 21 22 23 24 25 26 27 28 29 30
3429 page 3429
July 6, 2020
Panel D. ROE = 50% λ = 0.60 Time (t)
g 0= 35%
g 0= 40%
g 0= 45%
g 0= 55%
g 0= 65%
g 0= 75%
g 0= 85%
g 0= 95%
0.8095 0.7231 0.6658 0.6258 0.5968 0.5753 0.5591 0.5466 0.5370 0.5294 0.5235 0.5188 0.5150 0.5121 0.5097 0.5078 0.5063
0.7489 0.6833 0.6382 0.6059 0.5821 0.5642 0.5506 0.5400 0.5318 0.5254 0.5203 0.5162 0.5130 0.5105 0.5084 0.5068 0.5054
0.6840 0.6387 0.6063 0.5824 0.5644 0.5508 0.5402 0.5319 0.5255 0.5203 0.5163 0.5131 0.5105 0.5084 0.5068 0.5055 0.5044
0.6144 0.5885 0.5690 0.5543 0.5429 0.5341 0.5271 0.5217 0.5174 0.5139 0.5112 0.5090 0.5072 0.5058 0.5047 0.5038 0.5030
0.5396 0.5315 0.5251 0.5201 0.5161 0.5129 0.5104 0.5083 0.5067 0.5054 0.5043 0.5035 0.5028 0.5023 0.5018 0.5015 0.5012
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4589 0.4663 0.4724 0.4775 0.4817 0.4851 0.4879 0.4902 0.4921 0.4936 0.4948 0.4958 0.4966 0.4972 0.4978 0.4982 0.4985
0.3715 0.3909 0.4081 0.4231 0.4360 0.4470 0.4564 0.4642 0.4707 0.4761 0.4805 0.4841 0.4871 0.4896 0.4915 0.4931 0.4945
0.2767 0.3028 0.3277 0.3510 0.3724 0.3917 0.4088 0.4237 0.4365 0.4475 0.4568 0.4645 0.4709 0.4763 0.4807 0.4843 0.4872
0.1734 0.1984 0.2245 0.2512 0.2779 0.3039 0.3288 0.3521 0.3734 0.3925 0.4095 0.4243 0.4371 0.4480 0.4571 0.4648 0.4712
0.0605 0.0729 0.0872 0.1037 0.1225 0.1433 0.1662 0.1908 0.2166 0.2432 0.2699 0.2962 0.3215 0.3453 0.3672 0.3870 0.4047
b3568-v3-ch97
g 0= 25%
9.61in x 6.69in
g 0= 15%
H.-Y. Chen et al.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
g 0= 5%
Handbook of Financial Econometrics,. . . (Vol. 3)
(Continued )
15:56
3430
Table 97.1:
page 3430
July 6, 2020
0.5024 0.5020 0.5016 0.5013 0.5010 0.5008 0.5007 0.5005 0.5004 0.5004 0.5003 0.5002 0.5002
0.5010 0.5008 0.5006 0.5005 0.5004 0.5003 0.5003 0.5002 0.5002 0.5001 0.5001 0.5001 0.5001
0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
0.4988 0.4991 0.4992 0.4994 0.4995 0.4996 0.4997 0.4997 0.4998 0.4998 0.4999 0.4999 0.4999
0.4955 0.4964 0.4971 0.4976 0.4981 0.4985 0.4988 0.4990 0.4992 0.4993 0.4995 0.4996 0.4997
0.4897 0.4916 0.4932 0.4945 0.4956 0.4964 0.4971 0.4977 0.4981 0.4985 0.4988 0.4990 0.4992
0.4765 0.4809 0.4844 0.4874 0.4898 0.4917 0.4933 0.4946 0.4956 0.4964 0.4971 0.4977 0.4981
0.4201 0.4335 0.4449 0.4546 0.4627 0.4694 0.4750 0.4796 0.4834 0.4866 0.4891 0.4912 0.4928
b3568-v3-ch97
Notes: This table shows the mean-reverting process of the optimal growth rate for 30 years. Panel A presents the values of the optimal growth rate under different settings of the initial growth rate when the rate of return on equity is equal to 40% and the degree of market perfection is equal to 0.95. Panel B presents the values of the optimal growth rate when the rate of return on equity is equal to 40% and the degree of market perfection is equal to 0.60. Panel C presents the values of the optimal growth rate when the rate of return on equity is equal to 50% and the degree of market perfection is equal to 0.95. Panel D presents the values of the optimal growth rate when the rate of return on equity is equal to 50% and the degree of market perfection is equal to 0.60.
9.61in x 6.69in
0.5035 0.5029 0.5023 0.5019 0.5015 0.5012 0.5010 0.5008 0.5006 0.5005 0.5004 0.5003 0.5003
Handbook of Financial Econometrics,. . . (Vol. 3)
0.5044 0.5035 0.5028 0.5023 0.5019 0.5015 0.5012 0.5010 0.5008 0.5006 0.5005 0.5004 0.5003
15:56
0.5051 0.5041 0.5033 0.5026 0.5021 0.5017 0.5014 0.5011 0.5009 0.5007 0.5006 0.5005 0.5004
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
18 19 20 21 22 23 24 25 26 27 28 29 30
3431 page 3431
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
(a)
(b)
ROE = 40%; λ = 0.95
page 3432
ROE = 40%; λ = 0.60
Mean Revert Process of the Optimal Growth Rate
1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00
0
5
10
95% 55% 25%
Initial Growth Rate
(c)
Mean Revert Process of the Optimal Growth Rate
Optimal Growth Rate
Optimal Growth Rate
b3568-v3-ch97
H.-Y. Chen et al.
3432
15 Time 85% 45% 15%
20
25
75% 40% 5%
30 65% 35%
1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00
(d)
ROE = 50%; λ = 0.95
0
Initial Growth Rate
Figure 97.1:
5
10
95% 55% 25%
15 Time 85% 50% 15%
20 75% 45% 5%
25
5
10
95% 55% 25%
15 Time 85% 45% 15%
20
25
75% 40% 5%
30 65% 35%
ROE = 50%; λ = 0.60 Mean Revert Process of the Optimal Growth Rate
Optimal Growth Rate
1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00
0
Initial Growth Rate
Mean Revert Process of the Optimal Growth Rate
Optimal Growth Rate
July 6, 2020
30 65% 35%
1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00
0
Initial Growth Rate
5 95% 55% 25%
10
15 Time 85% 50% 15%
20 75% 45% 5%
25
30 65% 35%
Sensitivity analysis of the optimal growth rate.
Note: The figures show the sensitivity analysis on how the time horizon (t), the degree of market perfection (λ), the rate of return on equity (¯ r/1 − L), and the initial growth rate (go ) affect the optimal growth rate. (a) presents the mean-reverting process assuming that the rate of return on equity is 40% and the degree of market perfection is 0.95. Each line shows the optimal growth rates over 30 years with different initial growth rates. (b) presents the mean-reverting process assuming that the rate of return on equity is 40% and the degree of market perfection is 0.60. (c) presents the mean-reverting process of the optimal growth rate assuming that the rate of return on equity is 50% and the degree of market perfection is 0.95. (d) presents the mean-reverting process of the optimal growth rate assuming that the rate of return on equity is 50% and the degree of market perfection is 0.60.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3433
3433
(iii) The greater the degree of imperfection in the market, as indicated by a lower value of λ, (0 < λ < 1), the slower the speed with which the firm attains the steady-state optimal growth rate. One feasible reason is that the firm can issue additional equity only at a price lower than the current market price indicated by the value of λ < 1. 97.2.4.2 Case II: Degree of market perfection Taking the derivative of the optimal growth rate with respect to the degree of market perfection, λ, we can obtain equation (97.23). r¯ 2¯ rt ∗ − 1 er¯λt/(λ−2) (λ−2) 2 g0 ∂g (t) = (97.23)
2 , ∂λ r¯ r ¯ λt/(λ−2) 1− 1− g e 0
where g ∗ (t) =
1− 1−
r¯
r¯ g0
eλ¯r t/(λ−2)
.
We find that the sign of equation (97.23) depends on the sign of gr¯ − 1 . 0 If the initial growth rate is less than the rate of return on equity, equation (97.23) is positive; therefore, the optimal payout ratio at time t tends to be closer to the rate of return on equity if the degree of market perfection increases. Comparing Figures 97.1 (a) and (c) to Figures 97.1 (b) and (d), we can find when the degree of market perfection is 0.95, the optimal growth rate will converge to its target rate (rate of return on equity) after 20 years. When the degree of market perfection is 0.60, the optimal growth rate cannot converge to its target rate (rate of return on equity) even after 30 years. Figure 97.2 shows the optimal growth rates of different rates of return on equity and degrees of market perfection when the initial growth rate is 5% and the time is 3. We find that the more perfect the market is, the faster the optimal growth rate adjusts; thus, the mean-reverting process of the optimal growth rate is faster if the market is more perfect. 97.2.4.3 Case III: Rate of return on equity Taking the derivative of the optimal growth rate with respect to the rate of return on equity, r¯, we can obtain equation (97.24).
λt λ¯ r t/(λ−2) λ¯ r t/(λ−2) g ∗ − r ¯ ) r ¯ e 1 − e + (g g 0 0 0 (λ−2) ∂g (t) = , (97.24) 2 λ¯ r t/(λ−2) ∂¯ r r − g )e g + (¯ 0
0
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
3434
g0=5% and t =3 20.00% 18.00% 16.00% Optimal Growth Rate
July 6, 2020
14.00% 12.00% 10.00% 8.00% 6.00% 4.00% 2.00% 0.50 0.35 0.20 ROE
0.00%
0.05 Degree of Market Perfection
Figure 97.2: Optimal growth rate with respect to the rate of return on equity and the degree of market perfection. Note: The figure shows the optimal growth rates using different rates of return on equity and degrees of market perfection when the initial growth rate is 5% and the time is 3. The rates of return on equity range from 5% to 50%. The degrees of market perfection range from 0.05 to 1.
where g∗ (t) =
g0 r¯ . g0 − (¯ r − g0 ) eλ¯r t/(λ−2)
Because the degree of market perfection (λ) is between zero and one, is negative, and eλ¯r t/(λ−2) is between zero and one. The sign of equation (97.24) is therefore positive when the initial growth rate is less than the rate of return on equity. That is, in order to catch up to the higher target rate, a firm should increase its optimal growth rate. We further provide a sensitivity analysis to investigate the relationship between the change in the optimal growth rate and the change in the rate of return on equity. Panels A and C of Table 97.1 and Figures 97.1 (a) and (c) present the mean-reverting processes of the optimal growth rate when the rate of return on equity is 40%. Panels B and D of Table 97.1 and Figures 97.1 (b) and (d) present those when the rate of return on equity is 50%. We find that, holding the other factors constant, the optimal growth rate under a 50% rate of return λt (λ−2)
page 3434
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3435
3435
on equity is higher than the optimal growth rate under a 40% rate of return on equity. Figure 97.2 shows that, under different degrees of market perfection, the optimal growth rates increase when the rate of return on equity increases; therefore, when the rate of return on equity increases, the optimal growth rates increase under different conditions of initial growth rates, time horizons, and degrees of market perfections. 97.2.4.4 Case IV: Initial growth rate Taking the derivative of the optimal growth rate with respect to the initial growth rate, g0 , we can obtain equation (97.25). 2 r¯ ∗ eλ¯rt/(λ−2) g0 ∂g (t) = (97.25)
2 , ∂g0 1 − 1 − gr¯ eλ¯r t/(λ−2) 0
where g ∗ (t) =
1− 1−
r¯
r¯ g0
eλ¯r t/(λ−2)
.
We can easily determine that the sign of equation (97.25) is positive. From Table 97.1, we can also find that the optimal growth rate is higher if the initial growth rate is higher. Figure 97.1 shows that the lines of the optimal growth rate over time do not cross-over each other, meaning that the meanreverting process of the optimal growth rate under an initial growth rate far from the objective value cannot be faster than that under an initial growth rate closer to the objective value. From equation (97.21), we know that the optimal growth rate is not affected by the payout ratio. It is only affected by the time horizon (t), the degree of market perfection or imperfection (λ), the rate of return on equity (¯ r ), and the initial growth rate (g0 ). This might imply that the sustainable growth rate, rather than the optimal growth rate, is affected by payout ratio. In Section 97.5.5, we will develop the optimal dividend payout policy in terms of the optimal growth rate. 97.2.5 Optimal dividend policy Here, we address the problem of an optimum dividend policy for the firm. Using the model developed in the earlier sections, we maximize p(0) simultaneously with respect to the growth rate and the number of shares
July 6, 2020
15:56
3436
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
outstanding. We can derive a general form for the optimal payout ratio as follows8 : ¯ g∗ (t) D(t) = 1− r¯ (t) Y¯ (t)
t ∗ · 1 + ekt−λ 0 g (s)ds r − g∗ (t)) + σ 2 (t) g˙ ∗ (t) λ σ˙ 2 (t) + σ 2 (t) g∗ (t) (¯ W , (97.26) × (2 − λ) σ 2 (t)λ (¯ r − g∗ (t))3−λ T s ∗ r − g∗ (s))2−λ ds. where W = t eλ 0 g (u)du−ks σ 2 (s)λ−1 (¯ Equation (97.26) implies that the optimal payout ratio of a firm is a function of the optimal growth rate g ∗ (t), the change in optimal growth rate g˙ ∗ (t), the rate of return on equity r¯, the degree of market perfection λ, the cost of capital k, the risk of the rate of return on equity σ 2 (t), and the change in the risk of rate of the return on equity σ˙ 2 (t). Previous empirical studies find that a firm’s dividend policy is related to the firm’s risk, growth rate, and proxies of profitability, such as the rate of return on assets and the market-to-book ratio (e.g., Benartzi et al., 1997; DeAngelo et al., 2006; Denis and Osobov, 2008; Fama and French, 2001; Grullon et al., 2002; Rozeff, 1982). However, few studies have a solid theoretical model to support their findings. Our model fills this gap by showing the determinants of the optimal payout policy under a thorough and solid theoretical framework. Besides optimal payout policy in the static state, we will discuss how the stochastic growth affects the specification error associated with the expected dividend payout and the implication of stochastic initial growth rate on the optimal growth rate. 97.3 Stochastic Growth Rate and Specification Error on Expected Dividend We attempt to identify the possible error in the optimal dividend policy introduced by misspecification of the growth rate as deterministic. Lintner (1964) explicitly introduced uncertainty of growth in his valuation model. The uncertainty is shown to have two components — one derived from the uncertainty of the rate of return and, second, from the increase in this uncertainty as a linear function of futurity. In the present analysis, we introduce a fully dynamic time variant growth rate to clearly see the impact of its stochasticity on the optimal dividend policy. 8
For the detailed derivation of equation (97.26), please see Appendix 97D.
page 3436
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3437
3437
With the stochastic rate of return on equity, r˜(t) ∼ N (¯ r (t), σ 2 (t)), the asset size and the growth rate for any time interval also become stochastic. This can be seen more clearly by rewriting equation (97.3) explicitly in the growth form as follows: g˜(t) =
˙ λn(t)p(t) ˙ A(t) = b˜ r(t) + , A(t) (1 − L)A(t)
(97.27)
where b is the retention rate and (1 − L)A(t) is the total equity at time t. Therefore, the growth rate of a firm is related not only to how much earnings it retains, but also to how many new shares it issues. If the growth rate in equation (97.27) also becomes stochastic, g˜(t) ∼ N (g(t), σg2 (t)), a further element of stochasticity in g˜(t) is introduced by the stochastic nature of λ indicating the uncertainty about the price at which new equity can be issued relative to p(t). Misspecification of the growth rate as deterministic in the earlier sections introduced an error in the optimal dividend policy, which we now proceed to identify. Reviewing equations (97.1) to (97.4) under a stochastic growth rate specification, we derive from equation (97.4) the expression equation (97.28) as follows:
t ˜ (t) [˜ r (t) − g˜ (t)] A (o) e 0 g˜(s)ds + λn˙ (t) p (t) D ˜ = . d (t) = n (t) n (t)
(97.28)
It may be interesting to compare this model with the internal growth models. Note that if the external equity financing is not permissible, the growth rate, equation (97.27), is equal to the retention rate times the profitability rate, as in Lintner (1964) and Gordon (1963). Also, note that under the no-external equity financing assumption, equation (97.28) reduces to t g ˜ (s)ds ˜ ˜ , which is Lintner’s equation (97.8) with the only difd(t) = do e 0 ference being that, in our case, the growth rate is an explicit function of time. To carry the analysis further under a time variant stochastic growth rate, ˜ as follows: we derive the mean and variance of d(t)
t
(¯ r (t) − g (t)) A (o) e 0 g(s)ds + λn˙ (t) p (t) E d˜(t) = n (t)
t t Cov r˜ (t) , A (o) e 0 g˜(s)ds + E ε˜ (t) e 0 g˜(s)ds − n (t) (97.29a)
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
page 3438
H.-Y. Chen et al.
3438
and
Var d˜(t) =
A (o)2 σ 2 (t) e2
t
n (t)
0
g˜(s)ds
+
t A (o)2 Var ε˜ (t) e 0 g˜(s)ds n (t)2
, (97.29b)
where ε˜(t) = g˜(t) − g(t). Comparing equations (97.29) and (97.7), we observe that both the mean ˜ will be subject to specification error if the determinand the variance of d(t) istic growth rate is inappropriately employed. Equation (97.29a) implies that the numerator in the equation for the optimal payout ratio has the following additional element: t t −Cov r˜(t), A (o) e 0 g˜(s)ds + E ε˜ (t) e 0 g˜(s)ds . (97.30) n (t) Now, note that the second term in the expression of equation (97.30) tends to be dominated more and more by the first term as the time t approaches to infinity.9 Thus, the direction of error introduced in the optimal payout ratio when the growth rate is arbitrarily assumed to be non-stochastic increasingly depends upon the size of the covariance term in equation (97.30). If the covariance between the rate of return on equity and the growth rate is positive, the optimal payout ratio under the stochastic growth rate assumption would be lower even if the rate of return for the firm increases over time. In other words, the payout ratio in equation (97.26) under the deterministic growth rate assumption is an overestimation of the optimal payout ratio. In contrast, if the covariance between the rate of return on equity and the growth rate is negative (that is, the higher the growth rate, the lower the rate of return on equity), the payout ratio derived in the preceding section is an underestimation of the optimal payout ratio. 9
From Astrom (2006), we also know that t t ε (s)] plim e 0 g˜(s)ds , plim ε˜ (s) e 0 g˜(s)ds = plim [˜ s→∞
s→∞
s→∞
t t plim e 0 g˜(s)ds = e 0 g(s)ds .
s→∞
Substituting (B) into (A), we have t t plim ε˜ (s) e 0 g(˜s)ds = e 0 g(s)dt plim [˜ ε (s)] = 0.
s→∞
s→∞
(A) (B)
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3439
3439
97.4 Empirical Evidence Our theoretical model, developed in the foregoing sections, shows that a firm’s optimal growth rate follows a mean-reverting process with a target rate of return on equity. In addition, the optimal payout ratio will be subject to a specification error if a firm’s profitability and growth rate are both stochastic. We here develop the following testable hypotheses in support of implications derived from our theoretical model. H1: The firm’s growth rate follows a mean-reverting process. Equation (97.21) shows that the firm’s optimal growth rate follows a logistic equation with a special characteristic, a mean-reverting process. This main hypothesis will be tested in three ways. First, in the comparative statics analysis, Table 97.1 and Figure 97.1 show that the firm’s optimal growth rate will converge to the steady-state value, which is the firm’s return on equity. That is, H1a: There exists a target rate of the firm’s growth rate, and the target rate is the firm’s return on equity. From the implications of equation (97.22), we expect the firm will adjust its growth rate to its target rate gradually. This partial adjustment of the growth rate is optimal. H1b: The firm partially adjusts its growth rate to the target rate. The mean reverting process of the logistic equation is not a linear process (see Figure 97.1). The adjustment speed of the firm’s growth rate is fast in the beginning and slows down when the growth rate is approaching the objective value. H1c: The partial adjustment is fast in the early stage of the mean-reverting process. Besides the optimal growth rate derived in our theoretical model, the optimal dividend payout ratio is determined simultaneously. When we further introduce a stochastic growth rate into our model, equation (97.29) shows that the dividend will be subject to specification error if the deterministic growth rate is inappropriately employed. H2: The firm’s dividend payout is negatively associated with the covariance between the firm’s rate of return on equity and the firm’s growth rate. Due to ignoring the covariance between the firm’s rate of return on equity and the firm’s growth rate, existing dividend models will have the problem of potential specification error. Equation (97.29) therefore indicates that, if
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3440
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
we introduce the covariance between the firm’s rate of return on equity and the firm’s growth rate into existing dividend models, the dividend models can be improved. H2a: The covariance between the firm’s rate of return on equity and the firm’s growth rate is one of the key determinants of the dividend payout policy. From hypothesis 2, a firm’s payout ratio may be altered by the change of its covariance between the profitability and the growth rate. When the case goes to an extreme, a firm’s payout policy decision of whether to pay a dividend or not will be affected by its covariance between the profitability and the growth rate. Due to a negative correlation between the payout ratio and the covariance term in the theoretical model, we can hypothesize as follows: H3: The firm tends to pay a dividend if its covariance between the firm’s rate of return on equity and the firm’s growth rate is lower. In addition, we can investigate whether the covariance between the profitability and the growth rate can affect the decision to stop or initiate a dividend payment. H3a: The firm tends to stop paying a dividend if its covariance between the firm’s rate of return on equity and the firm’s growth rate is higher. H3b: The firm tends to start paying a dividend if its covariance between the firm’s rate of return on equity and the firm’s growth rate is lower. 97.4.1 Sample description To test the hypotheses developed from our model, we collect firm information from Compustat, including total asset, sales, net income, and dividends payout. We also collect stock prices, stock returns, share codes, and exchange codes from the Center for Research in Security Prices (CRSP) files. Our empirical study focuses on common stocks (SHRCD = 10, 11) and firms listed on NYSE, AMEX, or NASDAQ (EXCE = 1, 2, 3, 31, 32, 33) during 1969 to 2011. We exclude utility services (SICH = 4900–4999) and financial institutions (SICH = 6000–6999) from our sample. Similar to Rozeff (1982), Jagannathan et al. (2000), and Lee et al. (2011), we use five-year rolling average for all variables to examine the firm’s payout policy. There, the sample only includes those firm-years with at least 5 years of data available to compute average payout ratios, growth rate, return on assets, beta, size, and book-to-market ratios. The payout ratio is the ratio of the dividend payout to the net income. The growth rate is the growth rate of total assets. Sales growth rate and the sustainable growth rate proposed by Higgins (1977) are
page 3440
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3441
3441
also used in examining hypotheses. The beta coefficient is estimated by the market model over the prior 60 months. Firm-years in our sample should have at least 60 consecutive previous monthly returns to estimate their beta coefficients by the market model. We exclude firm-years with no dividend payouts 1 year before or after to ensure the dividend payout policy reflects the firm’s full-year condition. Table 97.2 presents the summary statistics for our sample firms between year 1969 and year 2011. Panel A shows that there are a total of 31,255 dividend paying firm-years, including 13,136 high growth firm-years and 18,119 low growth firm-years. High growth firm-years are those firm-years that have 5-year average total asset growth rates higher than their 5-year average rate of return on equity; and low growth firm-years are those firms with 5-year average total asset growth lower than their 5-year average rate of return on equity. Similar to the sample presented by Fama and French (2001) and Lee et al. (2011), the sample size of dividend paying firms increases during the period from 1969 to the middle of the 1980s and subsequently declines by 2011. A decreasing payout ratio over the sample period can also be observed. The definition of high growth firm (low growth firm), in our theoretical model, is based on the relative relationship between a firm’s total asset growth and its return on equity. Therefore, a firm can be classified as high growth because it has a high total asset growth or a low return on equity. Panel A shows that, on average, high growth firms have both a higher growth rate and a lower return on equity than low growth firms. Panel B presents the five-year moving averages of mean, median, and standard deviation for the payout ratio, growth rates, the return on equity, the beta coefficient, the market capitalization, the market-to-book ratio, and covariances between the growth rate and the return on equity across all firm-years in the sample. For high growth firms, relative to low growth firms, we can see higher averages of growth rates and beta coefficients and lower averages of payout ratios, returns on equity, market capitalization, and market-to-book ratios. The lower average and median values and the higher standard deviation value of covariances between the growth rate and the return on equity indicate that there is a positive relationship between the growth rate and the profitability for parts of the sample firms, and there is a negative relationship between the growth rate and the profitability for the rest of the sample firms. 97.4.2 Mean-reverting process for the growth rate To tackle the first hypothesis, we examine whether growth rates of sample firms follows a mean-reverting process similar to those figures in Figure 97.1.
July 6, 2020
Panel A. Sample size All
High Growth
gT A
ROE
0.4265 0.4244 0.4218 0.4076 0.3861 0.3700 0.3516 0.3293 0.3176 0.3139 0.3131 0.3089 0.3076 0.3121 0.3216 0.3286 0.3287 0.3317 0.3294 0.3244 0.3266 0.3284 0.3287
0.1705 0.1634 0.1502 0.1451 0.1377 0.1336 0.1307 0.1329 0.1356 0.1388 0.1427 0.1577 0.1627 0.1556 0.1456 0.1366 0.1392 0.1401 0.1546 0.1594 0.1589 0.1453 0.1295
0.1351 0.1338 0.1263 0.1225 0.1215 0.1233 0.1265 0.1306 0.1339 0.1366 0.1404 0.1445 0.1461 0.1448 0.1408 0.1377 0.1342 0.1328 0.1338 0.1377 0.1397 0.1403 0.1575
0.2350 0.2852 0.2829 0.3120 0.3534 0.4552 0.5247 0.5425 0.5652 0.5979 0.5963 0.6106 0.5452 0.5850 0.6258 0.5390 0.5160 0.5018 0.5057 0.3957 0.2779 0.2659 0.2710
OBS
Payout
gTA
ROE
OBS
Payout
gTA
ROE
0.5939 0.6311 0.6450 0.5969 0.6205 0.7337 0.7803 0.8286 0.8804 0.9327 0.9261 0.8911 0.8914 0.8728 1.0056 0.9844 1.0210 1.0522 1.0468 0.9576 0.8982 0.8479 0.7840
−0.4525 −0.4721 −0.3555 −0.3406 −0.3223 −0.3207 −0.4118 −0.3313 −0.3969 −0.3647 −0.3183 −0.3199 −0.4376 −0.4735 −0.5177 −0.5626 −0.6298 −0.4453 −0.5653 −0.5397 −0.6446 −0.5946 −0.6762
278 284 370 477 574 643 626 716 704 666 656 747 663 463 349 294 240 209 276 278 244 219 191
0.3958 0.3917 0.3797 0.3774 0.3584 0.3549 0.3330 0.3078 0.2933 0.2898 0.2860 0.2870 0.2796 0.2769 0.2804 0.2690 0.2611 0.2648 0.2850 0.2862 0.2860 0.2671 0.2867
0.2174 0.2172 0.2125 0.1868 0.1682 0.1545 0.1528 0.1525 0.1564 0.1643 0.1726 0.1838 0.2001 0.2124 0.2133 0.2131 0.2345 0.2403 0.2326 0.2261 0.2268 0.2155 0.1863
0.1253 0.1242 0.1174 0.1108 0.1085 0.1124 0.1165 0.1218 0.1259 0.1280 0.1305 0.1352 0.1358 0.1352 0.1283 0.1239 0.1198 0.1204 0.1175 0.1207 0.1201 0.1166 0.1106
264 341 512 489 485 410 440 416 442 502 634 554 602 654 688 654 604 512 439 382 379 396 380
0.4589 0.4516 0.4522 0.4371 0.4189 0.3936 0.3780 0.3663 0.3563 0.3459 0.3411 0.3384 0.3384 0.3371 0.3426 0.3554 0.3556 0.3590 0.3573 0.3522 0.3527 0.3622 0.3498
0.1211 0.1187 0.1052 0.1044 0.1016 0.1008 0.0994 0.0992 0.1024 0.1051 0.1118 0.1226 0.1215 0.1153 0.1113 0.1022 0.1013 0.0992 0.1056 0.1109 0.1152 0.1065 0.1009
0.1455 0.1418 0.1327 0.1339 0.1370 0.1405 0.1407 0.1459 0.1465 0.1480 0.1506 0.1570 0.1575 0.1516 0.1471 0.1438 0.1400 0.1379 0.1440 0.1500 0.1523 0.1534 0.1811
b3568-v3-ch97
Payout
542 625 882 966 1,059 1,053 1,066 1,132 1,146 1,168 1,290 1,301 1,265 1,117 1,037 948 844 721 715 660 623 615 571
Cov− *103
9.61in x 6.69in
OBS
1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991
Low Growth
Cov+ *103
H.-Y. Chen et al.
Year
Cov *103
Handbook of Financial Econometrics,. . . (Vol. 3)
Summary statistics of sample firm characteristics.
15:56
3442
Table 97.2:
page 3442
July 6, 2020
0.1603 0.1468 0.1446 0.1451 0.1473 0.1534 0.1637 0.1654 0.1918 0.1741 0.1794 0.1686 0.1603 0.1591 0.1602 0.1719 0.1851 0.1989 0.1988 0.1967
0.2777 0.2885 0.2865 0.3031 0.3232 0.2990 0.2775 0.1679 0.1171 0.1980 0.1992 0.2906 0.3217 0.2498 0.2163 0.2317 0.2139 0.1183 0.1701 0.0976
0.7672 0.7439 0.7525 0.7326 0.7563 0.8822 0.9772 1.0346 0.9921 1.0523 1.0553 1.0991 1.3396 1.2013 1.1834 1.1999 1.0955 0.8223 0.8769 1.0305
−0.7853 −0.7512 −0.6495 −0.7828 −0.8909 −0.8388 −0.8321 −0.9170 −1.1064 −1.2135 −1.3804 −1.5649 −1.6719 −1.2998 −0.9927 −1.0102 −0.9313 −1.0137 −1.3359 −1.2635
0.3307
0.1393
0.1455
0.4115
0.8737
−0.6221
163 139 151 173 197 223 217 186 166 105 97 111 139 140 181 196 148 82 88 67 13,136
0.2769 0.2812 0.2772 0.2909 0.2708 0.2511 0.2310 0.2375 0.2459 0.2232 0.2237 0.2397 0.2426 0.2302 0.2380 0.2360 0.2269 0.2327 0.2426 0.2489
0.1729 0.1774 0.1711 0.1918 0.2104 0.2215 0.2306 0.2435 0.2407 0.2325 0.2384 0.2190 0.1950 0.2184 0.2032 0.2031 0.2195 0.2464 0.2425 0.2458
0.1095 0.1063 0.1090 0.1107 0.1176 0.1237 0.1252 0.1268 0.1306 0.1216 0.1202 0.1163 0.1120 0.1173 0.1197 0.1276 0.1358 0.1266 0.1144 0.1170
0.2965
0.1930
0.1223
388 391 380 370 384 382 390 381 377 343 331 335 304 308 305 293 294 331 345 308 18,119
0.3565 0.3656 0.3732 0.3618 0.3542 0.3533 0.3363 0.3277 0.3127 0.3067 0.3051 0.3005 0.3027 0.3134 0.3107 0.3181 0.3205 0.3237 0.3392 0.3433
0.0949 0.0897 0.0869 0.0830 0.0843 0.0847 0.0931 0.0971 0.0916 0.0933 0.0911 0.0905 0.0899 0.0828 0.0856 0.0898 0.0871 0.0800 0.0885 0.0848
0.1817 0.1612 0.1587 0.1611 0.1625 0.1707 0.1852 0.1841 0.2185 0.1900 0.1962 0.1854 0.1819 0.1770 0.1831 0.2016 0.2100 0.2169 0.2203 0.2140
0.3554
0.1005 0.1622 (Continued )
9.61in x 6.69in
0.1180 0.1127 0.1108 0.1177 0.1271 0.1351 0.1423 0.1445 0.1361 0.1251 0.1238 0.1220 0.1226 0.1256 0.1300 0.1352 0.1314 0.1130 0.1198 0.1136
Handbook of Financial Econometrics,. . . (Vol. 3)
31,255
0.3329 0.3434 0.3459 0.3392 0.3259 0.3156 0.2987 0.2981 0.2923 0.2872 0.2867 0.2854 0.2838 0.2874 0.2836 0.2852 0.2892 0.3056 0.3195 0.3264
15:56
Total/ Average
551 530 531 543 581 605 607 567 543 448 428 446 443 448 486 489 442 413 433 375
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
3443 page 3443
July 6, 2020
(Continued )
Panel B. Descriptive statistics of characteristics of sample
Mean Median Stdev
Payout Ratio
gTA
gSales
gsus
ROE
Beta
Size ($MM)
M/B
Cov(gTA , ROE)
Cov(gSales , ROE)
Cov(gsus , ROE)
0.3307 0.3172 0.1545
0.1393 0.1178 0.1095
0.1332 0.1164 0.1165
0.1129 0.1003 0.1878
0.1455 0.1306 0.1512
1.0352 1.0296 0.4136
3,036 209 16,068
1.8375 1.3802 1.7307
0.0003 0.0004 0.0428
0.0239 0.0007 1.6671
0.0015 0.0007 0.0196
0.1031 0.0937 0.0575
0.1223 0.1164 0.0452
1.1213 1.1118 0.4194
1,138 140 5,946
1.4979 1.1714 1.3836
0.0009 0.0005 0.0082
0.0017 0.0007 0.0439
0.0020 0.0008 0.0066
0.1201 0.1051 0.2415
0.1622 0.1425 0.1931
0.9803 0.9826 0.4004
4,357 295 20,217
2.0735 1.5687 1.9001
−0.0001 0.0004 0.0558
0.0401 0.0007 2.1892
0.0012 0.0006 0.0252
High Growth Firms (N = 13,136) 0.2965 0.2791 0.1514
0.1930 0.1632 0.1301
0.2048 0.1766 0.1390
Low Growth Firms (N = 18,119) Mean Median Stdev
0.3554 0.3466 0.1519
0.1005 0.0937 0.0699
0.0813 0.0812 0.0558
9.61in x 6.69in b3568-v3-ch97 page 3444
Notes: This table shows the descriptive statistics for those major characteristics of sample firms. Sample includes firms listed on NYSE, AMEX, or NASDAQ with at least 5-years of data available to compute 5-year average payout ratios, growth rates, return on equity, beta, size, and book-to-market ratios, covariance terms. Cov*103 represents the median of the covariance between return on equity and growth rates times 1000. Cov+ *103 (Cov− *103 ) represents the median of the positive (negative) covariance terms times 1000. Financial institutions and utility companies are excluded from the sample. Panel A lists the numbers of firm-years and averages of 5-year average payout ratios, total asset growth, and return on equity for all sample firms, high growth firms, and low growth firms respectively between 1969 and 2011. High growth firm-years are defined as firm-years with 5-year average total asset growth higher than their 5-year average rates of return on equity. Low growth firm-years are defined as firm-years with 5-year average total asset growth lower than their 5-year average rates of return on equity. Panel B lists the mean, median, and standard deviation values of the 5-year average of the payout ratio, growth rates, rate of return on equity, beta risk, total risk, size, book-to-market ratio, and covariance between return on equity and growth rates. The payout ratio is the ratio of the dividend payout to the earnings. Growth rates are total asset growth (gTA ), sales growth (gSales ), and the sustainable growth rate (gsus ) proposed by Higgins (1977). The beta coefficient is estimated by the market model over the prior 60 months. Size represents the market capitalization calculated by the closing price times the outstanding shares of the last trading day of June of that year.
H.-Y. Chen et al.
Mean Median Stdev
Handbook of Financial Econometrics,. . . (Vol. 3)
All Sample (N = 31,255)
15:56
3444
Table 97.2:
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3445
3445
In each year, we group sample firms into six portfolios based on their growth levels which is defined as total asset growth relative to rates of return on equity. If a firm’s growth level is larger than one, this firm is assigned to the group of high-growth firms; if a firm’s growth level is less than one, this firm is assigned to the group of low-growth firms. We further divide high-growth group into three portfolios (P1, P2, and P3). Firms in P1 (P2 or P3) are those high-growth firms with top 30% (mid 40% or last 30%) growth level. The low-growth group is also divided into another three portfolios (P4, P5, and P6). Firms in P4 (P5 or P6) are those low-growth firms with top 30% (mid 40% or last 30%) growth level. Then we investigate how the average growth rate for each portfolio changes over the following 10 years. We also provide the average rate of return for all sample firms in each year as the reference for the target rate in the mean-reverting process.10 Figure 97.3 presents the empirical evidence of the mean-reverting process of firms’ growth rates in 1970, 1980, 1990, and 2000, respectively. We find, no matter which period, firms with a high-growth level will adjust their growth rate down for succeeding years, while firms with a low-growth level will adjust their growth rate up for succeeding years. It will take five years to adjust firms’ growth rates to the target rate, which is the average rate of return on equity for all sample firms. Therefore, Figure 97.3 shows that firms have a target rate for their growth rates and they will partially adjust their growth rates to the target rate rather than adjust their growth rates immediately. Note that our theoretical model shows that such partial adjustment for a firm’s growth rate is the optimal growth rate in each year. Thus, Figure 97.3 provides empirical evidence on Hypotheses 1a and 1b. One may argue that different firms have different rates of return on equity, and therefore have different target rates. The model also shows that, in a steady state, the target rate should be set in the beginning rather than change over time. We therefore use a partial adjustment model allowing each firm has its target rate of growth. Consider a situation in which firm i has a target rate ROEi,t at time t and its current growth rate is gi,t . There is a gap of ROEi,t − gi,t for firm i’s growth rate to adjust. If ROEi,t − gi,t is positive, firm i should adjust its growth rate up in the following years to 10
Besides using the average rate of return on equity for all sample firms in each year as the reference of the target rate of the growth, we also use equally-weighted and value-weighted industrial average rate of return on equity in each year as the target rate of the growth. Results of the following analyses are similar and the main conclusions are hold. To save space in this chapter, we only present results using the average rate of return on equity for all sample firms in each year as the target rate of the growth.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
3446 (a)
(b)
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
P1 (high)
P2
P1 (high)
P2
P3
P4
P3
P4
P5
P6 (low growth)
P5
P6 (low growth)
ROE
ROE
(c)
(d)
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
P1 (high)
P2
P1 (high)
P2
P3
P4
P3
P4
P5
P6 (low growth)
P5
P6 (low growth)
ROE
Figure 97.3:
ROE
Mean-reverting process for firms’ growth rates.
Notes: The figures show the mean-reverting process of firms’ growth rates starting from 1970, 1980, 1990, and 2000, respectively. Sample firms are divided into six portfolios based on their growth levels. The growth level is defined as the total asset growth relative to the rate of return on equity. If a firm’s growth level is larger than one, this firm is assigned to the group of high-growth firm; if a firm’s growth level is less than one, this firm is assigned to the group of low-growth firm. We further divide the high-growth group into three portfolios (P1, P2, and P3). Firms in P1 (P2 or P3) are those high-growth firms with top 30% (mid 40% or last 30%) growth level. The low-growth group is also divided into another three portfolios (P4, P5, and P6). Firms in P4 (P5 or P6) are those low-growth firms with top 30% (mid 40% or last 30%) growth level. The following 10 yearly average growth rates of each portfolio are presented. The average rate of return for all sample firms in each year is also presented as the reference of a target rate in the mean-reverting process. (a)–(d) present the mean-reverting process of the optimal growth rate starting from 1970, 1980, 1990, and 2000, respectively.
catch up to its target rate. If ROEi,t − gi,t is negative, firm i should decrease its growth rate in the following years because it cannot sustain such high growth. Therefore, firm i’s growth rate at time t+1, gi,t+1 will be a function of gi,t and ROEi,t : gi,t+1 = gi,t + βt,1 (ROEi,t − gi,t ),
(97.31)
page 3446
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3447
3447
where βt,1 represents the adjustment level. If βt,1 is equal to one, it means that firm i will adjust its growth rate to ROEi,t in only one year. If βt,1 is less than one, firm i will partially adjust its growth rate to ROEi,t in the following year. Our theoretical model shows that the mean-reverting process of the growth rate is a partial rather than an immediate adjustment. Therefore, we expect βt,1 is positive and less than one. In addition, the partial adjustment process may last for years and can be expressed as gi,t+j = gi,t + βt,j (ROEi,t − gi,t ),
(97.32)
where i = 1 . . . n, j = 1 . . . T , and βt,j is the adjustment level at time t + j. We here use a partial adjustment model to investigate the mean reverting process of firms’ growth rates. Regressions can be expressed as Δgi,t+j = βt,j (ROEi,t − gi,t ) + ei,t+j ,
(97.33)
where Δgi,t+j = gi,t+j − gi,t , i = 1 . . . n, j = 1 · · · 10, and t = 1970 · · · 2010. At each time point, there are 10 cross-sectional regressions and 10 βˆt,j can be obtained. Then, the time-series averages of βˆt,j are presented in Table 97.3. Panel A shows the time-series averages of the partial adjustment level for all sample firms. The value of 0.1299 for βˆt,1 represents that a firm, on average, adjusts 12.99% of the gap between its initial growth rate and its target rate. The first five averages of partial adjustment level are significantly positive, supporting the mean-reverting process of the growth rate (Hypothesis 1). It also indicates that firms do adjust their subsequent 5-year growth rates toward their target rate which is the current rate of return on equity (Hypothesis 1a). The time-series averages of βˆt,j are less than one indicating that firms adjust their growth rates to the target rate gradually (Hypothesis 1b). A monotonically decreasing trend from βˆt,1 (12.99%) to βˆt,5 (10.82%) shows that the partial adjustment of the growth rate is faster in the early stage of the mean-reverting process (Hypothesis 1c). Panel B presents the time-series averages of the partial adjustment level for six sub-groups based on their growth levels. We find portfolios P1, P2, P5, and P6 support the mean-reverting process for the growth rate (Hypothesis 1), the existence of the target rate of growth (Hypothesis 1a), partial adjustment of the growth rate (Hypothesis 1b), and the faster partial adjustment in the beginning stage (Hypothesis 1c). However, portfolios P3 and P4 do not fully support our theoretical model and hypotheses for the partial adjustment process. Especially for P4, the significantly negative values reject the existence of a mean-reverting process for the growth rate. One possible explanation is that firms in P3 and P4 have initial growth rates very close to
July 6, 2020
Panel A βˆt,j All Sample
j=1 j=2 j=3 j=4 0.1276∗∗∗ 0.1230∗∗∗ 0.1188∗∗∗ 0.1299∗∗∗ (10.11) (10.83) (11.35) (11.27)
j=5 0.1082∗∗∗ (9.20)
j=6 0.0045 (0.89)
j=7 0.0008 (0.18)
j=8 0.0046 (0.76)
j = 10 0.0107∗ (1.91)
0.0346∗∗∗ (3.30) 0.0498∗∗ (2.08) 0.2451∗∗ (2.52) −0.0230 (−0.28) −0.0222 (−0.68) −0.0236 (−1.57)
0.0314∗∗∗ (3.03) 0.0342 (1.17) 0.1456∗∗ (2.18) −0.0871 (−0.68) −0.0169 (−0.60) −0.0166 (−1.09)
Panel B 0.0164∗ (1.91) 0.1067∗∗∗ (3.52) 0.3525∗∗∗ (4.40) −0.2304∗∗ (−2.49) −0.0685∗∗ (−2.71) 0.0006 (0.05)
0.0220∗∗∗ (2.67) 0.0875∗∗∗ (3.19) 0.3624∗∗∗ (4.14) −0.2085∗∗ (−2.28) −0.0569∗ (−1.94) −0.0125 (−1.04)
0.0252∗∗ (2.37) 0.0596∗ (1.94) 0.2827∗∗∗ (2.73) −0.1443∗ (−1.77) −0.0292 (−0.97) −0.0173 (−1.40)
H.-Y. Chen et al.
P1 0.2199∗∗∗ 0.2072∗∗∗ 0.1865∗∗∗ 0.1791∗∗∗ 0.1683∗∗∗ (High Growth) (11.74) (11.59) (10.90) (10.22) (8.88) 0.1715∗∗∗ 0.2003∗∗∗ 0.2149∗∗∗ 0.2018∗∗∗ P2 0.1838∗∗∗ (7.31) (6.43) (8.63) (7.99) (6.09) 0.3175∗∗∗ 0.3234∗∗∗ 0.3739∗∗∗ 0.5283∗∗∗ P3 0.2939∗∗∗ (4.30) (3.57) (4.08) (4.97) (6.90) P4 −0.0654 −0.1236∗∗ −0.1755∗∗∗ −0.2217∗∗∗ −0.1798∗∗ (−1.08) (−2.18) (−2.86) (−4.35) (−2.25) 0.0420∗∗ 0.0168 0.0116 0.0144 P5 0.0574∗∗∗ (2.99) (2.09) (0.85) (0.57) (0.74) 0.0680∗∗∗ 0.0659∗∗∗ 0.0616∗∗∗ 0.0558∗∗∗ P6 0.0621∗∗∗ (Low Growth) (6.45) (6.31) (6.24) (4.72) (4.51)
9.61in x 6.69in
j=9 0.0081 (1.28)
Handbook of Financial Econometrics,. . . (Vol. 3)
Mean-reverting process.
15:56
3448
Table 97.3:
Notes: This table presents results of partial adjustment regressions to investigate the mean reverting process of firms’ growth rates during 1970 to 2010. Regressions can be expressed as
where Δgi,t+j estimated and average of βˆt,j t-statistics are
= gi,t+j − gi,t , i = 1 . . . n, j = 1 . . . 10, and t = 1970 · · · 2010. At each time point, 10 cross-sectional regressions are 10 βˆt,j are obtained. Panel A shows the time-series averages of βˆt,j for all sample firms. Panel B shows the time-series for six portfolios grouped by the growth level which is the ratio of total asset growth to the rate of return on equity. The presented in parentheses. ∗ , ∗∗ , and ∗∗∗ indicate significance at the 10%, 5%, and 1% levels, respectively.
b3568-v3-ch97
Δgi,t+j = βt,j (ROEi,t − gi,t ) + ei,t+j ,
page 3448
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3449
3449
their rates of return on equity which is their target rate of growth. It means that firms in P3 and P4 are close to their optimal in the steady state, so they do not change their growth rate as aggressively as other firms in P1, P2, P5, and P6. In addition, because the gap between the initial growth rate and the target rate, |ROEi,t − gi,t |, is relatively small, a significantly negative βˆt,j does not have much economic meaning. Overall, Figure 97.3 and Table 97.3 provide empirical evidence in support of our theoretical model and hypotheses depicting characteristics of the mean-reverting process of the growth rate. 97.4.3 Specification error and determinants of dividend payout model To examine the effect of ignoring the covariance between the profitability and the growth rate on determining dividend payout policy, we use the fixedeffects model proposed by Lee et al. (2011) and modify it as follows: payout ratioi,t ln 1 − (payout ratioi,t ) = α + β1 Covi,t + β2 Riski,t + β3 Di,t (gi,t < ROEi,t ) · Riski + β4 Growthi,t + β5 Costi,t + β6 ln(Size)i,t + β7 ROEi,t + ei,t ,
(97.34)
in which the dependent variable is the logistic transformation of the payout ratio and the independent variables are covariance between the return on equity and the growth rate, beta risk, dummy times beta risk, growth rates, log of size, and the rate of return on equity. The dummy variable (Di,t ) is equal to 1 if a firm’s 5-year average growth rate is less than its 5-year rate of return on equity, and 0 otherwise. The total asset growth, sales growth, and sustainable growth rate are used to examine the dividend payout model, and both time effect and firm effect are considered. Table 97.4 presents results of fixed-effects regressions for 2,342 firms during the period between 1969 and 2011. In models (1)–(3), the significantly positive estimated coefficients of the interaction of the dummy variable and the risk are consistent with results of Lee et al. (2011) that there is a structural change for the relationship between the payout ratio and risks in the dividend payout model. The structural change in the dividend payout model indicates that the payout policy for low growth firms follows the flexibility hypothesis while the payout policy for high growth firms follows the free cash flow hypothesis. When we introduce the covariance between profitability and growth rate into the regressions (models (4)–(6)), the structural change still
July 6, 2020
Intercept
Fixed-effects regressions using theoretical structural change point.
(1)
(2)
(3)
(4)
(5)
(6)
−0.2228 (−0.39)
−0.1422 (−0.25)
−0.1125 (−0.20)
−0.2269 (−0.40) −0.0558∗∗∗ (−2.07)
−0.1282 (−0.22)
−0.1064 (−0.19)
Cov(g TA , ROE) Cov(g Sales , ROE)
−6.5212∗∗∗ (−10.41)
Cov(g sus , ROE)
D*Beta g TA g Sales
−0.0884∗ (−1.96) 0.2462∗∗∗ (21.36) −0.6353∗∗∗ (−14.02)
g sus Cost of Equity
−29.085∗∗∗ 0 (−4.05) −0.0370∗∗∗ (−4.48)
−0.0627∗∗∗ (−3.12) −33.0227∗∗∗ (−4.58) −0.0431∗∗∗ (−5.19)
−0.0727 (−1.62) 0.2423∗∗∗ (21.02) −0.6714∗∗∗ (−15.48)
−30.4163∗∗∗ (−4.24) −0.0272∗∗∗ (−3.27)
−0.0898∗∗ (−2.00) 0.2506∗∗∗ (21.78) −0.6008∗∗∗ (−13.25) −27.2327∗∗∗ (−3.80) −0.0401∗∗∗ (−4.86)
−0.0972∗∗∗ (−4.57) −32.9414∗∗∗ (−4.57) −0.0437∗∗∗ (−5.26) (Continued )
b3568-v3-ch97
ln(Size)
−30.3128∗∗∗ (−4.23) −0.0277∗∗∗ (−3.33)
−0.0685 (−1.52) 0.2669∗∗∗ (23.08)
9.61in x 6.69in
−0.0733 (−1.63) 0.2423∗∗∗ (21.02) −0.6715∗∗∗ (−15.48)
H.-Y. Chen et al.
Beta
−0.0109∗∗∗ (−4.88) −0.0690 (−1.53) 0.2645∗∗∗ (22.87)
Handbook of Financial Econometrics,. . . (Vol. 3)
Model
15:56
3450
Table 97.4:
page 3450
July 6, 2020
(1)
(2)
(3)
(4)
(5)
(6)
ROE
0.0761∗∗∗ (2.97) 0.0815
0.0725∗∗∗ (2.83) 0.0798
0.0308 (1.13) 0.0721
0.0756∗∗∗ (2.95) 0.0817 0.0389
0.0782∗∗∗ (3.06) 0.0842 0 and c3 >0 or equivalently, ekt n(t)λ · −H(t) kt dt , P (t) = c4 · n(t)δ e where c4 >0. Finally, we have
(97A.11)
H(t)n(t)λ e−kt dt,
(97A.12)
(97A.13)
(97A.14)
Changing from an indefinite integral to a definite integral, equation (97A.13) can be shown as T ekt H(s)n(s)λ e−ks ds, p(t) = n(t)λ t which is equation (97.16).
page 3460
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
page 3461
3461
Appendix 97B: Derivation of Equation (97.20) This appendix presents a detailed procedure for deriving equation (97.20). Following the Euler–Lagrange condition (see Chiang, 1984), we first take the first-order condition on equation (97.17) with respect to t allowing only n(t) and g(t) to change over time, and set the first-order condition equal to zero. Then we obtain t ∂p(0) g(t)=0 ˙ λ−2 = (λ − 1) n(t) n(t) ˙ [r(t) − g(t)] A (0) exp g(s) ds ∂t 0 ˙ exp − a A 2 (0)σ 2 (t) (λ − 2) n(t)λ−3 n(t) t g(s)ds = 0, 2
(97B.1)
0
t ∂p(0) n(t)=0 ˙ λ−1 = n(t) r(t)A (0)g(t) exp g(s)ds ∂t 0 t λ−1 ˙ exp g(s)ds − n(t) A (0) g(t) 0
λ−1
− n(t)
2
2
A (0) g(t) exp 2
λ−2
− 2a A (0)σ (t) n(t)
0
t
g(s)ds
t g(t) exp 2 g(s)ds = 0. 0
(97B.2) After rearranging equations (97B.1) and (97B.2), we can obtain equations (97.18) and (97.19). t (λ − 2) a A (0) σ 2 (t) exp 0 g(s)ds , (97.16) n(t) = (λ − 1) [r(t) − g(t)] t 2a A (0) σ 2 (t) exp 0 g(s)ds . (97.17) n(t) = ˙ r(t) − g(t) − & g(t) g(t) Therefore, from equations (97.18) and (97.19), we can derive the following well-known logistic differential equation as ˙ = 0. λg(t)2 − λr(t)g(t) + (2 − λ) g(t)
(97.20)
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
3462
Appendix 97C: Derivation of Equation (97.21) This appendix presents a detailed derivation of the solution to equation (97.21). Partial differential equation, equation (97.20), is a logistic equation (or Verhulst model) first published by Pierre Verhulst (1845, 1847). To solve the logistic equation, we first rewrite equation (97.20) as a standard form of logistic equation. g(t) dg(t) = Ag(t) 1 − . (97C.1) dt B where A=
λ¯ r (2 − λ)
B = r¯.
and
Using separation of variables to separate g and t: Adt =
dg(t) , g(t) 1 − g(t) B dg(t)
dg(t) + B . = g(t) 1 − g(t) B We next integrate both sides of equation (97C.2), g(t) . At + C = ln (g(t)) − ln 1 − B
(97C.2)
(97C.3)
Taking the exponential on both sides of equation (97C.3), C · exp (At) =
g(t) 1−
g(t) B
.
(97C.4)
When t = 0, g(t) = g0 and equation (97C.4) can be written as C=
Bg0 g0 . g0 = B − g0 1− B
(97C.5)
Substitute equation (97C.5) into equation (97C.4), g(t) Bg0 · exp (At) = . B − g0 1 − g(t) B Solving for g(t), g(t) =
Bg0 B−g0
1+
exp (At)
Bg 0 B−g 0
exp(At) B
(97C.6)
page 3462
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
Sustainable Growth Rate, Optimal Growth Rate, and Optimal Payout Ratio
=
Bg0 exp (At) B − g0 + g0 exp (At)
=
Bg0 . g0 + (B − g0 ) exp (−At)
page 3463
3463
(97C.7)
λ¯ r and B = r¯ into equation (97C.7), we finally get the Substitute A = (2−λ) solution of equation (97.20),
g(t) = 1+
r¯ (1−L)
r¯ g0
, λ¯ rt − 1 exp λ−2
which is equation (97.21).
Appendix 97D: Derivation of Equation (97.26) This appendix presents a detailed derivation of equation (97.26). To obtain the expression for optimal n(t), given r¯(t) = r¯, we substitute equation (97.21) into equation (97.18) and obtain t
(2 − λ) a A (0)σ 2 (t)e 0 g n(t) = (1 − λ) (¯ r − g∗ (t))
∗ (s)ds
.
(97D.1)
From equations (97.15), (97.18), and (97.21), we have H(t) =
(¯ r − g ∗ (t))2 (1 − λ) a σ 2 (t) (2 − λ)2
.
(97D.2)
Substituting equations (97D.1) and (97D.2) into equation (97.16), we obtain n(s)λ 2 −ks ∗ (¯ r − g (s)) e ds , or p(t) = 2 t σ (s) t ∗ r − g∗ (t))λ ekt−λ 0 g (s)ds (1 − λ) (¯ W, (97D.3) p(t) = σ 2 (t)λ a (2 − λ)2
where W =
T t
1−λ ekt λ n(t) a (2 − λ)2
eλ
s 0
T
g ∗ (u)du−ks 2 σ (s)λ−1 (¯ r − g∗ (s))2−λ ds.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch97
H.-Y. Chen et al.
3464
From equation (97D.1), we have t
n(t) ˙ =
(2 − λ) a A (0)σ˙ 2 (t)e
0
g ∗ (s)ds
t
+ (2 − λ) a A(0)σ 2 (t)g∗ (t)e (1 − λ) (¯ r − g∗ (t))
t
0
g ∗ (s)ds
∗
(2 − λ) a A (0)σ 2 (t)e 0 g (s)ds g˙ ∗ (t) . + (1 − λ) (¯ r − g∗ (t))2
(97D.4)
From equations (97D.3) and (97D.4), we have the amount generated from the new equity issue 2 A(0) σ˙ (t) + σ 2 (t)g∗ (t) (¯ r − g∗ (t)) + σ 2 (t)g˙ ∗ (t) n(t)p(t) ˙ = 2 λ σ (t) (2 − λ)
t ∗ r − g∗ (t))λ−2 W (97D.5) · ekt−(λ−1) 0 g (s)ds (¯ ¯ ¯ From equations (97.1), (97.6), and (97D.5), we can obtain D(t) = n(t)d(t) t g(s)ds . Therefore, the optimal payout ratio can be and Y¯ (t) = r¯(t)A(0)e 0 written as ¯ g∗ (t) D(t) = 1− r¯(t) Y¯ (t)
t ∗ · 1 + ekt−λ 0 g (s)ds r − g∗ (t)) + σ 2 (t)g˙ ∗ (t) λ σ˙ 2 (t) + σ 2 (t)g∗ (t) (¯ W × (2 − λ) σ 2 (t)λ (¯ r − g∗ (t))3−λ where W =
T t
eλ
s 0
(97.26) g ∗ (u)du−ks 2 σ (s)λ−1 (¯ r − g∗ (s))2−λ ds.
page 3464
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Chapter 98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests of Asset-Pricing Models∗ Thomas Gramespacher, Armin B¨anziger, and Norbert Hilber Contents 98.1 98.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . The Bias in Two-Pass Regression Tests . . . . . . . . . . . . 98.2.1 The model and derivation of the analytic expression 98.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . 98.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 98.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 98A: Taylor Expansion of the Slope Estimator . . . . . Appendix 98B: Case of a Two-Parameter Covariance Matrix . . .
. . . . . . . . .
3466 3470 3470 3472 3475 3481 3482 3483 3486
Thomas Gramespacher Zurich University of Applied Sciences e-mail: [email protected] Armin B¨ anziger Zurich University of Applied Sciences e-mail: [email protected] Norbert Hilber Zurich University of Applied Sciences e-mail: [email protected] ∗ This chapter is an update and expansion of the paper “The bias in two-pass regression tests of asset-pricing models in presence of idiosyncratic errors with cross-sectional dependence,” Review of Pacific Basin Financial Markets and Policies, World Scientific Publishing Co. Pte. Ltd., vol. 22(02), pages 1–17, October 2019.
3465
page 3465
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3466
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
Appendix 98C: Expansion of the Slope Estimator of Richardson and Wu . . . . . . . . . . . . . . . . . . . . . . . . Appendix 98D: From Idiosyncratic Errors to Measurement Errors . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 98E: Independence of the Errors in the Slope Estimate and the Sample Mean . . . . . . . . . . . . . . . . .
3487 3488 3489
Abstract It is well known that in simple linear regression, measurement errors in the explanatory variable lead to a downward bias in the OLS slope estimator. In two-pass regression tests of asset-pricing models, one is confronted with such measurement errors as the second-pass cross-sectional regression uses as explanatory variables imprecise estimates of asset betas extracted from the first-pass time-series regression. The slope estimator of the second-pass regression is used to get an estimate of the pricing-model’s factor riskpremium. Since the significance of this estimate is decisive for the validity of the model, knowledge of the properties of the slope estimator, in particular, its bias, is crucial. First, we show that cross-sectional correlations in the idiosyncratic errors of the first-pass timeseries regression lead to correlated measurement errors in the betas used in the second-pass cross-sectional regression. We then study the effect of correlated measurement errors on the bias of the OLS slope estimator. Using Taylor approximation, we develop an analytic expression for the bias in the slope estimator of the second-pass regression with a finite number of test assets N and a finite time-series sample size T . The bias is found to depend in a non-trivial way not only on the size and correlations of the measurement errors but also on the distribution of the true values of the explanatory variable (the betas). In fact, while the bias increases with the size of the errors, it decreases the more the errors are correlated. We illustrate and validate our result using a simulation approach based on empirical return data commonly used in asset-pricing tests. In particular, we show that correlations seen in empirical returns (e.g., due to industry effects in sorted portfolios) substantially suppress the bias. Keywords Asset pricing • CAPM • Errors in variables • Simulation • Idiosyncratic risk • Two-pass regression • Measurement error.
98.1 Introduction Factor-based asset pricing models are of great importance as they can be used to determine an asset’s appropriate risk-adjusted required rate of return. This is not only an important input for portfolio construction and performance measurement but also allows for an estimate of a firm’s cost of equity capital (see, e.g., Jensen, 1968; Connor and Korajcyk, 1991, 2010; Schlueter and Sievers, 2014; Bianchi, Drew and Whittaker, 2016). The classical test of the validity of factor-based asset-pricing models is the two-path regression approach. Black, Jensen and Scholes (1972) and Fama and MacBeth (1973)
page 3466
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3467
3467
were among the first to use this approach to test the capital asset pricing model (CAPM) of Sharpe (1964) and Lintner (1965). A more recent application of this approach to test a variety of asset pricing models is found in Kan et al. (2013). In this approach, two successive linear regressions are conducted, where the explanatory variables of the second-pass cross-sectional regression (CSR) are estimated in a first-path time-series regression (TSR).1 Since the estimates obtained in the TSR are imprecise, a measurement error problem is introduced in the second-path regression. Consequently, in order to be capable of drawing conclusions concerning the validity of the assetpricing model, the impact of these errors in the explanatory variables has to be understood. The problem of errors in the explanatory variables (EIV) leading to an attenuation bias in the OLS slope estimator has been known for a long time (Durbin, 1954). The relevance of the EIV problem in the two-path regression approach in asset-pricing tests has been recognized already in Black et al. (1972). A review of the early work related to the CAPM can be found in Jensen (1972). A more recent review of the capital asset pricing model and of the history of its tests can be found in Fama and French (2004). Most of the tests tried to reduce the EIV problem by using diversified portfolios to reduce the idiosyncratic risk compared to single assets and, therefore, measure betas more precisely. Although mitigated, the EIV induced attenuation bias remains problematic in asset pricing tests. Therefore, several authors proposed corrections to the CSR slope estimator to address the EIV problem. Litzenberger and Ramaswamy (1979) use a weighted least squares version of the estimator. Wei et al. (1991) use the instrument variables approach in regression tests of the arbitrage pricing theory. Shanken (1992) gives an asymptotic distribution of the slope estimator for the case of a modified two-pass procedure. Kim (1995) derives a correction performed through a maximum likelihood estimation. Kim and Skoulakis (2018) employ the regression-calibration method to provide a correction method dealing with the EIV problem for the case of a modified two-pass regression approach with a large cross-section of test assets. In their work — as well as many other studies — the asymptotic behavior (i.e., the probability limit) of the estimator for large cross-sections of test assets (large N ) or large time series of return data (large T ) is studied. In this chapter, we study the properties of the CSR slope estimator for a finite length return time series and for a
1
For the econometric details of this approach, see, e.g., Jagannathan et al. (2010).
July 6, 2020
15:56
3468
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
finite number of test assets. Furthermore, we allow for arbitrary correlations among the test assets’ idiosyncratic errors. We start by considering a setting, where asset returns obey a single-factor CAPM-like pricing model of the form: E(ri ) = γ0 + γ1 · βi ,
(98.1)
where i identifies the assets, E(ri ) is the asset’s expected return, βi is the asset’s exposure to the single risk factor and γ1 is the factor’s risk premium. The constant γ0 is the expected return of a zero-beta asset. In the case of the Sharpe–Lintner version of the CAPM, γ1 is the market risk-premium and the zero-beta return is the risk-free rate rf . In the two-path regression approach, the cross-sectional regression tests the validity of the pricing equation (98.1) by regressing expected returns of N test assets (individual assets or portfolios of assets) on their βi . The slope of this regression is used as an estimate of the factor’s risk premium γ1 . A slope estimate that is not significantly positive would lead to a rejection of the pricing model. Unfortunately, neither the explained variables E(ri ) nor the βi needed as explanatory variables in the CSR are directly observable, they have to be estimated. This is done using a set of N time-series of empirical asset returns rit . The assets’ expected returns are usually estimated using the arithmetic average of the historic returns. βi are estimated by regression of asset excess returns against the time series of factor realizations ft (in the case of the CAPM, the factor realizations are the excess market returns). The underlying assumption is that the time series of asset excess returns follows a market model of the form: rit = αi + β i · ft + εit .
(98.2)
The N regression slopes are used as estimates of βi . These slope estimates are subject to measurement error, since the market model contains the stochastic idiosyncratic errors εit and the length of the time series of returns T is limited. In practical asset-pricing tests, typically around 5–40 years of monthly return data are used (hence, it is not uncommon to have T as small as 60 months). The consequence is that in the second-pass CSR, we are dealing with the problem of errors in the explained and explanatory variables. While errors in the explained variables are often unproblematic, it is well known that in the case of simple linear regression, errors in the explanatory variable lead to a downward bias in the OLS estimator of the regression slope (see, e.g., Green, 2011). In the context of the two-pass regression test, using the uncorrected downward biased OLS slope estimate would lead to unjustified rejections of the asset-pricing model due to its apparent insignificance
page 3468
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3469
3469
(see, e.g., Kim, 2010). A better understanding of the size of the downward bias in the slope estimator is therefore crucial for sound decisions about the pricing model’s validity. In practical asset-pricing tests, the test assets are usually portfolios of individual assets and, consequently, are limited in number. E.g., while using a set of the well-known Fama–French portfolios of individual assets sorted by some combination of company characteristics,2 in many cases, one ends up with the number of test assets being well below 50.3 Therefore, in order to be of practical relevance for such cases, we have to study the finite-sample properties of (the bias of) the slope estimator. Furthermore, the idiosyncratic errors εit of the N test assets are usually of different sizes (variances) and correlated over the cross-section of available assets.4 We show below that these properties translate directly into the variance and correlation of the measurement errors in the beta estimates leading to measurement errors that are of different sizes and are correlated among each other. In other words, in the CSR of asset-pricing tests, we are dealing with a fully occupied N × N covariance matrix of the N measurement errors whereas in most of the existing literature — as described above — the covariance matrix of the measurement errors is assumed to be proportional to the identity matrix (the proportionality factor σ 2 being the variance of an individual measurement error). Note that in our setup, increasing the sample size N by including more test assets would also increase the size of the covariance matrix of the measurement errors introducing new variances and covariances and thus resulting in highly non-trivial effects on the bias of the slope estimator. We therefore study the bias for a fixed number N of test assets and a fixed length T of the time series of asset returns. In Section 98.2, we use a first-order Taylor approximation to derive an analytic expression for the bias in the OLS slope estimator in the presence of small but correlated measurement errors. We discuss the result and apply it to some special cases including the well-established case of uncorrelated errors. In Section 98.3, we use simulations to illustrate and validate our results. We show the impact and relevance of correlations found in empirical data on the bias in two-pass regression tests and illustrate the dependence of
2
See, e.g., Fama and French (2015). Kim and Skoulakis (2018) study the case where the two-pass regression approach is conducted using a large cross-section of individual assets, i.e., the case where N → ∞. 4 See, e.g., Conner and Korajczyk (1988) for a discussion of the empirical relevance of non-diagonal covariance structures of the idiosyncratic errors. 3
July 6, 2020
15:56
3470
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
the bias on the strength of the correlation. Finally, we conclude the chapter in Section 98.4. 98.2 The Bias in Two-Pass Regression Tests 98.2.1 The model and derivation of the analytic expression We consider an economy where the assets’ expected returns obey the singlefactor pricing model (98.1). We use the two-path regression approach to test this pricing model. The assets’ expected return E(ri ) and βi needed as explained and explanatory variables in the second-pass CSR are estimated using empirical return data. We assume that the time series of asset excess returns rit is generated by the market model (98.2). The factors ft are assumed to be i.i.d. normally distributed, i.e., ft ∼ N (μf , σf2 ). In addition, we assume the errors and factors to be uncorrelated and the errors εt = (ε1t , . . . , εN t ) to be i.i.d. multivariate-normally distributed εt ∼ N (0N , Σ) with mean zero and cross-sectional covariance matrix ⎞ ⎛ 2 σ1 · · · σ1N ⎜ . .. ⎟ .. .. (98.3) Σ=⎜ . . ⎟ ⎠. ⎝ 2 σN 1 · · · σN In particular, we allow for the idiosyncratic variances σi2 of different assets to be of different sizes, and we allow for non-zero covariances σij . Let βˆi be the OLS slope estimator of asset i’s beta found in a TSR of realized asset returns rit on factor realizations ft . We write it as the sum of the true but unknown parameter βi and a measurement error term ui : βˆi = βi + ui .
(98.4)
Given our assumptions about the market model, equation (98.2), the measurement error ui is a random variable with E[ui ] = 0, and for any pair of assets i, j we have5 Cov(ui , uj ) = E[ui uj ] =
5
σij , T σf2
Here, we use the law of iterated expectations E[ui uj ] = E[E[ui uj |f1 , . . . , fT ]] and the fact that factors and errors are uncorrelated so that E[εit εjt |f1 , . . . , fT ] = E[εit εjt ] = σij . Details of the derivation can be found in Appendix 98D.
page 3470
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3471
3471
with σii = σi2 . Using the covariance matrix of the errors (98.3), the covariance matrix Σu of the measurement errors can be written as Σu =
1 Σ. T σf2
(98.5)
As seen in this equation, the covariance structure of the idiosyncratic errors Σ translates directly into the covariance structure of the measurement errors of βˆi . The next step is the second-pass CSR. Ideally, in order to estimate the factor risk premium γ1 of the pricing model (98.1), we would like to regress the assets’ expected returns E(ri ) on their true exposures βi . However, in practical asset-pricing tests, neither of those two values are known. Instead, we have to use estimates from the time series of available asset returns. The exposures βi are estimated using βˆi from the TSRs, equation (98.4). The expected returns are estimated using the arithmetic average of the historical returns: r¯i =
T 1 rit = E(ri ) + vi , T t=1
with the error vi := r¯i − E(ri ). Plugging in βi = βˆi − ui and E(ri ) = r¯i − vi in (1), we get r¯i − vi = γ0 + γ1 · (βˆi − ui ), leading to the regression equation: r¯i = γ0 + γ1 · βˆi − γ1 ui + vi = γ0 + γ1 · βˆi + ηi .
(98.6)
In this regression, the residuals ηi := vi − γ1 ui are correlated with the explanatory variable βˆi since both contain the measurement error ui . Using the observed values (βˆi , r¯i ) of the N test assets, the OLS estimator of the slope γ1 is γˆ1 =
N
ˆ
¯ˆ
ri − r¯) i=1 (βi − β)(¯ , N ˆ ¯ˆ 2 i=1 (βi − β) ¯
(98.7)
¯ 1 N ˆ ¯ ¯i . with the cross-sectional averages βˆ = N1 N i=1 βi and r = N i=1 r We are interested in the bias of the estimator γˆ1 . Therefore, for the given number of assets, we compute the expected value E[ˆ γ1 ] using a Taylor approximation to lowest order in the size of the measurement errors. The details of this derivation can be found in Appendix 98A. Here, we state the
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
E[ˆ γ1 ] = E
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
3472
final result:
9.61in x 6.69in
N
¯ 2 ˆi − β)(¯ ˆ ri − r¯) ( β W 2 σ Σ i=1 ≈ γ1 1 − u2 + , N ˆ ¯ (N − 1) σβ2 2 σβ ˆ ( β − β) i=1 i
with the cross-sectional variation of the measurement errors: ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ N N ⎨1 ⎬ 1 1 2 2 σ − σ σu := ij , i ⎪ N N (N − 1) T σf2 ⎪ ⎪ ⎪ i,j=1 ⎩ i=1 ⎭
(98.8)
(98.9)
i=j
the cross-sectional variance of the population parameters: N
1 ¯ 2 = 1 β β, := (βi − β) N −1 N −1
σβ2
(98.10)
i=1
and the Rayleigh quotient: WΣ :=
β Σu β . β β
(98.11)
¯ . . . , βN − β). ¯ The vector β is defined as β := (β1 − β, Since our focus is on the bias of the slope estimator, we rearrange (98.8) and get the following for the relative bias: WΣ σ2 2 E(ˆ γ1 ) − γ1 ≈ − u2 + . γ1 (N − 1) σβ2 σβ
(98.12)
Equations (98.8) and (98.12) are the central results of this chapter. They show that the bias of the slope estimator, i.e., the bias in the estimate of the pricing model’s factor risk premium depends on the covariance structure of the idiosyncratic errors and, via WΣ , also on the distribution of the test asset’s true βi . However, the errors vi in the explained variable have no impact on the bias, since they are uncorrelated with the errors ui in the explanatory variable. 98.2.2 Discussion In (98.12), we clearly see the downward bias in the slope estimator generated by the first term (σu2 /σβ2 ). The larger the cross-sectional variation of the measurement errors σu2 , the larger the bias in the slope estimate. As seen in (98.9), the cross-sectional variation of the measurement errors σu2 is the difference of the average variance and the average covariance of the individual
page 3472
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3473
3473
measurement errors. i.e., large measurement errors (a large average variance) lead to a large bias, whereas strong (positive) correlations (a large average covariance) reduce the cross-sectional variation of the errors and thus reduce the downward bias. This is easily understood intuitively: as the errors in βˆi get larger, we lose more and more information about the true value of the explanatory variables βi in (98.1). This results in the expected value of the slope estimator of the CSR to become smaller and smaller, approaching zero as the measurement errors (the noise) get large compared to the variation of the true parameters βi (the signal).6 This is the well-known attenuation effect in the slope estimator in the presence of errors in the explanatory variables. The opposite is true for correlations in the measurement errors: large correlations lead to a small bias. This might be surprising at first sight. If the errors are increasingly correlated, they tend to distort the true values βi in the same direction, so that the measured set of βˆi are merely a shifted version of the true set of βi . However, shifting the explanatory variables in the CSR by about the same amount and in the same direction does not substantially change the regression slope, no matter how large the shift is.7 The interpretation of the second term in (98.12), (2WΣ )/((N − 1)σβ2 ), is less intuitive. It is always positive and thus reduces the bias introduced by the first term. Compared to the first term, it contains an additional factor 1/(N −1), so that it will vanish in the large N limit. In the small sample case, the details of the covariance matrix Σu and the dispersion of the parameters βi decide about the size and importance of the second term compared to the first one. Qualitatively, we can see WΣ as a weighted average of the variances and covariances of the measurement errors, where the weighting is via the ¯ As such, this deviations of the asset (true) betas from their average value β. term reveals that correlations of the errors of assets with betas that are far apart from each other are more important for the exact size of the bias than correlations of assets with betas that are close together. In order to get a better understanding of the bias as a function of the strength of the correlations, we consider the special case, where the covariance matrix of the idiosyncratic errors (98.3) is parametrized by only two values, namely, the variance σ 2 (supposed to be equal for all assets, σi2 = σ 2 ) 6
Note that our approximation of the bias (98.12) is only valid for measurement errors that are small compared to the variation of the true parameters. So, the limit of very large errors leading to a zero slope in the CSR is not visible in (98.8) or (98.12). 7 The intercept of the regression line certainly would be affected by such a shift. But, here, we focus only on the slope of the regression.
July 6, 2020
15:56
3474
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
and the correlation coefficient ρ (supposed to be equal for all pairs of assets, σij = ρσ 2 ). In this case, the cross-sectional variation of the measurement errors σu2 and the Rayleigh quotient give the same expression8 : σu2 = WΣ =
1 2 σ (1 − ρ). T σf2
(98.13)
Finally, plugging (98.13) into (98.12), the relative bias of the slope estimator is 1 σ 2 (1 − ρ) 2 E(ˆ γ1 ) − γ1 . (98.14) ≈− 2 1− γ1 (N − 1) T σf σβ2 Details of the derivation of (98.13) and (98.14) are given in Appendix 98B. In these equations, we clearly see the above-described impact of correlated measurement errors on their cross-sectional variation and on the bias in the slope estimator: the larger the correlation, the smaller their cross-sectional variation σu2 and the smaller the bias in the slope estimator. In the extreme case of perfect positive correlation, ρ = 1, all measurement errors are exactly identical, so that the cross-sectional variation of the errors as well as the bias in the slope estimator vanish altogether. Note, that this does not mean that the betas are measured without error, but rather that the measurement errors in all βˆi are exactly identical, so that the estimates βˆi are simply a shifted version of the true parameters βi . As already mentioned, the case of uncorrelated errors, i.e., ρ = 0, with equal variance σ 2 has been extensively studied in the literature. In this case, plugging ρ = 0 in (98.13) and (98.14), the cross-sectional variation of the measurement errors is identical to the variance of an individual measurement error, i.e., σu2 = (σ 2 /T σf2 ), and the relative bias of the slope estimator is 1 σ2 E(ˆ γ1 ) − γ1 ≈− 2 2 γ1 T σf σβ
1−
2 . (N − 1)
(98.15)
The small sample properties of the bias in the OLS slope estimator in the presence of uncorrelated measurement errors have already been studied by Richardson and Wu (1970). In their paper, they give a series expansion 8
Due to the special shape of the covariance matrix of the errors, we have β Σu β = β β · σ 2 (1−ρ)/(T σf2 ) and thus WΣ = β Σu β/β β becomes totally independent of the βi . Intuitively, since all variances and covariances are equal, the exact structure of the distribution of βi does not matter anymore.
page 3474
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
of the relative bias of the slope estimator in powers of 1/(N − 1): 2 τ2 1 1 E(ˆ γ1 ) − γ1 1− =− +O , γ1 1+τ N − 1 (1 + τ )2 (N − 1)2
page 3475
3475
(98.16)
where 1/τ = σ 2 /(T σf2 σβ2 ). If we keep in this expression only terms up to first order in 1/τ , i.e., small measurement errors σ 2 /(T σf2 ) compared to the overall variation of the true parameters σβ2 , their result exactly reproduces our special case (98.15).9 Thus, on one hand, our main result (98.12) generalizes the result of Richardson and Wu in that we allow for arbitrary variances and cross-sectional correlations of the measurement errors. On the other hand, our result is a special case of Richardson and Wu in that it is only valid for sufficiently small measurement errors, whereas their formula is valid for measurement errors of any size. In the following section, we will show that in a typical application of the two-path regression test of a pricing model using empirical data, the measurement errors are sufficiently small so that our formula (98.12) gives a good approximation for the bias. However, in such a test, the measurement errors are typically correlated, so that applied to real data, our result gives considerable better predictions of the bias than what one would get by neglecting the correlations.10 98.3 Simulations Our previous analysis showed that the bias of the slope estimator depends on the structure of the correlation matrix of the idiosyncratic errors, Σ, and on the distribution of the true parameters βi . In this section, we illustrate the relevance of our result in the context of actual tests of pricing models. We do this using a simulation approach where we calibrate the parameters Σ, βi , and σf2 to values derived from real empirical data. On that account, we need the empirical time series of returns for a number of assets that are typically used to test pricing models. Choosing an arbitrary set of individual securities would lead to a particular error matrix Σ and thus finally to a particular effect on the bias, as seen in (98.12). Picking different sets of assets could lead to completely different error matrices Σ and, hence, to potentially very different effects on the bias. Instead of arbitrarily choosing 9
The details of this calculation can be found in Appendix 98C. In pricing tests as sketched here, the measurement errors can be reduced in certain limits by increasing the length T of the time series used to estimate the asset betas. However, it is far more difficult, if not impossible, to find (sufficiently many) uncorrelated test assets. 10
July 6, 2020
15:56
3476
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
a set of test assets, we use the well-known 25 Fama–French portfolios formed on size and book-to-market that are commonly used in actual asset-pricing tests. Even though taking portfolios instead of individual securities reduces the measurement error problem, we can still demonstrate the effects on the bias predicted in formula (98.12) in real empirical data. We base our calculations on the most recent 40 years of monthly returns of these portfolios (August 1978–July 2018), which are available from Kenneth French’s website.11 Table 98.1 shows the summary statistics of these portfolios. The annualized average excess return of the market is 8.27% with an annualized standard deviation of 15.23%. The excess market return plays the role of the factor ft in the market model (98.2). Therefore, the empirical standard deviation of the monthly market returns will later be used as the numeric value for the model parameter σf , i.e., in our simulation, we set σf = 4.40%. The annualized mean excess returns of the 25 portfolios range from 2.51% to 13.79%, with annualized standard deviations ranging from 15.01% to 26.99%. Particularly important are the portfolio betas that we derive by regression of the time series of portfolio excess returns on market excess returns. The resulting portfolio betas range from 0.86 to 1.39 with a standard deviation of σβ = 0.1420. This set of portfolio betas will be used as our second set of model parameters, the true betas βi of the test assets. To get the covariance matrix of the idiosyncratic returns Σ, we use the portfolio betas to split the portfolios’ total variance into systematic variance βi2 σf2 and idiosyncratic variance σi2 := V ar(εi ). The standard deviation of the monthly idiosyncratic returns σi of the 25 Fama–French portfolios ranges from 1.56% to 4.82%. The square of these values will be used as the diagonal elements of the model parameter Σ. The off-diagonal elements of the matrix Σ, the crosssectional covariances of the idiosyncratic returns, are especially interesting for our analysis. For any pair of assets, the idiosyncratic covariance can be calculated by σij := Cov(εi , εj ) = Cov(rir , rj ) − βi βj σf2 . Figure 98.1 shows the distribution of the empirically found cross-sectional correlations ρij = σij /σi σj between any two portfolios i and j. The correlations are fairly uniformly distributed ranging from the lowest observed correlation of −0.60 to the highest observed correlation of +0.90. Nearly three quarters of all pairs of portfolios show a positive correlation leading to
11
See http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data library.html
page 3476
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests Table 98.1:
Portfolio SMALL/LoBM ME1/BM2 ME1/BM3 ME1/BM4 SMALL/HiBM ME2/BM1 ME2/BM2 ME2/BM3 ME2/BM4 ME2/BM5 ME3/BM1 ME3/BM2 ME3/BM3 ME3/BM4 ME3/BM5 ME4/BM1 ME4/BM2 ME4/BM3 ME4/BM4 ME4/BM5 BIG/LoBM ME5/BM2 ME5/BM3 ME5/BM4 BIG/HiBM Market
3477
Descriptive statistic of the 25 Fama–French portfolios.
Mean monthly Annualized return mean (%) return (%) 0.21 0.85 0.84 1.05 1.07 0.59 0.89 0.93 0.96 0.95 0.65 0.90 0.82 0.91 1.08 0.81 0.81 0.77 0.84 0.85 0.66 0.73 0.70 0.54 0.78 0.66
page 3477
2.51 10.71 10.57 13.32 13.65 7.28 11.22 11.77 12.19 11.98 8.10 11.32 10.23 11.47 13.79 10.21 10.15 9.66 10.60 10.69 8.28 9.10 8.75 6.71 9.83 8.27
Monthly standard deviation (%) 7.79 6.72 5.62 5.33 5.57 7.06 5.74 5.17 5.00 5.83 6.46 5.30 4.85 4.79 5.45 5.85 5.02 4.94 4.60 5.45 4.55 4.46 4.33 4.68 5.49 4.40
Monthly Annualized idiosyncratic standard deviation Portfolio volatility σi (%) (%) beta βi 26.99 23.27 19.48 18.47 19.31 24.44 19.89 17.91 17.32 20.19 22.39 18.36 16.79 16.61 18.89 20.28 17.38 17.13 15.92 18.87 15.75 15.45 15.01 16.21 19.00 15.23
1.392 1.190 1.051 0.951 0.998 1.376 1.135 1.011 0.964 1.085 1.293 1.094 0.981 0.934 1.022 1.219 1.060 1.008 0.915 1.041 0.972 0.948 0.864 0.884 0.986
4.823 4.212 3.203 3.311 3.436 3.632 2.837 2.639 2.656 3.349 3.075 2.224 2.209 2.474 3.093 2.353 1.856 2.196 2.219 2.955 1.556 1.594 2.084 2.603 3.363
Notes: Descriptive statistic of the monthly excess returns from August 1978 to July 2018 of the 25 value-weight Fama–French Portfolios formed on size and book-to-market. The portfolio label “MEi/BMj” identifies the portfolio consisting of assets belonging to the i-th quintile with respect to their size and to the j-th quintile with respect to their book-to-market ratio. The market portfolio is the value-weight average excess return of all CRSP firms listed on the NYSE, AMEX, or NASDAQ. All return data have been retrieved from Kenneth French’s website. Portfolio beta βi is determined by the regression of portfolio excess returns on market excess returns. The idiosyncratic volatility σi is determined by splitting up the portfolio’s total variance into systematic and idiosyncratic variance and taking the square root.
an overall positive average correlation. The estimates of σi2 and σij are used to construct our last model parameter, the error matrix Σ. In addition to Σ, βi , and σf2 , which we calibrate as described above to the corresponding empirical properties of the 25 Fama–French portfolios, we choose the parameters of the pricing model (98.1) to be γ0 = 0 and γ1 = 0.5. This means that in our simulation, the factor risk premium is assumed to be 0.5% per month
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3478
Figure 98.1: portfolios.
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
Distribution of the idiosyncratic correlations of the 25 Fama–French
Notes: Shown is the frequency distribution of the correlation of the idiosyncratic returns between any two of the 25 Fama–French portfolios formed on size and book-to-market using the most recent 40 years of monthly return data (August 1968–July 2018).
or 6% per year (i.e., a typical forward-looking estimate of the market risk premium) and the return of a zero-beta asset is assumed to be 0. We start our simulation by generating the measured betas βˆi (the results of the TSR) as the sum of the true population parameters βi given above plus simulated measurement errors ui . We simulate the measurement errors by drawing M = 50,000 sets of N = 25 measurement errors (u1 , . . . , u25 ) from a multivariate normal distribution with mean zero and covariance (1/T σf2 )Σ. The length T of the time series of returns used in actual pricing tests typically ranges from a few dozens of months to a few dozens of years. We therefore run simulations for T = 30, 60, 120, 240, and 480 months. In Figure 98.2, we compare the bias of the slope estimator predicted by (98.12) to its true value, which is calculated as the average over M simulated individual slope estimates. Each individual estimate γˆ1 is calculated as the OLS slope estimate of the cross-sectional regression of the assets’ expected returns, which we set according to the pricing model E(ri ) = γ0 + γ1 · βi , on the assets’ simulated measured betas βˆi . For sufficiently small measurement errors, i.e., if the TSR is sufficiently long, Figure 98.2 shows good agreement of our result with the true bias represented by the simulation
page 3478
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3479
3479
Predicted versus actual bias 25% Relative bias -(E(γ1)-γ1)/γ1
July 6, 2020
20%
Simulation Average Equation (98.12)
15%
Richardson & Wu 10% 5% 0% 0
50
100
150
200
250
300
350
400
450
500
Length T of the Time-Series Regression
Figure 98.2:
Relative bias as a function of the length of the time-series regression.
Notes: For different lengths of the time-series regression (first pass), the true relative bias (E(ˆ γ1 )/γ1 − 1) of the OLS slope estimator (determined as a simulation average and represented in the figure by diamonds) is compared to our prediction calculated using equation (98.12) (bottom line) and to a value that would be predicted if one neglects correlations of the measurement errors (top line). As the length of the time-series regression increases, the measurement errors and the relative bias diminish. For any length of the time-series regression, neglecting the correlations of the measurement errors would lead to a far too large prediction of the actual bias.
average.12 Good agreement is found for T 120 corresponding to about 10 or more years of empirical return data. For T = 120, the cross-sectional variation of the measurement errors (98.9) and the Rayleigh quotient (98.11) are 1/
σu = 0.0511 (resulting in σu2 /σβ2 = 0.1295) and WΣ 2 = 0.1403 (resulting in 2WΣ /((N − 1)σβ2 ) = 0.0814), i.e., even though of lower order in 1/(N − 1) for the chosen set of parameters, the first and second term of (12) are of similar size. The second term leads to a considerable reduction of the bias introduced by the first term. To illustrate the influence of the covariance structure of the errors on the bias, we calculate the value of the bias using the formula of Richardson and Wu (98.16), which neglects correlations altogether. In addition, in their formula, the variances of all error terms σi2 are identical. We choose the variance to be equal to the average of the idiosyncratic variances 12 The simulation average itself is also stochastic due to the limited number of simulated values. Using M = 50,000 simulations, for T = 120, the standard error of the simulation average is roughly 0.004, leading to a very small error in the relative bias of the slope estimator ( 0.2%).
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
3480
of the 25 Fama–French portfolios. Using the average σ 2 = 8.413, the formula of Richardson and Wu (98.16) predicts for all T a far too large bias as is clearly seen in Figure 98.2. Next, we illustrate the impact of the strength of the correlations on the bias. Instead of arbitrarily varying all N (N − 1)/2 non-diagonal elements in the covariance matrix Σ, we use a simplified version of the covariance matrix parametrized by only two values: the variance σ 2 (set equal for all assets) and the correlation ρ (set equal for all pairs of assets). As before, the variance σ 2 = 8.413 is chosen as the average of the idiosyncratic variances of the 25 Fama–French portfolios while the correlation ρ is varied between ρ = 0 (i.e., uncorrelated errors) and ρ = 1 (perfect positively correlated errors). We choose the length of the time series T = 60. All other parameters are left as chosen in the previous section. Once the parameters are set, the simulation is conducted as described above. Figure 98.3 shows the relative bias as a function of the correlation ρ. The simulation average (the true bias of the slope estimator (98.7)) is compared to our approximate result (98.14). The larger the correlations, the better is 2-Parameter Covariance Matrix T = 60 35%
Relative bias -(E(γ1)-γ1)/γ1
July 6, 2020
30%
Simulation Average
25%
Equation (98.12) Richardson & Wu
20% 15% 10% 5% 0% 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Correlation ρ
Figure 98.3:
Relative bias as a function of the correlation of the measurement errors.
Notes: The relative bias of the OLS slope estimator (E(ˆ γ1 )/γ1 − 1) is shown as a function of the correlation of the measurement errors ρ (supposed to be identical for all pairs of assets). The true value (determined as a simulation average and represented in the figure by diamonds) is compared to the value predicted by our formula, equation (98.12). The thin horizontal line shows the relative bias predicted by a formula that neglects correlations. Our formula shows good agreement with the true bias for any correlation, whereas a prediction that neglects correlations naturally only gives reasonable results for very small or no correlations at all.
page 3480
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3481
3481
our approximation, and for ρ = 1, it gives the accurate result of zero bias. If correlations get weaker and approach zero, the gap between true (simulated) bias and our approximation opens. This is due to the fact that for the short time series of returns (T = 60), the noise (the variance of the measurement errors σ 2 /(T σf2 ) ≈ 0.0073) is quite large compared to the signal (the variance of the true parameters σβ2 = 0.0202). For longer time series, i.e., for smaller measurement errors and thus a larger signal-to-noise ratio, our approximation would give even better results than shown here. Neglecting the correlations altogether using the result of Richardson and Wu (98.16), we get an accurate value of the bias in the case of zero correlations. Their formula gives accurate results even in the case of large measurement errors. However, introducing correlations between the errors has a significant effect on the bias, so that neglecting them no longer gives reasonable predictions of the bias. This is clearly seen in Figure 98.3 in the gap opening between the true bias and the prediction of Richardson and Wu as the strength of the correlation increases. The importance of correlations in practical assetpricing tests has already been seen in Figure 98.2, where the bias was largely suppressed due to correlations. The tendency for positive correlations in the error terms of the 25 Fama–French portfolios has already been indicated above. Defining the average correlation as the ratio between average covariance and average variance, ρ¯ := σij /σi = 0.2804 clearly demonstrates the tendency for rather strong positive correlations in the idiosyncratic returns of the 25 Fama–French portfolios.13 98.4 Conclusion We study the effect of cross-sectionally correlated idiosyncratic errors on the estimate of a pricing model’s factor risk premium, which one gets using a twopass regression approach. Cross-sectionally correlated idiosyncratic errors in asset returns lead to correlated measurement errors in the asset betas used as explanatory variables in the second-pass cross-sectional regression. It is well known that uncorrelated measurement errors lead to a downward bias in a regression’s OLS slope estimate. In the two-pass regression approach the slope estimate in the second-pass regression serves as an estimate for the pricing model’s factor risk premium. Using Taylor approximation, we 13
An average correlation of approximately 0.3 has also been found by Kim (1995) for the case of NYSE/AMEX securities grouped into 25 portfolios. More recently, Pollet and Wilson (2010) found an average correlation of 0.237 among the 500 largest CRSP securities.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3482
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
derive an analytic expression for the bias of the slope estimator in presence of small, but possibly correlated measurement errors. While our result is in accordance with the known fact that larger measurement errors cause a stronger attenuation bias in the slope estimate, it reveals the new effect that positive correlations in measurement errors reduce the bias. An intuitive explanation is that in the presence of positive correlations, the errors in the explanatory variables tend to go in the same direction and thus rather shift than disturb the explanatory variables. We show that cross-sectional correlations found in empirical returns of assets typically used in pricing tests have a substantial impact on the bias in the estimate of the factor risk premium. In fact, using simulations parametrized by correlations found in empirical returns, we find good agreement between our analytical prediction of the bias and the actual (true) bias determined as a simulation average. Neglecting correlations, however, would lead to a substantial overestimation of the bias. Thus, our result not only helps to better understand the impact of correlations qualitatively but also shows very good quantitative agreement with results found in simulations based on empirical data.
Bibliography R. J. Bianchi, M. E. Drew, and T. Whittaker (2016). The Predictive Performance of Asset Pricing Models: Evidence from Australian Securities Exchange. Review of Pacific Basin Financial Markets and Policies, 19, 1650023. F. Black, M. C. Jensen, and M. Scholes (1972). The Capital Asset Pricing Model: Some Empirical Tests. In MC Jensen (ed.), Studies in the Theory of Capital Markets, New York: Praeger. G. Connor and R. A. Korajczyk (1988). Risk and Return in an Equilibrium APT: Application of a New Test Methodology. Journal of Financial Economics, 21, 255–289. G. Connor and R. A. Korajczyk (1991). The Attributes, Behaviour, and Performance of US. Mutual Funds. Review of Quantitative Finance and Accounting, 1, 5–26. G. Connor and R. A. Korajczyk (2010). Factor Models of Asset Returns. In Encyclopedia of Quantitative Finance, R Cont (ed.). Chicester: Wiley. J. Durbin (1954). Errors in Variables. Review of the International Statistical Institute, 22, 23–32. E. F. Fama and K. R. French (2004). The Capital Asset Pricing Model: Theory and Evidence. Journal of Economic Perspectives, 18, 25–46. E. F. Fama and K. R. French (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. E. F. Fama and J. MacBeth (1973). Risk, Returns and Equilibrium: Empirical Tests. Journal of Political Economy, 71, 607–636. W. H. Green (2011). Econometric Analysis 7th International ed., New York: Pearson. R. Jagannathan, G. Skoulakis, and Z. Wang (2010). The Analysis of the Cross-Section of Security Returns. In Y. A¨ıt-Sahalia and L. Hansen (eds.), Handbook of Financial Econometrics: Applications. Amsterdam: Elsevier, pp. 73–114.
page 3482
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3483
3483
M. C. Jensen (1968). The Performance of Mutual Funds in the Period 1945–1964. The Journal of Finance, 23, 389–416. M. C. Jensen (1972). Capital Markets: Theory and Evidence. The Bell Journal of Economics and Management Science, 3, 357–398. R. Kan, C. Robotti, and J. Shanken (2013). Pricing Model Performance and the Two-Pass Cross-Sectional Regression Methodology. The Journal of Finance, 68, 2617–2649. D. Kim (1995). The errors in variables problem in the cross-section of expected return. The Journal of Finance, 50, 1605–1634. D. Kim (2010). Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests. In C. Lee and J. Lee (eds.), Handbook of Quantitative Finance and Risk Management, New York: Springer, pp. 1091–1108. S. Kim and G. Skoulakis (2018). Ex-post risk premia estimation and asset pricing tests using large cross sections: The regression-calibration approach. Journal of Econometrics, 204, 159–188. J. Lintner (1965). The Valuation of Risk Assets and the Selection of Risky Investment in Stock Portfolios and Capital Budgets.Review of Economics and Statistics, 47, 13–37. R. H. Litzenberger and K. Ramaswamy (1979). The effect of personal taxes and dividends on capital asset prices: Theory and empirical evidence. Journal of Financial Economics, 7, 163–196. J. M. Pollet and M. Wilson (2010). Average correlation and stock market returns. Journal of Financial Economics, 96, 364–380. D. H. Richardson and D. Wu (1970). Least Squares and Grouping Method Estimators in the Errors in Variables Model. Journal of the American Statistical Association, 65, 724–748. T. Schlueter and S. Sievers (2014). Determinants of market beta: the impacts of firmspecific accounting figures and market conditions. Review of Quantitative Finance and Accounting, 42, 535–570. J. Shanken (1992). On the estimation of beta-pricing models. Review of Financial Studies, 5, 1–33. W. F. Sharpe (1964). Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk. The Journal of Finance, 19, 425–442. K. C. J. Wie, C. F. Lee, and A. H. Chen (1991). Multivariate regression tests of the arbitrage pricing theory: The instrument-variables approach. Review of Quantitative Finance and Accounting, 1, 191–208.
Appendix 98A: Taylor Expansion of the Slope Estimator Here, details are given of the derivation of (98.8), the expected value of the OLS slope estimator (98.7). We rewrite the OLS estimator γˆ1 =
βˆ ¯r ˆ ¯r) = f (β, ˆ ˆ ββ
¯ . . . , βN − β) ¯ as a function of the vectors βˆ := β + u with β := (β1 − β, ¯, . . . , uN − u ¯), and ¯r := E(r) + v with E(r) := (E(r1 ) − and u := (u1 − u E(r), . . . , E(rN ) − E(r)) and v = (v1 − v¯, . . . , vN − v¯). Note that E(r) and β are vectors of (known or unknown) parameters whereas u and v are vectors
July 6, 2020
15:56
3484
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
of random variables.14 In our case, u as well as v are (multivariate) normally distributed with E[u] = E[v] = 0. The measurement errors ui in the betas and vj in the mean returns are uncorrelated, i.e., E[ui vj ] = 0.15 ˆ ¯r) in a Taylor series around the expected values E(β) ˆ = We expand f (β, β and E(¯r) = E(r) up to quadratic terms in the errors u and v and get ˆ ¯r) = f (β, E(r)) + (∇ ˆ f ) · u + (∇¯r f ) · v f (β, β 1 u H βˆβˆ u + u H β¯ u + v H v + ··· + ¯ ¯ r r ˆ ˆr v + v H ¯ rβ 2 with the Hessian matrix H βˆβˆ containing the second-order derivatives with ˆ H ˆ and H ˆ containing the mixed derivatives and H ¯r¯r conrespect to β, ¯ rβ β¯ r ˆ ¯r) taining the second-order derivatives with respect to ¯r. However, since f (β, is linear in ¯r, this last term vanishes, H ¯r¯r = 0. Taking the expected value ˆ ¯r)] not only the first-order terms vanish as well, but due to the indeE[f (β, pendence of the errors u and v also both terms with mixed derivatives vanish. Since the errors u are (multivariate) normally distributed the Taylor series converges if the errors are sufficiently small, i.e., if the variance of the errors u is small compared to the variance of the true parameters β. We cut off the series after the second-order term and get ˆ ¯r)] ≈ f (β, E(r)) + 1 E[u H ˆ ˆ u]. E[f (β, ββ 2 Using the pricing model (98.1), we can write ⎞ ⎛ E(r1 ) − E(r) ⎟ ⎜ .. ⎟ E(r) = ⎜ . ⎠ ⎝ E (rN ) − E(r) ⎞ ⎞ ⎛ ⎛ ¯ ¯ γ1 (β1 − β) γ0 + γ1 β1 − (γ0 + γ1 β) ⎟ ⎟ ⎜ ⎜ .. .. ⎟ = γ1 β, ⎟=⎜ =⎜ . . ⎠ ⎠ ⎝ ⎝ ¯ ¯ γ1 (βN − β) γ0 + γ1 βN − (γ0 + γ1 β) 14
The above-defined vectors are centered, i.e., their elements measure the deviation of the values of individual assets from the corresponding cross-sectional average. 15 Using a time series of returns generated by the market model (98.2), it can be shown that the deviation of the sample mean return from its expected value vi := r¯i − E(ri ) is uncorrelated with the deviation of the OLS slope estimate from the true parameter ui := βˆi − βi . In Appendix 98E, we give a detailed derivation of this observation.
page 3484
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3485
3485
so that we have f (β, E(r)) = β E(r)/β β = γ1 . For the second term, we have to find the second-order derivatives in the Hessian matrix H βˆβˆ . The ˆ ¯r) with respect to βˆi is derivative of f (β, 1 βˆ ¯r ˆ ∂f r¯i − 2 = β. ˆ2 i ∂ βˆi (βˆ β) βˆ βˆ ˆ ¯r) with respect to βˆi and βˆj is The second order derivative of f (β, βˆ ¯r 2 8 βˆ ¯r ˆ ˆ ∂f ββ , = −2 δij − (βˆj r¯i + βˆi r¯j ) + ˆ2 ˆ2 ˆ 2 βˆ βˆ i j ∂ βˆi ∂ βˆj (βˆ β) (βˆ β) (βˆ β) where δij = 1 if i = j and zero otherwise. Evaluating the second-order derivatives at the point βˆ = β and ¯r = E(r), we find the matrix elements of the Hessian matrix: ¯ j − β) ¯ δij (βi − β)(β ∂ 2 f = −2γ1 + 4γ1 , {H βˆβˆ }ij = ˆ ββ (β β)2 ∂ βˆi ∂ βˆj β=β ¯ r=E(r)
¯ and β E(r)/β β = γ1 . In matrix where we used E(ri ) − E(r) = γ1 (βi − β) notation, we can write H βˆβˆ = −2γ1
1 1 I + 4γ1 2 ββ , ββ (β β)
where I is the identity matrix and ββ is the matrix created by multiplication of the column vector β with the row vector β (outer product). We finally arrive at the approximation of the expected value of the slope estimator: β E[uu ]β E[u u] +2 . E[ˆ γ1 ] ≈ γ1 1 − ββ (β β)2 Note that in the second term, we used the fact that we can rearrange ¯ j − β)(u ¯ i−u (βi − β)(β ¯)(uj − u ¯) = β uu β. u ββ u = ij
The expected value E[u u] is a scalar: ⎡ ⎤
N N N ⎢N − 1 2 ⎥ 1 ⎥ (ui − u ¯)2 = E ⎢ u − u u E[u u] = E i j i ⎣ N ⎦ N i=1
i=1
i,j=1 i=j
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
3486
⎛ ⎜1 = (N − 1) ⎜ ⎝N
⎞ N i=1
E[u2i ] −
⎛ = (N − 1)
1 N (N − 1)
N i,j=1 i=j
⎟ E[ui uj ]⎟ ⎠ ⎞
N N ⎟ 1 1 ⎜ 2 2 ⎜1 σ − σij ⎟ i 2 ⎝ ⎠ = (N − 1)σu , N (N − 1) T σf N i=1 i,j=1 i=j
with the definition of the cross-sectional variation of the measurement errors σu2 given in (98.9). Note that σu2 is simply the difference between the average variance and the average covariance of the measurement errors ui . We call σu2 the cross-sectional variation of the measurement errors. In the second term, ˜ := (u1 , . . . , uN ) = u + u ¯1N to get we rewrite the N × N matrix uu using u ˜u ˜ − u ˜ 1N ) + u ˜ + u ¯(1N u ¯2 1N 1N , uu = u and the expected value ˜ ] − 1N E[¯ ˜ ] − E[¯ ˜ ]1N + E[¯ uu uu uu u2 ]1N 1N . E[uu ] = E[˜ ˜ ] = Σu uu In the expression β E[uu ]β, only the first term containing E[˜ ¯ persists since β 1N = 1N β = N i=1 (βi − β) = 0. Putting it all together, we arrive at the final result: E[ˆ γ1 ] ≈ γ1
2 WΣ σu2 1− 2 + , N − 1 σβ2 σβ
where σβ2 := β β/(N − 1) and WΣ := β Σu β/β β, as defined in (98.10) and (98.11).
Appendix 98B: Case of a Two-Parameter Covariance Matrix We derive the expression of the bias in the slope estimator (98.13) for the special case discussed in Section 98.2.2, where the variances of the idiosyncratic errors of all assets are equal to σ 2 and the covariance between any two assets is equal to ρσ 2 . In this special case, the cross-sectional variation
page 3486
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3487
3487
of the measurement errors is ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ N N ⎬ 1 ⎨1 2 1 1 2 2 2 σ − ρσ σ (1 − ρ). = σu = ⎪ N N (N − 1) T σf2 ⎪ T σf2 ⎪ ⎪ i,j=1 ⎩ i=1 ⎭ i=j
To evaluate the Raleigh quotient (98.11), we first write the covariance matrix of the measurement errors (98.5) as Σu =
1 1 Σ= {σ 2 I + ρσ 2 (J − I)}, T σf2 T σf2
where I is the identity matrix and J is the matrix of ones. Since β = (β1 − ¯ is a centered vector, we have β Jβ = 0 and hence, ¯ . . . , β1 − β) β, β Σu β =
σ2 ββ {1 − ρ}, T σf2
so that the Raleigh quotient (98.11) becomes WΣ =
1 2 β Σu β = σ (1 − ρ). ββ T σf2
Finally, plugging in the above-derived expressions for σu2 and WΣ into (98.12), we get the following for the relative bias of the slope estimator: WΣ σu2 2 1 σ 2 (1 − ρ) 2 E(ˆ γ1 ) − γ1 . ≈− 2 + =− 2 1− γ1 (N − 1) σβ2 (N − 1) σβ T σf σβ2 Appendix 98C: Expansion of the Slope Estimator of Richardson and Wu Here, we expand the expression for the bias in the slope estimator (98.16) as given by Richardson and Wu (1970) to lowest order in the ratio of the measurement error σ 2 /T σf2 (the noise) to the variance of the true parameters σβ2 (the signal). Keeping in (98.16) only terms up to order 1/(N − 1), we have 2 τ2 1 E(ˆ γ1 ) − γ1 1− ≈− , γ1 1+τ N − 1 (1 + τ )2 where 1/τ = σ 2 /(T σf2 σβ2 ). To expand this expression in orders of 1/τ , we first rewrite it: 1 2 1 1 E(ˆ γ1 ) − γ1 1− ≈− · . γ1 τ 1 + 1/τ N − 1 (1 + 1/τ )2
July 6, 2020
15:56
3488
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
T. Gramespacher, A. B¨ anziger & N. Hilber
Keeping only terms of lowest order in 1/τ yields 2 σ2 1 2 E(ˆ γ1 ) − γ1 =− 2 2 1− , ≈− 1− γ1 τ N −1 N −1 T σf σβ which exactly reproduces our equation (98.15). Appendix 98D: From Idiosyncratic Errors to Measurement Errors Here, we give the details of the derivation of the covariance matrix of the measurement errors (98.5). The OLS slope estimator βˆi of the time-series regression (98.2) can be written, cf. (98.4): βˆi = βi + ui , with the measurement error term 1 t ft εit − T t ft t εit , ui = 2 1 2 t ft − T ( t ft ) where the summations are over times t = 1, . . . , T . Since the expected value of the idiosyncratic errors εit is zero and idiosyncratic errors and factors are assumed to be uncorrelated, it is immediately obvious that, given the factors ft , the expected value of the measurement errors ui is zero, so that we have E(ui ) = 0 and Cov(ui , uj ) = E(ui uj ). We expand as follows: $ % $ % 1 1 t ft εit − T t ft t εit t ft εjt − T t ft t εjt ui uj = & '2 2 2− 1 ( f f ) t t t t T 1 t,t ft ft εit εjt − T ( t ft ) t,t ft εit εjt + t,t ft εjt εit = & '2 2 2− 1 ( f f ) t t t t T 2 1 2 ( t ft ) t,t εit εjt + &T '2 . 2 2− 1 ( f f ) t t t t T Given the factors ft , the only random variables in this expression are the residuals εit . In addition, according to our assumption (98.3), the residuals are i.i.d. over time with covariance structure: ( σij if t = t , E(εit εjt ) = 0 otherwise.
page 3488
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch98
Cross-Sectionally Correlated Measurement Errors in Two-Pass Regression Tests
page 3489
3489
2 Using the symmetry σij = σji and the abbreviation T σf2 := ft − T −1 2 ( ft ) , we get σij t ft2 − T1 (σij + σji ) ( t ft )2 + T1 σij ( t ft )2 E(ui uj ) = & '2 2 2− 1 ( f f ) t t t t T =
σij . T σf2
Appendix 98E: Independence of the Errors in the Slope Estimate and the Sample Mean Consider a time series of returns rit generated by the market model (98.2). The deviation of the sample average of returns from its true expected value is 1 1 (ft − E(ft )) + εjt . vj := r¯j − E(rj ) = βj T T t t The deviation of the slope estimate from its true value is (cf. Appendix 98D) 1 t ft εit − T t ft t εit . ui = 2 1 2 t ft − T ( t ft ) By assumption, the expected value of the idiosyncratic errors εit is zero so that the expected value of both deviations is zero as well, hence Cov(ui , v j ) = E(ui vj ). Furthermore, in E(ui vj ), all terms linear in εit vanish. Given the factors, we have 1 1 t,t ft E(εit εjt ) − T 2 t ft t,t E(εit εjt ) T . E(ui vj ) = 2 1 2 t ft − T ( t ft ) Using the fact that E(εit εjt ) = σij if t = t and zero otherwise, the expression simplifies to 1 1 t ft − T σij t ft T σij = 0, E(ui vj ) = 2 1 2 f − ( f ) t t t t T thus showing that the deviations ui and vj are uncorrelated.
This page intentionally left blank
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Chapter 99
Asset Pricing with Disequilibrium Price Adjustment: Theory and Empirical Evidence∗ Cheng Few Lee, Chiung-Min Tsai and Alice C. Lee Contents 99.1 99.2
99.3 99.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Development of Multiperiod Dynamic Asset Pricing Model . 99.2.1 The demand function for capital assets . . . . . . . . 99.2.2 Supply function of securities . . . . . . . . . . . . . . Development of Disequilibrium Model for Asset Pricing . . . Alternative Methods of Estimating Asset Pricing Model with Disequilibrium Effect . . . . . . . . . . . . . . . . . . . . . . . 99.4.1 Estimation methods and hypothesis of testing price adjustment process . . . . . . . . . . . . . . . . . . . 99.4.2 Two-stage least squares (2SLS) estimator . . . . . . . 99.4.3 Maximum likelihood estimator . . . . . . . . . . . . .
3492 3494 3494 3495 3497 3499 3499 3500 3501
Cheng Few Lee Rutgers University e-mail: cfl[email protected] Chiung-Min Tsai The Central Bank of China e-mail: [email protected] Alice C. Lee Center for PBBEF Research e-mail: alice.fi[email protected] ∗
This chapter is a reprint of the paper “Asset pricing with disequilibrium price adjustment: theory and empirical evidence,” which was published in Quantitative Finance, Vol, 13, Issue 2, pp. 227–239 (2011). 3491
page 3491
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
3492
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
99.4.4 Testing of the price adjustment process . . . . . 99.4.5 Estimating disequilibrium adjustment parameter, γ . . . . . . . . . . . . . . . . . . . . 99.5 Data Description and Testing the Existence of Price Adjustment Process . . . . . . . . . . . . . . . . . . . . 99.5.1 Data description . . . . . . . . . . . . . . . . . . 99.5.2 International equity markets — country indices 99.5.3 United States equity markets . . . . . . . . . . . 99.5.4 Testing the existence of the price adjustment process . . . . . . . . . . . . . . . . . . . . . . . 99.6 Summary and Concluding Remarks . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 99.A: Estimation of the Disequilibrium Model . . . Appendix 99.B: Structural Coefficient Estimates . . . . . . . .
. . .
3502
. . .
3502
. . . .
. . . .
. . . .
3503 3503 3504 3505
. . . . .
. . . . .
. . . . .
3508 3511 3511 3513 3515
Abstract Breeden (1979), Grinols (1984), and Cox et al. (1985) describe the importance of supply side for the capital asset pricing. Black (1976) derives a dynamic, multi-period CAPM, integrating endogenous demand and supply. However, this theoretically elegant model has never been empirically tested for its implications in dynamic asset pricing. We first review and theoretically extend Black’s CAPM to allow for a price adjustment process. We then derive the disequilibrium model for asset pricing in terms of the disequilibrium model developed by Fair and Jaffe (1972), Amemiya (1974), Quandt (1988), and others. We discuss two methods of estimating an asset pricing model with disequilibrium price adjustment effect. Finally, using price per share, dividend per share, and outstanding shares data, we test the existence of price disequilibrium adjustment process with international index data and US equity data. We find that there exists disequilibrium price adjustment process in our empirical data. Our results support Lo and Wang’s (2000) findings that trading volume is one of the important factors in determining capital asset pricing. Keywords Multiperiod dynamic CAPM • Demand function • Supply function • Disequilibrium model • Disequilibrium effect • Two-stage least squares (2SLS) estimator • Maximum likelihood estimator.
99.1 Introduction Breeden (1979), Grinols (1984), Cox et al. (1985) describe the importance of supply side for the capital asset pricing. Grinols focuses on describing market optimality and supply decisions that guide firms in incomplete markets in the absence of investor unanimity. Cox et al. study a restricted technology to explicitly solve their model in reduced form. Cheng and Grauer (1980) have used unit price theory to show that the static CAPM can be written in terms of price per share and dividend per share. Black (1976) extends
page 3492
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Asset Pricing with Disequilibrium Price Adjustment
b3568-v3-ch99
page 3493
3493
the traditional static CAPM derived by Sharpe (1964), Litner (1965), and Mossin (1966) by explicitly allowing for the endogenous supply effect of risky securities in a dynamic asset pricing model.1 Black concludes that if the supply of a risky asset is responsive to its price, large price changes will be spread over time as specified by the dynamic capital asset pricing model. One important implication in Black’s model is that the efficient market hypothesis holds only if the supply of securities is fixed and independent of current prices. In short, Black’s dynamic CAPM adopts an endogenous supply effect of risky securities by setting the supply equal to demand in the equilibrium. Lee and Gweon (1986) and Lee et al. (2009) extend Black’s framework to allow time-varying dividend payments and then tests the existence of supply effect in the situation of market equilibrium. Their results reject the null hypothesis of no supply effect in US domestic stock market. Campbell et al. (1993), and Lo and Wang (2000) have studied the relationship between aggregate stock market trading volume and the serial correlation of daily stock returns. Campbell et al. (1993) find that a stock price decline on a high-volume day is more likely than a stock price decline on a low-volume day. They propose that trading volume changes when random shifts in the stock demand of non-informational traders are accommodated by the risk-averse market makers. Lo and Wang (2000) derive an intertemporal CAPM (ICAPM) by defining preference for wealth, instead of consumption, by introducing three state variables into the exponential terms of investor’s preference as we do in this paper. This state-dependent utility function allows us to capture the dynamic nature of the investment decision without explicitly solving a dynamic optimization problem. Thus, the marginal utility of wealth depends not only on the dividend of the portfolio but also on future state variables. That is, this dependence forces investors to care about future market conditions when choosing their portfolio. In equilibrium, this model also implies that an investor’s utility depends not only on his wealth but also on the stock payoffs directly. This “market spirit,” in their terminology, affects investor’s demand for the stocks. Black (1976), Lee and Gweon (1986), Lee et al. (2009), and Lo and Wang (2000) develop models by using either outstanding shares or trading volumes as variables to connect the decisions in two different periods, unlike consumption-based CAPM that uses consumption or macroeconomic information. Thus, the information of quantities demanded and supplied can 1
This dynamic asset pricing model is different from Merton’s (1973) intertemporal asset pricing model in two key aspects. First, Black’s model is derived in the form of simultaneous equations. Second, Black’s model is derived in terms of price change, and Merton’s model is derived in terms of rates of return.
July 6, 2020
15:56
3494
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
now play a role in determining the asset price. This proposes a wealth-based model as an alternative method to investigate intertemporal CAPM.2 The paper is structured as follows. In Section 99.2, a simultaneous equation system of asset pricing is constructed through a multi-period equation to represent the dynamic relationship between supply and demand for capital assets. Derivation of disequilibrium model for asset pricing is presented in Section 99.3. Section 99.4 discusses alternative methods of estimating disequilibrium asset pricing model. Section 99.5 describes three sets of data used to do empirical studies. In addition, the empirical findings for the hypotheses and tests constructed in the previous sections are then presented in this section. Our summary and concluding remarks are presented in Section 99.6. 99.2 Development of Multiperiod Dynamic Asset Pricing Model Black (1976) generalizes the static wealth-based CAPM by explicitly allowing for the endogenous supply effect of risky securities. The demand for securities is based on well-known model of James Tobin (1958) and Harry Markowitz (1959). However, Black further assumes a quadratic cost function of changing short-term capital structure under long-run optimality condition. Lee and Gweon (1986) and Lee et al. (2009) modify and extend Black’s framework to allow time-varying dividends and then test the existence of supply effect with two major different assumptions: (1) our model allows for time-varying dividends, unlike Black’s assumption of constant dividends; and (2) our model allows only the existence of unanticipated random shock in the supply side.3 By using these assumptions, we first derive the demand function for capital assets, and then we derive the supply function of securities. Next, we solve the demand and supply schedule simultaneously to reexamine the price adjustment behavior due to market in disequilibrium. 99.2.1 The demand function for capital assets The demand equation for the assets is derived under the standard assumptions of the CAPM. An investor’s objective is to maximize his/her expected 2
It should be noted that Lo and Wang’s model does not explicitly introduce the supply equation in asset pricing determination. Also, one can identify the hedging portfolio using volume data in the Lo and Wang model setting. 3 Black (1976) allows the existence of both anticipated and unanticipated shocks. In other words, our assumption is more restrictive than Black’s assumption.
page 3494
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment
page 3495
3495
utility in terms of the negative exponential function of wealth: U = a − h × e{−bWt+1 } ,
(99.1)
where the terminal wealth Wt+1 = Wt (1 + Rt ); Wt is initial wealth; and Rt is the rate of return on the portfolio. The parameters, a, b and h, are assumed to be constants. Assuming the returns are normally distributed, Lee and Gweon (1986), and Lee et al. (2009) have derived the following demand function for the optimal portfolio: qt+1 = b−1 S −1 (xt+1 − r ∗ Pt ).
(99.2)
The definitions of the variables used in equation (99.2) are as follows: qt+1 = (q1,t+1 , q2,t+1 , q3,t+1 , . . . , qN,t+1 ) , where qj,t+1 = number of units of security j after reconstruction of his portfolio. S = E(Xt+1 − xt+1 )(Xt+1 − xt+1 ) = the covariance matrix of returns of risky securities, where Xt+1 = (X1,t+1 , X2,t+1 , . . . , XN,t+1 ) as Xj,t+1 = Pj,t+1 − Pj,t + Dj,t+1 is the dollar returns on jth of N marketable risky securities and xt+1 = (x1,t+1 , x2,t+1 , . . . , xN,t+1 ) = Et Pt+1 − Pt + Et Dt+1 as xj,t+1 is the expected returns for each security. r ∗ represents the risk-free rate. [Note: Pt = (P1,t , P2,t , . . . , PN,t ) , where Pj,t = price of security j at time t and Dt+1 is the vector of dividend or coupon on N security at time t + 1.] Under the assumption of homogeneous expectation, or by assuming that all the investors have the same probability belief about future return, the aggregate demand for risky securities can be summed as Qt+1 =
m
k qt+1 = cS −1 [Et Pt+1 − (1 + r ∗ )Pt + Et Dt+1 ],
(99.3)
k=1
Σ(bk )−1 .
where c = In the standard CAPM, the supply of securities is fixed, denoted as Q∗ . Then, equation (3) can be rearranged as Pt = (1/r ∗ )(xt+1 − c−1 SQ∗ ), where c−1 is the market price of risk. In fact, this equation is similar to the Lintner’s (1965) well-known equation in capital asset pricing. 99.2.2 Supply function of securities An endogenous supply side to the model is derived in this section, and we present our resulting hypotheses, mainly regarding market imperfections. For example, the existence of taxes causes firms to borrow more since the interest expense is tax-deductible. The penalties for changing contractual payment (i.e., direct and indirect bankruptcy costs) are material in magnitude, so
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3496
the value of the firm would be reduced if firms increase borrowing. Another imperfection is the prohibition of short sales of some securities.4 The costs generated by market imperfections reduce the value of a firm, and thus, a firm has incentives to minimize these costs. Three more related assumptions are made here. First, a firm cannot issue a risk-free security; second, these adjustment costs of capital structure are quadratic; and third, the firm is not seeking to raise new funds from the market. It is assumed that there exists a solution to the optimal capital structure and that the firm has to determine the optimal level of additional investment. The one-period objective of the firm is to achieve the minimum cost of capital vector with adjustment costs involved in changing the quantity vector, Qi,t+1 : 1 (ΔQi,t+1 AiΔQi,t+1 ), (99.4) Min Et Di,t+1 Qi,t+1 + 2 subject to Pi,t ΔQi,t+1 = 0, where Ai is a ni × ni positive define matrix of coefficients measuring the assumed quadratic costs of adjustment. If the costs are high enough, firms tend to stop seeking to raise new funds or retire old securities. The solution to equation (99.10) is ΔQi,t+1 = A−1 i (λi Pi,t − Et Di,t+1 ),
(99.5)
where λi is the scalar Lagrangian multiplier. Aggregating equation (99.5) over N firms, the supply function is given by
where
A−1
⎡
⎤
Q1 ⎢Q ⎥ ⎢ 2⎥ ⎥ Q=⎢ ⎢ .. ⎥. ⎣ . ⎦
=
ΔQt+1 = A−1 (BPt − Et Dt+1 ), ⎡ −1 ⎤ ⎡ A1 λ1 I ⎢ ⎥ ⎢ −1 A2 λ2 I ⎢ ⎥ ⎢ ⎢ ⎥, B = ⎢ ⎢ ⎥ ⎢ .. .. . . ⎣ ⎦ ⎣ A−1 N
⎤
(99.6)
⎥ ⎥ ⎥ , and ⎥ ⎦ λN I
QN 4
Theories as to why taxes and penalties affect capital structure are first proposed by Modigliani and Miller (1958) and then Miller (1977). Another market imperfection, prohibition on short sales of securities, can generate “shadow risk premiums” and thus provide further incentives for firms to reduce the cost of capital by diversifying their securities.
page 3496
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Asset Pricing with Disequilibrium Price Adjustment
b3568-v3-ch99
page 3497
3497
Equation (99.6) implies that a lower price for a security will increase the amount retired of that security. In other words, the amount of each security newly issued is positively related to its own price and is negatively related to its required return and the prices of other securities. Equations (99.3) and (99.6) will be used to develop a disequilibrium system in the following section for our empirical analyses in Section 99.5.
99.3 Development of Disequilibrium Model for Asset Pricing The demand and supply functions derived in the previous section can be used to derive a simultaneous equation model for the disequilibrium case. Disequilibrium models have a very long history. All the partial adjustment models are in fact disequilibrium models. Much of the literature concerning the structure of disequilibrium markets focus on the commercial loan market and the labor market. For the commercial loan market, the structure of disequilibrium is frequently due to the government’s credit rationing for economic policies. For the labor market, the structural disequilibrium is frequently due to a rigid wage. The theory of credit rationing is first developed by Jaffee (1971) for a commercial loan market. One of the reasons for credit rationing is the existence of bankruptcy costs, as proposed by Miller (1977). Given that bankruptcy costs rise when firms fail, banks thus choose a lower amount of loan offerings than they would have if there were no bankruptcy costs. As a result, some firms will not receive the loan regardless of the rate they are willing to pay. In this section, we discuss and develop a model and methodology similar to these issues regarding commercial loan markets. Early studies of the disequilibrium model of commercial loan markets include Jaffee (1971), Maddala and Nelson (1974), Sealey (1979), and Nehls and Schmidt (2003). They use the disequilibrium methodology similar to Sealy to evaluate whether loans are constrained by demand or supply. In fact, one can see the disequilibrium model as a special case of simultaneous equation models. Thus, here, a similar demand and supply schedule is derived and solved simultaneously to reexamine the price adjustment behavior by assuming that market is in disequilibrium. All disequilibrium models share the feature that prices do not fully adjust to the market clearing level. The model used throughout this section is a basic model first proposed by Fair and Jaffee (1972) and Amemiya (1974)
July 6, 2020
15:56
3498
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
and modified as model C in Quandt (1988). This model consists of the following equations5 : QDt = α1 Pt + β1 XD,t + μt ,
(99.7)
QSt = α2 Pt + β2 XS,t + υt ,
(99.8)
Qt = min(QDt , QSt ), ΔPt = Pt − Pt−1 = γ(QDt − QSt ),
(99.9) (99.10)
where QDt and QSt are the quantities of securities demanded and supplied, respectively, Qt is the actual or observed quantity of securities in the market; Pt is the observed prices of securities, XD,t and XS,t are vectors of exogenous or predetermined variables including the lagged Pt−1 , α1 and α2 are unknown parameters for Pt , β1 and β2 are vectors of unknown parameters for exogenous variables, γ is an unknown positive scalar parameter, and μt and νt are disturbance terms and assumed to be jointly normal and independent over time with distributions N (0, σμ2 ) and N (0, συ2 ), respectively. The difficulty comes in estimating α1 , α2 , β1 , β2 , γ, σμ2 , and συ2 with observations of XD,t , XS,t , Qt and Pt for t = 1, 2, . . . , T . Some assumptions should to be made to deal with the relationships between Qt , QDt QSt and the price adjustment process. A basic assumption is reflected in equation (99.9), which shows that when demand exceeds supply, the observed quantity lies on the supply schedule, and the market is characterized by the conditions of excess demand. This assumption is often referred to as voluntary exchange. That is, in the presence of excess demand, seller cannot be forced to supply more than they wish to supply; and in the presence of excess supply, purchasers cannot be forced to buy more than they wish to buy. Another assumption in this model is that the price adjustment is proportional to excess demand, which is shown by the last equation (99.10) in the above system. Besides, the model is also assumed to be identified by different sets of exogenous variables (i.e., XD,t and XS,t .) Clearly, the equation system, equations (99.7)–(99.10), is a special case of simultaneous equation models. Equations (99.7) and (99.8) represent an identified equation system; therefore, from these two equations, we can consistently estimate α1 , α2 , β1 , β2 , γ, σμ2 , and συ2 by using methodologies of simultaneous equation. Since we have equation (99.9), we need to introduce equation (99.10) into the system for estimation of α1 , α2 , β1 , β2 , γ, σμ2 , 5
There are four major models and some alternative specifications in constructing disequilibrium issues (see Quandt, 1988, though the time period notation is slightly different from the models here).
page 3498
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment
page 3499
3499
and συ2 . However, one primary problem exists in this disequilibrium model, which is that QDt and QSt are not observable variables in the absence of the market clearing condition. The last topic of this section is to incorporate the demand and supply schedules developed in the previous section into this disequilibrium equation system. The demand and supply schedule in equations (99.3) and (99.6) can be restated and presented as equations (99.11) and (99.14) as part of the disequilibrium system as6 : QDt+1 = cS −1 Et Pt+1 − cS −1 (1 + r ∗ )Pt + cS −1 Et Dt+1 + μ1t , −1
QSt+1 = QSt + A
−1
BPt − A
(99.11) Et Dt+1 + μ2t ,
Qt+1 = min(QDt+1 , QSt+1 ), ΔPt+1 = γ(QDt+1 − QSt+1 ).
(99.12) (99.13) (99.14)
From the above equation system, it is clear that some conditions in equations (99.11) and (99.14) are different from the basic disequilibrium equation system, particularly QSt in the supply schedule. These problems are dealt with, before the empirical studies, by imposing more assumptions or by using alternative specifications in econometric methodologies in the following section. 99.4 Alternative Methods of Estimating Asset Pricing Model with Disequilibrium Effect In this section, we first reformulate the disequilibrium asset pricing model required for empirical study. Then we discuss the alternative methods of estimating and testing price adjustment process in capital asset pricing. 99.4.1 Estimation methods and hypothesis of testing price adjustment process To estimate α1 , α2 , β1 , β2 , γ, σμ2 , and συ2 with observations of XD,t , XS,t , Qt and Pt for t = 1, 2, . . . , T in equations (99.7)–(99.10). It is clear that the ordinary least squares will produce inconsistent estimators. Following Amemiya (1974) and Quandt (1988), we discuss two estimation methods to obtain 6
While there is a slight difference in the notation of time period, the essence of model still remained.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3500
consistent estimators. The first method is the two stage least square (2SLS) estimator, and the other is the maximum likelihood estimator (MLE). One can reformulate the above model by considering periods of rising prices ΔPt > 0 and periods of falling prices ΔPt < 0. As a result, in the period with rising prices, the supply function (99.8) can be estimated using the observed quantity, Qt , as the dependent variable since there will be excess demand and thus, Qt will be equal to QSt . Thus, the disequilibrium system described as equations (99.7)–(99.10) can be reformulated and summarized as the following equations: 1 (99.15a) Qt = α1 Pt + β1 XD,t − ΔPt+ + μt , γ 1 (99.15b) Qt = α1 Pt + β2 XS,t − ΔPt− + υt , γ ΔPt if ΔPt > 0, + ΔPt = 0 otherwise, where ΔPt−
=
−ΔPt 0
if ΔPt < 0, otherwise
.
Two alternative methods (two-stage least squares (2SLS), and MLE) for estimating the above disequilibrium model are also proposed in the following sections. 99.4.2 Two-stage least squares (2SLS) estimator The equations system shown in (99.15) contains the jointly dependent variables Qt , Pt , ΔPt+ , and ΔPt− . The parameters in the modified model can be consistently estimated by conventional two-stage least squares (2SLS) method. This can be briefly described as the following two stages. In the first stage, regress ΔPt+ and ΔPt− on all the exogenous variables, XD,t and ˆ + and ΔP ˆ − , then, in the second stage, XS,t to obtain the estimations of ΔP t t + ˆ in equation (99.15a), and regress Qt on X2t regress Qt on X1t and ΔP t − ˆ and ΔPt in equation (99.15b). However, the estimators of α ˆ1 , α ˆ 2 , βˆ1 and βˆ2 are not asymptotically efficient in this model, though could be consistent if the predictions of the endogenous variables are used for all observations.7 7
Amemiya shows that the Two-Stage Least Squares (2SLS) estimators proposed by Fair and Jaffee are not consistent since the expected value of error in first equation, Eμt given t belonging to period B is not zero, or, according to Quandt, the plim Xa μa /T is not zero (see Amemiya, 1974; Quandt, 1988).
page 3500
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment
page 3501
3501
The reasons are, first, there is no imposed restriction to force the same γ to appear in both equations and, second, ΔPt+ and ΔPt− are not, strictly speaking, linear functions of XD,t and XS,t . 99.4.3 Maximum likelihood estimator Quandt (1988) employs an appropriately formulated full-information maximum likelihood technique. He suggests one can use the log likelihood function represented as equation (99.16) to find the ML estimator T 1 −1 1 u Ω ut , (99.16) L = T log |α2 − α1 + | − T log 2π − log |Ω| − γ 2 2 t t where ut
=
Qt − α1 Pt −
β1 XD,t
1 1 + − + ΔPt , Qt − α2 Pt − β2 XS,t + ΔPt . γ γ
On the other hand, Amemiya (1974) shows the following iterative method of obtaining the maximum likelihood estimators. He suggests the maximum likelihood estimators can be obtained by solving the following equations simultaneously:
σμ2 =
συ2 =
1 T
1 T
α1,M LE = α1,LS
β1,M LE = β1,LS ,
α2,M LE = α2,LS
β2,M LE = β2,LS ,
(99.17a)
(99.17b)
2 1 2 (Qt − α1 Pt − β1 XD,t ) + (Qt + γ ΔPt − α1 Pt − β1 XD,t ) , A
B
(99.17c)
2 1 2 (Qt − α2 Pt − β2 XS,t ) + (Qt − γ ΔPt − α2 Pt − β2 XS,t ) , B
A
(99.17d) 1 1 Qt − ΔPt − α2 Pt − β2 XS,t ΔPt Tγ + 2 συ γ A 1 1 − 2 Qt + ΔPt − α1 Pt − β1 XD,t ΔPt = 0. σμ γ
(99.17e)
B
That is, the ML estimators of α and β are the same as LS estimators given γ applied to equation (99.15). The equations for σμ2 and συ2 (equations (99.17c) and (99.17d)) are the residual sums of squares of equations (99.15a) and (99.15b), given γ, divided by T as for the usual ML estimators. Equation (99.17d) is a quadratic function in γ. Following
July 6, 2020
15:56
3502
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
Amemiya’s suggestion, one can solve the above parameters by using the following iterative procedures: Step 1: Use 2LSL estimates of α, β, σμ2 and συ2 as the initial estimates. ˆ σ ˆυ2 into (99.17e) and solve for the positive Step 2: Substitute α ˆ , β, ˆμ2 and σ root of γ, γˆ . Step 3: Use γˆ in (99.15) to obtain least-squares estimates of α, β, σμ2 and σ2υ . Step 4: Repeat steps 2 and 3 until the solutions converge. 99.4.4 Testing of the price adjustment process In this paper, we are more interested in the market adjustment parameter, γ. Compared with equilibrium model, the parameter of most interest in the disequilibrium model is the market adjustment parameter, γ. In the case of continuous time, the limiting values of γ are zero and infinity. If γ = 0, then there is no price adjustment in response to an excess demand, and if γ is infinity, it indicates instantaneous adjustment. In other words, if one assumes there is price rigidity in response to an excess demand, then the value of γ should be equal to zero. That is, the most important test of the disequilibrium model is to test the hypothesis that the price adjustment parameter is zero. The null hypothesis can be stated as follows: if there is no price adjustment mechanism in response to an excess demand, the value of γ will be equal to zero, or can be stated as H0 : γ = 0 vs. H1 : γ = 0. This hypothesis will be empirically tested in the following section. 99.4.5 Estimating disequilibrium adjustment parameter, γ The last topic of the section is to incorporate the demand and supply schedules developed in the previous section into this disequilibrium equation system. It is clear that some conditions of the demand and supply schedule represented as equations (99.11) and (99.12) are different from the basic disequilibrium equation system proposed (e.g., QSt in the supply schedule and the expectation term of price in the demand function). Since the purpose of this study is to understand the price adjustment in response to an excess demand, some assumptions imposed to the variables on the right-hand side are needed to modify the equation system. First, the original model derives the supply schedule based on the assumption that there exist quadratic costs
page 3502
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment
page 3503
3503
to retire or issue securities. That is, there is a quadratic cost on ΔQSt . The cost is assumed to impose to the deviation from the long run trend of the quantities issued. Thus, the QSt on the right-hand side of equation (99.12) is treated as a constant. The second assumption is that the price follows random walk process, and the last assumption is that the expectation of the adjustment in dividend can be forecasted and computed from the adaptive expectation model by utilizing the past dividend and earnings information. The quantity observation, Qt , for each index can be obtained from the capitalization data. As a result, the disequilibrium equation system, i.e., equation (99.11)–(99.14), can be restated as follows: QDt = α1 Pt−1 + α2 Pt + α3 Dt + u1t ,
(99.18)
QSt = Q + β1 Pt−1 + β2 Dt + u2t ,
(99.19)
Qt = min(QDt , QSt ), ΔPt = Pt − Pt − 1 = γ(QDt − QSt ).
(99.20) (99.21)
In order to estimate the disequilibrium model described as equation (99.18)– (99.21), we follow the method presented in the previous section, that is, the method proposed by Amemiya (1974) and Quandt (1988). The procedures of estimating the disequilibrium model are discussed in the appendix. 99.5 Data Description and Testing the Existence of Price Adjustment Process Now that we have our disequilibrium asset pricing model for empirical study, we test the price adjustment mechanism by examining the market adjustment parameter, γ, as stated in the previous section. First, in this section, we describe our empirical data. 99.5.1 Data description Our data consist of two different types of markets: the international equity market and the US domestic stock market, which we examine here in terms of summary return statistics and key profitability financial ratios. In addition, we also analyze 30 firms of the Dow Jones Index. Most details of the model, the methodologies, and the hypotheses for empirical tests are discussed in the previous sections. We first examine international asset pricing by looking at summary statistics for our international country indices, and then we look at our data for the US domestic stock market with portfolios formed from the S&P 500 and also the 30 companies used to compile the Dow Jones Industrial Average.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3504
99.5.2 International equity markets — country indices The data come from two different sources. One is the Global Financial Data (GFD) from the databases of Rutgers Libraries, and the second set of dataset is the MSCI (Morgan Stanley Capital International, Inc.) equity indices. Mainly, we focus on the Global Financial Data, with the MSCI indices used for some comparisons. We use both datasets to perform the Grangercausality test. The monthly (GFD) dataset for February 1988–March 2004 consists of the index, dividend yield, price — earnings ratio, and capitalization for each equity market. Sixteen country indices and two world indices are used to do the empirical study, as listed in Table 99.1. For all country indices, dividends and earnings are converted into US dollar denominations. The exchange rate data also come from Global Financial Data. In Table 99.2, Panel A shows the first four moments of monthly returns and the Jarque–Bera statistics for testing normality for the two world indices and the seven indices of G7 countries, and Panel B provides the same summary information for the indices of nine emerging markets. As can be seen in the mean and standard deviation of the monthly returns, the emerging markets tend to be more volatile than developed markets though they may yield opportunity of higher return. The average of monthly variance of return in Table 99.1:
World indices and country indices list.
I. World Indices WI WIXUS
World index: FT-Actuaries World $ Index (w/GFD extension) World index excluding US
II. Country Indices AG BZ CD FR GM IT HK JP MA MX SG KO TW TL UK US
Argentina: Buenos Aires SE General Index (IVBNG) Brazil: Brazil Bolsa de Valores de Sao Paulo (Bovespa) ( BVSPD) Canada: Canada S&P/TSX 300 Composite Index ( GSPTSED) France: Paris CAC-40 Index ( FCHID) German: Germany Deutscher Aktienindex (DAX) ( GDAXD) Italy: Banca Commerciale Italiana General Index ( BCIID) Hong King: Hong Kong Hang Seng Composite Index ( HSID) Japan: Japan Nikkei 225 Stock Average ( N225D) Malaysia: Malaysia KLSE Composite ( KLSED) Mexico: Mexico SE Indice de Precios y Cotizaciones (IPC) ( MXXD) Singapore: Singapore Straits-Times Index ( STID) South Korea: Korea SE Stock Price Index (KOSPI) ( KS11D) Taiwan: Taiwan SE Capitalization Weighted Index ( TWIID) Thailand: Thailand SET General Index ( SETID) United Kingdom: UK Financial Times-SE 100 Index ( FTSED) United States: S&P 500 Composite ( SPXD)
page 3504
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment Table 99.2:
page 3505
3505
Summary statistics of monthly return.a,b
Panel A: G7 and World indices Country
Mean
Std. dev.
Skewness
Kurtosis
Jarque–Bera
WI WI excl.US
0.0051 0.0032
0.0425 0.0484
−0.3499 −0.1327
3.3425 3.2027
4.7547 0.8738
0.0064 0.0083 0.0074 0.0054 −0.00036 0.0056 0.0083
0.0510 0.0556 0.0645 0.0700 0.0690 0.0474 0.0426
−0.6210 −0.1130 −0.3523 0.2333 0.3745 0.2142 −0.3903
4.7660 3.1032 4.9452 3.1085 3.5108 3.0592 3.3795
36.515∗∗ 0.4831 33.528∗∗ 1.7985 6.4386∗ 1.4647 5.9019
CD FR GM IT JP UK US
Panel B: Emerging markets AG BZ HK KO MA MX SG TW TL
0.0248 0.0243 0.0102 0.0084 0.0084 0.0179 0.0072 0.0092 0.0074
0.1762 0.1716 0.0819 0.1210 0.0969 0.0979 0.0746 0.1192 0.1223
1.9069 0.4387 0.0819 1.2450 0.5779 −0.4652 −0.0235 0.4763 0.2184
10.984 6.6138 4.7521 8.6968 7.4591 4.0340 4.8485 4.0947 4.5271
613.29∗∗ 108.33∗∗ 26.490∗∗ 302.79∗∗ 166.22∗∗ 15.155∗∗ 26.784∗∗ 16.495∗∗ 19.763∗∗
Notes: a The monthly returns from February 1988 to March 2004 for international markets. b ∗ and ∗∗ denote statistical significance at the 5% and 1% levels, respectively.
emerging markets is 0.166, while the average of monthly variance of return in developed countries is 0.042. 99.5.3 United States equity markets A total of 300 companies are selected from the S&P 500 and grouped into 10 portfolios by their payout ratios, with equal numbers of 30 companies in each portfolio. The data are obtained from the COMPUSTAT North America industrial quarterly data. The data start from the first quarter of 1981 to the last quarter of 2002. The companies selected satisfy the following two criteria. First, the company appears on the S&P500 at some time period during 1981– 2002. Second, the company must have complete data available — including price, dividend, earnings per share and shares outstanding — during the 88 quarters (22 years). Firms are eliminated from the sample list if either
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3506 1600 S&P500
1400
M
1200 1000 800 600 400 200
02 20
00
99
20
97
19
19
96
94
19
93
19
91
19
90
19
88
19
87
Figure 99.1:
19
85
19
84
19
19
82
0 19
July 6, 2020
Comparison of S&P500 and market portfolio.
their reported earnings are not positive for most periods or their reported dividends are zero. Around 314 firms remain after these adjustments. Finally, excluding those seven companies with highest and lowest average payout ratio, the remaining 300 firms are grouped into 10 portfolios by the payout ratio. Each portfolio contains 30 companies. Figure 99.1 shows the comparison of S&P 500 index and the value-weighted index (M) of the 300 firms selected. Figure 99.1 shows that the trend is similar to each other before the 3rd quarter of 1999. However, the two follow noticeably different paths after the third quarter of 1999. To group these 300 firms, the payout ratio for each firm in each year is determined by dividing the sum of four quarters’ dividends by the sum of four quarters’ earnings; then, the yearly ratios are further averaged over the 22-year period. The first 30 firms with highest payout ratio comprise portfolio one and so on. Then, the value-weighted averages of the price, dividend, and earnings of each portfolio are also computed. Characteristics and summary statistics of these 10 portfolios are presented in Tables 99.3 and 99.4, respectively. Table 99.3 presents information of the return, payout ratio, size, and beta for the 10 portfolios. Some inverse relationship between mean return and payout ratio appears to exist. However, the relationship between payout ratio and beta is not so clear. This finding is similar to that of Fama and French (1992). Table 99.4 shows the first four moments of quarterly returns of the market portfolio and 10 portfolios. The coefficients of skewness, kurtosis, and Jarque–Bera statistics show that one cannot reject the hypothesis that log
page 3506
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment Table 99.3: Portfolioa 1 2 3 4 5 6 7 8 9 10
page 3507
3507
Characteristics of 10 portfolios.
Returnb
Payoutc
Size (000)
Beta
0.0351 0.0316 0.0381 0.0343 0.0410 0.0362 0.0431 0.0336 0.0382 0.0454
0.7831 0.7372 0.5700 0.5522 0.5025 0.4578 0.3944 0.3593 0.2907 0.1381
193,051 358,168 332,240 141,496 475,874 267,429 196,265 243,459 211,769 284,600
0.7028 0.8878 0.8776 1.0541 1.1481 1.0545 1.1850 1.0092 0.9487 1.1007
Notes: a The first 30 firms with highest payout ratio comprises portfolio one, and so on. b The price, dividend and earnings of each portfolio are computed by value-weighted of the 30 firms included in the same category. c The payout ratio for each firm in each year is found by dividing the sum of four quarters’ dividends by the sum of four quarters’ earnings, then, the yearly ratios are then computed from the quarterly data over the 22-year period.
Table 99.4:
Country Market portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio
1 2 3 4 5 6 7 8 9 10
Summary statistics of portfolio quarterly returns.a
Mean (quarterly)
Std. dev. (quarterly)
0.0364
0.0710
−0.4604
3.9742
6.5142∗
0.0351 0.0316 0.0381 0.0343 0.0410 0.0362 0.0431 0.0336 0.0382 0.0454
0.0683 0.0766 0.0768 0.0853 0.0876 0.0837 0.0919 0.0906 0.0791 0.0985
−0.5612 −1.1123 −0.3302 −0.1320 −0.4370 −0.2638 −0.1902 0.2798 −0.2949 −0.0154
3.8010 5.5480 2.8459 3.3064 3.8062 3.6861 3.3274 3.3290 3.8571 2.8371
6.8925∗ 41.470∗∗ 1.6672∗ 0.5928 5.1251 2.7153 0.9132 1.5276 3.9236 0.0996
Skewness Kurtosis Jarque–Berab
Notes: a Quarterly returns from 1981:Q1 to 2002:Q4 are calculated. b ∗ and ∗∗ denote statistical significance at the 5% and 1% levels, respectively.
return of most portfolios is normal. The kurtosis statistics for most sample portfolios are close to three, which indicates that heavy tail is not an issue. Additionally, Jarque–Bera coefficients illustrate that the hypotheses of Gaussian distribution for most portfolios are not rejected. It seems to
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3508
be unnecessary to consider the problem of heteroskedasticity in estimating domestic stock market if the quarterly data are used. Finally, we use quarterly data of 30 Dow Jones companies to test the existence of disequilibrium adjustment process for asset pricing. The sample period of this set of data is from first quarter of 1981 to fourth quarter of 2002. 99.5.4 Testing the existence of the price adjustment process The maximum likelihood estimators are computed from the derivation in Section 99.4. First, use the Two-Stage Least Squares (2SLS) approach to find the initial values for the estimates, and then the maximum likelihood estimate is obtained from the calculation of the log likelihood function described as equation (99.16). The results of 16 country indexes are summarized in Table 99.5. In 15 out of 16, the maximum likelihood estimates of γ are significantly different from zero at the 1% significance level. The results indicate some but much less than complete price adjustment during each month. The results in terms of 10 portfolios are summarized in Table 99.6. There are six portfolios, including market portfolio, with a maximum likelihood Table 99.5:
Canadaa France Italya Japan Germany UKa US Argentinaa Brazila Hong Konga Malaysia Mexico Singaporea S. Korea Taiwan Thailand
Price adjustment factor, γ, for 16 international indices.b γ (MLE)
Std. deviation
z -statistic
1.5092 2.5655 0.3383 0.0016 2.3242 3.2916 0.2404 0.2194 0.0024 0.8342 0.5675 0.3407 0.0917 0.0282 1.8893 0.0710
0.1704 0.370 0.0585 0.0003 0.5849 0.7396 0.0557 0.0107 0.0005 0.2421 0.1144 0.1014 0.0939 0.0042 0.3827 0.0194
8.859 6.931 5.786 5.171 3.974 4.451 4.307 20.609 5.069 3.446 4.962 3.362 0.9758 6.6679 4.9371 3.6622
Note: a Sample periods used are other than 1988:2–2004:3. b Null hypothesis: if there is no price adjustment mechanism in response to an excess demand, the value of γ will be equal to zero (H0 : γ = 0).
page 3508
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment Table 99.6:
page 3509
3509
Price adjustment factor, γ, for 10 portfolios from S&P 500.a,b γ (MLE)
Std. deviation
z -statistic
p-value
Market portfolio
0.0021
0.0007
2.8773
0.0040
Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio Portfolio
0.0474 0.0178 0.0169 0.0476 0.0340 0.0244 0.0200 0.0431 0.0088 0.0129
0.0158 0.0058 0.0113 0.0142 0.0155 0.0197 0.0073 0.0284 0.0098 0.0078
3.0086 3.0280 1.5028 3.3560 2.1867 1.2349 2.7182 1.5171 0.9016 1.6514
0.0026 0.0025 0.1329 0.0008 0.0288 0.2169 0.0066 0.1292 0.3673 0.0987
1 2 3 4 5 6 7 8 9 10
Note: a Sample periods used are other than 1981:Q1–2002:Q4. b Null hypothesis: if there is no price adjustment mechanism in response to an excess demand, the value of γ will be equal to zero (H0 : γ = 0).
estimates of γ statistically significantly different from zero. For example, for portfolios 1, 2, 4, and 7, γ is significantly different from zero at the 1% significance level. Portfolio 5 and the market portfolio show significance level of 5%, and portfolio 10 is significant at a 10% level. We cannot reject the null hypothesis that γ equals to zero for three portfolios: 3, 6, 8, and 9. The results imply some but less than complete price adjustment during each quarter in the US stock markets. Table 99.7 shows the results of 30 companies listed in the Dow Jones Index. The price adjustment factor is significantly different from zero at the 5% level in 22 out of 28 companies. On average, an individual company has a higher estimated value of γ than the individual portfolio and individual portfolio has a higher value than market portfolio. For example, IBM’s γ is 0.0308, which indicates that an excess demand of 32.47 million shares is required to cause a change in the price of 1 dollar, whereas 476 million shares is required to cause one unit price change for market portfolio since its γ is only 0.0021.8 The estimates of demand and supply elasticities and other structural parameters can be found in Appendix 99.B. From the information in Appendix 99.B, we can conclude that the model derived in this paper perform fairly well for the empirical study. 8
S According to equation (99.16), ΔPt = γ(QD t+1 − Qt+1 ), the amount of excess demand needed to cause a dollar change in price can be calculated by 1/0.0308.
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3510 Table 99.7:
Alcoa Inc. Altria Group Inc. American Express AT&T Boeing Co. Citigroup Inc. Caterpillar Inc. Disney Co. Honeywell Inc. JP Morgan Chase Co. Coca Cola Co. Du Pont Eastman Kodak Co. GE GM Home Depot Inc. HP IBM Intl’ Paper Co. Exxon Johnson & Johnson MacDonald 3M Merck & Co. Procter & Gamble Co. SBC Communication Inc. United Technologies Corp. Wal-Mart
Price adjustment factor, γ, Dow Jones 30.a,b,c γ (MLE)
Std. deviation
z -statistic
p-value
0.0559 0.0118 0.0264 0.0587 0.0357 0.0169 0.1328 0.0367 0.0258 0.0248 0.0131 0.0223 0.0707 0.0080 0.0343 0.0317 0.0170 0.0308 0.0393 0.0014 0.0105 0.0129 0.0564 0.0156 0.0222 0.0051 0.0588 0.0360
0.0285 0.0057 0.0176 0.0220 0.0090 0.0113 0.0750 0.0301 0.0097 0.0073 0.0045 0.0084 0.0377 0.0020 0.0121 0.0161 0.0071 0.0095 0.0205 0.0003 0.0023 0.0038 0.0131 0.0060 0.0063 0.0020 0.0217 0.0096
1.9622 2.0696 1.4936 2.6650 3.9307 1.5028 1.7705 1.2212 2.6717 3.3799 2.8895 2.6680 1.8763 4.0130 2.8474 1.9630 2.3924 3.2365 1.9165 4.0822 4.4941 3.4029 4.3081 2.5954 3.5219 2.5754 2.7074 3.7343
0.0497 0.0385 0.1353 0.0077 0.0001 0.1329 0.0766 0.2220 0.0075 0.0007 0.0039 0.0076 0.0616 0.0000 0.0044 0.0496 0.0167 0.0012 0.0503 0.0000 0.0000 0.0007 0.0000 0.0094 0.0004 0.0100 0.0068 0.0002
Note: a Sample periods used are other than 1981:Q1–2002:Q4. b Null hypothesis: if there is no price adjustment mechanism in response to an excess demand, the value of γ will be equal to zero. (H0 : γ = 0). c Microsoft and Intel are not in the list since their dividends paid are trivial during the period analyzed here.
In this section, we used three kinds of data to test the existence of the disequilibrium adjustment process in terms of the disequilibrium model that is defined in equations (99.7)–(99.10). We found that there exists a disequilibrium adjustment process for international indexes, 10 portfolios from S&P 500, and 30 companies of Dow Jones index. Lee et al. (2009) have found that there exists supply effect in the asset pricing determination process. The existence of supply effect and disequilibrium price adjustment process is important for investigating asset pricing in security analysis and portfolio management. First, these results imply that
page 3510
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
Asset Pricing with Disequilibrium Price Adjustment
b3568-v3-ch99
page 3511
3511
market efficiency hypothesis is questionable. Second, these results imply that technical analysis is useful for security analysis. Finally, this information can be useful for either Fed or SEC to regulate the security industry. 99.6 Summary and Concluding Remarks In this paper, we first theoretically review and extend Black’s CAPM to allow for a price adjustment process. Next, we derive the disequilibrium model for asset pricing in terms of the disequilibrium model developed by Fair and Jaffe (1972), Amemiya (1974), Quandt (1988), and others. MLE and Two-Stage Least Squares (2SLS) are our two methods of estimating our asset pricing model with disequilibrium price adjustment effect. Using three data sets of price per share, dividend per share and volume data, we test the existence of price disequilibrium adjustment process with international index data, US equity data, and the thirty firms of the Dow Jones Index. We find that there exists disequilibrium price adjustment process. Our results support Lo and Wang’s (2000) findings that trading volume is one of the important factors in determining capital asset pricing. Harvey et al. (2016) and Harvey (2017) reviewed asset pricing tests for the last 49 years and concluded that the empirical asset pricing tests are not very successful. In addition, they suggest that we should use t value equal to three instead of t value equal to two for testing asset pricing. Gu et al. (2018) use machine learning method to test the CAPM, their method can be combined with the method discussed in this chapter to perform further tests of CAPM. Bibliography T. Amemiya (1974). A Note on a Fair and Jaffee Model, Econometrica, 42, 759–762. S. W. Black (1976). Rational Response to Shocks in a Dynamic Model of Capital Asset Pricing. American Economic Review, 66, 767–779. F. Black, M. Jensen, and M. Scholes (1972). The Capital Asset Pricing Model: Some Empirical Tests, In Jensen, M. (ed.), Studies in the Theory of Capital Markets, New York: Praeger, 79–121. D. T. Breeden (1979). An Intertemporal Asset Pricing Model with Stochastic Consumption and Investment Opportunities. Journal of Financial Economics, 7, 265–196. M. J. Brennan, A. W. Wang, and Y. Xia (2004). Estimation and Test of a Simple Model of Intertemporal Capital Asset Pricing. Journal of Finance, 59, 1743–1775. J. Y. Campbell (1993). Intertemporal Assets Pricing without Consumption Data. American Economic Review, 83, 487–512. J. Y. Campbell and H. John Cochrane (1999). By Force of Habit: A Consumption- Based Explanation of Aggregate Stock Market Behavior. Journal of Political Economy, 107, 205–250.
July 6, 2020
15:56
3512
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
J. Y. Campbell, J. Sanford Grossman, and J. Wang (1993). Trading Volume and Serial Correlation in Stock Returns. Quarterly Journal of Economics, 108, 905–939. P. L. Cheng and R. R. Grauer (1980). An alternative test of the capital asset pricing model. The American Economic Review, 70, 660–671. Cochrane, H. John (2001). Asset Pricing, Princeton University Press. C. J. Cox, E. Jonathan, J. R. Ingersoll, and A. Stephen Ross (1985). An Intertemporal General Equilibrium Model of Asset Prices. Econometrica, 53, 363–384. C. R. Fair and M. Dwight Jaffee (1972). Methods of Estimation of Market in Disequilibrium. Econometrica, 40, 497–514. E. Fama and K. R. French (1992). The Cross-section of Expected Stock Return. Journal of Finance, 47, 427–465. E. Fama and K. R. French (1993). Common Risk Factors in the Return on Bonds and Stocks. Journal of Financial Economics, 33, 3–56. E. Fama and K. R. French (1996). Multifactor Explanations of Asset Pricing Anomalies. Journal of Finance, 51, 55–84. E. Fama and K. R. French (2000). Disappearing Dividends: Changing Firm Characteristics or Lower Propensity to Pay? Journal of Financial Economics, 60, 3–43. Goldfeld, M. Stephen, and E. Richard Quandt (1974). Estimation in a Disequilibrium Model and the Value of Information. Research Memorandum 169, Econometric Research Program, Princeton University. Grinols, L. Earl (1984). Production and Risk Leveling in the Intertemporal Capital Asset Pricing Model. The Journal of Finance, 39(5), 1571–1595. S. Gu, B. Kelly, and D. Xiu (2018). Empirical Asset Pricing via Machine Learning, Working paper, University of Chicago. Gweon, Seong Cheol (1985). Rational Expectation, Supple Effect, and Stock Price Adjustment Process: A Simultaneous Equations System Approach. Ph.D Dissertation, Univ. of Illinois. C. R. Harvey, Y. Liu, and H. Zhu (2016). . . . and the Cross-Section of Expected Returns. The Review of Financial Studies, 29(1), 5–68. C. R. Harvey (2017). Presidential address: The scientific outlook in Financial Economics. The Journal of Finance, 72, 1399–1440. Jaffee, M. Dwight (1971). Credit Rationing and the Commercial Loan Market, New York: John Wiley and Sons Inc. G. G. Judge, W. E. Griffiths, and R. Carter Hill (1985). The Theory and Practice of Econometrics, 2nd ed., John Wiley & Sons, Inc. C. F. Lee and Seong Cheol Gweon (1986). Rational Expectation, Supply Effect and Stock Price Adjustment. Econometrica Society, 1985. C. F. Lee, Chiung-Min Tsai, and Alice C. Lee (2009). A Dynamic CAPM with Supply Effect: Theory and Empirical Results. Quarterly Review of Economics and Finance, Forthcoming. C. F. Lee, Chunchi Wu, and Mohamed Djarraya (1987). A Further Empirical Investigation of the Dividend Adjustment Process. Journal of Econometrics, 35, 267–285. J. Lintner (1965). The Valuation of Risky Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets. Review of Economics and Statistics, 47, 13–37. Lo, W. Andrew and J. Wang (2000). Trading Volume: Definition, Data Analysis, and Implications of Portfolio Theory. Review of Financial Studies, 13, 257–300. G. S. Maddala and Forrest D. Nelson (1974). Maximum Likelihood Methods for Models of Market in Disequilibrium. Econometrica, 42, 1013–1030. H. Markowitz (1959). Portfolio Selection: Efficient Diversification of Investments. New York: John Wiley.
page 3512
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
Asset Pricing with Disequilibrium Price Adjustment
page 3513
3513
Merton, C. Robert (1973). An Intertemporal Capital Assets Pricing Model. Econometrica, 41, 867–887. J. Mossin (1966). Equilibrium in a Capital Asset Market. Econometrica, 35, 768–783. H. Nehls and T. Schmidt (2003). Credit Crunch in Germany? Discussion Papers No.6, Rheinisch-Westfalisches Institute. Quandt, E. Richard (1988). The Econometrics of Disequilibrium. New York: Basil Blackwell Inc. Ross, A. Stephen (1976). The Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory, 13, 341–360. C. W. Sealey, Jr. (1979). Credit Rationing in the Commercial Loan Market: Estimates of a Structural Model under Conditions of Disequilibrium. Journal of Finance, 34, 689–702. W. Sharpe (1964). Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk. Journal of Finance, 19, 425–442. Y. C. Shih, S. S. Chen, C. F. Lee, and P. J. Chen (2014). The Evolution of Capital Asset Pricing Models. Review of Quantitative Finance and Accounting, 42(3), 415–448.
Appendix 99.A: Estimation of the Disequilibrium Model From Section 99.4, the disequilibrium equation system of equations (99.18)– (99.21) can be reformulated as 1 (99A.1a) Qt = α1 Pt−1 + α2 Pt + α3 Dt − ΔPt+ + u1t , γ 1 (99A.1b) Qt = Q + β1 Pt−1 + β2 Dt − ΔPt− + u2t , γ where ΔPt if ΔPt > 0, ΔPt+ = 0 otherwise, −ΔPt if ΔPt > 0, ΔPt− = 0 otherwise, and, from equation (99.16), we can derive the log-likelihood function for this empirical study, that is, we need to estimate the following equations simultaneously: L = T log |β1 − α1 + 1/γ| − T log(2π) − T log(σu1 ) − T log(σu2 ) 1 u21t u22t , (99A.2) − 2 − 2 2 σu1 σu2 t t u1t = Qt − α1 Pt−1 − α2 Pt − α3 Dt + u2t = Qt − Q − β1 Pt−1 − β2 Dt +
1 ΔPt+ , γ
1 ΔPt− . γ
(99A.3) (99A.4)
July 6, 2020
15:56
Handbook of Financial Econometrics,. . . (Vol. 3)
9.61in x 6.69in
b3568-v3-ch99
C. F. Lee, C.-M. Tsai & A. C. Lee
3514
The procedures and the related code in Eviews package are as follows. For step 2, we use the Marquardt procedure implemented in the Eviews package to find out the ML estimates of α, β, σμ2 and συ2 in equations (99A.2)–(99A.4). The order of evaluation is set to evaluate the specification by observation. The tolerance level of convergence, tol, is set as 1e-5. Step 1: Use 2LSL estimates of α, β, σμ2 and συ2 in equation (99A.1) as the initial estimates. ˆ σ ˆυ2 into equation (99A.2)–(99A.4) and solve Step 2: Substitute α ˆ , β, ˆμ2 and σ ˆ σ ˆ 2 and γˆ simultaneously. for the MLE of α ˆ , β, ˆ2 , σ μ
υ
Code ‘Assume zero correlation between demand and supply error ‘Define delta(p)+ and delta(p)- in (A.1) series dp_pos = (d(p1)>0)*d(p1) series dp_neg = (d(p1) 0 holds. In order to show the limiting result that the binomial option pricing formula converges to the continuous version of the Black–Scholes option pricing formula, we assume that h represents the lapsed time between successive stock price changes. Thus, if t is the fixed length of calendar time to expiration, and n is the total number of periods each with length h, then h = nt . As the trading frequency increases, h will get closer to zero. When h → 0, this is equivalent to n → ∞. rˆ is one plus the interest rate over a trading period of length h. We not only want rˆ to depend on n, but want it to depend on n in a particular way — so that as n changes, the total return rˆn remains the same. We denote r as one plus the rate over a fixed unit of calendar time, then over time t, the total return should be r t . Then, we will have following equation: rˆn = r t
(102.24)
t
for any choice of n. Therefore, rˆ = r n . Let S ∗ be the stock price at the end of the nth period with the initial price S. If there are j upwards move, then the generalized expression should be log(S ∗ /S) = j log u + (n − j) log d = j log(u/d) + n log d.
(102.25)
Therefore, j is the realization of a binomial random variable with probability of a success being p. We have the expectation of log(S ∗ /S) as ˜n E(log(S ∗ /S)) = [p log(u/d) + log d]n ≡ μ
(102.26)
and its variance ˜ 2 n. var(log S ∗ /S) = [log(u/d)]2 p(1 − p)n ≡ σ
(102.27)
page 3589
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3590
We are considering dividing up the original time period t into many shorter subperiods of length h so that t = nh. Our procedure calls for making n larger while keeping the original time period t fixed. As n → ∞, we would at least like the mean and the variance if the continuously compounded return rate of the assumed stock price movement coincided with that of actual stock price. Label the actual empirical values of μ ˜n and σ ˜ 2 n as 2 ˜n → μt μt and σ t, respectively. Then we want to choose u, d, and p so that μ 2 2 and σ ˜ n → σ t as n → ∞. A little algebra shows that we can accomplish this by letting √t √t u = eσ n , d = e−σ n , (102.28) 1 1 μ t . p= + 2 2 σ n At this point, in order to proceed further, we need the Lyapunov condition of central limit theorem as follows (Ash and Doleans-Dade 1999, Billingsley 2008). Lyapunov’s Condition. Suppose X1 , X2 , . . . are independent and uniformly bounded with E(Xi ) = 0, Yn = X1 + · · · + X n , and s2 = E(Yn2 ) = Var(Yn ). 1 E|Xk |2+δ = 0 for some δ > 0, then the distribution If limn→∞ nk=1 2+δ of
Yn sn
sn
converges to the standard normal distribution as n → ∞.
Theorem. If ˜|3 p|log u − μ ˜|3 + (1 − p)|log d − μ √ → 0 as n → ∞, σ ˜3 n ∗ ˜n log SS − μ √ ≤ z → N (z), Pr σ ˜ n
then (102.29) (102.30)
where N(z) is the cumulative standard normal distribution function. Proof. Since
3 u p|log u − μ ˜|3 = plog u − p log − log d = p(1 − p)3 log d
and
u 3 d
3 u u 3 (1 − p)|log d − μ ˜|3 = (1 − p) log d − p log − log d = p3 (1 − p) log , d d we have u 3 ˜|3 = p(1 − p)[(1 − p)2 − p2 ] log . p|log u − μ ˜|3 + (1 − p)|log d − μ d
page 3590
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
3591
Thus ˜|3 p|log u − μ ˜|3 + (1 − p)|log d − μ √ σ ˆ3 n 3 p(1 − p)[(1 − p)2 − p2 ] log ud = 3 √ p(1 − p) log ud n (1 − p)2 + p2 . = np(1 − p)
√t √t t with rˆ = r n , u = eσ n , d = e−σ n , we have √t t e n log r − e−σ n √t p = √t eσ n − e−σ n 1 + nt log r − 1 − σ nt + 12 σ 2 nt + O(n−3/2 ) = t 1 + σ n − 1 − σ nt + O(n−3/2 ) t 1 1 log r − 12 σ 2 + O(n−1 ) = + 2 2 σ n
Recall that p =
rˆ−d u−d
2 +p2 np(1−p)
√ Therefore, (1−p)
→ 0 as n → ∞.
Hence the condition for the theorem to hold as stated in equation (102.29) is satisfied. It is noted that the condition (102.29) is a special case of Lyapunov’s condition where δ = 1. Next, we will show that the binomial option pricing model as given in equation (102.23) will indeed coincide with the Black–Scholes option pricing formula. We can see that there are apparent similarities in equation (102.23). In order to show the limiting result, we need to show that √ As n → ∞, B1 (a; n, p ) → N (x) and B2 (a; n, p) → N (x − σ t). In this section we will only show the second convergence result, as the same argument will hold true for the first convergence. From the definition of B2 (a; n, p), it is clear that 1 − B2 (a; n, p) = Pr(j ≤ a − 1) a − 1 − np j − np ≤ . = Pr np(1 − p) np(1 − p)
(102.31)
page 3591
July 6, 2020
16:3
3592
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
Recall that we consider a stock to move from S to uS with probability p and dS with probability 1 − p. The mean and variance of the continuously ˜p2 where compounded rate of return for this stock are μ ˜ p and σ u u 2 + log d and σ ˜p2 = log p(1 − p). (102.32) μ ˜p = p log d d ˜p2 , we have From equation (102.25) and the definitions for μ ˜ p and σ ∗ log SS − μ ˜p n j − np √ = . (102.33) σ ˜ n np(1 − p) p Also, from the binomial option pricing formula we have X u log Sd X n u − ε = log − n log d / log − ε, a−1= S d log d
(102.34)
where ε is a real number between 0 and 1. ˜p2 , it is easy to show that From the definitions of μ ˜p and σ log X −μ ˜p n − ε log ud a − 1 − np S √ = . σ ˜p n np(1 − p) Thus from equation (102.31) we have ∗ log log SS − μ ˜p n √ ≤ 1 − B2 (a; n, p) = Pr σ ˜p n
X S
−μ ˜p n − ε log √ σ ˜p n
(102.35)
u d
.
(102.36) We have checked the condition given by equation (102.29) in order to ˜p2 n apply the central limit theorem. In addition, we have to evaluate μ ˜p n, σ u 1 2 ˜p n → (log r − 2 σ )t, which can be derived from the and log( d ) as n → ∞. μ property of the lognormal distribution that log E(S ∗ /S) = μp t + 12 σ 2 t, and σp2 → σ 2 t and E(S ∗ /S) = [pu + (1 − p)d]n = rˆn = r t . It is also clear that n˜ log( ud ) → 0. Hence, in order to evaluate the asymptotic probability in equation (102.30), we have log X ˜p n − ε log ud − log r − 12 σ 2 t log X S −μ S √ √ . (102.37) →z= σ ˜p n σ t Using the fact that 1 − N (z) = N (−z), we have, as n → ∞, B2 (a; n, p) → √ √ log( S−t ) √ + 12 σ t. A similar argument holds N (−z) = N (x−σ t), where x = σXr t for B1 (a; n, p ), and hence we completed the proof that the binomial option
page 3592
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3593
pricing formula as given in equation (102.23) includes the Black–Scholes option pricing formula as a limiting case. Lyapunov’s condition requires that X1 , X2 , . . . are independent and uniformly bounded with E(Xi ) = 0, Yn = X1 + · · · + Xn , and s2 = E(Yn2 ) = Var(Yn ). However, rates of return are generally not independent over time and not necessarily uniformly bounded by the condition required. This is the potential limitation of proof by Cox et al. (1979). We found that the derivation methods proposed by Rendleman and Bartter (1979), which will be discussed in next section, are not so restrictive as the proof discussed in this section. 102.4.2 Rendleman and Bartter method In Rendleman and Bartter (1979), a stock price can either advance or decline during the next period. Let HT+ and HT− represent the returns per dollar invested in the stock if the price rises (the + state) or falls (the − state), respectively, from time T − 1 to time T (maturity of the option). VT+ and VT− the corresponding end-of-period values of the option. Let R be the riskless interest rate, they showed that the price of the option can be represented as a recursive form as: WT −1
WT+ (1 + R − HT− ) + WT− (HT+ − 1 − R) . = (HT+ − HT− )(1 + R)
(102.38)
Equation (102.37) can be applied at any time T − 1 to determine the price of the option as a function of its value at time T .7 By using recursive substitution as discussed in Section 102.2.1, they derived the binomial option pricing model as defined in equation (102.38)8 : W0 = S0 B1 (a; T, ϕ) −
X B2 (a; T, φ), (1 + R)T
(102.39)
where pseudo probabilities ϕ and φ are defined as
7
ϕ=
(1 + R − H − )H + , (1 + R)(H + − H − )
(102.40)
φ=
(1 + R − H − ) . (H + − H − )
(102.41)
Note that notation T used here is the number of periods rather than calendar time. Note that some of the variables used in this section are different from those used in Section 102.2.1 and Section 102.4.1.
8
page 3593
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3594
r−d Note that φ and ϕ are identical to p and p , which are defined as p = u−d and p ≡ (u/r)p in Section 102.2.1. i T −i >X a denotes the minimum integer value of i for which S0 H + H − will be satisfied. This value is given by9 ln(X/S0 ) − T ln(H − ) a = 1 + INT , (102.42) ln H + − ln H −
where INT[·] is the integer operator. B1 (a; T, ϕ) and B2 (a; T, ϕ) are the cumulative binomial probability. The number of successes will fall between a and T after T trials, ϕ and φ represent the probability associated with a success after one trial. In each period, the stock price rises with the probability θ. We assume the distribution of returns, which is generated after T periods will follow a log-binomial distribution. Then the mean of the stock price return is μ = T [h+ θ + h− (1 − θ)] = T [(h+ − h− )θ + h− ],
(102.43)
and the variance of stock price return is σ 2 = T (h+ − h− )2 θ(1 − θ),
(102.44)
where θ = probability that the price of the stock will rise, h+ = ln(H + ),
(102.45)
h− = ln(H − ).
(102.46)
Note that in Cox et al. (1979), they assume log-binomial distribution with mean μt, and variance σ 2 t. Apparently, Rendleman and Bartter (1979) assumed that t = 1. Therefore, the Black–Scholes model derived by them is not exactly identical to the original Black–Scholes model. The implied values of H + and H − are then determined by solving equations (102.42)–(102.45), 9
i
We first solve equality S0 H + H −
T −i
= X. This yields i = i
T −i
ln(X/S0 )−T ln(H − ) . ln H + −ln H −
To get a,
> X will be satisfied, we should the minimum integer value of i for which S0 H + H − note a as ln(X/S0 ) − T ln(H − ) . a = 1 + INT ln H + − ln H −
page 3594
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
shown as equations (102.46) and (102.47), respectively. √ (1 − θ) , H + = exp μ/T + (σ/ T ) θ √ θ − . H = exp μ/T − (σ/ T ) (1 − θ)
3595
(102.47)
(102.48)
As T becomes larger, the cumulative binomial density function can be approximated by the cumulative normal density function. When T → ∞, the approximation will be exact, and equation (102.38) evolves to equation (102.48).10 W0 ∼ S0 N (Z1 , Z1 ) −
X N (Z2 , Z2 ). (1 + R)T
(102.49)
In this equation, N (Z, Z ) is the probability that a random variable from a standard normal distribution will take on values between a lower limit Z and an upper limit Z . According to the property of binomial probability distribution function, we have Z1 = Z2 =
a − Tϕ
,
Z1 =
a − Tφ , T φ(1 − φ)
Z2 =
T ϕ(1 − ϕ)
T − Tϕ T ϕ(1 − ϕ)
,
T − Tφ . T φ(1 − φ)
Thus, the price of option when the two-state process evolves continuously is presented as X N lim Z2 , lim Z2 . W0 = S0 N lim Z1 , lim Z1 − T →∞ T →∞ T →∞ T →∞ limT →∞ (1 + R)T (102.50) Let 1 + R = er/T reflect the continuous compounding of interest, then limT →∞ (1 + R)T = er . It is obvious that limT →∞ Z1 = limT →∞ Z2 = ∞, therefore, all that needs to be determined is limT →∞ Z1 and limT →∞ Z2 in the derivation of the two-state model under a continuous time case. Substituting H + and H − in equations (102.46) and (102.47) into equation (102.41), 10
In Appendix 102, we will use de Moivre–Laplace theorem to show that the best fit between the binomial and normal distributions occurs when the binomial probability (or pseudo probability in this case) is 1/2. In addition, we also present the Excel program in Appendix 102 to do some sensitive analysis about the precision of using binomial OPM to approximate the Black–Scholes OPM.
page 3595
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3596
we have
√ θ ⎤ ln(X/S0 ) − μ + σ T (1−θ) ⎦. a = 1 + INT ⎣ σ/ T θ(1 − θ) ⎡
Then, equation (102.50) holds. Now Z1 =
ln
1 + INT
a − Tϕ
=
T ϕ(1 − ϕ)
X S0
√ θ −μ+σ T 1−θ √ σ
T θ(1−θ)
− Tϕ .
T ϕ(1 − ϕ)
(102.51)
In the limit, the term 1 + INT[·] will be simplified to [·]. Therefore, Z1 can be restated as ln SX0 − μ √T (θ − ϕ) . (102.52) + Z1 ∼ ϕ(1 − ϕ) σ ϕ(1−ϕ) θ(1−θ) Substituting H + and H − in equations (102.46) and (102.47) and 1 + R = er/T into equation (102.39), we have μ μ 1−θ θ r − √σ + √σ T 1−θ θ T T eT − e eT − + (1 + R − H )H √ = ϕ= μ μ 1−θ θ (1 + R)(H + − H − ) + √σ − √σ T T θ 1−θ r T T e
T
e
√σ T
1−θ θ
−e
μ T
−
r T
e +
√σ T
−e
1−θ θ
−
θ 1−θ
. (102.53) 1−θ θ − √σ θ 1−θ T e −e Now, we expand Taylor’s series11 in √1T , and obtain 1−θ σ 1−θ θ √σ √ √1 θ − (μ − r)/T − θ − 1−θ + O T T T ϕ= 1−θ θ √σ √1 θ + 1−θ + O T T =
=
√σ T
√σ T
1−θ θ
√σ T
θ 1−θ
+O
+
θ 1−θ
√1 T
+O
√1 T
,
(102.54)
where O( √1T ) denotes a function tending to zero more rapidly than 11
Using Taylor expansion, we have ex = 1 + x +
x2 2!
+ O(x2 ).
√1 T
.
page 3596
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
It can be shown that lim ϕ =
T →∞
+
√
⎜ √ ⎜ e T (θ − ϕ) = T ⎜ ⎜θ − ⎝ √ θ T
e
1−θ θ
−e
e
√σ T
⎛
√σ T
√ ⎜ − T ⎝e
√σ T
1−θ θ
=
−e 1−θ θ
e
√σ T
−e
√σ T
r T
−
√σ T
−
μ T
μ T
θ 1−θ √1−θ+θ θ(1−θ)
=
θ 1−θ
Similarly, we have ⎛
3597
θ 1−θ
1−θ θ
page 3597
+
1−θ θ
−
1−θ θ
r T
√σ T
−e
θ 1−θ
(102.55)
√σ T −
= θ.
1−θ θ
−
θ 1−θ
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
θ 1−θ
+
−
−e
√σ T
√σ T
1−θ θ
−
θ 1−θ
⎞ ⎟ ⎠ .
θ 1−θ
(102.56) √1 , T
and we can obtain We also expand Taylor’s series in 2 √ σ 1−θ θ 1 √σ 1−θ √ θ T + + θ 1−θ 2 θ − T T 1 + O T √ T (θ − ϕ) = σ 1−θ θ √ √1 + + O θ 1−θ T T
θ 1−θ
2 √ σ 1−θ 1 √σ 1−θ √ T + θ 2 θ T T μ−r σ 1−θ θ √ − T + θ − 1−θ T − 1−θ θ √σ √1 + + O θ 1−θ T T 1 2
+
=
√σ T
√σ T
1 √ σ2 2θ T
2
1−θ θ
1−θ θ
1−θ θ
− √σ T
θ 1−θ
−
+
+
θ 1−θ
θ 1−θ
2 +O +O
1
√1 T
T
2 θ + 12 √σT 1−θ − 2 + O √1T . θ + 1−θ + O √1T
μ−r √ T
1−θ θ
(102.57)
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3598
Therefore, we have 1 √ σ2 2θ T
lim
T →∞
=
√
T (θ − ϕ) = lim
1−θ − θ θ 1−θ +O √1T
T →∞
1 2 2 θσ
1−θ θ
−
θ 1−θ
√σ T
+ μ − r + 12 σ 2 1−θ θ σ θ + 1−θ
1−θ θ θ 1−θ
+
μ−r √ T
+
+
θ 1−θ
σ2 1√ 2 T
−2
μ − r − 12 σ 2 θ(1 − θ) μ − r − 12 σ 2 = . = σ 1−θ θ σ θ + 1−θ
+O
θ 1−θ
√1 T
−2
(102.58)
√ √ Now substituting limT →∞ ϕ for ϕ and limT →∞ T (θ − ϕ) for T (θ − ϕ) into equation (102.51). Then equation (102.58) holds. ln SX0 − μ θ(1 − θ) r − μ + 12 σ 2 − lim Z1 = T →∞ σ θ(1 − θ) σ θ(1−θ) θ(1−θ) ln SX0 − r − 12 σ 2 . (102.59) = σ Similarly, we can also prove that lim Z2 =
T →∞
ln
X S0
− r + 12 σ 2 σ
.
(102.60)
According to the property of normal distribution, N (Z, ∞) = N (−∞, −Z). Let d1 = − limT →∞ Z1 , d2 = − limT →∞ Z2 , the continuous time version of the two-state model is obtained w0 = S0 N (−∞, d1 ) − Xe−r N (−∞, d2 ) = S0 N (d1 ) − Xe−r N (d2 ), ln SX0 + r + 12 σ 2 , d1 = σ d2 = d1 − σ.
(102.61)
Equation (102.60) is not exactly identical to the original Black–Scholes model because of the assumed log-binomial distribution with mean μ and
page 3598
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3599
variance σ 2 . If they assume a log-binomial distribution with mean μt and variance σ 2 t, then d1 and d2 should be rewritten as ln SX0 + r + 12 σ 2 t √ , d1 = σ t √ d2 = d1 − σ t. Lee and Lin (2010) have theoretically compared these two derivation methods. Based upon (i) mathematical and probability theory knowledge, (ii) assumption and (iii) advantage and disadvantage, the comparison results are listed in Table 102.2. The main differences of assumptions between two approaches are: Under Cox et al. (1979) method, √ the stock σ t/n and price’s increase factor and decrease factor is expressed as u = e √ d = e−σ t/n , respectively, which implies that the restraints equality ud = 1 holds. While under the Rendleman and Bartter (1979)√method, the increase factor and decrease factor is H + = exp μ/T + (σ/ T ) (1 − θ)/θ) and √ H + = exp(μ/T − (σ/ T ) θ/(1 − θ)), respectively. In the Rendleman and Bartter (1979) method’s settings, time to maturity is settled as “1”. With the number of periods T → ∞, we can find that the expressions are similar et al. (1979) method. They still have the “adjusted factor” to the Cox √ (1−θ) θ and (1−θ) before σ/ T in the exponential expression for increase θ factor and decrease factor. Under the Rendleman and Bartter (1979) method, H + H − = 1. Hence, like we indicate in Table 102.2, the Cox et al. method is easy to follow if one has the advanced level knowledge in probability theory, but the assumptions on the model parameters make its applications limited. On the other hand, the Rendleman and Bartter model is intuitive and does not require higher-level knowledge in probability theory. However, the derivation is more complicated and tedious. In Appendix B, we show that the best fit between binomial distribution and normal distribution will occur when binomial probability is 0.5. 102.5 Lognormal Distribution Approach to Derive Black–Scholes Model12 To derive the option pricing model in terms of lognormal distribution, we begin by assuming that the stock price follows a lognormal distribution 12
The presentation and derivation of this section follow Garven (1986), Lee et al. (2013a, 2013b).
page 3599
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3600 Table 102.2: approaches.
Model
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee Comparison between Rendleman and Bartter’s and Cox et al.’s
Rendleman and Bartter (1979)
Mathematical and Probability Theory Knowledge
Basic Algebra Taylor Expansion Binomial Theorem Central Limit Theorem Properties of Binomial Distribution
Assumption
The mean and variance of logarithmic returns of the stock are held constant over the life of the option
Advantage and Disadvantage
1. Readers who have undergraduate level training in mathematics and probability theory can follow this approach 2. The approach is intuitive, but the derivation is more complicated and tedious
Cox et al . (1979) Basic Algebra Taylor Expansion Binomial Theorem Central Limit Theorem Properties of Binomial Distribution Lyapunov’s Condition The stock follows a binomial process from one period to the next. It can only go up by a factor of u with probability p or go down by a factor of d with probability 1 − p In order to apply the Central Limit Theorem, u, d, and p are needed to be chosen 1. Readers who have advanced level knowledge in probability theory can follow this approach; but for those who do not, it may be difficult to follow 2. The assumption on the parameters u, d, and p makes this approach more restricted
(Lee et al., 2013b). Denote the current stock price by S and the stock price at t = exp(Kt ) is a random variable with the end of tth period by St . Then SSt−1 a lognormal distribution, where Kt is the rate of return in tth period and is assumed as a random variable with normal distribution. Assume Kt has the same expected value μk and variance σk2 for each. Then K1 + K2 + · · · + KT is a normal random variable with expected value T μk and variance T σk2 . Property of lognormal distribution. If a continuous random variable y is normally distributed, then the continuous variable x defined in equation (102.61) is lognormally distributed: x = ey .
(102.62)
page 3600
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3601
If the variable y has mean μ and variance σ 2 , then the mean μx and variance σx2 of variable x is defined as follows, respectively: 2
μx = eμ+1/2σ , σx2
2μ+σ2
=e
σ2
(e
(102.63) − 1).
(102.64)
Following the property, we then can define the expected value of SST = exp(K1 + K2 + · · · + KT ) as T σk2 ST = exp T μk + . (102.65) E S 2 Under the assumption of a risk-neutral investor, the expected return E( SST ) is assumed to be erT (where r is the riskless rate of interest). In other words, the following equality holds: μk = r −
σk2 . 2
(102.66)
The call option price C can be determined by discounting the expected value of the terminal option price by the risk-free rate: C = e−rT E[Max(ST − X, 0)].
(102.67)
Note that in equation (102.66): Max(ST − X, 0) =
" ST − X 0
for ST > X, otherwise,
where T is the time of expiration and X is the exercise price. Let x = SST be a lognormal distribution. Then we have C = e−rT E[Max(ST − X)] # ∞ X −rT g(x)dx S x− =e X S S # ∞ # ∞ −rT −rT X S xg(x)dx − e S g(x)dx, =e X S X S
(102.68)
S
where g(x) is the probability density function x = SST . Here, we will use properties of normal distribution, lognormal distribution, and their mutual relations to derive the Black–Scholes model. We
page 3601
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3602
continue with variable settings in equation (102.61), where y is normally distributed and x is lognormally distributed. The PDF of x is 1 1 2 √ exp − 2 (x − μ) , x > 0. (102.69) f (x) = 2σ xσ 2π The PDF of y can be defined as 1 1 f (y) = √ exp − 2 (y − μ)2 , 2σ σ 2π
−∞ < y < ∞.
(102.70)
By comparing the PDF of normal distribution and the PDF of lognormal distribution, we know that f (x) =
f (y) . x
(102.71)
In addition, it can be shown that13 dx = xdy. The CDF of lognormal distribution can be defined as # ∞ f (x)dx.
(102.72)
(102.73)
a
If we transform variable x in equation (102.72) into variable y, then the upper and lower limits of integration for a new variable are ∞ and ln a, respectively. Then the CDF for lognormal distribution can be written in terms of the CDF for normal distribution as # ∞ # ∞ # ∞ f (y) x dy = f (x)dx = f (y)dy. (102.74) x a ln a ln(a) We can rewrite equation (102.73) in a standard normal distribution form, by substituting the variable. # ∞ # ∞ f (x)dx = f (y)dy = N (d), (102.75) a
ln(a)
. where d = μ−ln(a) σ Similarly, the mean of a lognormal variable can be defined as # ∞ 2 xf (x)dx = eμ+1/2σ . 0
13
Now that x = ey , then dx = d(ey ) = ey dy = xdy.
(102.76)
page 3602
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
3603
If the lower bound a is greater than 0, then the partial mean of x can be shown as14 # ∞ # ∞ 2 xf (x)dx = f (y)ey dy = eμ+σ /2 N (d), (102.77) 0
ln(a)
+ σ. where d = μ−ln(a) σ Substituting μ = r − σ 2 /2 and a = X S into equation (102.74), we obtain # ∞ g(x)dx = N (d2 ), (102.78) X S
r−(1/2)σ2 −ln
X
S . where d2 = σ Similarly, we substitute μ = r − σ 2 /2 and a = X S into equation (102.76), and obtain # ∞ xg(x)dx = er N (d1 ), (102.79) X S
r−(1/2)σ2 −ln
X
S + σ. where d1 = σ Substituting equations (102.77) and (102.78) into equation (102.67), we obtain equation (102.79), which is identical to the Black–Scholes formula:
C = SN (d1 ) − Xe−rT N (d2 ), S ln X + r + 12 σ 2 T √ , d1 = σ T S √ ln X + r − 12 σ 2 T √ = d1 − σ T . d2 = σ T
(102.80)
In this section, we show that the Black–Scholes model can be derived by differential and integral calculus without using stochastic calculus. However, it should be noted that we assume risk neutrality instead of risk averse in the derivation of this section. 102.6 Using Stochastic Calculus to Derive Black–Scholes Model Black and Scholes (1973) have used two alternative approaches to derive the well-known stochastic differential equation defined in 14 The second equality is obtained by substituting the PDF of normal distribution into ∞ f (y)ey dy and does the appropriate manipulation. ln(a)
page 3603
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3604
equation (102.80)15 : 1 2 2 σ S CSS (t, S) + rSCS (t, S) − rC(t, S) + Ct (t, S) = 0, 2
(102.81)
where t = passage of time; S = stock price, which is a function of time t; C(t, S) = call price, which is a function of time t and stock price S; Ct (t, S) is the first order partial derivative of C(t, S) respect to t; CS (t, S) is the first order partial derivative of C(t, S) respect to S; CSS (t, S) is the second order partial derivative of C(t, S) respective to S; r = risk-free interest rate; σ = stock volatility. We rewrite it in a simpler way, as shown in equation (102.81): ∂C 1 ∂2C ∂C + rS + σ 2 S 2 2 = rC, ∂t ∂S 2 ∂S
(102.82)
where ∂C ∂2C ∂C = Ct (t, S); = CS (t, S); = CSS (t, S) in equation (102.80). ∂t ∂S ∂S 2 To derive the Black–Scholes model, we need to solve this differential equation under the boundary condition: $ S − X if S ≥ X, C(S, T ) = (102.83) 0 otherwise, where T is the maturity date of the option, and X is the exercise price. By introducing boundary constraints and making variable substitutions, they obtained a differential equation, which is the heat-transfer equation 15
Black and Scholes have used two alternative methods to derive this equation. In addition, the careful derivation of this equation can be found in Chapter 27 of Lee et al. (2013a), which was written by Professor A.G. Malliaris, Loyola University of Chicago. Beck (1993) has proposed an alternative way to derive this equation, and raised questions about the methods used by Black and Scholes. In the summary of his paper, he mentioned that the traditional derivation of the Black–Scholes formula is mathematically unsatisfactory. The hedge portfolio is not a hedge portfolio since it is neither self-financing nor riskless. Due to compensating inconsistencies, the final result obtained is nevertheless correct. In his paper, these inconsistencies, which abound in the literature, were pointed out and an alternative, more rigorous derivation avoiding these problems is presented.
page 3604
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3605
in physics (Joshi 2003). They used the Fourier transformation to solve the heat-transfer equation under the boundary condition, and finally obtain the solution. Here we will demonstrate the main procedures to obtain the heattransfer equation, and then get the closed-form solution under the boundary condition. Let Z = ln S, using the chain rule of partial derivatives, then we have the following equations hold: ∂C ∂Z ∂C 1 ∂C = = , ∂S ∂Z ∂S ∂Z S ∂ ∂C ∂C 1 ∂2C ∂Z 1 − = 2 ∂S ∂S S ∂Z S 2 ∂2C 1 ∂C 1 = − . 2 2 ∂Z S ∂Z S 2 Then we changed equation (102.81) into equation (102.85). 1 1 ∂2C ∂C ∂C + r − σ2 + σ2 = rC ∂t 2 ∂Z 2 ∂Z 2
(102.84)
(102.85)
(102.86)
Let τ = T − t. Then
1 2 ∂C 1 ∂2C ∂C − r− σ − σ2 = −rC. ∂τ 2 ∂Z 2 ∂Z 2
(102.87)
Let D = erτ C, i.e., C = e−rτ D, and re-define three partial derivatives in equation (102.86). We have ∂D ∂D ∂C = −re−rτ D + e−rτ = −rC + e−rτ , ∂τ ∂τ ∂τ ∂D ∂C = e−rτ , ∂Z ∂Z ∂2D ∂2C = e−rτ . 2 ∂Z ∂Z 2
(102.88) (102.89) (102.90)
If we substitute equations (102.87)–(102.89) into equation (102.86), we obtain 1 2 ∂D 1 2 ∂ 2 D ∂D − r− σ − σ = 0. (102.91) ∂τ 2 ∂Z 2 ∂Z 2 We introduce a new variable Y to replace Z, then we have 1 2 1 2 S + r − σ τ = Z + r − σ τ − ln X. Y = ln X 2 2
(102.92)
page 3605
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3606
Since D = erτ C, and it is a function of Z and τ , then we explicitly rewrite D as D Z (Z, τ ). Equation (102.91) implies that D is also a function of Y and τ . We define D Z (Z, τ ) and D Y (Y, τ ) as follows. 1 2 Z Z Y − r − σ τ + ln X, τ , (102.93) D (Z, τ ) = D 2 1 2 Y Y Z + r − σ τ − ln X, τ . (102.94) D (Y, τ ) = D 2 Taking partial derivatives of D Z (Z, τ ) respective to Z, we obtain ∂D Y (Y, τ ) ∂Y ∂D Y (Y, τ ) ∂D Z (Z, τ ) = = . ∂Z ∂Y ∂Z ∂Y
(102.95)
Similarly, we have ∂ 2 D Y (Y, τ ) ∂ 2 D Z (Z, τ ) = , 2 ∂Z ∂Y 2
(102.96)
∂D Y (Y, τ ) ∂Y ∂D Y (Y, τ ) ∂D Z (Z, τ ) = + ∂τ ∂Y ∂τ ∂τ Y ∂D Y (Y, τ ) ∂D (Y, τ ) ∂ Z + r − 12 σ 2 τ − ln X + = ∂Y ∂τ ∂τ Y Y 1 ∂D (Y, τ ) ∂D (Y, τ ) r − σ2 + . (102.97) = ∂Y 2 ∂τ Substituting equations (102.94)–(102.96) into equation (102.90), we can get 1 ∂ 2 DY 2 ∂D Y − σ = 0. ∂τ 2 ∂Y 2
(102.98)
Equation (102.97) is almost close to the heat transfer equation used by Black and Scholes. Let u = σ22 (r − 12 σ 2 )Y , v = σ22 (r − 12 σ 2 )2 τ , and re-denote D(u, v) as the function of u and v: ∂D(u, v) ∂u ∂D(u, v) 2 1 2 ∂D Y (Y, τ ) = = r− σ , ∂Y ∂u ∂Y ∂u σ2 2 (102.99) 2 ∂ 2 D(u, v) 2 1 ∂ 2 D Y (Y, τ ) = , (102.100) r − σ2 2 2 2 ∂Y ∂u σ 2
page 3606
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3607
∂D(u, v) ∂u ∂D(u, v) ∂v ∂D(u, v) ∂v ∂D Y (Y, τ ) = + = ∂τ ∂u ∂τ ∂v ∂τ ∂v ∂τ 2 1 ∂D(u, v) 2 (102.101) r − σ2 . = 2 ∂v σ 2 We finally reach the heat transfer equation derived by Black and Scholes, when we substitute equations (102.99) and (102.100) into equation (102.97): ∂ 2 D(u, v) ∂D(u, v) = . ∂v ∂u2
(102.102)
Equation (102.101) is identical to equation (102.10) of Black–Scholes (1973). In terms of the Black–Scholes notation, equation (102.101) can be written as y2 = y11 .
(102.103)
Now, we need to find a function D(u, v) that satisfies both of the boundary conditions and the partial differential equation as shown in equation (102.101).16 The general solution given in Churchill (1963) is as follows: If vt (x, t) = kvxx (x, t), v(x, 0) = f (x),
−∞ < x < ∞, t > 0,
−∞ < x < ∞,
then the general solution for vt (x, t) is17 # ∞ √ 1 2 v(x, t) = √ f (x + 2η kt)e−η dη. π −∞
(102.103) (102.104)
(102.105)
In our notation, D(u, v) = v(x, t), and k = 1, which makes equation (102.102) equivalent to the partial differential equation as shown in equation (102.101). Moreover, we have the boundary condition: $ S − X if S ≥ X, C(S, T ) = 0 otherwise. At maturity date, t = T , v = 0. Then we have D(u, 0) = C(S, T ). f (u) must be determined to make equation (102.103) satisfied. Black and Scholes 16 The following procedure has closely related to Kutner (1988). Therefore, we strongly suggest the readers read his paper. 17 The solution is obtained as an application of the general Fourier integral. See Churchill (1963, pp. 154–155) for more details.
page 3607
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3608
choose:
" f (u) =
1
X(eu( 2 σ 0
2 )/(r− 1 σ 2 ) 2
− 1)
if u ≥ 0, if u < 0.
(102.106)
Note that, when t = T , u = σ22 (r − 12 σ 2 ) ln(S/X), then we have $ X(eln(S/X) − 1) = S − X if u ≥ 0, (102.107) D(u, 0) = 0 if u < 0, which is identical to the boundary condition. Therefore, the determined f (u) makes equation (102.103) hold. Now that equations (102.102) and (102.103) are satisfied, the solution to the differential equation is given by # ∞ √ 1 2 1 2 1 2 X(e(u+2η v)( 2 σ )/(r− 2 σ ) − 1)e−η dη. (102.108)18 D(u, v) = √ √ π −u/2 v √ Let η = q/ 2, and substitute it and C = e−rτ D into equation (102.107), then we have # ∞
√ 2 (u+q 2v)( 12 σ2 )/(r− 12 σ2 ) −rτ 1 √ X e − 1 e−q /2 dq. C(S, t) = e √ 2π −u/ 2v (102.109) √ ln(S/X)+(r− 12 σ2 )τ √ = −d2 . Therefore, equation Note that −u/ 2v = − σ τ (102.108) can evolve into # ∞ √ 1 2 1 2 2 −rτ 1 √ e[(u+q 2v)( 2 σ )/(r− 2 σ )]−q /2 dq C(S, t) = Xe 2π −d2 # ∞ 2 −rτ 1 √ e−q /2 dq. (102.110) − Xe 2π −d2 We observe the second term in equation (102.109). Recall that the cumulative standard normal density function is defined as # x t2 1 √ e− 2 dt. (102.111) N (x) = 2π −∞ Therefore, the second term of equation (102.109) is # ∞ 1 2 √ e−q /2 dq = Xe−rτ (1 − N (−d2 )) = Xe−rτ N (d2 ). (102.112) Xe−rτ 2π −d2 √ The lower √ limit exists since if u < 0, f (u) = 0. Therefore, we require u + 2η v, i.e., η ≥ −u/2 v. 18
page 3608
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3609
Deriving the first term in equation (102.109) is much more tedious and difficult. Recall the expressions for u and v, u = σ22 (r − 12 σ 2 )(ln(S/X) + (r − 1 2 2 1 2 2 2 σ )τ ), v = σ2 (r − 2 σ ) τ . Therefore, equations (102.112) and (102.113) hold. 1 2 1 2 1 2 r− σ = ln(S/X) + r − σ τ , u σ 2 2 2 (102.113) √ √ 1 1 (102.114) r − σ 2 = qσ τ . q 2v σ 2 2 2 Therefore, the first term in equation (102.109) is # ∞ √ 1 2 1 2 1 2 e[(u+q 2v)( 2 σ )/(r− 2 σ )]−q /2 dq Xe−rτ √ 2π −d2 # ∞ √ 1 2 2 −rτ ln(S/X) 1 rτ − 2 (q − 2qσ τ + σ τ ) √ e e dq = Xe e 2π −d2 # ∞ √ 2 1 1 e− 2 (q − σ τ ) dq. (102.115) = S√ 2π −d2 √ Here, again we apply variable substitution. Let q = q − σ τ , then dq = dq. Therefore, equation (102.114) evolves to # ∞ 1 2 1 e− 2 q dq . (102.116) S√ √ 2π −d2 −σ τ √ Let d1 = d2 + σ τ , then we obtain # ∞ # ∞ 1 2 1 2 1 1 √ e− 2 q dq = S(1 − N (−d1 )) = SN (d1 ). e− 2 q dq = S S√ 2π −d1 2π −d1 (102.117) Finally, when combining the first and second terms in equation (102.109), simplified by equations (102.116) and (102.111) respectively, we reach the Black–Scholes formula: C(S, t) = SN (d1 ) − Xe−rτ N (d2 ), where
ln(S/X) + r + 12 σ 2 τ √ , d1 = σ τ √ ln(S/X) + r − 12 σ 2 τ √ = d1 − σ τ , d2 = σ τ τ = T − t.
(102.118)
(102.119)
page 3609
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3610
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
102.7 Summary and Concluding Remarks In this paper, we have reviewed three alternative approaches to derive option pricing models. We have discussed how the binomial model can be used to derive the Black–Scholes model in detail. In addition, we also show how the Excel program in terms of decision tree can be used to empirically show how binomial model can be converted to Black–Scholes model when observations approach infinity. Under an assumption of risk neutrality, we show that the Black–Scholes formula can be derived using only differential and integral calculus and a basic knowledge of normal and lognormal distributions. In Appendix, we use the de Moivre–Laplace theorem to prove that the best fit between the binomial and normal distributions occurs when binomial probability is 12 . Overall, this paper can help statisticians and mathematicians better understand how alternative methods can be used to derive the Black–Scholes option model.
Bibliography Amin, K.I. and Jarrow, R.A. (1992). Pricing Options on Risky Assets in a Stochastic Interest Rate Economy. Mathematical Finance 2(4), 217–237. Amin, K.I. and Ng, V.K. (1993). Option Valuation with Systematic Stochastic Volatility. Journal of Finance 48(3), 881–910. Ash, R.B. and Doleans-Dade, C. (1999). Probability and Measure Theory, 2nd edn. Academic Press. Bailey, W. and Stulz, R.M. (1989). The Pricing of Stock Index Options in a General Equilibrium Model. Journal of Financial and Quantitative Analysis 24(1), 1–12. Bakshi, G.S. and Chen, Z. (1997a). An Alternative Valuation Model for Contingent Claims. Journal of Financial Economics 44(1), 123–165. Bakshi, G.S. and Chen, Z. (1997b). Equilibrium Valuation of Foreign Exchange Claims. Journal of Finance 52(2), 799–826. Bakshi, G., Cao, C., and Chen, Z. (1997). Empirical Performance of Alternative Option Pricing Models. Journal of Finance 52(5), 2003–2049. Bates, D.S. (1991). The Crash of ’87: Was It Expected? The Evidence from Options Markets. Journal of Finance 46(3), 1009–1044. Bates, D.S. (1996). Jumps and Stochastic Volatility: Exchange Rate Processes Implicit in Deutsche Mark Options. Review of Financial Studies 9(1), 69–107. Beck, T.M. (1993). Black–Scholes Revisited: Some Important Details. Financial Review 28(1), 77–90. Beckers, S. (1980). The Constant Elasticity of Variance Model and Its Implications for Option Pricing. Journal Finance 35(3), 661–673. Benninga, S. and Czaczkes, B. (2000). Financial Modeling, MIT press. Billingsley, P. (2008). Probability and Measure, 3rd edn. John Wiley & Sons. Black, F. and Scholes, M. (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy 81(3), 637–654. Buetow, G.W. and Albert, J.D. (1998). The Pricing of Embedded Options in Real Estate Lease Contracts. Journal of Real Estate Research 15(3), 253–266.
page 3610
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Alternative Methods to Derive Option Pricing Models
b3568-v4-ch102
3611
Cakici, N. and Topyan, K. (2000). The GARCH Option Pricing Model: A Lattice Approach. Journal of Computational Finance 3(4), 71–85. Carr, P. and Wu, L. (2004). Time–Changed L´evy Processes and Option Pricing. Journal of Financial Economics 71(1), 113–141. Chen, R.R. and Palmon, O. (2005). A Non-parametric Option Pricing Model: Theory and Empirical Evidence. Review of Quantitative Finance and Accounting 24(2), 115–134. Chen, R.R., Lee, C.F., and Lee, H.H. (2009). Empirical Performance of the Constant Elasticity Variance Option Pricing Model. Review of Pacific Basin Financial Markets and Policies 12(2), 177–217. Churchill, R.V. (1963). Fourier Series and Boundary Value Problems, 2nd edn.: McGrawHill Companies. Costabile, M., Leccadito, A., Massab´ o, I., and Russo, E. (2014). A Reduced Lattice Model for Option Pricing under Regime-Switching. Review of Quantitative Finance and Accounting 42(4), 667–690. Cox, J.C. and Ross, S.A. (1976). The Valuation of Options for Alternative Stochastic Processes. Journal of Financial Economics 3(1), 145–166. Cox, J.C., Ross, S.A. and Rubinstein, M. (1979). Option Pricing: A Simplified Approach. Journal of Financial Economics 7(3), 229–263. Davydov, D. and Linetsky, V. (2001). Pricing and Hedging Path-Dependent Options under the CEV Process. Management Science 47(7), 949–965. Duan, J.C. (1995). The GARCH Option Pricing Model. Mathematical Finance 5(1), 13–32. Garven, J.R. (1986). A Pedagogic Note on the Derivation of the Black–Scholes Option Pricing Formula. Financial Review 21(2), 337–348. Geman, H., Madan, D.B. and Yor, M. (2001). Time Changes for L´evy Processes. Mathematical Finance 11(1), 79–96. Grenadier, S.R. (1995). Valuing Lease Contracts: A Real-Options Approach. Journal of Financial Economics 38(3), 297–331. Heston, S.L. (1993). A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options. Review of Financial Studies 6(2), 327–343. Heston, S.L. and Nandi, S. (2000). A Closed-Form GARCH Option Valuation Model. Review of Financial Studies 13(3), 585–625. Hillegeist, S.A., Keating, E.K., Cram, D.P. and Lundstedt, K.G. (2004). Assessing the Probability of Bankruptcy. Review of Accounting Studies 9(1), 5–34. Hull, J. and White, A. (1987). The Pricing of Options on Assets with Stochastic Volatilities. Journal of Finance 42(2), 281–300. Hull, J.C. (2014). Options, Futures, and Other Derivatives, 9th edn.: Prentice-Hall. Joshi, M.S. (2003). The Concepts and Practice of Mathematical Finance, Vol. 1. Cambridge University Press. Kou, S.G. (2002). A Jump-Diffusion Model for Option Pricing. Management Science 48(8), 1086–1101. Kou, S.G. and Wang, H. (2004). Option Pricing under a Double Exponential Jump Diffusion Model. Management Science 50(9), 1178–1192. Kutner, G.W. (1988). Black–Scholes Revisited: Some Important Details. Financial Review 23(1), 95–104. Lee, C.F., Wu, T.P. and Chen, R.R. (2004). The Constant Elasticity of Variance Models: New Evidence from S&P 500 Index Options. Review of Pacific Basin Financial Markets and Policies 7(2), 173–190. Lee, C.-F. and Lin, C.S.-M. (2010). Two Alternative Binomial Option Pricing Model Approaches to Derive Black–Scholes Option Pricing Model. In Handbook of Quantitative Finance and Risk Management: Springer, pp. 409–419.
page 3611
July 6, 2020
16:3
3612
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
Lee, C.-F., Finnerty, J., Lee, J., Lee, A.C. and Wort, D. (2013a). Security Analysis, Portfolio Management, and Financial Derivatives, World Scientific. Lee, C.-F., Lee, J. and Lee, A.C. (2013b). Statistics for Business and Financial Economics, 3rd edn. Springer. Lee, J.C. (2001). Using Microsoft Excel and Decision Trees to Demonstrate the Binomial Option Pricing Model. Advances in Investment Analysis and Portfolio Management 8, 303–329. Lee, J.C., Lee, C.F. and Wei, K.J. (1991). Binomial Option Pricing with Stochastic Parameters: A Beta Distribution Approach. Review of Quantitative Finance and Accounting 1(4), 435–448. Lin, C.H., Lin, S.K. and Wu, A.C. (2014). Foreign Exchange Option Pricing in the Currency Cycle with Jump Risks. Review of Quantitative Finance and Accounting 44(4), 1–35. Madan, D.B., Carr, P.P. and Chang, E.C. (1998). The Variance Gamma Process and Option Pricing. European Finance Review 2(1), 79–105. Marcus, A.J. and Shaked, I. (1984). The Valuation of FDIC Deposit Insurance Using Option-Pricing Estimates. Journal of Money, Credit and Banking 16(4), 446–460. Melino, A. and Turnbull, S.M. (1990). Pricing Foreign Currency Options with Stochastic Volatility. Journal of Econometrics 45(1), 239–265. Melino, A. and Turnbull, S.M. (1995). Misspecification and the Pricing and Hedging of Long-Term Foreign Currency Options. Journal of International Money and Finance 14(3), 373–393. Merton, R.C. (1973). Theory of Rational Option Pricing. The Bell Journal of Economics and Management Science 4(1), 141–183. Merton, R.C. (1977). An Analytic Derivation of the Cost of Deposit Insurance and Loan Guarantees: An Application of Modern Option Pricing Theory. Journal of Banking and Finance 1(1), 3–11. Merton, R.C. (1978). On the Cost of Deposit Insurance When There are Surveillance Costs. Journal of Business 51(3), 439–452. Psychoyios, D., Dotsis, G. and Markellos, R.N. (2010). A Jump Diffusion Model for VIX Volatility Options and Futures. Review of Quantitative Finance and Accounting 35(3), 245–269. Rendleman, R.J. and Bartter, B.J. (1979). Two-State Option Pricing. Journal of Finance 34(5), 1093–1110. Rubinstein, M. (1994). Implied Binomial Trees. Journal of Finance 49(3), 771–818. Scott, L.O. (1987). Option Pricing When the Variance Changes Randomly: Theory, Estimation, and an Application. Journal of Financial and Quantitative Analysis 22(4), 419–438. Scott, L.O. (1997). Pricing Stock Options in a Jump-Diffusion Model with Stochastic Volatility and Interest Rates: Applications of Fourier Inversion Methods. Mathematical Finance 7(4), 413–426. Smith Jr, C.W. (1976). Option Pricing: A Review. Journal of Financial Economics 3(1), 3–51. Stein, E.M. and Stein, J.C. (1991). Stock Price Distributions with Stochastic Volatility: An Analytic Approach. Review of Financial Studies 4(4), 727–752. Wiggins, J.B. (1987). Option Values Under Stochastic Volatility: Theory and Empirical Estimates. Journal of Financial Economics 19(2), 351–372. Williams, J.T. (1991). Real Estate Development as an Option. The Journal of Real Estate Finance and Economics 4(2), 191–208. Wu, C.C. (2006). The GARCH Option Pricing Model: A Modification of Lattice Approach. Review of Quantitative Finance and Accounting 26(1), 55–66.
page 3612
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
3613
Appendix 102A The Relationship Between Binomial Distribution and Normal Distribution In this appendix, we will use the de Moivre–Laplace theorem to prove that the best fit between the binomial and normal distributions occurs when the binomial probability is 12 . de Moivre–Laplace Theorem. As n grows larger and approaches infinity, for k in the neighborhood of np we can approximate 2 1 n k n−k − (k−np) e 2npq , p + q = 1, p, q > 0. √ (102A.1) p q k 2πnpq
Proof. According to Stirling’s approximation (or Stirling’s formula) for factorials approximation, we can replace the factorial of large number n with the following: √ (102A.2) n! nn e−n 2πn as n → ∞. n k n−k can be approximated as shown in the following proceThen p q k dures: √ n! nn e−n 2πn n k n−k k n−k √ √ pk q n−k p q = p q k −k n−k −k k k!(n − k)! k e 2πk(n − k) e 2πk −k −(n−k) k n n−k = . (102A.3) 2πk(n − k) np nq Let x =
k−np √ npq ,
we obtain
−k k n n − k −(n−k) 2πk(n − k) np nq −k −(n−k) n q p 1+x = 1−x 2πk(n − k) np nq −k −(n−k) n−1 q p = 1+x 1−x k (n−k) np nq 2π n n −k −(n−k) n−1 q p 1+x = . 1−x k k np nq 2π n 1 − n (102A.4)
page 3613
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3614
As k → np, we get as
k n
→ p. Then equation (102A.4) can be approximated
−k −(n−k) n−1 q p 1+x 1−x k k np nq 2π n 1 − n −k −(n−k) n−1 q p 1+x 1−x 2πpq np nq " −(n−k) % 1 q −k p exp ln 1 + x = + ln 1 − x 2πnpq np nq $ & 1 q p exp −k ln 1 + x − (n − k) ln 1 − x = 2πnpq np nq $ 1 q √ exp −(np + x npq) ln 1 + x = 2πnpq np & p √ . (102A.5) − (nq − x npq) ln 1 − x nq We are considering the term in exponential function, i.e., √ q p √ − (nq − x npq) ln 1 − x . −(np + x npq) ln 1 + x np nq (102A.6) Here, we are using the Taylor series expansions of functions ln(1 ± x): x2 x3 + + o(x3 ), 2 3 x2 x3 − + o(x3 ). ln(1 − x) = −x − 2 3
ln(1 + x) = x −
Then we expand equation (102A.6) with respect to x and obtain 3 x2 q x3 q 2 q − + + o(x3 ) − (np + x npq) x np 2np 3n 32 p 32 ⎛ ⎞ 3 2 3 x p x p2 p √ − − + o(x3 )⎠ − (nq − x npq) ⎝−x nq 2nq 3n 32 q 32 √
page 3614
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models
3615
⎞ 1 1 3 p− 2 q 23 3 p− 2 q 32 x 1 x √ ⎠ √ √ + = − ⎝x npq + x2 q − x2 q − 2 2 n 3 n ⎛
⎛ 1 √ − ⎝−x npq + x2 p − x2 q + 2
1 3 x3 p 2 q − 2
√ 2 n
−
1 1 x3 (p2 − q 2 ) + o(x3 ). = − x2 (p + q) − √ 2 6 npq
1 3 x3 p 2 q − 2
√ 3 n
⎞ ⎠ + o(x3 ) (102A.7)
Since we have p + q = 1 when we ignore the higher order of x, equation (102A.7) can be simply approximated to 1 1 x3 (p − q). − x2 − √ 2 6 npq
(102A.8)
Then we replace equation (102A.8) in the exponential function in equation (102A.5), we obtain 1 2 1 1 n k n−k 3 exp − x − √ x (p − q) . √ (102A.9) p q k 2 6 npq 2πpq Although term − 6√1npq x3 (p−q) → 0 as n → ∞, the term − 6√1npq x3 (p−q)
k n−k will be exactly zero if and only if p = q. Under this condition, (n k )p q √ 1 2πpq
exp(− 12 x2 ) =
√ 1 2πpq
2
exp[− (k−np) 2npq ]. Thus, it is shown that the best fit
between the binomial and normal distribution occurs when p = q = 12 . If p = q, then there exists an additional term − 6√1npq x3 (p − q). It is √ obvious that pq = p(1 − p) will reach maximum if and only if p = q = 12 . Therefore, when n is fixed, if the difference between p and q becomes larger, the absolute value of an additional term − 6√1npq x3 (p − q) will be larger. This implies that the magnitude of absolute value of the difference between p and q is an important factor to make the approximation to normal distribution less precise. We use the following figures to demonstrate how the absolute number of differences between p and q affect the precision of using binomial distribution to approximate normal distribution. From Figures 102A.1 and 102A.2, we find that when p = q, the absolute magnitude does affect the estimated continuous distribution as indicated by solid curves. For example, when n = 30, the solid curve when p = 0.9 is very much different from p = 0.5. In other words, when p = 0.9, the solid curve is not as similar to the normal curve as p = 0.5. If we
page 3615
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
C. F. Lee, Y. Chen & J. Lee
3616 1500
1800 1600 1400
1000
1200 1000 800
500
600 400 200
0
4
6
8
10
12
14
16
18
20
22
0 10
24
12
n = 30, p = 0.5, q = 0.5
14
16
20
18
22
24
26
28
30
n = 30, p = 0.7, q = 0.3
(a)
(b) 2500 2000 1500 1000 500 0 18
20
22
24
26
28
30
32
n = 30, p = 0.9, q = 0.1 (c)
Figure 102A.1:
Binomial distributions to approximate normal distributions (n = 30).
increase n from 30 to 100, the solid curve from p = 0.9 is less different from the solid curve when p = 0.5. In sum, both the magnitude of n and p will affect the shape of using normal distribution to approximate binomial distribution. From equations (102.15a) and (102.16) in the text, we can define the binomial OPM and the Black–Scholes OPM as follows: C = SB1 (a; n, p ) −
X B2 (a; n, p), rn
C = SN (d1 ) − Xe−rT N (d2 ).
(102.15a) (102.16)
Both Cox et al. and Rendlemen and Bartter tried to show the binomial cumulative functions of equation (102.15a) will converge to the normal
page 3616
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch102
Alternative Methods to Derive Option Pricing Models 900
900
800
800
700
700
600
600
500
500
400
400
300
300
200
200
100
100
0 30
35
40
45
50
55
60
65
70
0 50
55
60
65
70
3617
75
80
85
n = 100, p = 0.5, q = 0.5
n = 100, p = 0.7, q = 0.3
(a)
(b)
90
95
1400 1200 1000 800 600 400 200 0 75
80
85
90
95
100
n = 100, p = 0.9, q = 0.1 (c)
Figure 102A.2:
Binomial distributions to approximate normal distributions (n = 100).
cumulative function of equation (102.16) when n approaches infinity. In this appendix, we have mathematically and graphically showed that the relative magnitude between p and q is the important factor to determine this approximation when n is constant. In addition, we also demonstrate the size of n which also affects the precision of this approximation process.
page 3617
This page intentionally left blank
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
Chapter 103
Option Price and Stock Market Momentum in China∗
Jianping Li, Yanzhen Yao, Yibing Chen and Cheng Few Lee Contents 103.1 Introduction . . . . . . . . . . . . . . 103.2 SSE 50 ETF Option . . . . . . . . . 103.3 Methodology . . . . . . . . . . . . . 103.3.1 Put–call parity . . . . . . . . 103.3.2 Implied volatility spread . . 103.3.3 Implied volatility estimation 103.3.4 Momentum factor calculation 103.4 Empirical Study . . . . . . . . . . . 103.4.1 Data . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
3620 3623 3624 3625 3626 3628 3630 3631 3631
Jianping Li Chinese Academy of Sciences e-mail: [email protected] Yanzhen Yao Chinese Academy of Sciences e-mail: [email protected] Yibing Chen National Council for Social Security Fund e-mail: [email protected] Cheng Few Lee Rutgers University e-mail: cfl[email protected] ∗
This chapter is an update and expansion of the paper “Option prices and stock market momentum: Evidence from China,” which was published in Quantitative Finance, Vol. 18, Issue 9, pp. 1517–1529, 2018. 3619
page 3619
July 17, 2020
14:50
Handbook of Financial Econometrics,. . . (Vol. 4)
3620
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
103.4.2 Implied volatility spread and past stock returns 103.4.3 Source of price pressure . . . . . . . . . . . . . . 103.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 103A Exact Form of Black–Scholes Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . Appendix 103B MATLAB Approach to Estimate Implied Volatility . . . . . . . . . . . . . . . . . . . . . Appendix 103C Upper and Lower Bounds for European Index Options . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
3633 3638 3641 3642
. . .
3644
. . .
3644
. . .
3645
Abstract Option prices tend to be correlated to past stock market returns due to market imperfections. This chapter discuss this issue in Chinese derivative market. Implied volatility spread based on pairs of options is constructed to measure the price pressure in the option market. By regressing the implied volatility spread on past stock returns, we find that past stock returns exert a strong influence on the pricing of index options. Specifically, the SSE 50 ETF calls are significantly overvalued relative to SSE 50 ETF puts after stock price increases, and vice versa. Moreover, we empirically validate that momentum effects in the underlying stock market are responsible for the price pressure. These findings are both economically and statistically significant and have important implications. Keywords Option price • Implied volatility spread • Past stock returns • Stock market momentum • Price pressure • Momentum factor.
103.1 Introduction Providing a frictionless market free of arbitrage, the standard and celebrated option pricing model, the Black–Scholes model (hereafter B–S model, see Appendix A) proposed by Black and Scholes (1973), has gained wide popularity among market participants. As tested by a spectrum of empirical studies on the option pricing model (Cox and Ross 1976; Galai 1977; MacBeth and Merville 1979; Jerbi 2015), options are priced by disallowing arbitrage opportunities. When pricing a European option, only the following five factors are needed: the strike price, the current market price of the underlying stock, the risk-free interest rate, the remaining life of the option, and the volatility. In addition, no relation should exist between option prices and the past stock market momentum. However, due to market imperfections, including indivisibilities of securities, the unknown volatility, the existence of transaction costs and the rebalancing only at discrete intervals, additional factors such as investors’ risk aversion and perceptions about the fluctuation
page 3620
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3621
in the underlying asset market are likely to have an influence (Figlewski 1989). The investors’ risk aversion and perceptions about the underlying stock market can be altered by the past stock market performance. Thus, the option prices in the real options market may have relation to the past stock market performance or, in other words, the momentum of the underlying stock. The momentum of the underlying stock can be affected by investors’ expectations about future stock returns, investors’ risk aversion, and investors’ preferences about the higher moments of past stock prices. For instance, Lo and MacKinlay (1988), as well as Campbell et al. (1992), empirically found that the stock index showed strong positive autocorrelation of returns, suggesting that future stock returns will also be greater than average if past stock returns are positive. Subsequently, the researchers demonstrate that momentum may be caused by autocorrelation in returns (Lo and MacKinlay 1990). Hence, in imperfect markets, option prices can be affected by market momentum through expectations about future stock returns, which depend on past stock returns. Investors’ risk aversion and expectations about the higher moments of stock prices are related to recent stock market movements, as well as to the demand for an option. Theoretical models based on investors’ preferences for skewness in stock returns have been developed by many researchers (Kraus and Litzenberger 1976; Singleton and Wingender 1986). Investors generally prefer to observe the skewness in past return distributions and find it appealing when they observe the stock’s price chart, which causes their demand for options to be altered as their expectations about the higher moments of stock prices change (Barberis et al. 2016). In academia, the first important contribution to empirically clarify the influence of past stock market momentum on option prices was implemented by Amin et al. (2004). The researchers indicated strong evidence of systematic price pressure in the options market based on the Standard and Poor’s 100 Index options (OEX options) prices. However, recent contributions by academics and practitioners on this issue are very rare. Furthermore, to the best of our knowledge, there is no academic research on the relationship between the options market and the underlying stock market in China. Therefore, in this paper, our objective is to examine the effects of past stock market momentum on the SSE 50 ETF option prices. The SSE 50 ETF option is issued on February 9th, 2015 and is the only exchange-traded ETF option in the Chinese derivatives market thus far. In fact, although an extensive body of literature has revealed the existence and significance of the momentum effects in stock markets of different countries, such as
page 3621
July 6, 2020
16:3
3622
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
the US (Jegadeesh and Titman 1993; Grinblatt et al. 1995), Korea (Choe et al. 1999), Finland (Grinblatt and Keloharju 2001), and Canada (Liew and Vassalou 2000), theoretical and empirical studies on the relation between the past stock market momentum and option pricing are lacking. To address this issue, we first seek deviations from the celebrated put–call parity via the average difference in implied volatility, or “implied volatility spread”, between the call and put options with identical strike price and expiration date, to measure the relative values of call options to put options. Then, in accordance with Amin et al. (2004), we analyze the implied volatility spread as a function of past stock returns to investigate the relation between past stock returns and option prices in China. To determine the effects of the stock momentum on option prices, we adopt the momentum return proposed in Carhart (1997), WML, as the momentum factor. Instead of using the difference between the monthly returns on the diversified portfolios of the winners and losers of the last year, we apply the difference between the daily returns of diversified portfolios of the winners and losers of the last month. Our contributions mainly incorporate two aspects. First, this study is the first to empirically investigate the relationship between option prices and past underlying stock index returns in China. Second, our results show a positive relation between implied volatility spread and past stock returns, suggesting that the past stock returns do exert an influence on option prices; this also validates the findings from previous study (see Amin et al. 2004). Our empirical findings have fourfold significant implications. First, we provide strong evidence that past stock returns exert an important influence on option prices. The regression analysis shows a positive influence of past stock returns on the implied volatility spread. A positive implied volatility spread indicates that calls are relatively overvalued than puts, and a negative implied volatility spread suggests the opposite. Consequently, positive past stock returns would increase the price of call options, while negative past stock returns increase the price of put options. Second, our results prove that the implied volatility spread is an effective and applicable indicator to identify price pressure1 in the options market. Variations in the implied volatility spread as a function of past stock returns can carry more information since it can not only provide separate information about volatilities in the options market but also identify the direction of price pressure in the options market. 1
Price pressure in the stock market has been researched for a long time; however, in the options market, it remains a novel idea by many academic scholars, such as Jegadeesh and Titman (1993), Pan and Poteshman (2006) and Ni et al. (2008).
page 3622
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3623
Third, our results demonstrate that strong momentum returns exist in the Chinese stock market; these potentially interpret the source of price pressure in the options market. Fourth, our findings have relative investment implications. Grinblatt et al. (1995) empirically documented that 77% of investors are “momentum investors” who prefer to buy the past “winners” and sell the past “losers”. If the stock price increases, the price of the call option will increase according to the standard pricing models; thus, a long position in call option is more expensive. For momentum investors, bets on increases in the future stock market will be costlier to implement with long calls, while covered call writing appears to be more profitable. This paper is organized as follows. Section 103.2 describes the characteristics of the SSE 50 ETF options. The methodology is presented in Section 103.3. In Section 103.4, we empirically examine the relation between option prices and past stock market momentum and further test the momentum effects in Chinese stock market. Section 103.5 concludes the paper. 103.2 SSE 50 ETF Option Options trading has developed dramatically since the inception of the first worldwide options exchange, the Chicago Board Options Exchange (CBOE), in 1973. However, only call options were traded initially; put options were not permitted to be traded on five different exchanges until 1977. The advent of put options trading on the exchanges is both practically and economically significant since the put–call parity developed by Stoll (1969) demonstrates that a deterministic parity exists between put and call prices. An option can be written on various underlying securities, such as stock, futures, index, and ETFs. Although the ETF (Exchange Traded Fund) option was first exchange-traded in 1998, its trading volume has soared. According to WFE (World Federation of Exchanges), the ETF option trading volume was 1.46 billion in 2013, representing 15.54% of the global exchange-traded trading volume, only second to stock options and index options. In other words, ETF options are playing an increasingly important role in the global securities market. After nearly two decades’ exploration in the Chinese financial market, the first ETF option, SSE 50 ETF option, was listed on Shanghai Stock Exchange (SSE) on February 9th, 2015. The SSE 50 ETF completely copies the SSE 50 Index, which is a sample of 50 representative stocks with large market size and strong liquidity in the Shanghai securities market. The SSE 50 ETF option, incorporating both puts and calls, is European and thus can be exercised only on the expiration date.
page 3623
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
3624 Table 103.1: ETF options.
Strike prices and the spacing for SSE 50
Stock price (CNY ) (0, 3] (3, 5] (5, 10] (10, 20] (20, 50] (50, 100] (100, ∞)
Spacing (CNY ) 0.05 0.1 0.25 0.5 1 2.5 5
Each SSE 50 ETF option contract is written on 10000 SSE 50 ETF and defined by three features: premium, strike price and expiration date. First, the premium, the only nonstandardized variable, is the transaction price of the option, which depends on the strike price, expiration date, and the riskfree interest rate. Second, the SSE chooses the strike prices at which options can be written. The strike prices are spaced 0.05 CNY, 0.1 CNY, 0.25 CNY, 0.5 CNY, 1 CNY, 2.5 CNY and 5 CNY apart. Typically, the spacing is 0.05 CNY when the stock price is below 3 CNY (3 CNY contained). Table 103.1 presents the detailed strike prices and the corresponding spacing. The third item of an option is the month in which the expiration date occurs. Usually, stock options are on a January, February, or March cycle. However, the expiration months of the SSE 50 ETF option include the current month, the next month and subsequent two season months. The precise expiration date is the fourth Wednesday of the expiration month. Let us use a March call SSE 50 ETF option for example. At the beginning of March, the options are traded with expiration dates in March, April, June, and September, while at the end of March, they are traded with expiration dates in April, May, June, and September. As long as one SSE 50 ETF option reaches expiration, trading in another is started. Additional elements include the exercise date and the trading date. For example, an investor with a long position in an SSE 50 ETF option usually has to instruct a broker to exercise the option until 3:30 p.m. Central Time on that Wednesday. The broker then has to complete the paperwork notifying the exchange that an exercise is to occur until 3:00 p.m. the next day. 103.3 Methodology In this section, we introduce our methodology to investigate the relation between the past stock market momentum and option prices. First, we
page 3624
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3625
present a brief description of the celebrated put–call parity. Then, we introduce how the difference between pairs of calls and puts with identical expiration date and strike price, which is also referred to as “implied volatility spread”, is calculated to measure the relative values of calls to puts and thus identify the price pressure in the options market. The estimation method for implied volatility is also introduced. 103.3.1 Put–call parity According to the classical put–call parity proposed by Stoll (1969), the prices of calls and puts will always maintain the parity over time based on the no-arbitrage principle in perfect markets. A review of the parity is as follows. Notation c and p prices of a European call and put, respectively, with the same strike price and expiration date; current market price of the underlying stock; S0 stock price on the expiration date; ST K strike price; PV(K) present value of an amount of cash equal to K; σ volatility; cBS (σ) and pBS (σ) prices of European call and put option, respectively, calculated using the Black–Scholes pricing model (Black and Scholes 1973) given the value of volatility; call put and IV the volatility implied by call and put option prices, IV respectively, observed in the market; call BS c (IV ) and prices of European call and put option, respecpBS (IVput ) tively, calculated with the B–S pricing model based on the value of implied volatility. Under perfect markets free of arbitrage, consider the following two portfolios: Portfolio A: one European call option plus a zero-coupon bond that provides a payoff of K at time T ; Portfolio B: one European put option plus one share corresponding stock. Both portfolios are worth: max(ST , K).
page 3625
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
3626
The options are European and cannot be exercised prior to the expiration date. Thus, both portfolios must have identical values today, or else an arbitrage opportunity exists. Thus, the put–call parity is induced c − p = S0 − PV(K).
(103.1)
Given an assumed volatility value σ, the celebrated B–S pricing model (Black and Scholes 1973) meets the parity, namely: ∀σ > 0,
cBS (σ) + PV(K) = pBS (σ) + S0 .
(103.2)
Theoretically, the prices of calls and puts, which are derived from the B–S pricing model given the implied volatility, satisfy the following equity: cBS (IVcall ) = c, pBS (IVput ) = p.
(103.3)
By combining equations (103.2) with (103.3), we thus obtain IVcall = IVput .
(103.4)
Equation (103.4) means that the Black–Scholes implied volatilities of pairs of call and put option must be identical. Previous empirical studies (Evnine and Rudd 1985; Ball and Torous 1986; Bodurtha and Courtadon 1986; Followill and Helms 1990; Nisbet 1992) concluded that the put–call parity generally held up very well. However, in imperfect markets, deviations from the parity appear due to short-sale restrictions (Lamont and Thaler 2003; Ofek and Richardson 2003; Ofek et al. 2004), transaction costs (Klemkosky and Resnick 1979, 1980), early exercise involving American options (Black and Scholes 1973), and inequality between lending rates and borrowing rates (Brenner and Galai 1986). Consequently, to an extent, option prices can deviate from the put–call parity without creating arbitrage opportunities. 103.3.2 Implied volatility spread A practical indicator to measure the deviations from put–call parity is the difference in implied volatilities between pairs of call and put options (Figlewski and Webb 1993; Amin et al. 2004; Cremers and Weinbaum 2010). Similarly, we use the average difference in implied volatilities, which is referred to as the “implied volatility spread”, between pairs of options with identical strike price and expiration date to measure the price pressure in the options market. Specifically, in the Chinese options market, the implied
page 3626
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3627
volatility spread indicator is constructed only upon the SSE 50 ETF options. The notation and formula are presented below: Notation number of valid pairs of call and put options on Nt day t; weight of the jth pair of call and put options with wj,t identical strike price and expiration date; put and IV implied volatility for each pair of call and put IVcall j,t j,t options on the day t. The estimation approach is set forth in the next subsection. Therefore, the implied volatility spread (IVS) is computed as − IVputs = IVSt = IVcalls t t
Nt j=1
put wj,t (IVcall j,t − IVj,t ).
(103.5)
Empirically, on each day, for the given set of call and put options, the implied volatility spread is calculated by weighting the implied volatility difference between each pair of options with the average trading volume across all the valid call and put options. Therefore, option pairs for which either the call or put has a trading volume of zero or a bid price of zero should be eliminated. High call implied volatilities relative to put-implied volatilities indicate that calls are more expensive than puts, while high put-implied volatilities relative to call-implied volatilities indicate the opposite. If the implied volatility spread increases after the stock market increases and decreases after the stock market declines, we conclude that the price of call options increases after the stock market increases and price put options increase after the stock market declines. In other words, the past stock market momentum affects the pricing of options. To test these issues, we then regress the past underlying asset returns, namely, the SSE 50 ETF returns on the implied volatility spread. Since the SSE 50 ETF completely copies the SSE 50 Index, we utilize the past returns of the SSE 50 Index instead. If the implied volatility spread as a function of past stock returns does exist, and the coefficient is positive, this indicates strong evidence of price pressure on calls and puts. Specifically, positive past stock returns lead to deviations from the put–call parity by increasing the call option prices. Similarly, negative past stock returns cause deviations from the parity by increasing the put option prices. Moreover, the effects of past returns of different time periods on the pricing of options are tested, respectively. Therefore, our results can be not only statistically significant but also economically significant.
page 3627
July 6, 2020
16:3
3628
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
103.3.3 Implied volatility estimation 103.3.3.1 Existing estimation methods Implied volatility estimation is critical for the implied volatility spread calculation. Since the B–S pricing model is a nonlinear equation, an explicit analytic solution for the implied volatility is not available in the literature (except for at-the-money calls, one of the three kinds of options classified by the relationship between the strike price and the market price (Hull 2014)). Therefore, researchers and practitioners have resolved it implicitly. The researches on implied volatility estimation fall into two categories: numerical search methods and closed-form derivation methods. Numerical search methods attempt to find an approximate solution for implied volatility which makes the theoretical option value equal to or very close to market observed option price. These methods do not provide closedform solution for estimated implied volatility, and need iterative algorithms to approximate the final solution. Latane and Rendleman (1976) was one of the first researches on the derivation and use of the implied volatility. They argued that although it is impossible to solve the Black–Scholes equation directly, one can use numerical search to closely approximate the standard deviation implied by given option price. Their procedure is to find an implied standard deviation which makes the theoretical option value, within ±0.001 of the observed actual call price, using trial-and-error method. Manaster and Koehler (1982) used the Newton–Raphson method to provide an iterative algorithm for implied volatility estimation. They claimed that, starting with their chosen initial point, implied variance estimate converges monotonically quadratically. Closed-form derivation approaches two different routines to calculate the analytical solutions for the implied volatility: either use inverse function or Taylor expansion. Lai et al. (1992) used inverse function of normal distribution to derived a ¶C closed form solution for the implied volatility in terms of the delta ¶C ¶S , ¶X , and other observable variables: time to maturity T and risk-free rate r. They argued that, according to Merton (1973), the Black–Scholes model exhibits homogeneity of degree one in the stock price and exercise price, therefore, ¶C the two partial derivatives ¶C ¶S and ¶X can be estimated by running linear multiple regression of call price on stock price and exercise price. There were also studies applying Taylor expansions of different orders to calculate the analytical solutions for the implied volatility. Brenner and Subrahmanyam (1988) applied first-order Taylor series expansion to the
page 3628
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3629
cumulative normal function at zero in the Black–Scholes option pricing model. Although simple, their method can only be used to estimate the implied standard deviation from at-the-money or at least not too far in- or out-of-the-money options. To allow for the deviation between the underlying asset price and the present value of exercise price, Corrado and Miller (1996) expanded the cumulative normal function at zero to the first-order term in the Black–Scholes option pricing model to derive a quadratic equation of the implied volatility. Chance (1996) took use of second-order Taylor expansions of cumulative normal distribution, and developed a generalized formula so that this formula can be implemented under other cases when options are in-the-money or out-of-the-money. Experiments in Chance (1996) suggest their correction for Brenner–Subrahmanyam’s formula can give the correct solution for implied variance. Li (2005) also followed Brenner and Subrahmanyam’s work and expanded the expression to the third-order term and solved for the implied volatility with a cubic equation. Since Li included the third-order term in the Taylor expansion on the cumulative normal distribution in his derivation, his formula provided a consistently more accurate estimate of the true implied volatility than previous studies. 103.3.3.2 Newton–Raphson method As illustrated above, for both types of methods to estimate the implied volatility, each has its own merits. In this paper, we estimate the implied volatility using the Newton–Raphson search procedure similar to that suggested by Manaster and Koehler (1982). For each individual option, the implied volatility can be obtained by first choosing an initial estimation σ0 ; then, equation (103.6) is used to iterate towards the correct value: T ∂c j,t T (103.6) cM σ0 (σ − σ0 ) + ej,t , j,t − cj,t (σ0 ) = ∂σ where cM j,t is market price of call option j at time t, σ the true or actual implied standard deviation, σ0 the initial estimate of implied standard deviation, cTj,t (σ0 ) the theoretical price of call option j at time t given σ = σ0 , ∂cT j,t ∂σ |σ0
the partial derivative of the call option with respect to the standard deviation σ at σ = σ0 and ej,t the error term. The partial derivative of the call option with respect to the standard deviation
∂cT j,t ∂σ
from the B–S model is √ ∂cFt,j √ τ 2 = Xe−rτ τ N (d1 ) = Xe−rτ √ e−d1 /2 . ∂σ 2π
(103.7)
page 3629
July 6, 2020
16:3
3630
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
This partial derivative is also called Vega of the option, denoting the rate of change of the value of the option with respect to the volatility of the underlying asset. The iteration proceeds by reinitializing σ0 to equal σ1 at each successive stage until an acceptable tolerance level is attained. The tolerance level used is σ1 − σ0 (103.8) σ0 < 0.001. For each option transaction, the iteration procedure is conducted until the implied volatility has converged, and the predicted option price is finally equal to the observed price.2 The detailed estimation procedure using the optimization technique in MATLAB is presented in Appendix B. 103.3.4 Momentum factor calculation For the purpose of confirming the momentum return pattern and investigating the relationship between the stock market momentum and the option prices, we adopt the momentum return, WML, which is defined as the difference between the monthly returns on diversified portfolios of the winners and losers of the last year, as the momentum factor. The momentum factor was originally used in Carhart (1997)’s four-factor model and then commonly used in the empirical application, most notably to detail momentum patterns in average stock returns, such as in Fama and French (2012). Specifically, WML is calculated for portfolios that are constructed from 2 × 3 sorts on size and momentum (Carhart 1997; Fama and French 2012). First, at the end of December of each year y, the stocks in a region are assigned to two size portfolios based on each stock’s market cap. Large stocks are those in the top a% of the market cap, and small stocks are those in the bottom (1 − a)%. The choice for the value of a is flexible. In Fama and French (2012), big stocks are those in the top 90% of market cap for the region, and small stocks are those in the bottom 10%. For North America, 90% of market cap corresponds roughly to the NYSE median, used to define small and big stocks in Fama and French (1993). This description determines the size portfolios in the next year. The same stocks are then allocated in an independent 2
The tolerance level is set at 0.001%, which means that we consider the procedure converged if the estimated price is within 0.001% of the observed price. In fact, most options converge finally. A small number of options that did not converge are thus removed from the sample, for instance, options with an observed price dissatisfying the inequalities in terms of the upper and lower bounds.
page 3630
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3631
sort to three momentum portfolios based on the breakpoints for the bottom 30%, middle 40% and top 30% of the lagged momentum return, which is used in Fama and French (2012). The monthly value-weighted returns for each portfolio from December of year y to December of t + 1. Winners are those in the top 30% of lagged momentum, neutral are those in the middle 40%, and losers are those in the bottom 30%. The intersection of the independent 2 × 3 sorts produces six value-weight size-momentum portfolios: SL, SN, SW, BL, BN, and BW, where S and B indicate “Small” or “Big” and L, N, and W indicate “Loser”, “Neutral”, and “Winner” (bottom 30%, middle 40%, and top 30% of the lagged momentum return). The momentum factor, WML(t), is then the equal-weight average of the daily returns on the high lagged momentum stocks (winners) minus the low lagged momentum stocks (losers). In essence, WML is long past winners and short past losers (Drew et al. 2003). That is, WML =
1 (SW + BW − SL − BL). 2
(103.9)
103.4 Empirical Study 103.4.1 Data Our data set incorporates three parts. The first part is the Wind transactions data set for call and put options on the SSE 50 ETF traded on Shanghai Stock Exchange for the period February 9, 2015, to May 25, 2016, covering 316 trading days. For each trading day, the data record contains the option’s type (call or put), its transaction price, strike price, expiration data, and trading volume. The second part, referring to the daily return of the SSE 50 Index and its constituent stocks, is also collected from Wind. The third part, which refers to the accounting data including the daily market cap of the 50 constituent stocks of the SSE 50 Index, is primarily from CSMAR and is supplemented by Wind. Both databases are the leading integrated service providers of financial data in China. The entire sample period spans from December 2014 to May 2016. Our goal is to explore the relationship between the past stock market return and the option price. Although certain data, particularly for the large constituent stocks, are available earlier, the SSE 50 ETF options began listing on Shanghai Stock Exchange on February 9th, 2015. Therefore, the cost is a short sample period. During the sample period, 320 matched call–put pairs are covered by matching all options (both the delisted and the listed) based on the strike
page 3631
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
3632
Table 103.2: Characteristics of SSE 50 ETF options for the period February 9th, 2015, through May 26th, 2016. Number of matched trades Days to maturity 1–39 40–79 80–119 120 and more All options
Total
Maximum per day
Average per day
127 76 35 82 320
24 28 20 50 122
8.25 10.75 10.35 31.99 61.34
price and the expiration date. Option pairs for which either the call or put option has a trading volume of zero or bid price of zero are omitted. In addition, the option price must be within the upper and the lower bounds for options; otherwise, implied volatilities cannot be calculated (See Appendix C for the upper and lower bounds for European index options). Table 103.2 shows the characteristics of the 320 matched call–put options we have included for empirical analysis. Our overall sample contains, on average, 61.34 matched option pair trades per day. Additionally, for the shortest maturity options, the daily trade averages 8.25, increasing to 31.99 for the longest maturity options. Since different options may have different liquidities due to different strike prices and maturities, we divide the full sample into different subsamples with different liquidities. The trading volume of an option is an appropriate proxy for its liquidity since the larger the trading volume is, the higher the liquidity is. Additionally, because the SSE 50 ETF option began to be exchanged-traded on February 9th, 2015, the number of matched options sample is limited. Therefore, the median of the trading volume is chosen as the liquidity breakpoint for the 320 pairs of matched options sample. The options with top 50% of the trading volume are labeled the high-liquidity subsample while those with bottom 50% are the low-liquidity subsample. These options are considered in our subsequent empirical analysis. It is noted that the liquidity breakpoint is not the same for each trading day. First, the median of trading volume differs every day since the distribution of the trading volume of 320 matched options varies every day in reality. Furthermore, the traded options are not exactly the same every day because of different maturities. The constituent stocks list released on May 26th, 2016, is used for the lagged momentum return calculation. Although the constituent stocks list of the SSE 50 Index is adjusted semiannually, only two adjustments have
page 3632
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3633
been made during the sample period, and the adjustment ratio is generally no more than 10%. 103.4.2 Implied volatility spread and past stock returns 103.4.2.1 Implied volatility spread We begin by calculating an implied volatility for each transaction on the SSE 50 ETF option in our sample data set using the numerical search method. For convenience, the remaining maturity is counted in terms of calendar days and then annualized. We manually add 1 calendar day for each option such that no option will have a maturity of zero. Using the average trading volume of each pair of the call and put across all valid pairs in each subsample in a given day as weights, we calculate the average difference between call and put implied volatilities, respectively, referred to as “implied volatility spread”. Hence, two daily implied volatility spread series of high-liquidity and lowliquidity subsamples, covering the sample period lasting for 316 days, are obtained. A positive implied volatility spread, namely, a higher call-implied volatility relative to put-implied volatility, indicates that calls are overvalued relative to puts. Similarly, a negative implied volatility spread indicates that puts are overvalued relative to calls. Both the implied volatility spread series of the high-liquidity options and the closing price of the SSE 50 Index are presented in Figure 103.1.
Figure 103.1: Daily implied volatility spread for the high-liquidity SSE 50 ETF option and daily closing price curve for the SSE 50 Index during the sample period.
page 3633
July 6, 2020
16:3
3634
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
Figure 103.2: Daily implied volatility spread for the low-liquidity SSE 50 ETF option and daily closing price curve for the SSE 50 Index during the sample period.
Figure 103.2 shows the implied volatility spread series of the low-liquidity options and the closing price of the underlying stock index. Specifically, the maximum and minimum values are all marked out. Figures 103.1 and 103.2 provide several important insights. First, there is an interesting difference between the implied volatility spread series of two subsamples. It is found that the fluctuation intensity in Figure 103.2 is generally larger than that in Figure 103.1 during the entire sample period except for the period from June to September in 2015. We attribute this finding to the impact of the 2015 Chinese stock market crash. Second, the fluctuation frequency and the intensity of the implied volatility spread are both far larger than those of the index price. The implied volatility spread curve has intense ups and downs, while the index price curve is much milder. The percentage change from the maximum to the value for index price is 44.97%, while those of the two implied volatility spread series achieve a maximum of 413.19% and 389.36%, respectively. This phenomenon is primarily due to the great leverage inherent in financial derivatives such as options. A small change in the underlying asset price would, accordingly, strongly lead to an amplification in options. Most significantly, for two implied volatility spread series, the appearance time of the maximum and minimum of the two implied volatility spread series lags those of the stock index price. Specifically, the maximum lags by 11 days. The minimums of the two implied volatility spread series lag
page 3634
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
Option Price and Stock Market Momentum in China
3635
Table 103.3: Summary statistics of the implied volatility spread of the high-liquidity subsample and the low-liquidity subsample. Implied volatility spread High-liquidity subsample Mean (%) Standard deviation (%) Median (%) Minimum (%) Maximum (%) ρ1 ρ2 ρ3 Sample size
−8.12 14.52 −6.47 −94.96 30.32 0.82 0.14 0.08 79
Low-liquidity subsample −7.93 14.72 −9.47 −79.72 27.55 0.86 0.22 0.20 79
by 8 days and 7 days. Therefore, our results, to a degree, provide evidence for the market momentum hypothesis, which predicts that call prices tend to be relatively high following large stock price increases, and put prices tend to be relatively high following large stock price declines. Later, we will further study the relationship between the option price and past stock returns. 103.4.2.2 Relation with past stock returns In this section, we begin by documenting summary statistics for the daily implied volatility spread series of both subsamples, as shown in Table 103.3. Typically, for two subsamples, the mean and median of implied volatility spread are small and negative. A negative estimation of implied volatility spread means that the put-implied volatility exceeds the callimplied volatility spread. There is nearly no difference between the two implied volatility spread series for standard deviations. The maximum equals 30.32% and 27.555%, while the minimum equals −94.96% and −79.72%. Table 103.3 also presents the partial autocorrelation coefficients for the average daily implied volatility spread series. Both series exhibit significantly positive partial serial correlations. The first-order partial autocorrelation coefficients are 0.82 and 0.86 for the high-liquidity subsample and the low-liquidity subsample, respectively. The partial autocorrelation coefficients decrease as the order increases, with the second-order and third-order partial autocorrelation coefficients being much smaller. Intuitively, the large, positive first-order partial autocorrelation coefficients indicate that the implied volatility spread follows a slow-moving diffusion process, which suggests that the innovations in the implied volatility
page 3635
July 6, 2020
16:3
3636
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
spread do arise from continuing price pressure on options, either calls or puts. The positive serial correlation also makes it less likely that the deviations from the put–call parity measured by the implied volatility spread arise from either temporary measurement errors or asynchronous trading between the options market and the underlying stock market (Amin et al. 2004). The slow-moving feature of the implied volatility spread series deters us from using the ordinary least squares (OLS) regressions, or else the residuals will exhibit strong autocorrelations, leading to biases in the variance estimation of estimated regression coefficients. As judged by the Box–Pierce statistics, we find that an autoregressive error model of order 1, namely, AR (1), is sufficient to eliminate the correlation structure of the residuals. The model is as follows (Amin et al. 2004): IVSt = α0 + α1 Rt−k,t−1 + A(L)εt .
(103.10)
Here IVSt is the implied volatility spread at day t, α0 the constant, α1 the parameter of the model, Rt−k,t−1 the value-weighted market return from calendar date t − k to t − 1, A(L) = φ10 − φ1 L, where L is the lag operator, and εt the white noise. According to the market momentum hypothesis, call option prices increase following an increase in the stock market, and put option prices increase following a decline in the stock market. Hence, it is predicted that the volatility should be positively related to past stock returns. In our tests, past stock returns are computed using the preceding 2–20 weeks (10–100 days) of returns for the SSE 50 Index. Table 103.4 shows regression results of the implied volatilities spread of a high-liquidity subsample on past stock returns over the past 2–20 weeks for the entire sample period. For the implied volatility spread of the low-liquidity subsample, we also replicated our regression analysis; the results are presented in Table 103.5. As is observed, both implied volatility spread series produce similar results. First, the relation between the implied volatility spread and the past stock returns is positive and significant, as the value of α1 is positive, and the corresponding t-statistics is significant. This finding provides direct and strong evidence for the market momentum hypothesis. In addition, as the value of k increases, the values of α1 and the corresponding t-statistics at first increase and then decrease. Regarding the differences, there are two aspects. For the high-liquidity subsample, the value of α1 achieves the maximum at k = 60, while k = 80 is achieved for the low-liquidity subsample. This result indicates that the past 60-days SSE 50 Index return exerts the most significant and positive influence on the prices of high-liquidity SSE 50 ETF options, while the past
page 3636
July 6, 2020 16:3
Table 103.4: Regression of the daily implied volatility spread (high-liquidity subsample) on past SSE 50 Index returns during the sample period.
20
30
40
60
80
α0
−0.0829 (−9.9980)∗∗∗
−0.0849 (−10.4618)∗∗∗
−0.0849 (−10.7621)∗∗∗
−0.0856 (−10.7619)∗∗∗
−0.0784 (−10.2218)∗∗∗
−0.0709 (−7.3202)∗∗∗
α1
0.3311 (2.6347)∗∗∗
0.5057 (6.0987)∗∗∗
0.5330 (8.7905)∗∗∗
0.4858 (9.4614)∗∗∗
0.5521 (12.8465)∗∗∗
0.5042 (9.2142)∗∗∗
R2
0.0223
0.1120
0.2133
0.2456
0.3929
0.2654
100 −0.1184 (−8.6227)∗∗∗ 0.1274 (1.5824) 0.1150
Notes: For each parameter, the value in the first row represents the estimation value of coefficients, and the value reported in parentheses denotes the corresponding t-statistics. The asterisk signifies the level of significance.
Table 103.5: Regression of the daily implied volatility spread (low-liquidity subsample) on past SSE 50 Index returns during the sample period.
10
20
−0.0818 (−9.6895)∗∗∗
α1
0.2171 (1.6972)∗
R2
0.0094 ∗∗∗
and
∗
60
80
100
−0.0847 (−10.1044)∗∗∗
−0.0856 (−10.4202)∗∗∗
−0.0877 (−10.8354)∗∗∗
−0.0843 (−11.0950)∗∗∗
−0.074 (−9.0910)∗∗∗
−0.1061 (−9.9534)∗∗∗
0.3948 (4.6088)∗∗∗
0.4573 (7.2404)∗∗∗
0.4536 (8.6788)∗∗∗
0.5234 (12.2889)∗∗∗
0.5730 (12.4485)∗∗∗
0.3436 (5.4951)∗∗∗
0.0672
0.1554
0.2150
0.3719
0.3974
0.1231
represent that the responding coefficients are significant at the significance level of 1% and 10% respectively. 3637
Note:
40
b3568-v4-ch103
α0
30
9.61in x 6.69in
k (days)
Option Price and Stock Market Momentum in China
10
Handbook of Financial Econometrics,. . . (Vol. 4)
k (days)
page 3637
July 6, 2020
16:3
3638
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
80-days SSE 50 Index return exerts the most influence for low-liquidity SSE 50 ETF options. Comparison between the high-liquidity subsample and the low-liquidity subsample shows that the high-liquidity subsample results generally indicate larger regression coefficients. This finding suggests that the influence of past stock returns on high-liquidity option prices is relatively larger than that on low-liquidity option prices. 103.4.3 Source of price pressure 103.4.3.1 Market momentum hypothesis So far, we have established that past stock market returns affect index option prices. Both price changes lead to boundary condition violations that are inconsistent with frictionless, arbitrage-free markets. There are a number of potential interpretations of our findings. One possible explanation for the results documented so far is that investors simply project past stock returns into future stock returns. As documented by Lo and Mackinlay (1988), the value-weighted market index exhibits strong positive autocorrelation over intermediate horizons of less than 1 year. Hence, given positive past market returns, investors expect the positive returns to continue, and bid up the prices of call options. Given negative past market returns, investors expect the negative returns to continue, and bid up the prices of put options. This phenomenon is called the market momentum hypothesis. This hypothesis predicts that past stock returns exhibit an independent positive influence on the volatility spread. It is noted in a large number of literature that there exists a momentum return in the stock market, such as Grinblatt et al. (1955), Jegadeesh and Titman (1993), and Lewellen (2002). Most investors are “momentum investors”, buying stocks that were past winners and sell stocks that were past losers. Therefore, in this section, we investigate the effects of the underlying stock momentum as the source of price pressure in the options market. 103.4.3.2 Momentum return pattern We begin by capturing the momentum pattern in the average returns among the constituent stocks of the SSE 50 Index by adopting the momentum return, WML, as the momentum factor. WML was originally used in Carhart (1997)’s four-factor model and then commonly used in the empirical application, most notably to detail momentum patterns in average stock returns, such as in Fama and French (2012). Instead of using the difference between the monthly returns on diversified portfolios of the winners and losers of the
page 3638
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3639
past year, we apply the difference between the daily returns on diversified portfolios of the winners and losers of the last month, which is then denoted as WML. This process is performed to facilitate the subsequent time-series regression. Specifically, WML is calculated for portfolios that are constructed from 2 × 3 sorts on size and momentum (Carhart 1997, Fama and French 2012). First, at the end of December 2014 and 2015, all constituent stocks are assigned to two size portfolios based on each stock’s market cap. Large stocks are those in the top 50% of the market cap, and small stocks are those in the bottom 50%. This description determines the size portfolios in 2015 and 2016. The same stocks are then allocated in an independent sort to three momentum portfolios based on the breakpoints for the bottom 30%, middle 40% and top 30% of lagged momentum return. For each stock at the end of day t, the lagged momentum returns are its cumulative returns from t − 20 to t − 1 (in China, each month includes approximately 21 working days, and skipping the sorting day is standard in momentum tests). Winners are those in the top 30% of lagged momentum neutral are those in the middle 40%, and losers are those in the bottom 30%. The intersection of the independent 2 × 3 sorts produces six value-weight size-momentum portfolios: SL, SN, SW, BL, BN, and BW. We compute the daily value-weight returns for each portfolio at day t + 1. The momentum factor, WML(t), is then the equalweight average of the daily returns on the high lagged momentum stocks (winners) minus the low lagged momentum stocks (losers), as illustrated in equation (103.9). Based on the portfolio formed from 2 × 3 sorts on size and momentum, we document the summary statistics for WML, as shown in Table 103.6. It is Table 103.6: Summary statistics for momentum factor: February 9th, 2015–May 26th, 2016, 316 days.
Mean (%) Std dev (%) t-mean
WMLS
WMLB
WML
WMLS–B
20.58 10.91 7.55
15.81 7.47 8.47
36.40 16.10 9.04
4.77 9.50 2.01
Notes: WMLS is the difference between the daily returns of the diversified portfolios of the winners and losers of the last month among small stocks; WMLB is the difference between the daily returns of the diversified portfolios of the winners and losers of the last month among large stocks, and WML is the equal-weight average of WMLS and WMLB . WMLS−B is the difference between WMLS and WMLB . All returns are in CNY. Mean and standard deviation denote the mean and standard deviation of the return, respectively, and t-mean is the ratio of mean to its standard error.
page 3639
July 6, 2020
16:3
3640
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
found that momentum returns are strong among the constituent stocks of the SSE 50 Index. As shown in Table 103.6, average WML return achieved a maximum of 36.40% (t = 9.04). The average WML return is 20.58% (t = 7.55) for small stocks and 15.81% (t = 8.47) for large stocks. The difference between average WML returns for small and large stocks exceeds two standard errors. This finding provides support for our subsequent time-series regression. 103.4.3.3 Momentum effects on option prices We test the market momentum hypothesis with a regression analysis using the momentum factor, WML, which is the difference between the daily returns of the diversified portfolios of winners and losers of the last month among all constituent stocks of the SSE 50 Index, as the independent variable. For the dependent variable, we use the implied volatility spread series of high-liquidity and low-liquidity subsamples. Then, the regression model is IV St = β0 + β1 W M Lt + εt ,
(103.11)
where IVSt is the implied volatility spread at day t, β0 the constant, β1 the coefficient of the momentum factor, WMLt the difference between the daily returns of the diversified portfolios of the winners and losers of the last month at day t, and εt the residuals. The regression results in Table 103.7 indicate that stock market momentum arrives with a positive and significant coefficient against the implied volatility spread of either the high-liquidity subsample or the low-liquidity subsample. This finding suggests that the momentum in the underlying equity market directly and significantly affects the pricing of index option. However, it is found that the value of β1 for low-liquidity options is 0.35, which is larger than that for high-liquidity options, 0.08. All t-values in parentheses below the coefficients are smaller than 0.01, showing that they are significant. This finding suggests that the underlying stock momentum exerts a greater influence on the options with low liquidities than those with high-liquidities. Furthermore, an exciting finding can be ascertained from Table 103.7 in that the R2 for the high liquidity group is significantly lower than that for the low liquidity group. The potential explanation for this phenomenon is related to the maturity of the option. Specifically, compared to low-liquidity options, high-liquidity options usually have a shorter maturity. Therefore, high-liquidity options usually exist in the options market for a shorter period, which causes the results that the past stock momentum will not affect them for a long time and occasionally cannot influence them in time. Therefore,
page 3640
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
Option Price and Stock Market Momentum in China
3641
Table 103.7: Regression results of the implied volatility spread on the momentum return during the sample period. IVS t =β0 +β1 WMLt +εt High liquidity β0 β1 R2
−0.11 (−5.54)∗∗∗ 0.08 (1.65)∗∗ 0.008
Low liquidity −0.21 (−11)∗∗∗ 0.35 (7.40)∗∗∗ 0.15
Note: ∗∗ and ∗∗∗ represent that the responding coefficient is significant at the significance level of 5% and 1%, respectively.
the momentum effects on the low-liquidity options group are much higher than those on the high-liquidity options group. 103.5 Conclusions In this chapter, we investigate the relations between the option prices and past stock market returns in China. Using the SSE 50 ETF option prices, we construct the implied volatility spread of pairs of high-liquidity and low-liquidity call and put options, respectively, with identical expiration dates and strike prices, to measure the relative values of call options to put options. Our empirical results provide strong evidence that past stock returns exert an influence on the pricing of both high-liquidity and lowliquidity index options. Positive past stock returns increase the price of call options, while negative past stock returns increase the price of put options. Specifically, the past 60-day and 80-day stock returns impose the most significant positive influence on high-liquidity and low-liquidity option prices, respectively. Therefore, our findings are both economically and statistically significant. Using the momentum factor, WML, proposed by Carhart (1997), we further investigate the effects of the underlying stock momentum as the source of price pressure in the options market. The regression results show that the momentum effects in the underlying stock exert a greater influence on low-liquidity options than that on high-liquidity options. This finding somewhat validates the market momentum hypothesis, which predicts that call prices tend to be relatively high following large stock price increases, and put prices tend to be relatively high following large stock price declines.
page 3641
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3642
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
Bibliography Amin, K., Coval, J.D. and Seyhun, H.N. (2004). Index Option Prices and Stock Market Momentum. Journal of Business 77, 835–874. Ball, C.A. and Torous, W.N. (1986). Futures Options and the Volatility of Futures Prices. Journal of Finance 41, 857–870. Barberis, N., Mukherjee, A. and Wang, B. (2016). Prospect Theory and Stock Returns: An Empirical Test. Review of Financial Studies 29, 3068–3107. Black, F. and Scholes, M. (1973). The Pricing of Options and Corporate Liabilities. Journal of Political Economy 81, 637–654. Bodurtha, J.N. and Courtadon, G.R. (1986). Efficiency Tests of the Foreign Currency Options Market. Journal of Finance 41, 151–162. Brenner, M. and Galai, D. (1986). Implied Interest Rates. Journal of Business 59(3), 493–507. Brenner, M. and Subrahmanyam, M.G. (1988). A Simple Formula to Compute the Implied Standard Deviation. Financial Analysts Journal 44, 80–83. Campbell, J.Y., Grossman, S.J. and Wang, J. (1992). Trading Volume and Serial Correlation in Stock Returns. Working Paper, National Bureau of Economic Research. Chance, D.M. (1996). A Generalized Simple Formula to Compute the Implied Volatility. Financial Review 31, 859–867. Carhart, M.M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance 52, 57–82. Choe, H., Kho, B.C. and Stulz, R.M. (1999). Do Foreign Investors Destabilize Stock Markets? The Korean Experience in 1997. Journal of Financial Economics 54, 227–264. Conrad, J. and Kaul, G. (1998). An Anatomy of Trading Strategies. Review of Financial Studies 11, 489–519. Cox, J.C. and Ross, S.A. (1976). A Survey of Some New Results in Financial Option Pricing Theory. Journal of Finance 31, 383–402. Cremers, M. and Weinbaum, D. (2010). Deviations from Put–Call Parity and Stock Return Predictability. Journal of Financial and Quantitative Analysis 45, 335–367. Drew, M.E., Naughton, T. and Veeraraghavan, M. (2003). Firm Size, Book-to-Market Equity and Security Returns: Evidence from the Shanghai Stock Exchange. Australian Journal of Management 28, 119–139. Evnine, J. and Rudd, A. (1985). Index Options: The Early Evidence. Journal of Finance 40, 743–756. Fama, E.F. and French, K.R. (1993). Common Risk Factors in the Returns on Stocks and Bonds. Journal of Financial Economics 33, 3–56. Fama, E.F. and French, K.R. (2012). Size, Value, and Momentum in International Stock Returns. Journal of Financial Economics 105, 457–472. Figlewski, S. (1989). Options Arbitrage in Imperfect Markets. Journal of Finance 44, 1289–1311. Figlewski, S. and Webb, G.P. (1993). Options, Short Sales, and Market Completeness. Journal of Finance 48, 761–777. Followill, R.A. and Helms, B.P. (1990). Put–call-Futures Parity and Arbitrage Opportunity in the Market For Options on Gold Futures Contracts. Journal of Futures Markets 10, 339–352. Galai, D. (1977). Tests of Market Efficiency of the chicago Board Options Exchange. Journal of Business 50, 167–197.
page 3642
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3643
Grinblatt, M. and Keloharju, M. (2001). How Distance, Language, and Culture Influence Stockholdings and Trades. Journal of Finance 56, 1053–1073. Grinblatt, M., Titman, S. and Wermers, R. (1995). Momentum Investment Strategies, Portfolio Performance, and Herding: A Study of Mutual Fund Behavior. American Economic Review 85, 1088–1105. Hull, J.C. (2014). Options, Futures and Other Derivatives. Pearson Education, Inc., New Jersey. Jegadeesh, N. and Titman, S. (1995). Overreaction, Delayed Reaction, and Contrarian Profits. Review of Financial Studies 8, 973–993. Jegadeesh, N. and Titman, S. (1993). Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency. Journal of Finance 48, 65–91. Jerbi, Y. (2015). A New Closed-Form Solution as an Extension of the Black–Scholes Formula Allowing Smile Curve Plotting. Quantitative Finance 12, 2041–2052. Klemkosky, R.C. and Resnick, B.G. (1980). An Ex-Ante Analysis of Put–Call Parity. Journal of Financial Economics 8, 363–378. Klemkosky, R.C. and Resnick, B.G. (1979). Put–Call Parity and Market Efficiency. Journal of Finance 34, 1141–1155. Kraus, A. and Litzenberger, R.H. (1976). Skewness Preference and Valuation of Risk Assets. Journal of Finance 31, 1085–1100. Lai, T.Y., Lee, C.F. and Tucker, A.L. (1992). An Alternative Method for Obtaining the Implied Standard Deviation. Journal of Financial Engineering 1, 369–375. Lamont, O. and Thaler, R. (2003). Can the Market Add and Subtract? Mispricing in Tech Stock Carve-Outs. Journal of Political Economy 111, 227–268. Latan´e, H.A. and Rendleman, R.J. (1976). Standard Deviation of Stock Price Ratios Implied by Option Prices. Journal of Finance 31, 369–381. Li, S. (2005). A New Formula for Computing Implied Volatility. Applied Mathematics and Computation 170, 611–625. Liew, J. and Vassalou, M. (2000). Can Book-to-Market, Size and Momentum be Risk Factors that Predict Economic Growth? Journal of Financial Economics 7, 221–245. Lo, A.W. and MacKinlay, A.C. (1988). Stock Market Prices do not Follow Random Walks: Evidence from a Simple Specification Test. Review of Financial Studies 1, 41–66. Lo, A.W. and MacKinlay, A.C. (1990). When are Contrarian Profits Due to Stock Market Overreaction? Review of Financial Studies 3, 175–205. Lowellen, J. (2002). Momentum and Autocorrelation in Stock Returns. Review of Financial Studies 15, 533–563. MacBeth, J.D. and Merville, L.J. (1979). An Empirical Examination of the Black–Scholes Call Option Pricing Model. Journal of Finance 34, 1173–1186. Manaster, S. and Koehler, G. (1982). The Calculation of Implied Variances from the Black– Scholes Model: A Note. Journal of Finance 37, 227–230. Nisbet, M. (1992). Put–Call Parity Theory and an Empirical Test of the Efficiency of the London Traded Options Market. Journal of Banking and Finance 16, 381–403. Ofek, E. and Richardson, M. (2003). Dotcom Mania: The Rise and Fall of Internet Stock Prices. Journal of Finance 58, 1113–1137. Ofek, E., Richaardson, M. and Whitelaw, R.F. (2004). Limited Arbitrage and Short Sales Restrictions: Evidence from the Options Markets. Journal of Financial Economics 74, 305–342. Singleton, J.C. and Wingender, J. (1986). Skewness Persistence in Common Stock Returns. Journal of Financial and Quantitative Analysis 21, 335–341. Stoll, H.R. (1969). The Relationship between Put and Call Option Prices. Journal of Finance 24, 801–824.
page 3643
July 17, 2020
14:50
3644
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
Appendix 103A Exact Form of Black–Scholes Option Pricing Model The well-known Black–Scholes option pricing model was proposed by Black and Scholes (1973) and soon gained large popularity among practitioners and academic researchers. The exact form the model is provided below:
where
c = S0 N (d1 ) − Ke−rT N (d2 ),
(103A.1)
p = Ke−rT N (−d2 ) − S0 N (−d1 ).
(103A.2)
ln(S0 /K) + r + 12 σ 2 T √ , d1 = σ T √ d2 = d1 − σ T ,
S0 is the current market price of the underlying stock, K the exercise price, r the risk-free interest rate, and T the remaining life of the option. The implied volatility estimation is based on the B–S pricing model. Appendix 103B MATLAB Approach to Estimate Implied Volatility The MATLAB finance toolbox provides a function blsimpv to search for implied volatility. The algorithm used in the blsimpv function is Newton’s method, just as the procedure described in equation (103.6). This approach minimizes the difference between the observed market option value and the theoretical value of the B–S model and obtains the implied volatility estimator until the tolerance level is attained. The complete command of the function blsimpv is as follows: Volatility = blsimpv(Price, Strike, Rate, Time, Value, Limit, Yield, Tolerance, Class). The command with default setting is as follows: Volatility = blsimpv(Price, Strike, Rate, Time, and Value). There are nine inputs in total, while the last four are optional. Detailed explanations of all the inputs are as follows: Inputs: Price — Current market price of the underlying asset. Strike — Strike (i.e., exercise) price of the option. Rate — Annualized continuously compounded risk-free rate of return over the life of the option, expressed as a positive decimal number. Time — Time to expiration of the option, expressed in years.
page 3644
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
Option Price and Stock Market Momentum in China
3645
Value — Price (i.e., value) of a European option from which the implied volatility of the underlying asset is derived. Optional Inputs: Limit — Positive scalar representing the upper bound of the implied volatility search interval. If empty or missing, the default is 10, or 1000% per annum. Yield — Annualized continuously compounded yield of the underlying asset over the life of the option, expressed as a decimal number. For example, this yield could represent the dividend yield and foreign risk-free interest rate for options written on stock indices and currencies, respectively. If empty or missing, the default is zero. Tolerance — Positive scalar implied volatility termination tolerance. If empty or missing, the default is 1e–6. Class — Option class (i.e., whether a call or put) indicating the option type from which the implied volatility is derived. This class may be either a logical indicator or a cell array of characters. To specify call options, set Class = true or Class = {‘Call’}; to specify put options, set Class = false or Class = {‘Put’}. If empty or missing, the default is a call option. Output: Volatility — Implied volatility of the underlying asset derived from European option prices, expressed as a decimal number. If no solution can be found, a NaN (i.e., Not-a-Number) is returned. Example: Consider a European call option trading at $5 with an exercise price of $95 and 3 months until expiration. Assume that the underlying stock pays 5% annual dividends and that it is trading at $90 at this moment; in addition, the risk-free rate is 3% per annum. Under these conditions, the command used in Matlab will be either of the following: Volatility = blsimpv(90, 95, 0.03, 0.25, 5, [ ], 0.05, [ ], {‘Call’}), Volatility = blsimpv(90, 95, 0.03, 0.25, 5, [ ], 0.05, [ ], true). Note that this function provided by MATLAB’s toolbox can only estimate the implied volatility from a single option. For more than one option, the user needs to write his/her own programs to estimate the implied variances. Appendix 103C Upper and Lower Bounds for European Index Options The upper and lower bounds for option prices do not rely on any particular assumptions about the factors in the B–S pricing model. If an option price is
page 3645
July 6, 2020
16:3
3646
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch103
J. Li et al.
above the upper bound or below the lower bound, then there are profitable opportunities for arbitrageurs. Specifically, a European call or option provides the holder the right to buy one share of a stock for a certain price. Regardless of what occurs, the option can never be worth more than the stock. Hence, the current stock price should be an upper bound to the call option price: c ≤ S0 .
(103C.1)
A European put option provides the holder the right to sell one share of a stock for K. Regardless of how low the stock price becomes, we know that, at maturity, the option cannot be worth more than K. It follows that the put cannot be worth more than the present value of K today: p ≤ Ke−rT .
(103C.2)
Regarding the lower bound for the price of a European call option, let us consider the following two portfolios: Portfolio A: one European call option plus an amount of cash equal to Ke−rT Portfolio B: one share In portfolio A, the cash, if it is invested at the risk-free interest rate, will grow to K in time T . If ST > K, the call option is exercised at maturity, and portfolio A is worth ST . If ST < K, the call option expires worthless and the portfolio is worth K. Hence, at time T , portfolio A is worth: max(ST , K).
(103C.3)
Portfolio B is worth ST at time T . Hence, portfolio A is always worth as much as, and can be worth more than, the value of portfolio B at the option’s maturity. It follows that in the absence of arbitrage opportunities, this must also be true today. Hence, c ≥ S0 − Ke−rT .
(103C.4)
To obtain the lower bound for a European put option, we consider the following two portfolios: Portfolio C : one European put option plus one share Portfolio D: an amount of cash equal to Ke−rT If ST < K, then the option in portfolio C is exercised at the option maturity, and the portfolio becomes worth K. If ST > K, then the put
page 3646
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Option Price and Stock Market Momentum in China
b3568-v4-ch103
3647
option expires worthless, and the portfolio is worth ST at this time. Hence, portfolio C is worth max(ST , K) in time T . Assuming the cash is invested at the risk-free interest rate, portfolio D is worth K at time T . Hence, portfolio C is always worth as much as, and can occasionally be worth more than, portfolio D in time T . It follows that, in the absence of arbitrage opportunities, portfolio C must be worth at least as much as portfolio D today. Hence: p ≥ Ke−rT − S0 .
(103C.5)
Therefore, we obtain the upper and lower bounds for European calls and puts, respectively: S0 − Ke−rT ≤ c ≤ S0 ,
(103C.6)
Ke−rT − S0 ≤ p ≤ Ke−rT .
(103C.7)
page 3647
This page intentionally left blank
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Chapter 104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs: Methodology and Effectiveness Wan-Jiun Paul Chiou and Jing-Rung Yu Contents 104.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 104.2 Constructing Portfolios with Short-Sales and Transaction Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.2.1 Mean–variance model . . . . . . . . . . . . . . . . 104.2.2 Mean–absolute deviation model . . . . . . . . . . 104.2.3 Linearized value-at-risk model . . . . . . . . . . . 104.2.4 Conditional value-at-risk model . . . . . . . . . . 104.2.5 Omega ratio model . . . . . . . . . . . . . . . . . 104.3 Data and Performance Measures . . . . . . . . . . . . . . 104.4 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . 104.4.1 Ex ante performance . . . . . . . . . . . . . . . . 104.4.2 Realized performance . . . . . . . . . . . . . . . . 104.4.3 Portfolio structure . . . . . . . . . . . . . . . . . . 104.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wan-Jiun Paul Chiou Northeastern University e-mail: [email protected] Jing-Rung Yu National Chi-Nan University e-mail: [email protected] 3649
. .
3650
. . . . . . . . . . . . .
3653 3654 3655 3656 3657 3658 3659 3662 3662 3663 3670 3672 3673
. . . . . . . . . . . . .
page 3649
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3650
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
Abstract This chapter presents advancement of several widely applied portfolio models to ensure flexibility in their applications: Mean–variance (MV), Mean–absolute deviation (MAD), linearized value-at-risk (LVaR), conditional value-at-risk (CVaR), and Omega models. We include short-sales and transaction costs in modeling portfolios and further investigate their effectiveness. Using the daily data of international ETFs over 15 years, we generate the results of the rebalancing portfolios. The empirical findings show that the MV, MAD, and Omega models yield higher realized return with lower portfolio diversity than the LVaR and CVaR models. The outperformance of these risk-return-based models over the downside-risk-focused models comes from efficient asset allocation but not only the saving of transaction costs. Keywords Portfolio selection • Conditional value at risk model • Value at risk model • Omega model • Transaction costs • Short selling.
104.1 Introduction How to construct optimal portfolio to achieve a certain objective in managing assets is a core issue for modern finance. The analytical framework of Markowitz (1952) has provided the foundations for modeling portfolio; however, some issues challenge its application: Optimal portfolios are frequently questioned by their usefulness in the real world, what are the economic values of exercising these strategies? How does design of investment objectives and/or measure of risk affect the conclusion regarding portfolio performance? Given the complexity of computing a huge number of quadratic programming, is the optimal portfolio that uses simplified estimation less effective than that uses full estimation of variance–covariance matrix? The above issues are critical both in academia and on Wall Street to determine how these models should be used in practice. In this chapter, we advance various portfolio models with considering trading costs and short-selling and furthermore evaluate their overtime rebalancing performance. This chapter is useful to asset management professionals in the following aspects. First, we modify the optimal portfolio models that are advanced from the mean–variance (MV) framework and examine their performance. Since portfolio management cannot be separated from risk management, it is critical to evaluate how do different measures of portfolio risk affect the performance. We present mean–variance (MV) by Markowitz (1952), mean–absolute deviation (MAD) by Konno and Yamazaki (1991), linearized value-at-risk (VaR) by Yu, Chiou, and Mu (2015), conditional value-atrisk (CVaR) by Rockafellar, Uryasev, and Zabarankin (2006), and Omega by Kapsos et al. (2014b) models in this chapter. The later four methods
page 3650
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3651
represent the modifications for the various issues that complicate and bias the implementation of the authentic MV model. For instance, the VaR has been widely used by financial institutions while CVaR is regarded as an advance for risk/portfolio management. Previous studies such as Angelelli, Mansini, and Speranza (2008) compare the performance of the MAD and CVaR models but did not consider more associated portfolio models. By comparing various models, we provide the insightful analysis in selecting portfolio models according to scenarios. Furthermore, this chapter incorporates trading costs and allows shortsales to ensure the feasibility of the strategies and the usefulness of the findings. Ignoring the transaction costs and the limiting short selling can reduce commutating complexity but hinder the use of the portfolio models in the finance industry. We also apply multiple objective programming by Yu, Chiou, and Mu (2015), Yu, Chiou, and Yang (2017), and Yu and Lee (2011) on the portfolio selection. The strategies with consideration of these different scenarios help ensure the comparability of the results. This chapter also presents the portfolio models that are suitable to manage a large number of assets. The growing variance–covariance matrix can cause computational complexity to the MV-like models. Konno and Yamazaki (1991) propose a mean–absolute deviation (MAD) in which risk is defined as the mean of the absolute value of the difference in return. To enhance the efficiency of calculation, Simaan (1997) and Angelelli et al. (2008) suggest that the MAD model can be linearized without the need to calculate the covariance matrix. Gotoh and Takeda (2012), on the other hand, suggest that risk is the value below a benchmark of return. Therefore, in the Omega model, only lower partial moments (LPMs) serve as the portfolio risk. The Omega ratio is to measure the ratio between the profit (portfolio returns that are higher than the threshold value τ ) and the downside losses (Kapsos et al., 2014a). One of the good properties of the Omega ratio is it does not need a normality assumption to assess portfolio performance. Finally, the results using different performance measures allow us to evaluate the portfolio strategies from different aspects. We follow Li, Sarkar, and Wang (2003); and Woodside-Oriakhi, Lucas, and Beasley (2013) and focus on how these models can be applied in the finance industry. Our findings of ex post performance by rebalancing the portfolios provide a comprehensive perspective on how to select models for investors with different investment objectives, whose points of view also represent the various motivations for diversifying investments.
page 3651
July 6, 2020
16:3
3652
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
The conventional portfolio models do not soundly respond to the call of risk management. The linear relationship between the portfolio return and the returns of its composing securities challenges ex post performance. DeMiguel et al. (2009) suggest that there is no portfolio model consistently yields significantly higher out-of-sample performance than the na¨ıve diversification strategy (1/N portfolio). The MV models historical volatility but not the future potential downside risk. To improve this issue, the value-atrisk (VaR) model is applied to estimate the loss that the investor can tolerate given a certain confidence level (Jorion, 2000). We will apply the linearized VaR model suggested by Yu, Chiou, and Mu (2015) in constructing portfolio. Their model is an improvement of the mixed integer linear model proposed by Benati and Rizzi (2007). The conditional value-at-risk (CVaR) model is an improvement to the VaR in the issues that affect its application, including the bias caused by return discontinuities (Rockafellar and Uryasev, 2000), multiple local optima (Mausser and Rosen, 1999), and sub-additivity. In addition, the CVaR is less sensitive to the shape of the loss distribution while, more importantly, respects the properties of coherent risk measure with convex nature (Rockafellar, Uryasev, and Zabarankin, 2006). Rockafellar and Uryasev (2002) document that the CVaR can quantify downside risk more precisely than traditional MV models since the CVaR models asymmetry, or fat tail, in asset return distribution. Since CVaR can be linearized and yields the global optimum, it demonstrates better theoretical properties in risk management than VaR. We include Omega model developed by Kapsos et al. (2014a) in this chapter. Like CVaR-related models, Omega model is free from an assumption of return distribution, it considers both upside profit and downside loss that are defined by the return threshold (τ ). The investor may adjust the preference coefficient between upside profit and downside loss in portfolio optimization. This chapter synthesizes the linearized method suggested by Kapsos et al. (2014b) in constructing the optimal portfolio. Both the short-selling mechanism and the transaction costs, including purchasing, selling, short-sales, and repurchase of short-sales are modeled in making portfolio decision. To ensure the feasibility of the results, we consider the impact of short-sale and trading costs in portfolio construction. Green and Hollifield (1992), Jagannathan and Ma (2003), and Kwan (1997) have documented that adding weight constraints can improve a portfolio’s riskreturn trade-off. Angel et al. (2003) suggest short selling provides investors a chance to arbitrage during market downturn while it can cause an increase
page 3652
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3653
in portfolio return volatility. Yu and Lee (2011) optimize the proportion of short selling with considering risks in portfolio. A reasonable design of portfolio short-sales increases the effectiveness of diversification across assets. Regarding trading expenses, Atkinson and Mokkhavesa (2004) suggest that the portfolio rebalancing frequencies and scale are larger than what they should be if the transaction costs are not considered. Chen and Yang (2017) document that ignoring transaction costs can result in frequent trades and increase risk exposure. Any statistically significant outperformance of optimized strategies is, however, found to dissipate for large transaction costs (Carroll et al., 2017). Purchasing or selling assets to rebalance the tracking portfolio causes transaction costs. Strub and Baumann (2018) consider the trade-off between transaction costs and similarity in terms of normalized value development for multiperiod rebalancing. We integrate WoodsideOriakhi et al. (2013) and Yu, Chiou, and Liu (2017), in rebalancing portfolio with considering the issues to improve feasibility. This chapter studies the performance of various portfolio models by using daily data over 15 years, including in bear and bull markets. We use the naive diversification, the equally-weighted (1/N ) portfolio in DeMiguel et al. (2009), as the benchmark to evaluate the performance of various portfolio models. Our time-series empirical results confirm the ex ante benefits of risk model portfolios. We find that the models that consider both risk and return, such as MV, MAD, and Omega, outperform the downside risk-focus models, such as LVaR and CVaR, in terms of the realized return. The LVaR and CVaR models, however, yield lower volatility in returns. The higher profitability of the three models comes from the better portfolio modeling but not only the savings of transaction costs. The structure of the rest of the chapter is as follows. Section 104.2 describes the models and their empirical applications. Section 104.3 presents the data and how to evaluate the effectiveness of the models. Section 104.4 reports the major empirical results. Section 104.5 concludes. 104.2 Constructing Portfolios with Short-Sales and Transaction Costs We consider time-rolling portfolio rebalancing in which short-sales are allowed and trading fees are included. When the expected returns of all available assets are negative, prohibiting short-sales may not generate a solution if the expected return of the portfolio is set to be non-negative. Allowing short-sales improves portfolio efficiency and helps to fashion flexible
page 3653
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3654
asset allocation strategies, particularly during periods of market downturn (e.g., Kwan, 1997). White (1990) and Angel, Christophe and Ferri (2003) also suggest that short-selling grants speculation opportunity and may decrease portfolio volatility. A recent study by Carroll, Conlon, Cotter, and Salvador (2017) documents that the outperformance of optimized strategies dissipate for large transaction costs. Strub and Baumann (2018) find the trade-off between transaction costs and similarity in terms of normalized value development for multiperiod rebalancing. The trading costs and short-selling are modeled when we rebalance the portfolios. Given there can be more than one objective, we apply a simple weighted method for multiple objective programming. 104.2.1 Mean–variance model The objective function incorporates the short selling portion of the portfolio and trading costs in the minimization of portfolio variance. Let σij be the covariance between ri and rj ; wj be the portfolio weight for asset j and can be decomposed into weights of long position (+) and short position (−), wj = wj+ − wj− ; pj ’s be the transaction costs of buying, selling, short selling, and repurchasing the short-sale, respectively; k be the initial margin requirement for short selling; wj+ be the total proportion of security j invested at portfolio rebalancing; and wj− be the total weight of security j sold by investors at portfolio rebalancing. With each rebalancing, lj+ is the buying weight of security j; lj− is the short-selling weight of security j; s+ j is − the short-selling weight of security j; and sj is the repurchasing weight of security j. With the required return (E), the MV model that incorporates the transaction costs, lower bounds of weights, the optimization of short-sale weights is Min
n n i=1 j=1
σij (wi+ − wi− )(wj+ − wj− ) +
n j=1
⎛ ⎞ n +⎝ p1 l+ + p2 l− + p3 s+ + p4 s− ⎠, j=1
s.t.
n j=1
j
j
r¯j (wj+ − wj− ) ≥ E,
j
j
wj−
(104.1)
(104.2)
page 3654
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3655 n j=1
− (wj+ + kwj− + p1 lj+ + p2 lj− + p3 s+ j + p4 sj ) = 1,
+ + lj+ − lj− , wj+ = wj,0
j = 1, 2, . . . , n,
− − + s+ wj− = wj,0 j − sj ,
j = 1, 2, . . . , n,
(104.3) (104.4) (104.5)
0.01uj ≤ wj+ ≤ uj ,
j = 1, 2, . . . , n,
(104.6)
0.01vj ≤ wj− ≤ vj ,
j = 1, 2, . . . , n,
(104.7)
uj + vj ≤ 1, uj , vj ∈ {0, 1},
j = 1, 2, . . . , n, j = 1, 2, . . . , n,
(104.8) (104.9)
where pc ’s are the transaction costs of buying (p1 ), selling (p2 ), short sales (p3 ), and repurchasing of short-sales (p4 ) asset j, respectively; k is the initial margin requirement for short selling; wj+ is the total proportion of security j invested at portfolio rebalancing; wj− is the total weight of security j sold by + is the long position weight of security investors at portfolio rebalancing; wj,0 − is the short position weight of j prior to portfolio rebalancing; and wj,0 security j prior to portfolio rebalancing. With each rebalancing, lj+ is the buying weight of security j; lj− is the short-selling weight of security j; s+ j is the short-selling weight of security j; and s− is the repurchasing weight j of security j. The binary variables uj and vj are used to indicate the longand short-selling position and to model the upper bounds of the weight. We set 1% is the minimal threshold to invest in one asset. This value can be adjusted accordingly. Regarding the equations in the model, equation (104.2) specifies the budget allocated to buying and short selling; (104.3) shows the long position after rebalancing; (104.4) represents the short-selling position after rebalancing; and (104.5) and (104.6) define the upper and lower bounds of the total weights of each security of long position and that of short-selling position, respectively. The definition of the binary variables in equations (104.8) and (104.9) ensures that long and short positions do not happen simultaneously. 104.2.2 Mean–absolute deviation model The mean–absolute deviation (MAD) model by Chang (2005) and Konno and Yamazaki (1991) is modified. The absolute deviation can be split into − the upside and downside deviations, | nj=1 (rjt − E(Rj ))wj | = dt = d+ t +dt ,
page 3655
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3656
n − + − d+ t ≥ 0, and dt ≥ 0, and one will have j=1 (rjt − E(Rj ))wj = dt − dt . The objective function in the MAD model can be linearized as T 1 d, Min T t=1
s.t. dt +
n
(104.10) (rn − E(Rj ))wj ≥ 0,
t = 1, . . . , T,
(104.11)
j=1
dt −
n
(rjt − E(Rj ))wj ≥ 0,
t = 1, . . . , T,
(104.12)
j=1
and Eqs. (104.2)–(104.9), where T is the ending period and rjt is the return on security j in period t. The MAD model is suitable to model a portfolio of a large number of assets since it does not need to solve quadratic programming. 104.2.3 Linearized value-at-risk model The Value-at-Risk (VaR) approach has been widely applied by financial institutions to control the downside loss. To assure the global optimums, we apply a linearized VaR model of Yu, Chiou, and Mu (2015) to construct and rebalance portfolios. This model is an improvement of Lin (2009) and Benati and Rizzi and is to avoid multiple solutions (2007). Specifically, T 1 − (A+ Max δr VaR + (1 − δ) − t + At ) T t=1 ⎞ ⎛ n − ⎠ (p1 lj+ + p2 lj− + p3 s+ (104.13) + ⎝− j + p4 s j ) , j=1
s.t. xt =
n
wj rjt , t = 1, . . . , T,
(104.14)
j=1 − Min yt , xt ≥ r Min + A+ t − At − r VaR+ , A+ t ≤r
t = 1, . . . , T,
VaR+ − U (1 − yt ), A+ t ≥r VaR− , A− t ≤r
t = 1, . . . , T,
t = 1, . . . , T,
t = 1, . . . , T,
(104.15) (104.16) (104.17) (104.18)
page 3656
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3657 VaR− A− − U (1 − yt ), t ≥r − A+ t , At ≥ 0,
t = 1, . . . , T,
t = 1, . . . , T,
(104.19) (104.20)
r VaR = r VaR+ − r VaR− ,
(104.21)
r VaR+ , r VaR− ≥ 0,
(104.22)
T 1 (1 − yt ) ≤ αVaR , T t=1
(104.23)
yt ∈ {0, 1},
(104.24)
t = 1, . . . , T,
and Eqs. (104.2)–(104.9), where the return threshold r VaR = r VaR+ − r VaR− . If r VaR > 0, then r VaR+ > 0 and r VaR− = 0. If r VaR < 0, then r VaR+ = 0 and r VaR− > 0. If r VaR = 0, then r VaR+ = 0 and r VaR− = 0. Let xt be the portfolio return on day t, r Min be the minimal return of investing assets among T days, yt be a binary variable indicating whether the portfolio return greater than r Min or r VaR on day t, and αVaR be the confidence level of return distribution. We have At = r VaR yt = (r VaR+ − r VaR− )yt = r VaR+ yt − − r VaR− yt = A+ t − At . 104.2.4 Conditional value-at-risk model We modify Rockafellar and Uryasev (2000) by transforming the objectives, the portfolio variance and short-selling weights, into a linear function. The VaR bears some undesirable mathematical characteristics such as a lack of sub-additivity and convexity. These characters will drive the volatility of the sum of a portfolio to be larger than the sum of the variances of the individual assets. This attribute may discourage diversification as it presents an increase in portfolio risk. To improve the above issues, Rockafellar, Uryasev, and Zabarankin (2006) suggest CVaR model. Ogryczak and Ruszczynski (2002) suggest that VaR criterion is equivalent to the first-order stochastic dominance while Ma and Wong (2010) show that CVaR is equivalent to the second-order stochastic dominance. Our model is ⎛ ⎞ T n 1 − ⎠ ηt + ⎝− (p1 lj+ + p2 lj− + p3 s+ Min ξ + j + p4 sj ) , (104.25) (1 − α)T t=1 j=1
s.t. ηt ≥ −
n j=1
ηt ≥ 0,
rjt (wj+ − wj− ) − ξ,
t = 1, . . . , T,
t = 1, . . . , T,
(104.26) (104.27)
page 3657
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3658
and Eqs. (104.2)–(104.9), where ηt is an auxiliary variable of the loss value, ξ is a threshold, which represents the CVaR. The calculation of the VaR and the CVaR is usually in monetary values. In this chapter, we will present the rate of return and asset values according to the corresponding performance measures. Since previous studies do not propose how to determine risk aversion coefficient in practice, our study set α = 95% and compare various models. 104.2.5 Omega ratio model As a lower partial moment (LPM), the Omega ratio is free from the assumption of Gaussian distribution. Kapsos et al. (2014a) develop the worst-case Omega ratio model to deal with uncertainty in asset returns. In this model, the Omega ratio optimization is transformed as a linear program and hence can be solved as a mixed integer linear programming problem. When there is one scenario, i.e., i = 1 in Kapsos et al. (2014a), the portfolio is the Omega ratio maximization under certain returns. The following rebalancing model by minimizing the Omega ratio, minimizing the trading costs, and minimizing the portfolio weights of the short-sales, is further transformed into a linearized single-objective function. Max
ω− ⎛
s.t. δ ⎝
n j=1
n j=1
ηt ≥ τ −
wj− −
n j=1
− (p1 lj+ + p2 lj− + p3 s+ j + p4 sj ),
⎞
(wj+ − wj− )¯ rj − τ ⎠ − (1 − δ)
n j=1
rjt (wj+ − wj− ),
T 1 ηt ≥ θ, T t=1
t = 1, 2, . . . , T,
ηt ≥ 0,
t = 1, 2, . . . , T, ⎛ ⎞ n rjt (wj+ − wj− )⎠ − ξ, ηt ≥ − ⎝
(104.28)
(104.29)
(104.30) (104.31)
t = 1, 2, . . . , T,
(104.32)
j=1
ηti ≥ 0, n j=1
t = 1, 2, . . . , T,
r¯j (wj+ − wj− ) ≥ E,
and equations (104.2)–(104.9),
(104.33) (104.34)
page 3658
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3659
where r¯j is the mean of the asset return, rjt is return of asset j at time t. In equation (104.28), three objectives are transformed into a linearized singleobjective function when the portfolio is rebalanced: minimizing the Omega ratio, minimizing the trading costs, and minimizing the portfolio weights of the short sales. Differing from the crisp-return model that i = 1, equation (104.29) describes the trade-off decision between the profit and the loss in different scenarios.
104.3 Data and Performance Measures To verify the effectiveness of our models, we perform empirical tests by using daily return of the exchange traded funds (ETFs) on the Morgan Stanley Capital International (MSCI) stock indices from 21 countries between October 10, 2001 and June 10, 2015. Table 104.1 presents the summary statistics of the assets, including their Sharpe ratio (SR), skewness, and kurtosis during the sample period. These markets represent more than 95% of the world capitalization and the widely traded alternative investments during the sample period. The portfolio strategies using the Morgan Stanley Capital International (MSCI) index funds are feasible and are of high liquidity. We evaluate the realized performance that the diversification strategies are executed over time. Previous studies such as DeMiguel, Garlappi, and Uppal (2009) indicate that a poor estimation of asset returns challenges the application of risk portfolio models. For a na¨ıve investor without the knowledge of optimization techniques, asset allocation that follows the market potentially can be a feasible diversification strategy. An equally weighted diversification (1/N ) serves as a comparison portfolio. The realized portfolio values (RPVs) are generated when the portfolios are rebalanced every period according to the optimal asset allocation. We use the realized portfolio returns to calculate the Sharpe ratio and Omega ratio of the above risk-return portfolio models and to evaluate their ex post effectiveness. The Omega ratio is more feasible in estimating the performance on return distribution that departs from normality. The proposed mixed integer linear programming models are run with Lingo 11 software (Schrage, 2002). Our models are linearized and thus would obtain the global optimal solutions for each model. Figure 104.1 shows the process of rebalancing the asset allocation over the 3440 trading days. The portfolio is formed by using the previous 60 daily returns to estimate the parameters. The asset allocation is adjusted every 20 transaction days.
page 3659
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3660 Table 104.1: Asset iShares MSCI Australia Index iShares MSCI Austria Index iShares MSCI Belgium Index iShares MSCI Brazil Index iShares MSCI Canada Index iShares MSCI France Index iShares MSCI Germany Index iShares MSCI Hong Kong Index iShares MSCI Italy Index iShares MSCI Japan Index iShares MSCI Malaysia Index iShares MSCI Mexico Index iShares MSCI Netherlands Index iShares MSCI Singapore Index iShares MSCI South Korea Index iShares MSCI Spain Index iShares MSCI Sweden Index iShares MSCI Switzerland Index iShares MSCI Taiwan Index iShares MSCI United Kingdom Index SPDR S&P 500
Summary statistics of the investments.
Symbol Mean Sharpe ratio Skewness Kurtosis EWA
0.0999
0.363
−0.239
11.072
EWO EWK
0.0936 0.0627
0.331 0.231
−0.011 −0.395
14.245 6.717
EWZ EWC
0.1538 0.0985
0.428 0.434
−0.055 −0.456
8.903 8.364
EWQ EWG
0.0615 0.0897
0.221 0.327
−0.031 0.165
4.686 7.850
EWH
0.1028
0.418
0.303
6.868
EWI EWJ EWM
0.0326 0.0641 0.0929
0.092 0.255 0.467
−0.106 0.092 −0.529
5.024 5.765 8.571
EWW EWN
0.1387 0.0631
0.535 0.232
0.132 −0.308
13.490 5.394
EWS
0.1038
0.414
0.161
7.153
EWY
0.1631
0.495
0.484
12.626
EWP EWD EWL
0.0804 0.1227 0.0959
0.278 0.399 0.458
−0.010 0.017 −0.232
5.493 4.900 5.960
EWT EWU
0.0965 0.0435
0.323 0.164
0.152 −0.307
6.084 7.818
SPY
0.0628
0.345
−0.020
10.648
Note: The summary statistics of the MSCI exchange traded funds (ETFs) that are included in this study during the period from 10/10/2001 to 06/10/2015 are reported. The mean and standard deviation of return are annualized.
The first of the 170 rebalances takes place on January 4, 2002, using the data from the preceding 60 days. The last rebalance is on June 10, 2015. The transaction costs and portfolio diversity are used to assess the effectiveness of these models. Though rebalancing portfolio enhances efficiency of investment, frequent asset replacements lead to high transaction costs that erode the market value. The ratio of the transaction costs over the portfolio value provides the information regarding asset management efficiency. The
page 3660
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3661
Figure 104.1:
Portfolio rebalancing mechanism.
after-cost realized returns are obtained by using the weights of each portfolio in each period with assuming the initial investments to be $1 million. For the computation of market value when short-sales is allowed, the margin (k), set as 100% in the study, needs to be paid before the asset is short sold. The budget for the investment in the next period thus depends on the market value at the end of the previous period. To generate practical results, we design a portfolio rebalancing mechanism with considering short-sales and trading costs. The trading costs vary from broker to broker and from asset to asset. In this study, we refer to the fees that are generally accepted in the US market and set all transaction costs (p1 , p2 , p3 , and p4 ) at 25 basis points of the trading value. For the Omega model, the return thresholds (τ ) are specified 0.01%. Other than ex ante expected return, we use the realized portfolio return to measure the Sharpe and Omega ratios of the portfolios in order to evaluate the effectiveness of the above risk-return portfolio models. The Sharpe ratio (SR) is defined as SR =
r¯P − rf , σP
(104.35)
where r¯P is the expected return of the portfolio, rf is the risk free rate, and σP is the standard deviation of the portfolio. Considering the impact of departure of normality in return distribution and estimation errors, we also use the Omega ratio to measure portfolio
page 3661
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3662
performance: ω=
E(rj ) − τ + 1, E[τ − rj ]+
(104.36)
where τ is a threshold that partitions the return to the desirable (gain) and the undesirable (loss); rj denotes the random return of asset j. The value of τ is determined by the investor. Here, the return threshold is set as 0 for easy comparison ex post performance of all portfolios. Since the Omega ratio is free from the assumption of Gaussian distribution, it can be suitable to model the performance under various likelihood distributions with a threshold return (Nawrocki, 1999; Ogryczak and Ruszczynski, 1999; Keating and Shadwick, 2002; Kapsos et al., 2014a). Unlike the Sharpe ratio that uses the first two moments, the lower partial moment is more feasible in estimating the performance of a return distribution that departs from normality. We next analyze asset restructure. We report Herfindahl–Hirschman Index (HHI) of the portfolio weights to measure the diversity. To evaluate how the candidate assets are utilized by each portfolio model, we calculate the asset coverage ratio, the number of assets in the portfolio divided by that of the all available assets. The ratio of the transaction costs over the portfolio value (TC%) to assess the asset management efficiency: TC%t = Transaction Costst /RPV t .
(104.37)
104.4 Empirical Results We rebalance each of the portfolios every 20 trading days by using the data of the previous 60 trading days. For this study, each portfolio is rebalanced 170 times over the period. 104.4.1 Ex ante performance We first present the ex ante performance of the portfolio models in Table 104.2. Panel A shows these models, although their objectives vary, all generate low volatility-low return outcomes over the sample period. The fact that the return distributions highly concentrate on the range between 0% and 5% suggests conservative solutions of the in-the-sample optimums. It is natural that all the optimal outcomes are non-negative even during the market downturns. Among the models, the CVaR yields the lowest mean of returns but the LVaR performs the highest return. However, one should
page 3662
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3663 Table 104.2:
Summary statistics of ex ante performance: Expected return. MV
Mean St. Dev Distribution (%) ER ≤ 0 0 < ER ≤ 5% 5% < ER ≤ 10% 10% ≤ ER
0.0110 0.0027 0.00 100.00 0.00 0.00
MAD 0.0107 0.0028 0.00 100.00 0.00 0.00
CVaR (95%)
VaR (95%)
Omega
0.0100 0.0024
0.0143 0.0033
0.0132 0.0038
0.00 99.41 0.59 0.00
0.00 98.82 0.59 0.59
0.00 99.41 0.59 0.00
Note: The table reports the annualized mean, standard deviation, and distribution of ex ante returns of the above portfolio models over the sample period between October 10, 2001 and June 10, 2015.
keep in mind that the above optimal solutions do not guarantee the profit of realizing the portfolios. Figure 104.2 shows the time-variation of the ex ante return over the sample period. Among the models, the Omega portfolio yields the most volatile expected returns. This is because the Omega model considers both upside profit and downside loss that are defined by the return threshold. The investor may adjust the parameter in portfolio optimization. The summary statistics of the long-term Sharpe ratio over the sample period are reported in Table 104.3. The fact that all the risky portfolios yield higher mean–variance efficiency than the equally weighted portfolio shows the usefulness of diversification constructed by these models. Among the models, on average, the LVaR and CVaR demonstrate the highest Sharpe ratio as their in-the-sample volatilities are lower than the other models. The comparison of the in-the-sample Sharpe ratio between the portfolio models with the na¨ıve diversification is demonstrated in Figure 104.3. These portfolio models persistently yield higher mean–variance efficiency than the equally weighted portfolio. The time-variation of the Sharpe ratio of each of the portfolios is affected by the market dynamics. 104.4.2 Realized performance One of core questions in asset management is whether realizing portfolio models yields economic value to investors. We generate out-of-the-sample results by rebalancing the above portfolios with considering the transaction costs and short selling. The empirical tests evaluate the ex post effectiveness in managing portfolio. Figure 104.4 demonstrates the realized market value for the portfolio models and 1/N diversification. All the risky models
page 3663
July 6, 2020
16:3
3664
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
W.-J. P. Chiou & J.-R. Yu
(a) Mean–Variance Model
(b) Mean–Absolute Deviation Model
(c) Linearized Value-at-Risk Model
Figure 104.2:
Ex ante expected return.
b3568-v4-ch104
page 3664
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3665
(d) Conditional Value-at-Risk Model
(e) Omega Model
Figure 104.2: Table 104.3: 1/N Mean 0.2886 St. Dev 0.2450 Distribution (%) SR ≤ 0 0.00 0 < SR ≤ 0.2 43.20 0.2 < SR ≤ 0.5 36.69 0.5 < SR ≤ 0.8 18.34 0.8 ≤ SR 1.78
(Continued)
Summary statistics of ex ante Sharpe ratio. MV
MAD
0.4599 0.4037
0.5014 0.4359
0.00 39.05 18.93 20.71 21.30
0.00 36.69 17.75 20.12 25.44
CVaR (95%) VaR (95%) Omega 0.5738 0.4689 0.00 30.18 20.12 20.12 29.59
0.7332 0.5214 0.00 21.89 14.79 18.34 44.97
0.4964 0.4281 0.00 35.50 20.71 19.53 24.26
Note: The table reports the mean, standard deviation, and distribution of ex ante Sharpe ratio of the above portfolio models over the sample period between October 10, 2001 and June 10, 2015.
page 3665
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
W.-J. P. Chiou & J.-R. Yu
3666
(a) Mean–Variance Model
(b) Mean–Absolute Deviation Model
(c) LVaR Model
Figure 104.3:
Ex ante Sharpe ratio over the sample period.
b3568-v4-ch104
page 3666
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3667
(d) CVaR Model
(e) Omega Model
Figure 104.3:
(Continued)
outperform the na¨ıve diversified portfolio. The superiority of the advanced portfolio models primarily comes from the period of bear market (between 2007 and 2010). The patterns of time-variation in the market value for these portfolio models are similar over the sample period as they all are affected by the business cycle and market dynamics. The fact that these portfolios outperform the equally weighted diversification show the usefulness of the models in managing risk assets. The difference in objectives of portfolio management and the measures of risk affect the performance. The models that are based on both return and volatility, such as MV and MAD shown in A and B, and the partial moment, such as Omega ratio shown in Panel E in Figure 104.4, generate more ex post profit than those mainly focus on controlling the downside loss, such as
page 3667
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
W.-J. P. Chiou & J.-R. Yu
3668
(a) Mean–Variance Model
(b) Mean–Absolute Deviation Model
(c) LVaR Model
Figure 104.4:
The market value of the portfolio models.
b3568-v4-ch104
page 3668
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3669
(d) CVaR Model
(e) Omega Model
Figure 104.4:
(Continued)
the LVaR and CVaR models. The fact that these risky portfolio models outperform the equally weighted diversification suggests their usefulness in managing risky assets. Table 104.4 demonstrates the summary statistics of the realized returns of the portfolios. The ex post performance is higher and more volatile than the ex ante return due to the unpredictability in returns. The risk models outperform the equally weighted diversification in a long run. Among them, the risk-return optimization models, such as the MV, MAD, and Omega, outperform the other risky portfolios that focus on downside loss (VaR and CVaR) in terms of raw return and risk-adjusted return, realized Sharpe ratio (RSR), Omega ratio, although the later yields lower volatility.
page 3669
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3670 Table 104.4: 1/N Mean St. Dev RSR Omega Max Min Distribution (%) RR ≤ 0 0 < RR ≤ 5% 5% < RR ≤ 10% 10% ≤ RR
Summary statistics of the realized return. MV
MAD
CVaR (95%) VaR (95%) Omega
0.0293 0.1065 0.0957 0.2043 0.1572 0.1542 0.0700 0.5822 0.5231 1.2432 1.8712 1.7354 0.1607 0.1107 0.1107 −0.2446 −0.2046 −0.2046
0.0800 0.1297 0.5014 1.5432 0.1097 −0.1693
0.0741 0.1414 0.4179 1.3390 0.1163 −0.2145
0.0923 0.1620 0.4770 1.7903 0.1287 −0.2046
43.45 39.88 14.29 2.38
39.29 51.79 8.33 0.60
39.88 51.79 7.14 1.19
38.10 46.43 14.29 1.19
39.29 45.24 14.29 1.19
38.69 45.83 14.88 0.60
Note: The table reports the mean, standard deviation, realized Sharpe ratio (RSR), Omega ratio, maximum, minimum of return of realized portfolio market values of the above portfolio models over the sample period between October 10, 2001 and June 10, 2015. The distribution of the realized return is presented.
The distributions of realized return show similar percentages, from 60% to 62% over the sample period across various risk-return models, to yield profit by exercising the strategies. Among them, the CVaR model yields the lowest volatility (12.97%) and the least downside loss (−16.93%). The distribution of the na¨ıve diversification is consistent with its high volatility, low return, and a wide range of outcome comparing with the risk-return models. Our empirical results show risk-return portfolios can be used to enhance efficiency of asset management. 104.4.3 Portfolio structure Table 104.5 analyzes the structure of the portfolios. We report the statistics of the Herfindahl–Hirschman Index (HHI) of portfolio weights and the number of assets in the portfolio and their distributions in Panel A. The assets are least concentrated for the models that focus on the downside risks, like LVaR (95%) and CVaR (95%) models, ranging between 0.290 and 0.336. On the other hand, the means of the HHI of portfolio weights for the MAD model is 0.639, suggesting that the portfolios on average consist of about equivalent 1.6 assets over the sample period. The fact that its minimum value (0.2644) is also higher than the other four models indicates the MAD creates less diversified portfolios over time. We show the statistics of Asset Coverage Ratio, the number of assets included in the portfolio over the total assets in the selection pool, in Panel B.
page 3670
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3671 Table 104.5: MV
MAD
Panel A. Herfindahl–Hirschman Index Mean 0.4616 0.6397 St. Dev 0.2526 0.2067 Max 0.9661 1.0000 Min 0.1583 0.2644 Panel B. Asset Coverage Ratio Mean 17.90 9.09 St. Dev 9.04 3.64 Max 48.15 18.52 Min 3.70 3.70
Portfolio diversity.
CVaR (95%)
VaR (95%)
Omega
0.3359 0.1699 0.9660 0.1212
0.2901 0.1660 0.9609 0.1030
0.4186 0.2376 0.9703 0.1466
23.19 9.42 44.44 3.70
28.14 12.21 100.00 3.70
19.77 7.70 37.04 3.70
Note: The table reports the summary of the overtime measures of concentration ratio of the portfolio models and the summary statistics of Asset Coverage Ratio over the sample period. Asset Coverage Ratio is the number of assets included in the portfolio over the total assets in the pool.
Comparing with the models that include both risk and return in decision, the LVaR and CVaR models tend to include more available assets in the optimal portfolios over the sample period. On the other hand, the MAD portfolio in general only includes 9.1% of assets in the selection pool. The portfolio diversity may provide an explanation of the difference in performance. The outperformance of the MV, MAD, and Omega models over the LVaR and CVaR models seems to relate to better selection of assets. The MV, MAD, and Omega models are of lower diversity and less inclusion of available assets in the optimal portfolios. The percentage of available asset assets included in the portfolio increases by from 9.1% (MAD) to 19.8% (Omega). We further search the possible explanations of higher performance generated by incorporating a floating required rate in optimizing portfolio. The superior profitability in the portfolio models may come from (1) lower transaction costs, and/or (2) better asset allocation strategies. Table 104.6 shows that the LVaR and CVaR models have higher transaction costs, both measured by the total dollar value and the percentage of the portfolio value, than the MV, MAD, and Omega models. The MV model is of the lowest costs-value ratio and the Omega model is of the lowest total trading expenses. However, the difference in transaction costs between the high-cost model (like LVaR) and the low-cost model (like Omega) during the entire investment horizon merely represents a fraction of the difference in portfolio market value. A reasonable explanation is that the flexibility in model setting may be more effective in capturing the uncertainty in future returns. The
page 3671
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
Table 104.6: MV
MAD
A. Cumulative total cost 135,342 179,387 B. TC % Mean St. Dev
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
3672
$
9.61in x 6.69in
0.36 0.27
0.58 0.35
Transaction costs.
CVaR (95%)
VaR (95%)
Omega
377,527
568,652
123,283
1.30 0.29
2.21 0.32
0.43 0.27
Note: The table reports the total transaction costs of the portfolio models as dollar value and as annualized percentage over the portfolio market value (TC%) defined in equation (37).
higher market values yielded by the risk-return models can be attributed to more efficient asset allocation strategies. We present and empirically test the effectiveness of various portfolio models in managing international assets. We compare their ex ante return, volatility, ex post performance, realized market value, portfolio diversity, and trading costs. The results show that the portfolio effectiveness of the MV, MAD and Omega models yield similar higher return and lower transaction costs. On the other hand, the LVaR and CVaR models realized lower volatility in returns. The increase in realized profitability comes from the better modeling of uncertainty in future return but not the savings of transaction costs. 104.5 Conclusions This chapter presents and develops empirical framework of various portfolio models with considering various factors that are associated with application feasibility. In particular, we model short-sales, the transaction costs, and the bounds of portfolio weights. We find in general that the models that consider both risk and return, like MV, MAD, and Omega models, yield higher realized return, higher trading costs, and lower portfolio diversity than the models that focus on downside risk, like LVaR and CVaR models. Notably, the outperformance of the MV, MAD, and Omega models over the LVaR and CVaR models is significant when the market recovers from the financial crisis. The better asset allocation, but not only the saving of transaction costs, attributes to their superior profitability. We present the portfolio models that are widely applied in financial institutions and evaluate their benefits under various scenarios while considering the factors affecting the practicality of strategies. Our study synthesizes the major concepts and modi operandi of the previous research and maximizes
page 3672
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
Advancement of Optimal Portfolio Models with Short-Sales and Transaction Costs 3673
the flexibility in managing international portfolios. The empirical results show the superiority of the MV, MAD, and Omega models in portfolio and risk management when we consider the short-sales and trading costs. Bibliography Angel, J.J., Christophe, S.E. and Ferri, M.G. (2003). A Close Look at Short Selling on Nasdaq. Financial Analysts Journal 59, 66–74. Angelelli, E., Mansini, R. and Speranza, M.G. (2008). A Comparison of MAD and CVaR Models with Real Features. Journal of Banking and Finance 32, 1188–1197. Atkinson, C. and Mokkhavesa, S. (2004). Multi-asset Portfolio Optimization with Transaction Cost. Applied Mathematical Finance 11, 95–123. Benati, S. and Rizzi, R. (2007). A Mixed Integer Linear Programming Formulation of the Optimal Mean/Value-at-Risk Portfolio Problem. European Journal of Operational Research 176, 423–434. Carroll, R., Thomas Conlon, T., Cotter, J. and Salvador, E. (2017). Asset Allocation with Correlation: A Composite Trade-off. European Journal of Operational Research 262, 1164–1180. Chen, H.H. and Yang, C.B. (2017). Multiperiod Portfolio Investment Using Stochastic Programming with Conditional Value at Risk. Computers and Operations Research 81, 305–321. DeMiguel, V., Garlappi, L. and Uppal, R. (2009). Optimal versus Naive Diversification: How Inefficient is the 1/n Portfolio Strategy? Review of Financial Studies 22, 1915– 1953. Gotoh, J. and Takeda, A. (2012). Minimizing Loss Probability Bounds for Portfolio Selection. European Journal of Operational Research 217, 371–380. Green, R.C. and Hollifield, B. (1992). When Will Mean–Variance Efficient Portfolios be Well Diversified? Journal of Finance 47, 1785–1809. Kapsos, M., Christofides, N. and Rustem, B. (2014a). Worst-Case Robust Omega Ratio. European Journal of Operational Research 234, 499–507. Kapsos, M., Zymler, S., Christofides, N. and Rustem, B. (2014b). Optimizing the Omega Ratio Using Linear Programming. Journal of Computational Finance 17, 49–57. Jagannathan, R. and Ma, T.S. (2003). Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps. Journal of Finance 58, 1651–1683. Jorion, P. (2000). Value-at-Risk: The New Benchmark for Managing Financial Risk. McGraw-Hill, New York. Keating, C. and Shadwick, W.F. (2002). A Universal Performance Measure. Journal of Performance Measurement 6, 59–84. Konno, H. and Yamazaki, H. (1991). Mean-Absolute Deviation Portfolio Optimization Model and its Applications to Tokyo Stock Market. Management Science 37, 519–531. Kwan, C.C.Y. (1997). Portfolio Selection Under Institutional Procedures for Short Selling: Normative and Market-Equilibrium Considerations. Journal of Banking & Finance 21, 369–391. Li, K., Sarkar, A. and Wang, Z. (2003). Diversification Benefits of Emerging Markets Subject to Portfolio Constraints. Journal of Empirical Finance 10, 57–80. Markowitz, H. (1952). Portfolio Selection. Journal of Finance 7, 77–91. Mausser, H. and Rosen, D. (1999). Beyond VaR: From Measuring Risk to Managing Risk. ALGO Research Quarterly 1, 5–20.
page 3673
July 6, 2020
16:3
3674
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch104
W.-J. P. Chiou & J.-R. Yu
Nawrocki, D.N. (1999). A Brief History of Downside Risk Measures. Journal of Investing 8, 9–25. Ogryczak, W. and Ruszczynski, A. (1999). From Stochastic Dominance to Mean-Risk Models: Semideviations as Risk Measures. European Journal of Operational Research 116, 33–50. Rockafellar, R.T. and Uryasev, S. (2000). Optimization of Conditional Value-at-Risk. Journal of Risk 2, 21–41. Rockafellar, R.T. and Uryasev, S. (2002). Conditional Value-at-Risk for General Loss Distributions. Journal of Banking and Finance 26, 1443–1471. Rockafellar, R.T., Uryasev, S. and Zabarankin, M. (2006). Generalized Deviations in Risk Analysis. Finance and Stochastics 10, 51–74. Roy, A.D. (1952). Safety-First and the Holdings of Assets. Econormetrica 20, 431–449. Simaan, Y. (1997). Estimation Risk in Portfolio Selection: The Mean-Variance Model versus the Mean-Absolute-Deviation Model. Management Science 43, 1437–1446. Speranza, M.G. (1993). Linear Programming Model. Management Science 43, 1437–1466. Strub, O. and Baumann, P. (2018). Optimal Construction and Rebalancing of IndexTracking Portfolios. European Journal of Operational Research 264, 370–387. White, J.A. (1990). More Institutional Investors Selling Short: But Tactic is Part of Wider Strategy. Wall Street Journal 7, 742–747. Woodside-Oriakhi, M., Lucas, C. and Beasley, J. (2013). Portfolio Rebalancing with an Investment Horizon and Transaction Costs. Omega 41, 406–420. Yu, J., Chiou, W. and Liu, R. (2017). Incorporating Transaction Costs, Weighting Management, and Floating Required Return in Robust Portfolios. Computers and Industrial Engineering 109, 48–58. Yu, J., Chiou, W. and Mu, D. (2015). A Linearized Value-at-Risk Model with Transaction Costs and Short Selling. European Journal of Operational Research 247, 872–878. Yu, J., Chiou, W. and Yang, Y. (2017). Diversification Benefits of Risk Portfolio Models: A Case of Taiwan’S Stock Market. Review of Quantitative Finance and Accounting 48, 467–502. Yu, J.R. and Lee, W.Y. (2011). Portfolio Rebalancing Model Using Multiple Criteria. European Journal of Operational Research 209, 166–175.
page 3674
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
Chapter 105
The Path Leading up to the New IFRS 16 Leasing Standard: How was the Restructuring of Lease Accounting Received by Different Advocacy Groups? Christian Blecher and Stephanie Kruse Contents 105.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.2 Methodological Procedure and Overview of the Submitted Comment Letters . . . . . . . . . . . . . . . . . . . . . . . . . 105.3 Empirical Analysis of the Comment Letters . . . . . . . . . . 105.3.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . 105.3.2 Question 1: Definition and criteria for identifying leases . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.3.3 Question 4: Application of the consumption principle to classify leases . . . . . . . . . . . . . . . . . . . . . 105.3.4 Question 2: Variation of the recognition, valuation, and presentation of leases for the lessee . . . . . . . . . .
Christian Blecher Institute for Business Administration Kiel University email: [email protected] Stephanie Kruse Institute for Business Administration Kiel University email: [email protected] 3675
3676 3678 3682 3682 3685 3687 3689
page 3675
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3676
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
105.3.5 Question 3: Variation of the recognition, valuation, and presentation of leases for the lessor . . . . . . . . . . 105.3.6 Question 5: Lease term with remeasurement in the event of changes to relevant factors . . . . . . . . . . 105.3.7 Question 6: Variable lease payments with remeasurement in the event of changes to the underlying index or interest rate . . . . . . . . . . . . 105.3.8 Question 12: Right-of-use assets under IAS 40 . . . . 105.4 Discussion of the Most Important Results . . . . . . . . . . . 105.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3691 3692
3695 3697 3699 3700 3701
Abstract The due process of the International Financial Reporting Standards (IFRS) enables interested parties to comment on the development of new IFRS. Unsurprisingly, different advocacy groups have very different perspectives and interests. For example, businesses are more likely to be interested in “user-friendly” rules, whereas standard-setters and academics tend to prefer theoretically coherent standards. This paper analyzes the response behavior of different advocacy groups using the example of lease accounting reform whereas leasing seems to be a promising example. First, to analyze the response behavior, five different advocacy groups are defined. The 657 comment letters submitted for the Re-Exposure Draft “Leases” are then assigned to these five advocacy groups. The Re-Exposure Draft formulates questions about different aspects of the new standard and asks for comments regarding these aspects. Next, the response behavior of the different advocacy groups with respect to the most relevant questions is examined quantitatively and qualitatively. The quantitative analysis uses the Kruskal–Wallis test (H-test) and the Mann–Whitney test (U-test) to evaluate the response behavior. The main result of the study is that the response behavior to various questions differs significantly between advocacy groups. In particular, it is shown that the response behavior differs drastically between more “user-oriented” and more “theoretically oriented” advocacy groups. Keywords Lease accounting • Advocacy groups • Due process • Comment letters • Statistical analysis of response behavior • Kruskal–Wallis test • Mann–Whitney test • Significance identification.
105.1 Introduction For over a decade, the IASB has been working to revise the IAS 17 leasing standard. The most prominent criticism of IAS 17 relates to its all-or-nothing approach, according to which leased assets are either fully ignored (operating leases) or fully recognized (finance leases) by the lessee. This can potentially cause very similar leasing arrangements to be reported in completely different ways. IAS 17 uses a risks-and-rewards approach to distinguish between these two types of lease. One of the key objectives of the project to
page 3676
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3677
revise these leasing regulations was to eliminate this all-or-nothing approach, and hence prevent off-balance-sheet treatment by the lessee. But revising IAS 17 proved a complicated and lengthy task; the first discussion paper was published in 2009, followed by the first exposure draft in 2010 and another re-exposure draft (Re-ED) in 2013, before the new leasing standard IFRS 16 was finally published in early 2016.1 This chapter analyzes the 657 comment letters (CLs)2 submitted to the IASB and the FASB regarding the Re-ED, classifies them into advocacy groups, and investigates the extent to which the criticism of the Re-ED was ultimately addressed by the final version of the standard. A systematic analysis of these comment letters leads to the following particularly striking conclusion: there were major differences in response behavior between highly application-oriented advocacy groups (users: businesses, auditing companies, banks/insurance companies) and more theoretically oriented advocacy groups (developers: standard-setters, auditing associations). For six of the seven topics considered, there were (in some cases extremely) significant differences in the response behavior of these “users” and “developers” of lease accounting. As a result, despite an extended development phase and three preparatory papers, the IASB was unable to establish a consensus regarding key elements of the controversial topic of lease accounting. Instead, the analysis suggests the existence of two opposing “factions” that present very uniform arguments in support of their own positions. The approach used in the present paper follows the procedure applied in Wielenberg et al. (2007). In this paper, the CLs to an exposure draft concerning amendments to IAS 37 with respect to the accounting for non-financial liabilities are evaluated in a similar way. Moreover there are other papers which use a comparable approach, in particular Blecher et al. (2009) and Schmidt and Blecher (2015). Blecher et al. (2009) analyze a major reform of pension accounting whereas Schmidt and Blecher (2015) investigate the CLs with respect to the revision of the IFRS-framework. In all the three papers there could be perceived that the user-oriented advocacy groups tend to have a more critical response behavior than the theoretical-oriented advocacy groups. However, in the previous papers this tendency is limited to some specific questions whereas in the present paper these differences arise
1
The papers are titled as follows: Discussion Paper DP/2009/1: “Leases: Preliminary Views,” Exposure Draft ED/2010/9 “Leases,” and Re-Exposure Draft ED/2013/6: “Leases.” 2 The 657 CLs on ED/2013/6 may be viewed online on the website of the FASB: http://www.fasb.org/jsp/FASB/CommentLetter C/CommentLetterPage%26cid =1218220137090%26project id=2013-270 (last retrieved: 23/05/2018).
page 3677
July 6, 2020
16:3
3678
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
for all evaluated questions. Such a large deviation in the response behavior between users and developers has not been identified in the previous papers. This chapter is structured as follows. Section 2 gives a brief description of the procedure used to systematically evaluate the CLs, as well as an overview of the CLs that were submitted. Details of the statistical analysis of the questions asked by the Re-ED are given in Section 3. In each case, some background is given on the topics covered by the question, followed by a quantitative analysis and a summary of any recurring arguments that could be identified in the CLs. The response of the IASB to these critical comments is then presented, as well as any resulting changes made in the final draft of IFRS 16. Section 4 discusses the main findings of the study. Finally, a summary of the conclusions is given in Section 5. 105.2 Methodological Procedure and Overview of the Submitted Comment Letters The Re-ED invited any parties with an interest in IFRS accounting to comment on 12 questions relating to key aspects of the content of the draft.3 This paper considers seven of these questions (Q) — those deemed most relevant — and performs a statistical analysis of the response behavior of respondents:4 Q1 : identification of a lease, Q4 : classification of leases, Q2 : accounting by the lessee, Q3 : accounting by the lessor, Q5 : lease term, Q6 : variable lease payments, Q12 : resulting modifications of IAS 40. The total of 657 submitted CLs were attributed to 641 distinct institutions.5 The first step of the process was to classify the CLs into the advocacy groups listed in Table 105.1.6 After identifying the advocacy group of each CL, the second step was to evaluate the position of the authors of the CL on the questions to which they responded. Each statement, if present, was graded on a scale from one to four. A grade of one represents strong agreement, whereas a grade of four indicates 3
See IASB (2013a, ED/2013/6, pp. 8–12). The statistical analysis follows the approach described in Wielenberg et al. (2007) and Blecher et al. (2009). 5 The other 16 letters consisted of multiple statements by the same institution; for example, a total of nine letters, CL 47 and 47A to 47H, were respectively submitted by First Financial Corporate Leasing Inc. and First Financial Corporate Services Inc. 6 Deviating from the procedure in Wielenberg et al. (2007) and Blecher et al. (2009), the interest groups of banks and insurance companies were combined, since only few CLs were submitted by insurance companies, the activities of both groups overlap, and their response behaviors were similar. 4
page 3678
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard Table 105.1:
b3568-v4-ch105
3679
Classification of advocacy groups by subgroup.
Advocacy groups
Subgroups
Banks/Insurance companies
Standard-setters
Auditing companies
Auditing associations Businesses
Banks and financial institutions, financial service providers, asset management services, analysts, rating agencies, investors, insurances companies, and associations of such entities Institutions of national or international standard-setters or other accounting-related committees, government institutions, and representatives of universities Auditing companies, auditors in general, tax and management consultancy firms, including in the financial sector, as well as law firms Associations of auditors, auditing companies, and tax consultancy firms, or leading CFOs Leasing companies and associations of leasing companies, firms and business associations from various fields, private individuals that cannot be clearly attributed to any of the other advocacy groups
strong disagreement. Grades of two and three indicate weak agreement and disagreement, respectively. The third step was to statistically evaluate the CLs. The statistical parameters used for the evaluation were the mean value (M ), the standard deviation (SD), and the total number (N ) of responses to each of the seven topics. As the fourth step, Kruskal–Wallis non-parametric rank-variance analysis (the H-test) was used to verify whether the answers to each question were systematically influenced by association with an advocacy group or whether the differences in response behavior might be attributable to chance.7 The Kruskal–Wallis test evaluates the hypothesis that multiple independent groups come from different populations, in opposition to the Friedman test which is used for multiple dependent conditions. To carry out the test in a first step the data is ranked from lowest to highest not considering the advocacy group. Secondly, the scores are put back in their groups and in each group the ranks are summed up. In a third step, a calculation of the test statistic, H, occurs: k
H=
R2 12 i − 3(N + 1), N (N + 1) ni i=1
7
See McKillup (2012, p. 331); Weinberg and Abramowitz (2002, p. 549).
(105.1)
page 3679
July 6, 2020
16:3
3680
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
with Ri is the sum of ranks within each advocacy group, Ni the sample size of each of the five advocacy groups, N the total sample size, i.e., 629 CLs, and k the number of advocacy groups, i.e., five. The equation indicates that the sum of ranks for each of the five advocacy groups is squared, afterwards divided by the sample size of the corresponding group and summed up.8 One possibility to evaluate the results is to compare H against an appropriate sampling distribution which in this case is the chi-square distribution with k − 1 freedoms because for each of the five advocacy groups at least five CLs have been assigned (see Figure 105.1).9 After running the Kruskal–Wallis test with SPSS, the output shows the test statistic H and additionally among other things the p-value (the asymptotic significance) for each of the seven questions. A comparison of these p-values against the desired significance threshold establishes significance.10 The following notation is used: p > 0.05 for not significant, p ∈ (0.01; 0.05] for significant, p ∈ (0.001; 0.01] for highly significant, and p ≤ 0.001 for extremely significant.11 If the H-test finds a significant result, it is likely that at least one of the five advocacy groups differed from the others in terms of response behavior. However, this does not allow any conclusions to be drawn as to which of the five advocacy groups presented differences. To answer this question, the H-test was subsequently extended by a pairwise comparison of the mean values of each advocacy group based on the U-test whenever significance was found.12 The U-test, also known as the Mann– Whitney test, checks whether the differences in response behavior between two independent advocacy groups could have arisen by chance, or whether there is a systematic connection between each advocacy group and their response.13 Here, the U-test is applied as a two-sided test, since the direction of the differences may be deduced by comparing the mean values of the responses. A parametric equivalent to the non-parametric U-test which is used to compare two independent advocacy groups is the t-test. Firstly, the data is combined together and rank-ordered which indicates whether the two advocacy groups are randomly mixed in ranks or whether they build clusters at the end of the distribution. Then a calculation of the test statistic
8
See Field (2017, pp. 306–308). See Weinberg and Abramowitz (2002, p. 549). 10 See Field (2017, p. 313). 11 See Lecoutre and Poitevineau (2014, p. 40). 12 See Field (2017, p. 318); Burns and Burns (2008, pp. 315, 317). 13 See Gravetter and Wallnau (2013/2010, p. 743f); Weinberg and Abramowitz (2002, p. 545). 9
page 3680
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
The Path Leading up to the New IFRS 16 Leasing Standard
3681
U occurs:
Ni (Ni + 1) − Ri , (105.2) 2 with Ri is the sum of ranks within each advocacy group and Ni the sample size of each of the five advocacy groups, e.g., the number of values from the first sample is N1 . After running the U-test with SPSS the output shows the test statistic U and again also the p-value. If the sample has an extent which is covered by the two-tailed Mann–Whitney table, then an examination of significance is possibly based on these critical values. Otherwise, a large sample approximation has to be performed where a z-score is calculated and evaluated against the normal distribution. The last case applies on the advocacy group analysis in this study so a comparison between an asymptotic significance against the desired significance threshold offers information concerning the significance.14 Of the total of 657 submitted CLs, 28 letters were excluded from the analysis, due to being unusable.15 The analysis therefore considered 629 CLs, which were classified into the five advocacy groups as follows (see also Figure 105.1): the group of businesses accounted for 371 of the CLs, or Ui = N1 N2 +
400
371 58.98%
300 200
100 15.90%
100
59 9.38%
54 8.59%
45 7.15%
Audi ng companies
Audi ng associa ons
0 Businesses
Figure 105.1: 14
Banks/Insurance Standard-se ers companies
Distribution of the CLs into advocacy groups (frequency analysis).
See Corder and Foreman (2009, pp. 16–18, 24). The reasons for exclusion included: multiple statements from the same institution with inconsistent positions, the submission of multiple CLs to clarify or reinforce an opinion, comments that did not clearly reveal the opinion of the author on the relevant question, comments in which multiple institutions expressed differing positions, CLs that related to another standard draft, and comments that could not be assigned to an advocacy group. CL 499 was not published. Thirteen of the multiple CLs from the same institution were condensed into three letters. 15
page 3681
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3682
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
just under 59%. Banks/insurance companies were the second-largest group, with 100 comments. The group of standard-setters submitted 59 responses, and the groups of auditing companies and auditing associations submitted 54 and 45 comments, respectively. The degree of participation by businesses was remarkably high, even compared to similar studies conducted in the past. Leasing seems to represent a very sensitive topic that prompted a larger number of companies to participate in the standard-setting process. 105.3 Empirical Analysis of the Comment Letters 105.3.1 General Figure 105.2 summarizes the statistical analysis performed on the responses to each of the seven questions across all advocacy groups, showing the mean value (M ) and standard (SD) deviation for each group. Overall, Questions 2, 3, and 4 seem to have been met with clear disagreement, and Question 12 was received with slight agreement. The responses to Questions 1, 5, and 6 appear to have been largely indifferent. Table 105.2 lists the M , SD, and N for each of the seven questions considered in this study. The number of CLs evaluated for each question ranged from a minimum of 181 for Question 12 to a maximum of 535 for Question 2, since not every respondent addressed every question asked by the IASB. Respondents tended to only discuss the questions that seemed most relevant to their own positions. Analysis of this data revealed that the response behavior was relatively homogeneous between companies, auditing companies, and banks/insurance companies on the one hand, and standard-setters and auditing associations on the other. One possible explanation is that the former groups are more interested in the concrete applicability of the standards, whereas the latter groups focus more on their theoretical consistency. Motivated by this observation, the advocacy groups of companies, auditing companies, and banks/insurance companies were aggregated into the broader category of “users,” and the standard-setters and auditing associations were combined into the category of “developers.”16 Table 105.3 shows the aggregated results 16
The distinction between “users” and “developers” was previously proposed by Wielenberg et al. (2007, pp. 453–459). However, differences in response behavior found by their study were much less pronounced than here.
page 3682
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
Figure 105.2: groups.
b3568-v4-ch105
3683
Mean values and standard deviations of opinions across all advocacy
for the categories of users and developers, and also presents the results of the Mann–Whitney test verifying the differences in response behavior between categories. A comparison of the mean value (M ) shows that users were more critical of the proposals than developers for all seven questions. Only Question 2 slightly failed to meet the 5% significance threshold, with a p-value of 0.055. For each of the other six questions, users and developers adopted systematically different response behavior. The differences in Questions 3, 5, 6, and 12 were extremely significant (p ≤ 0.001), the differences in Question 4 were highly significant (p ∈ (0.001; 0, 01]), and the differences in Question 1 were significant up to the 5% threshold. The magnitude of the deviations in the response behavior is remarkable and has not yet been observed so distinctly in any comparable study.
page 3683
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
3684
Table 105.2: Mean, standard deviation, and total number of evaluated responses, classified by advocacy group. Q3 Q2 Account- Accounting by ing by Q4 the the Q1 IdentiClassifilessee lessor fication cation
Q5 Lease term
Q12 Q6 Variable Modifilease cations payof IAS ments 40
Businesses
M N SD
2.355 259 1.106
3.435 301 1.003
3.349 232 1.126
3.512 291 0.868
2.722 216 1.098
2.408 191 1.032
2.433 67 1.510
Banks/ Insurance companies
M N SD
2.137 80 1.052
3.247 93 1.195
3.103 78 1.265
3.429 77 1.057
2.446 74 0.995
2.303 66 0.928
2.212 33 1.431
Standardsetters
M N SD
2.045 44 0.914
3.286 49 1.099
2.959 49 1.258
3.240 50 1.080
1.826 46 0.825
1.605 43 0.728
1.417 36 1.052
Auditing companies
M N SD
2.333 39 1.199
3.420 50 1.071
3.195 41 1.269
3.674 43 0.747
2.541 37 1.095
2.686 35 1.022
2.273 22 1.486
Auditing M associations N SD Overall M N SD
1.868 38 1.044 2.246 460 1.090
2.952 42 1.324 3.350 535 1.086
2.472 36 1.424 3.174 436 1.227
3.025 40 1.271 3.447 501 0.959
1.914 35 0.981 2.485 408 1.088
1.973 37 1.118 2.280 372 1.029
1.391 23 1.033 2.039 181 1.416
Table 105.3: developers.
Users M Developers M U-test (p-value)
Mean values and results of the U-test for the categories of users and
Q1
Q2
Q3
Q4
Q5
Q6
Q12
2.307 1.963
3.394 3.132
3.276 2.753
3.513 3.144
2.639 1.864
2.418 1.775
2.344 1.407
0.0114
0.0552
0.0007
0.0069
0.000000006
0.0000005
0.00002
Below, each of the seven questions selected from the Re-ED are examined in more detail. First, the content of each question is briefly introduced. Then, a statistical analysis is performed, and the typical lines of reasoning presented by each advocacy group are identified. Finally, the extent to which the critical comments were addressed by the IASB in the final draft of IFRS 16 is discussed.
page 3684
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
The Path Leading up to the New IFRS 16 Leasing Standard
3685
105.3.2 Question 1: Definition and criteria for identifying leases The definition of a lease was readopted from IAS 17 and ASC 840 without any modifications in terms of content.17 At the beginning of a contract, the existence of a contractual lease is determined by two criteria. The first step is to check whether the fulfillment of the contract depends on the use of a named asset. If so, the first criterion of IAS 17 and ASC 840 is preserved with little modifications concerning, i.e., the existence of substantive rights of substitution.18 Thus, it must be determined whether the asset or one of its physically separable components are explicitly associated with any substantive rights of substitution.19 The second step verifies whether the contract awards the right to exercise control over the usage of the named asset for a specified period of time against payment.20 The Re-ED further refines this second criterion by introducing the new concept of power element and revising the benefits element to distinguish between leasing and service contracts and to focus on usage rather than ownership.21 The power element is present if the customer controls the usage of the named asset. The benefits element is met if the customer is able and legally entitled to derive economic benefit from the usage of the named asset.22 A total of 460 responses were submitted for Question 1. The mean value (M ) over all responses was 2.25, indicating very slight approval. Among specific advocacy groups, auditing associations expressed by far the strongest approval (M = 1.87), followed by standard-setters (M = 2.05), and banks/insurance companies (M = 2.14). Businesses (M = 2.36) and auditing companies (M = 2.33) were more or less indifferent. The H-test found no significance at the 5% level by a very small margin (p = 0.051). After aggregating the businesses, auditing companies, and banks/ insurance companies into the broader category of users, and similarly combining the standard-setters and auditing associations into the category of developers as described above, the differences in response behavior between both categories were significant at the 5% level (p = 0.011). The developers supported the proposals more strongly than the users, as noted 17
See See 19 See 20 See 21 See 22 See 18
IASB IASB IASB IASB IASB IASB
(2013b, (2013b, (2013a, (2013a, (2013b, (2013a,
Basis for Conclusions, p. 36). Basis for Conclusions, p. 36f). ED/2013/6, p. 13f). ED/2013/6, p. 14f). Basis for Conclusions, p. 38f). ED/2013/6, p. 15f); IASB (2013b, Basis for Conclusions, p. 38).
page 3685
July 6, 2020
16:3
3686
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
above.23 For example, many of the users called for the new components of the identification newly introduced by the Re-ED to be removed, expressing the opinion that the definitions and identification process of IAS 17 and ASC 840 should be kept unchanged.24 Changes to one of the two criteria were often requested in the CLs. In particular, the distinction between leasing and service contracts made by the second criterion frequently attracted criticism. New assumptions were requested, such as the option to treat the lease as a service contract whenever the service aspect dominates the contract or is inseparable from it.25 Some users also called for less drastic changes, adopting similar positions to the developers, such as the view that short-term exceptions should be revised,26 e.g., by extending them to include additional arrangements that would cover any long-term leases that are globally insignificant for the financial statements and ancillary business activities.27 In the final draft of IFRS 16, the definition proposed by ED/2013/6 was essentially adopted without modification, with an added clarification stating that a lease can also exist as a component of a broader contract. Both criteria for the identification process proposed by the Re-ED were adopted in IFRS 16.28 Thus, the key principles of ED/2013/6 were upheld by the IASB, and some of the details proposed in the CLs were also implemented, notably proposed improvements regarding rights of substitution and the lessee’s ability to control usage.29 The ISAB did not comply with the frequent request to exclude low-value assets from the scope of the new standards and extend the duration of short-term leases. Instead, IFRS 16 introduced an optional practical exemption clause for assets with values of less than $5,000.30
23 The grades of user responses were distributed as follows: 34% one, 18% two, 32% three, 16% four. The grades of developer responses were distributed as follows: 39% one, 35% two, 16% three, 10% four. 24 See CL 199, KPMG IFRG Limited, p. 8f.; CL 220, DONLEN Corporation, pp. 1–3; CL 267, Magellan Midstream Partners L. P.; p. 1f.; CL 270, Financial Corporation, p. 4; CL 321, Enterprise Holdings Inc., p. 2; CL 333, Suntrust Banks Inc., p. 4; CL 419, AELA and AFLA, p. 4f.; CL 464, Canadian Natural Resources Limited, p. 3f. 25 See CL 452, Rolls-Royce Holdings PLC, p. 4. 26 See CL 521, ESBG-WSBI, p. 2; CL 329, AIG, p. 4f.; CL 373, MSCPA AP & APC, pp. 2, 4; CL 391, First American Equipment Finance, p. 1f.; CL 424, KASB, p. 2; CL 472, CPA Ireland, p. 1f. 27 See CL 246, AFLAC Inc., p. 2f.; CL 273, Regions Financial Corporation, p. 4. 28 See Deloitte (2016, p. 3). 29 See IASB (2016a, IFRS 16 Leases Project Summary and Feedback Statement, p. 11). 30 See KPMG (2016b, p. 1).
page 3686
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3687
105.3.3 Question 4: Application of the consumption principle to classify leases Two new classes, Type A and Type B, were introduced to classify the leases defined by Q1 . First, a preliminary grouping into property and nonproperty is established based on the nature of underlying asset.31 Any land and/or buildings or sections of buildings are viewed as property. Examples of non-property/equipment include business equipment, aircraft, automobiles, trucks, etc. The classification rule states that real estate should be classified as Type B and equipment should be classified as Type A by default.32 However, assets can be reassigned to the other class under certain conditions, as explained below. The consumption principle is applied to establish whether reassignment is necessary.33 Two classification criteria are applied to the preliminary groups of property and non-property. The first criterion compares the lease term against the economic lifespan of the asset, and the second criterion compares the lease payments against the fair value of the underlying asset.34 Real estate can for example be reassigned as Type A if a long-term real estate lease has been established that spans a significant proportion of the remaining economic lifetime of the underlying asset. This is effectively similar to purchasing the asset, i.e., a financing rationale exists. Conversely, short-term equipment leases with an insignificant lease term relative to the overall economic lifespan of the asset do not have a financing rationale; their purpose is instead to generate cash flows, as is typically the case for real estate. Accordingly, they must be reassigned as Type B.35 The statistical analysis considered 501 responses to Question 4. The mean value (M ) over all responses was 3.45. This was the strongest level of disagreement registered for any of the seven questions considered by this study. The standard deviation (SD) was also lower than any other question, with a value of 0.96. Among specific advocacy groups, the weakest disagreement was expressed by auditing associations (M = 3.03) and standard-setters (M = 3.24). The disagreement from banks/insurance companies (M = 3.43) and businesses (M = 3.51) was stronger, and the strongest opposition came
31
See IASB (2013a, ED/2013/6, p. 18). See IASB (2013a, ED/2013/6, p. 6). 33 See IASB (2013a, ED/2013/6, p. 8); this involves an assessment of whether the expected consumption of the lessee represents a significant proportion of the economic benefit derived from the underlying asset. 34 See IASB (2013a, ED/2013/6, p. 18). 35 See IASB (2013b, Basis for Conclusions, p. 44). 32
page 3687
July 6, 2020
16:3
3688
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
from auditing companies (M = 3.67). The p-value found by the Kruskal– Wallis test slightly exceeded the 5% significance threshold, with a value of 0.055; accordingly, the U-test was not performed to compare specific groups. After aggregating the advocacy groups into the categories of users and developers, the mean value (M ) was 3.51 for the users and 3.14 for the developers. The Mann–Whitney test found a p-value of 0.006, revealing highly significant differences in the response behavior of the two categories. The arguments advanced to justify the sweeping disagreement with Question 4 typically focused on the proposed definition of two distinct classes. Representatives of both users and developers considered this proposal to be superfluous, for various reasons.36 Some respondents described the type system as unsuitable due to its excessive complexity and conceptual flaws, and also rejected the preliminary grouping into property and non-property. Instead, they called for a single-accounting model that would apply to both the lessor and the lessee.37 Other parties rejected the consumption principle and the pregrouping system.38 Among the respondents who did not fully reject the type system, instead calling for an alternative classification, the most common request was to preserve the classification into operating and finance leases prescribed by IAS 17 and ASC 840, or slightly modify it for lessors and/or lessees. This key argument was mostly advanced by respondents from the category of users to support their negative stance on this question.39 In their final implementation of Question 4, the IASB acknowledged and addressed the strong opposition across all advocacy groups by removing the classification into Type A and Type B from the definitive version of the standard. For the lessor, the classification into operating and finance leases was adopted intact from IAS 17.40 For the lessee, the classification of leases was fully abolished in IFRS 16.41 36
See CL 125, Modular Building Institute, p. 3; CL 602, EFAA, p. 3; CL 608, Chartered Accountants Ireland AC, pp. 4–6. 37 For example, see CL 257, Fedex Corporation, pp. 2, 4; CL 244, Sherwin-Williams Company, p. 3; CL 317, Westjet Airlines Ltd., p. 2, 4. 38 See CL 516, Norwegian Accounting Standards Board, p. 2f. 39 Among banks/insurance companies, around 38% of the commenters cited this argument, as well as around 33% of auditing companies and 29% of businesses. As for the category of developers, only 9 of the 50 responses by standard-setters and 3 of the 40 responses by auditing associations included this request; see for example CL 138, RBS, p. 3; CL 194, Siemens AG, p. 6f.; CL 586, Leaseurope, p. 17f. 40 See KPMG (2016a, p. 34). 41 See KPMG (2016a, p. 17); PwC (2016a, p. 1). In terms of this key aspect of lease accounting, the objective of establishing convergence between the US GAAP and the IFRS was not achieved.
page 3688
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
The Path Leading up to the New IFRS 16 Leasing Standard
3689
105.3.4 Question 2: Variation of the recognition, valuation, and presentation of leases for the lessee The lessee must recognize leases of both classes as a lease liability and register a right-of-use (RoU) asset in the form of a contra item. Both entries must be adjusted in the event of remeasurement.42 The first key difference between the two classes is the depreciation methodology for RoU assets. For Type A, a straight-line depreciation is typically required over the relatively short period ranging from the time of delivery until the end of either the useful life of the asset or the lease term. For Type B, the depreciation is calculated as the periodic leasing costs minus the periodic interest cost of the lease liability.43 The accounting principles for Type A are largely equivalent to those prescribed by IAS 17 and ASC 840 for finance leases. For Type B, the decrease in the economic production potential of the underlying asset is assumed to be insignificant, justifying the recognition of a constant lease cost. The second key difference relates to the presentation requirements for the P&L statement, where the interest cost of the lease liability and the depreciation of the RoU asset must be recognized separately for Type A, and aggregated as the lease cost for Type B.44 The reasoning underpinning this distinction is that, in the case of Type B, the lessee does not acquire a substantial share of the underlying asset and therefore is exclusively paying for the usage of this asset, whereas in the case of Type A, a partial acquisition effectively unfolds in accordance with the consumption principle.45 A total of 535 responses were submitted for Question 2. The secondhighest overall level of disagreement was registered for this question, with a mean value (M ) of 3.35. The auditing associations rejected the dualaccounting model the least, by a considerable margin (M = 2.95), followed by the banks/insurance companies (M = 3.25) and standard-setters (M = 3.29). The disagreement voiced by businesses (M = 3.44) and auditing companies (M = 3.42) was especially strong. The Kruskal–Wallis test found a p-value of 0.212, considerably higher than the significance threshold, which means that no significant differences could be established between the response behaviors of the various advocacy groups regarding the accounting principles for lessees. Similarly, when considering the broader categories of users and developers, the U-test gives a p-value of 0.055, meaning that the possibility that the differences in response behavior are random cannot be 42
See See 44 See 45 See 43
IASB IASB IASB IASB
(2013a, (2013a, (2013a, (2013b,
ED/2013/6, p. 19f). ED/2013/6, p. 22). ED/2013/6, p. 23). Basis for Conclusions, p. 62).
page 3689
July 6, 2020
16:3
3690
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
excluded to the 5% level, albeit by a thin margin. Question 2 is the only one of the seven questions considered in this paper for which significant differences between the response behaviors of users and developers could not be established. The discussion of Question 2 revolved around two key discussion points. The first challenged the claim that the practice of capitalizing operating leases is justified, as opposed to off-balance sheet treatment, and the second questioned whether a dual-accounting model is more appropriate than a single-accounting model. The request to preserve the model of IAS 17 and ASC 840 either in full or with improved disclosure requirements, which corresponds to a disclosure-only approach, was cited as a key reason for strong disagreement by 129 respondents.46 To some respondents, the basic principles were acceptable but the definition of two distinct classes was disliked, also leading them to strongly reject the proposals. These respondents usually called for the implementation of a single-accounting model, which was requested a total of 119 times, at frequencies that varied by advocacy group.47 A single-accounting model based on Type A was another popular request.48 In the final draft of IFRS 16, the IASB addressed some of the key arguments proposed by the CLs. Firstly, lessees must recognize both a RoU asset and a lease liability for almost every type of lease.49 Thus, the IASB ultimately decided against implementing one of the key requests from users, namely to preserve the off-balance-sheet treatment from IAS 17 and ASC 840. The broad strokes of the lease accounting revisions pursued by the IASB 46
This request was most frequently cited by banks/interest companies, featuring in 35% of their responses, followed by auditing companies and businesses, cited by 26% and 24% of responses, respectively. It was far less of a priority for the developers than the users, only mentioned in 12% of their responses. For example, see CL 170, ICSC, p. 2; CL 194, Siemens AG, p. 4f.; CL 300, BPCE, p. 4f.; CL 308, ACCA, pp. 2–4, 8; CL 521, ESBG-WSBI, pp. 2-4; CL 583, CCMC US Chamber of Commerce, pp. 4, 6; CL 616, Dow Lohnes PLLC, pp. 1f., 5f., 8; CL 625, ANC, pp. 2, 11f. 47 This request was most frequently cited by auditing associations and standard-setters in 33% and 32% of their responses respectively, followed by businesses with 24% and then by banks/insurance companies and auditing companies with frequencies of 17% and 14%. Thus, this was more frequently a concern for developers than users. 48 For example, see CL 86, FEE, pp. 5–7; CL 88, Israel Accounting Standards Board, p. 3f.; CL 90, Shell International B. V., p. 4f.; CL 99, Financial Reporting Advisors LLC., pp. 5–7; CL 131, Hydro-Quebec, p. 2; CL 164, Deutsche Telekom AG, p. 12f.; CL 186, Air Canada, p. 1f., 4; CL 305, Chick-Fil-A Inc., p. 2; CL 309, ASCG, pp. 8–11; CL 429, Classic Technology, pp. 7, 13; CL 516, Norwegian Accounting Standards Board, p. 2; CL 608, Chartered Accountants Ireland AC, p. 4. 49 See PwC (2016b, pp. 5, 7).
page 3690
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
The Path Leading up to the New IFRS 16 Leasing Standard
3691
were therefore upheld despite considerable resistance. The lease liability is capitalized according to the fair value of the outstanding lease payments and remeasured by the effective interest method. The value of the RoU asset at initial valuation is equal to the lease liability, adjusted by a few specific aspects such as the initial direct costs. As noted earlier for Question 4, the IASB abolished the classification system, replacing it by a single-accounting model for lessees with consistent remeasurements of the RoU asset.50 Therefore, the accounting principles of finance leases prescribed by IAS 17 were largely readopted for IFRS 16.51 The emphatic and relatively uniform disagreement with the proposals of the Re-ED effectively contributed to major revisions of IFRS 16 on the topics raised by this question. 105.3.5 Question 3: Variation of the recognition, valuation, and presentation of leases for the lessor The accounting approach for lessors proposed by ED/2013/6 differentiates between the two classes outlined above. For Type A, the lessor must abandon the underlying asset and instead recognize a lease receivable and a residual asset. Remeasurement increases the lease receivable by the interest yield according to the effective interest method and reduces it by any settled lease payments. The carrying amount of the residual asset is then increased by the scheduled interest from the expected amounts at the end of the lease term. Impairment tests are also required for the lease receivable and the residual asset.52 For Type B, the accounting principles prescribed by IAS 17 and ASC 840 for operating leases were readopted in a largely intact form. A total of 436 responses were evaluated for Question 3. Overall, the respondents expressed strong disagreement (M = 3.17) with significant differences in response behavior across advocacy groups. The strongest disagreement came from businesses (M = 3.35), followed by auditing companies (M = 3.20) and banks/insurance companies (M = 3.10). The disagreement voiced by standard-setters was somewhat less pronounced (M = 2.96), and the opinions put forth by auditing associations were relatively neutral (M = 2.47). The Kruskal–Wallis test was able to establish differences in response behavior across advocacy groups, finding an asymptotic significance of 0.003. The Mann–Whitney test supported the conclusion that differences 50 See Deloitte (2016, p. 3); PwC (2016a, p. 1); the ASU 2016-02 diverges substantially from IFRS 16 in terms of remeasurements of the RoU asset. 51 See IASB (2016b, IFRS 16 Effects Analysis, p. 12). 52 See IASB (2013b, Basis for Conclusions, pp. 70, 75).
page 3691
July 6, 2020
16:3
3692
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
existed at the 5% significance threshold between businesses and standardsetters (p = 0.028), as well as between auditing associations on the one hand and banks/insurance companies (p = 0.029) and auditing companies (p = 0.030) on the other. The differences in response behavior between auditing associations and businesses were found to be extremely significant (p = 0.00022). The response behavior was also confirmed to differ across the aggregated categories of users and developers. The mean value (M ) of the responses by users was 3.28, compared to 2.75 for developers. The difference was found to be highly significant, with a p-value of 0.0007. The majority of both users and developers expressed strong disagreement in their responses.53 The most common request associated with strongly disapproving stances was to preserve the accounting model specified by IAS 17 and ASC 840 for lessors either intact or with some minor improvements.54 The strong resistance likely contributed to the decision of the IASB to retain the accounting model of IAS 17 for lessors in the final draft of the IFRS 16 standard.55 A few updates were introduced, such as adjustments to the revenue standard.56 This decision by the IASB tended to benefit the interests of users, and in particular the large number of businesses who responded to the draft. 105.3.6 Question 5: Lease term with remeasurement in the event of changes to relevant factors ED/2013/6 defines the lease term as the non-cancellable duration of the lease, extended by any periods covered by a renewal option whenever the lessee has a significant economic incentive to exercise this option, and including any periods covered by a termination option whenever the lessee does not have a significant economic incentive to exercise this option.57 Thus, 53
The grades of user responses were distributed as follows: 17% one, 7% two, 7% three, 70% four. The grades of developer responses were distributed as follows: 29% one, 16% two, 4% three, 51% four. 54 For example, see CL 166, HKAP, p. 4f.; CL 229, KEIDANREN, p. 9f.; CL 266 Tim Hortons Inc., p. 3; CL 308, ACCA, pp. 2f., 8; CL 319, NAR, p. 4; CL 337, COHNREZNICK LLP, p. 4f.; CL 349, Praxair Inc., pp. 5, 10; CL 521, ESBG-WSBI, p. 4; CL 625, ANC, pp. 2, 13f. 55 See KPMG (2016a, p. 35). 56 See Deloitte (2016, p. 4). 57 See IASB (2013a, ED/2013/6, p. 17).
page 3692
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3693
the significant economic incentive method was chosen as the basis of the valuation approach.58 The existence of a significant economic incentive is determined by performing a combined analysis of contractual, asset-related, and business- and market-based valuation factors.59 According to IAS 17 and ASC 840, the lease term must be remeasured whenever the lease changes without a separate contract being concluded.60 ED/2013/6 also stipulates that remeasurement should be performed whenever a significant economic incentive arises or disappears, whenever an option is exercised despite the non-existence of a significant economic incentive, and whenever an option is not exercised despite the existence of a significant economic incentive to do so.61 A total of 408 comments were submitted for Question 5. The positions expressed by the respondents were broadly indifferent overall (M = 2.49). However, there were significant differences across advocacy groups. The standard-setters (M = 1.83) and auditing associations (M = 1.91) were clearly in favor of the proposals. Banks/insurance companies (M = 2.45) and auditing companies (M = 2.54) were largely neutral, and businesses were slightly opposed (M = 2.72). The Kruskal–Wallis test confirmed that the responses were extremely dependent on association with an advocacy group, finding a p-value of 0.0000002. The Mann–Whitney test revealed extremely significant differences between businesses and standardsetters (p = 0.0000006) and businesses and auditing associations (p = 0.000067), as well as between banks/insurance companies and standardsetters (p = 0.00054). There were also highly significant differences between banks/insurance companies and auditing associations (p = 0.006), as well as standard-setters and auditing companies (p = 0.002). The differences in response behavior between auditing associations and auditing companies remained significant, with a p-value of 0.013. After aggregating the advocacy groups into the categories of users and developers, the users had a mean value (M ) of 2.64, compared to 1.87 for the developers. The U-test revealed that the differences between categories were extremely significant, with a p-value of 0.0000000064.
58
See See 60 See 61 See 59
IASB (2013b, Basis for Conclusions, p. 49). IASB (2013a, ED/2013/6, p. 17). Deloitte (2016, p. 4). IASB (2013a, ED/2013/6, p. 17f).
page 3693
July 6, 2020
16:3
3694
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
Both users and developers typically advanced arguments that were only partially in agreement with the proposals.62 This was often due to negative feedback regarding the significant economic incentive approach, frequently combined with requests to readopt the thresholds of reasonably certain and reasonably assured from IAS 17 and ASC 840.63 The desire for fixed remeasurement dates and/or the introduction of a materiality criterion was also frequently cited.64 The significant differences in the responses from users and developers stemmed from the strong approval expressed by many of the developers. Developers often expressed unreserved approval65 and/or only requested minor changes.66 For example, the auditing association CPA IRELAND requested support for contracts with automatic renewal options.67 By contrast, users often opposed the inclusion of any renewal and termination options in the definition of the lease term; in other words, the users tended to reject a measurement-based approach.68 The IASB ultimately confirmed the definition of the lease period proposed by ED/2013/6 but replaced the significant economic incentive concept by the reasonably certain threshold from IAS 17. The requirements for lessors to perform remeasurements were dropped, except in the event of changes to the contract. However, IFRS 16 stipulates that the lessee must perform a remeasurement in the circumstances outlined by the Re-ED, replacing the significant economic incentive threshold by the reasonably certain threshold and restricting the circumstances that require remeasurement to material changes that are controllable by the lessee.69 For the definition of the lease term from the perspective of the lessor, the IASB opted to reuse the
62
The grades of user responses were distributed as follows: 16% one, 35% two, 19% three, 31% four. The grades of developer responses were distributed as follows: 38% one, 46% two, 7% three, 9% four. 63 For example, see CL 140, HSBC, p. 6f.; CL 262, Deloitte Touche Tohmatsu Limited, p. 10f.; CL 301, Price Waterhouse Coopers International Limited, p. 6f.; CL 456, Singapore Accounting Standards Council, p. 13f. 64 See CL 268, Pfizer Inc., p. 6; CL 297, Ernst & Young Global Limited, p. 16f.; CL 332, TRW Automotive Holdings Group, p. 4; CL 394, Sensiba San Filippo LLP, p. 2.; CL 629, ICAS, p. 5. 65 See CL 21, FICPA AP&ASC, p. 2. 66 See CL 134, ESMA, p. 7; CL 218, MSCPA TIG, p. 3. 67 See CL 472, CPA Ireland, p. 2. 68 See CL 247 L Brands Inc., p. 4f.; CL 369, Coresite Realty Corporation, p. 4; CL 421, BHP Billiton Limited, p. 7; CL 493, Petroleo Brasileiro, pp. 5, 7; CL 503, Kesko Corporation, p. 6; CL 519, USB AG, p. 2; CL 562, Anglo American PLC, p. 5; CL 599, Businesseurope, pp. 13–15; CL 600, Sanofi, p. 5f. 69 See Deloitte (2016, p. 4).
page 3694
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3695
accounting model from IAS 17 and ASC 840 largely unchanged. The IASB’s final decision therefore tended to align with the favorable opinions expressed by the developers on issues relating to the accounting models used by lessees. 105.3.7 Question 6: Variable lease payments with remeasurement in the event of changes to the underlying index or interest rate ED/2013/6 stipulates that lessees must recognize variable lease payments under certain conditions in the initial valuation of the lease liability for both classes of lease, whereas lessors need only recognize them for Type-A leases in the lease receivable.70 The lessor may alternatively opt to recognize variable lease payments in the initial valuation of the residual asset instead of the lease receivable.71 In general, variable lease payments based on an interest rate or index, or variable lease payments that are de facto fixed, are a prerequisite of capitalization.72 The treatment of Type-B assets by the lessor was not opened for discussion.73 Any variable lease payments coupled to an interest rate or index that were incorporated into the initial measurement must be remeasured if the relevant indicators change over the course of a reporting period.74 The key difference between the treatment of variable lease payments under IAS 17 and ASC 840 and the treatment proposed by ED/2013/6 is the requirement that interest-rate and index dependencies must be remeasured instead of recognizing changes directly in the P&L where applicable.75 A total of 372 responses were submitted for Question 6. Figure 105.2 shows that the respondents were in slight agreement overall but the responses across different advocacy groups diverged strongly. Standard-setters clearly approved of the proposals (M = 1.61), and auditing associations also defended distinctly positive positions (M = 1.97). Banks/insurance companies (M = 2.30) and businesses also (M = 2.41) expressed faint approval. Auditing companies were positioned within the region of the slight disapproval, with a mean value of 2.69. The H-test found an asymptotic significance of p = 0.0000038, indicating considerable differences in response behavior between advocacy groups. The Mann–Whitney test showed that 70
See See 72 See 73 See 74 See 75 See 71
KPMG (2013, p. 37, 44). IASB (2013a, ED/2013/6, p. 27). KPMG (2013, p. 21). IASB (2013a, ED/2013/6, p. 31). IASB (2013a, ED/2013/6, pp. 21, 28). IASB (2008, Leases–Contingent rentals, p. 3).
page 3695
July 6, 2020
16:3
3696
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
each of the three groups of users differed extremely significantly from the standard-setters (businesses: p = 0.0000036; banks/insurance companies: p = 0.0001134; auditing companies: p = 0.0000056). The auditing associations differed highly significantly from the auditing companies (p = 0.006), and significantly from the businesses (p = 0.020). The stark contrast between users and developers was again apparent at the aggregated level: The mean value (M ) of respondents from the category of users was 2.42, compared to 1.78 for the respondents from the category of developers.76 The U-test found that these differences in response behavior were significant at the level of p = 0.0000005. Many developers expressed unreserved agreement with the proposals77 and/or requested minor changes, such as additional support for variable lease payments that are de facto fixed.78 Users often defended slightly unfavorable positions, instead asking for the remeasurement requirement in response to changes in the index or interest rate to be fully abolished, i.e., requesting the same treatment of variable lease payments as IAS 17 and ASC 840.79 Criticism of the remeasurement requirement, often together with requests to introduce a materiality criterion, featured among the slightly favorable positions frequently advanced by respondents from both categories.80 The same was true of requests to adjust the scope of the variable lease payments covered by the standard.81 The implementation of the proposals in the final version of IFRS 16 distinguishes between lessors and lessees. For lessors, variable lease payments are treated as in IAS 17.82 This is connected to the decision to preserve the dual-accounting model for lessors, as discussed in Question 3, and was likely
76
The grades of user responses were distributed as follows: 23% one, 27% two, 34% three, 15% four. The grades of developer responses were distributed as follows: 51% one, 26% two, 16% three, 6% four. 77 For example, see CL 522, Zambia Institute of Chartered Accountants, p. 4. 78 See CL 437, MASB, p. 6; CL 594, Belgian Accounting Standards Board, p. 5; CL 618, EFRAG, p. 10f.; CL 630, ICPAK, p. 5. 79 For example, see CL 332, TRW Automotive Holdings Group, p. 4; CL 344 ENSCO PLC, p. 7; CL 348, Grassi & Co., p. 5f.; CL 371, TTX Company, p. 4; CL 469, Confederation of Finnish Industries, p. 4; CL 447, RND, p. 7. 80 For example, see CL 194, Siemens AG, p. 8f.; CL 297 Ernst & Young Global Limited, p. 18f.; CL 301, Price Waterhouse Coopers International Limited, p. 7f.; CL 456, Singapore Accounting Standards Council, pp. 14–16; CL 523, Leaseteam Inc., p. 4; CL 603, ProSiebenSat. 1 Group, p. 8. 81 See CL 150, John M J Williamson, p. 12f.; CL 442, Standard & Poors Rating Services, p. 8; CL 489, FAR, p. 12; CL 527, Eumedion, p. 7; CL 547, GDF Suez, p. 8. 82 See KPMG (2016b, p. 5).
page 3696
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3697
motivated by reasons of practicability. Thus, on this issue, the decision of the IASB accepted and addressed the criticism from the category of users. For lessees, the IASB ultimately opted to accept most of the proposals outlined by ED/2013/6. Accordingly, variable lease payments associated with an interest rate or index must be incorporated into the initial valuation, as well as any variable lease payments that are effectively fixed. This is not considered to be necessary for other variable lease payments. The remeasurement requirement was not abolished in IFRS 16 despite frequent requests, although an additional limitation was introduced on the form of remeasurement required when the lease payments are globally remeasured due to other triggers, such as changes in the lease term, or contractual changes to the cash flows.83 This can be viewed as the introduction of a materiality threshold. 105.3.8 Question 12: Right-of-use assets under IAS 40 From the perspective of the lessee, if a Type-B RoU asset meets the definition of an investment property and all investment properties are remeasured according to the fair-value model, then IAS 40, and therefore the fair-value model, is mandatory for this RoU asset. Accordingly, the lessee’s right to opt for an operating lease does not apply to real-estate RoU assets.84 IAS 40 defines investment properties as real-estate assets acquired with the intent of long-term investment and/or leasing that are not intended for exploitation as part of ordinary business activities or for the production and distribution of products and services (IAS 40.5). Renting out any such investment property creates a lease in the sense of IAS 17.85 Type-A real-estate leases are never classified as investment properties.86 A total of 181 comments were submitted for Question 12. With an overall mean value (M ) of 2.04, this question received the most approval out of any of the seven questions considered in this study, but also provoked the largest deviations in response behavior (SA = 1.42). Once again, the response behavior differed considerably across advocacy groups. Standard-setters (M = 1.42) and auditing associations (M = 1.39) welcomed the adjustments very distinctly. Banks/insurance companies (M = 2.21) and auditing companies (M = 2.27) only expressed very slight approval. Businesses responded more or less indifferently (M = 2.43). The Kruskal–Wallis test revealed 83
See See 85 See 86 See 84
KPMG (2016b, p. 5). IASB (2013a, ED/2013/6, pp. 22, 71, 77). RBS (2015, p. 420). IASB (2013a, ED/2013/6, p. 73).
page 3697
July 6, 2020
16:3
3698
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
that the differences in response behavior between advocacy groups were highly significant (p = 0.00098). The Mann–Whitney test showed that the responses of businesses and standard-setters (p = 0.00067) differed extremely significantly, as well as the responses of standard-setters and auditing companies (p = 0.0099). The differences between businesses and auditing associations (p = 0.003) were also highly significant, as were the difference between banks/insurance companies and standard-setters (p = 0.006). The 5% significance threshold was also met by the differences between banks/insurance companies and auditing associations (p = 0.016), as well as auditing associations and auditing companies (p = 0.020). Extreme differences in response behavior were found between the aggregated categories of users and developers: the mean value (M ) of respondents from the category of users was 2.34, compared to 1.41 for developers. The U-test found a p-value of 0.00002. The majority of both users and developers fully supported the proposals.87 The significant differences between the two categories resulted primarily from strongly negative positions defended by a large segment of users. These users typically argued that applying the FV model to RoU assets is inappropriate and so the requirements for treatment under IAS 40 are not met.88 Another aspect frequently criticized by users, cited as the basis of strongly disapproving positions, was the mandatory nature of the treatment under IAS 40. These users instead wanted the right to choose to be preserved, as is the case for operating leases.89 The IASB ultimately decided to incorporate the proposals of the Re-ED regarding the treatment of real-estate RoU assets under IAS 40 into the final draft of the IFRS 16 standard without modifications. Hence, RoU assets are required to be remeasured according to the fair-value model whenever the relevant prerequisites are satisfied (IFRS 16.34, IAS 40.33, IAS 40.40A, IAS 40.41).90 This decision by the IASB therefore aligned with the strongest
87
The grades of user responses were distributed as follows: 53% one, 2% two, 1% three, 43% four. The grades of developer responses were distributed as follows: 86% one, 0% two, 0% three, 14% four. For example, see CL 196, Hermes Equity Ownership Services Limited, p. 5; CL 297, Ernst & Young Global Limited, p. 23; CL 516, Norwegian Accounting Standards Board, p. 4; CL 576, Polish Accounting Standards Commitee, p. 6; CL 629, ICAS, p. 6. 88 See CL 154, A Group of Japanese Companies, p. 9f.; CL 229, KEIDANREN, p. 14f.; CL 619, Allianz SE, p. 4. 89 For example, see CL 148, ACLI, p. 6; CL 162, Bank of communications, p. 3f.; CL 327, Canadian Bankers Association, p. 8; CL 557, EACB, p. 8; CL 591, BNP Paribas, p. 11. 90 See IASB (2016c, International Financial Reporting Standard 16 Leases, p. A744); IASB (2016d, International Accounting Standard 40 Investment Property, p. A1350f).
page 3698
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3699
consensus across advocacy groups expressed in the comments of the Re-ED out of any of the questions considered in this study.
105.4 Discussion of the Most Important Results The results of this study compellingly illustrate the difficulties encountered by the IASB as they sought to reform the principles of lease accounting, and explain why such an extended development period was required for the final draft. According to the findings presented here, the obstacles ultimately seem to have arisen from the existence of two wildly different approaches to lease accounting, both of which can be justified by very strong arguments. On the one hand, a more practically oriented approach can be identified; from a practical perspective, it could be argued that the long-standing IAS 17 standard has proven to be perfectly sufficient for ordinary activities, and so there is no real need to revise the regulations. Concerns about the theoretical consistency of accounting standards are less relevant to practitioners. This line of reasoning appears to be much more popular among the advocacy groups that were collectively characterized as the category of users in this paper. On the other hand, another more theoretically oriented perspective can also be adopted, placing much stronger emphasis on the theoretical consistency of the standard. From the theoretical point of view, the all-ornothing approach of IAS 17 is extremely unsatisfactory. Relatively small differences, e.g., in the duration of the lease term relative to the total useful life of a lease asset, completely change the accounting picture (operating lease vs. finance lease), which is difficult to justify from a theoretical standpoint. It is therefore completely understandable that the more theoretically inclined respondents are supportive of attempts to abolish the IAS 17 regulations. The more theoretical approach appeared to be much more popular among the advocacy groups described above as belonging to the category of developers. Unsurprisingly, this resulted in differences in response behavior. Nevertheless, the extent of the discrepancies and the uniformity of the arguments advanced by the two major factions are remarkable. For all seven questions, the highest level of approval was expressed by one of the developer advocacy groups — the auditing associations in five of the questions, and the standardsetters in the other two. In six of the seven questions, the second-strongest agreement came from the other developer advocacy group (banks/insurance companies were in second place for the seventh question). In each of the
page 3699
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3700
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
seven questions, the strongest disapproval was articulated by one of the user advocacy groups — businesses in four cases and auditing companies in the other three. In every one of these cases, the other group in this pair was second-strongest in terms of disapproval. Banks/insurance companies occupied middle place on six of the seven issues. Previous studies have already reported that users tend to judge newly proposed standards more harshly than developers,91 but their results tended to focus on some specific questions. On the subject of lease accounting, however, the differences were apparent in virtually every considered question. As shown in Table 105.3 by aggregating into users and developers, significant differences in response behavior were established for six of the seven questions. Only Question 2 resulted in a p-value slightly above the 5% threshold. In four of the seven questions, the differences were even extremely significant (p ≤ 0.001). These results illustrate the obstacles faced by the IASB in their project to reform the principles of lease accounting. Two highly polarized “factions” seem to have formed, described as users and developers in this paper. Within each category, the respondents presented strongly uniform arguments. Despite the many years of preparation and three drafts leading up to the final version of the standard, the IASB seems to have been unsuccessful in reconciling the diametrically opposed views of these two factions and unable to establish a broad consensus across all advocacy groups for their lease accounting reform. Even though some modifications were made in the final draft of IFRS 16, it seems unlikely that the skepticism of users toward the proposed accounting regulations will have been appeased.
105.5 Summary This paper performs a quantitative analysis of CLs submitted for the Re-ED on lease accounting published by the IASB. Evaluating these CLs in terms of their association with an advocacy group immediately reveals enormous differences in response behavior: standard-setters and auditing associations consistently expressed more favorable opinions than the other advocacy groups, especially businesses and auditing companies. Thus, the five advocacy groups were combined into the broader categories of users and developers. At this aggregated level, the systematic deviations in response
91
In particular, see Wielenberg et al. (2007, pp. 453–459).
page 3700
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
The Path Leading up to the New IFRS 16 Leasing Standard
b3568-v4-ch105
3701
behavior become even more pronounced. Users responded to every question with much greater skepticism than developers.
Bibliography Blecher, C., Recke, T. and Wielenberg, S. (2009). Das Diskussionspapier zu IAS 19 - Eine Systematische Auswertung der Eingegangenen Kommentare. Zeitschrift f¨ ur Internationale und Kapitalmarktorientierte Rechnungslegung 9, 565–575. Burns, R.B. and Burns, R.A. (2008). Business Research Methods and Statistics Using SPSS, Sage Publications, London. Corder, G.W. and Foreman, D.I. (2009). Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Hoboken, Wiley. Deloitte (2016). Deloitte Development LLC, FASB’s New Standard Brings Most Leases Onto the Balance Sheet, Heads Up-Newsletter, Vol. 23 Issue 5, 2016, Available online at: https://www2.deloitte.com/content/dam/Deloitte/us/Documents/audit/ASC/H U/2016/us-aers-headsup-fasb-new-standard-brings-most-leases-onto-the-balance-sh eet.pdf (last retrieved: 23/05/2018). Field, A. (2017). Discovering Statistics Using IBM SPSS, 5th Edition. Sage Publications, London. Gravetter, F.J. and Wallnau, L.B. (2013/2010). Statistics for the Behavioral Sciences, 9th Edition. Wadsworth, Belmont. IASB (2008). Leases — Contingent rentals (Agenda Paper 13C). IASB (2009). Discussion Paper DP/2009/1 Leases: Preliminary Views, Available online at: https://www.ifrs.org/-/media/project/leases/discussion-paper/published-docum ents/dp-leases-preliminary-views-march-2009.pdf (last retrieved: 23/05/2018). IASB (2010). Exposure Draft ED/2010/9 Leases, https://www.ifrs.org/-/media/ project/leases/exposure-draft/published-documents/ed-leases-august-2010.pdf (last retrieved: 23/05/2018). IASB (2013a). Re-Exposure Draft ED/2013/6 Leases, Available online at: https://www.ifrs.org/-/media/project/leases/revised-ed/published-documents/ed-le ases-may-2013.pdf (last retrieved: 23/05/2018). IASB (2013b). Basis for Conclusions, Exposure Draft ED/2013/6 Leases, Available online at: www.ifrs.org/-/media/project/leases/revised-ed/published-documents/ed-leasesbasis-for-conclusions-may-2013.pdf (last retrieved: 23/05/2018). IASB (2016a). IFRS 16 Leases Project Summary and Feedback Statement, Available online at: https://www.ifrs.org/-/media/project/leases/ifrs/published-documents/ifrs16-pr oject-summary.pdf (last retrieved: 23/05/2018). IASB (2016b). IFRS 16 Effects Analysis, Available online at: https://www.ifrs.org/-/medi a/project/leases/ifrs/published-documents/ifrs16-effects-analysis.pdf (last retrieved: 23/05/2018). IASB (2016c). International Financial Reporting Standard 16 Leases. IASB (2016d). International Accounting Standard 40 Investment Property. KPMG (2013). KPMG IFRG Limited, New on the Horizon: Lease, Available online at: https://assets.kpmg.com/content/dam/kpmg/pdf/2013/06/Leases-O-201305.pdf (last retrieved: 23/05/2018). KPMG (2016a). KPMG IFRG Limited, IFRS 16 Leases — A More Transparent Balance Sheet, First Impressions, Available online at: https://home.kpmg.com/content/dam/ kpmg/pdf/2016/01/leases-first-impressions-2016.pdf (last retrieved: 23/05/2018).
page 3701
July 6, 2020
16:3
3702
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch105
C. Blecher & S. Kruse
KPMG (2016b). KPMG LLP, Summary of Similarities and Differences between New US GAAP and IFRS Lease Accounting Standards, Available online at: http://www.kpmg-institutes.com/content/dam/kpmg/financialreportingnetwork /pdf/2016/defining-issues-summary.pdf (last retrieved: 23/05/2018). Lecoutre, B. and Poitevineau J. (2014). The Significance Test Controversy Revisited: The Fiducial Bayesian Alternative, Springer. McKillup, S. (2012). Statistics Explained: An Introductory Guide for Life Scientists, 2nd Edition. Cambridge University Press, Cambridge. PwC (2016a). PricewaterhouseCoopers, Lease accounting: The long-awaited FASB standard has arrived, Available online at: www.pwc.com/us/en/cfodirect/assets /pdf/in-brief/us-2016-05-fasb-lease-accounting-standard-asc842.pdf (last retrieved: 23/05/2018). PwC (2016b). PricewaterhouseCoopers, IFRS 16: The leases standard is changing, Available online at: https://www.pwc.com/gx/en/communications/pdf/ifrs-16-leases-sta ndard-changing.pdf (last retrieved: 23/05/2018). RBS (2015). RoeverBroennerSusat GmbH & Co. KG, IFRS International Financial Reporting Standards, 5th Edition. 2015, Available online at: www.ifrs-portal.com/ Publikationen/IFRS Texte 5.0 2015 04.pdf (last retrieved: 23/05/2018). ¨ Schmidt, M. and Blecher, C. (2015). Das Diskussionspapier des IASB zur Uberarbeitung des Conceptual Framework — eine Systematische Auswertung der Comment Letters, Zeitschrift f¨ ur Internationale und Kapitalmarktorientierte Rechnungslegung 15, 252–260. Weinberg, S.L. and Abramowitz, S.K. (2002). Data Analysis for the Behavioral Sciences Using SPSS. Cambridge University Press, Cambridge. Wielenberg, S., Blecher, C. and Puchala, A. (2007). Die Reform der Bilanzierung von Non Financial Liabilities: Systematische Auswertung der Kommentare zum ED IAS 37. Zeitschrift f¨ ur Internationale und Kapitalmarktorientierte Rechnungslegung 7, 453–459.
page 3702
July 17, 2020
16:22
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Chapter 106
Implied Variance Estimates for Black–Scholes and CEV OPM: Review and Comparison Cheng Few Lee, Yibing Chen and John Lee Contents 106.1 106.2 106.3 106.4 106.5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Literature Review . . . . . . . . . . . . . . . . . . . . . MATLAB Approach to Estimate Implied Variance . . . Approximation Approach to Estimate Implied Variance Some Empirical Results . . . . . . . . . . . . . . . . . . 106.5.1 Cases from US — individual stock options . . . 106.5.2 Cases from China — 50 ETF options . . . . . . 106.6 Implied Volatility in Terms of CEV Model and Related Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 106.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . Excel . . . . . . . . .
3704 3705 3717 3720 3725 3725 3726 3728 3735 3735
Cheng Few Lee Rutgers University e-mail: cfl[email protected] Yibing Chen Asset Allocation & Research Department, National Council for Social Security Fund e-mail: [email protected] John Lee Center for PBBEF Research e-mail: [email protected] 3703
page 3703
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3704
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
Abstract The main purpose of this chapter is to demonstrate how to estimate implied variance for both Black–Scholes option pricing model (OPM) and constant elasticity of variance (CEV) OPM. For the Black–Scholes OPM model, we classify them into two different estimation routines: numerical search methods and closed-form derivation approaches. Both MATLAB approach and approximation method are used to empirically estimate implied variance for American and Chinese options. For the CEV model, we present the theory and demonstrate how to use related Excel program in detail. Keywords Implied variance • Black–Scholes model • MATLAB approach • Approximation approach • CEV model.
106.1 Introduction It is well known that implied variance estimation is important for evaluating option pricing. There are two class of methods to determine the implied volatility. The first class uses iterative root finders to estimate implied volatility, for example Manaster and Koehler (1982), Delbourgo and Gregory (1985), Jackel (2006, 2015), and MATLAB implied volatility function blsimpv. The second class uses non-iterative approximation to calculate implied volatility, for example Brenner and Subrahmanyyan (1988), Lai et al. (1992), Chance (1996), Chambers and Nawalkha (2001), Li (2008), Ang et al. (2013), Glau et al. (2017), Salazar Celis (2017) and Pagliarani and Pascucci (2017). In this paper, we first review several alternative methods to estimate implied variance in Section 106.2. We classify them into two different estimation routines: numerical search methods and closed-form derivation approaches. Closed-form derivation approaches took use of either Taylor expansion or inverse function to calculate the analytical solutions for the ISD. In Section 106.3, we show how the MATLAB computer program can be used to estimate implied variance. This computer program is based upon the Black–Scholes model using Newton–Raphson method. In Section 106.4, we discuss how the approximation method derived by Ang et al. (2013) can be used to estimate implied variance under the case of continuous dividends. This approximation method can also estimate implied volatility from two options with the same maturity, but different exercise prices and values. In Section 106.5, real data from American option markets and Chinese option markets are used to compare the performances of three typical alternative methods: regression method proposed by Lai et al., MATLAB computer program approach and approximation method derived by Ang et al. In
page 3704
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3705
Section 106.6, we introduce how to estimate implied volatility using constant elasticity of variance (CEV) model, and present related Excel program for calculation. Section 106.7 summarizes the paper. 106.2 Literature Review The derivation and use of the implied volatility for an option as originated by Latane and Rendleman (1976) has become a widely used methodology for variance estimation. Latane and Rendleman (1976) argued that although it is impossible to solve the B–S equation directly, one can use numerical search to closely approximate the standard deviation implied by given option price. The exact form of Black and Scholes model they used is given below: C = SN (d1 ) − Xe−rT N (d2 ),
(106.1)
where d1 =
ln(S/X)+(r+ 12 σ2 )T √ σ √T
;
d2 = d1 − σ T ; S = current market price of the underlying stock; X = exercise price; r = continuous constant interest rate; T = remaining life of the option. Their procedure is to find an implied standard deviation which makes the theoretical option value, i.e., the right-hand side of equation (106.1), within ±0.001 of the observed actual call price. This is a kind of trial-and-error method. Later researchers such as Beckers (1981), Manaster and Koehler (1982), Brenner and Subrahmanyam (1988), Lai et al. (1992), Chance (1996), Corrado and Miller (1996), Hallerbach (2004), Li (2005) and Corrado and Miller (2006), and have studied implied variance estimation in more detail. Since the Black–Scholes’ option pricing model is a nonlinear equation, an explicit analytic solution for the ISD is not available in the literature (except for at-the-money call) and numerical methods are generally used to approximate the ISD. Manaster and Koehler (1982) used the Newton– Raphson method to provide an iterative algorithm for the ISD. They rewrote the Black–Scholes formula as in equation (106.2), for given values of S, X, r and T : C = f (S, X, r, T, σ) = f (σ).
(106.2)
page 3705
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3706
For given values of S, X, r and T , f is a function of σ alone, and satisfies that: lim f (σ) = max(0, S − Xe−rT ),
σ→0+
lim f (σ) = S.
σ→∞
Equation (106.2) will have a positive solution of implied standard deviation σ ∗ , if and only if the option is rationally priced that max(0, S−Xe−rT ) < C < S. This is because function f (·) is strictly monotone increasing in σ over (0, ∞),1 and the monotonicity and continuity of f (·) guarantees there is a unique solution. Newton–Raphson method is a common used method to solve nonlinear systems of equation. In this case for equation (106.2), the method is stated as σn+1 = σn −
f (σn ) − C , f (σn )
(106.3)
where σn is the nth estimate of σ ∗ , and f (σn ) is the first derivative of f (σ) when σ = σn . 1
Here we will briefly prove that the function f (·) is strictly monotone increasing in σ. f (σ) =
∂N (d1 ) ∂N (d2 ) ∂C =S − Xe−rT ∂σ ∂σ ∂σ
∂N (d2 ) ∂d2 ∂N (d1 ) ∂d1 − Xe−rT ∂d1 ∂σ ∂d2 ∂σ √ S 3 d2 + r + 12 σ 2 T T σ 2 T 2 − ln X 1 1 = S √ e− 2 σ2T 2π √ S 2 T − ln X + r + 12 σ 2 T 1 − d21 S rT −rT √ e e − Xe 2T X σ 2π 3 d2 1 σ2T 2 1 = S √ e− 2 σ2 T 2π
=S
d2 √ 1 1 = S T √ e− 2 2π √ = S T N (d1 ).
See Chapter 20 of Lee et al. (2013) for more details if interested in derivation of partial derivatives. Since f (σ) > 0 when S, X, r and T > 0, and σ > 0, we have that f (·) is strictly monotone increasing in σ.
page 3706
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3707
Mean-Value Theorem. Let f be a continuous function on the closed interval [a, b], and can be differentiable on the open interval (a, b), where a < b. There exists some c ∈ (a, b) such that: f (c) =
f (b) − f (a) . b−a
(106.4)
Under this case, mean-value theorem is stated as: f (λσn + (1 − λ)σ ∗ ) =
f (σn ) − f (σ ∗ ) f (σn ) − C = , ∗ σn − σ σn − σ ∗
for some λ ∈ (0, 1). (106.5)
Combining the above equation and equation (106.3) in the main text, we can easily get f (σn ) − C f (λσ ∗ + (1 − λ)σn ) |σn+1 − σ ∗ | = 1 − = 1− . |σn − σ ∗ | f (σn )(σn − σ ∗ ) f (σn ) (106.6) This motivates to choose σ1 as σ that maximizes f (σ) so that σ2 will be closer to σ ∗ than σ1 . From footnote 1, we know that to maximize f (σ) is to maximize N (d1 ), where N (·) is the standard normal density function. d2 N (d1 ) = √12π exp(− 21 ). For simplicity of presentation, we denote N (d1 ) = 1) g(d1 ). First-order conditions for maximizing N (d1 ), i.e., g(d1 ), is ∂g(d ∂σ = 0. We then have 1 − d21 ∂d1 ∂g(d1 ) = √ e 2 · (−d1 ) · ∂σ ∂σ 2π √ ln(S/X) + rT T 1 − d21 √ + = √ e 2 · (−d1 ) · − 2 2π σ2 T √ σ T ln(S/X) + rT 1 1 − d21 √ − = √ e 2 · (d1 ) · σ 2 2π σ T d2 1 1 = √ e− 2 d1 d2 /σ 2π
= g(d1 )d1 d2 /σ = 0. Therefore, first-order condition is simplified to d1 d2 = 0. This happens ) or d2 = 0 in which case if either d1 = 0 in which case σ 2 = −2(ln(S/X)+rT T
page 3707
July 6, 2020
16:3
3708
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
) σ 2 = 2(ln(S/X)+rT . Now we are checking second-order conditions under both T the two cases:
∂g (d1 ) ∂ 2 g(d1 ) = ∂σ 2 ∂σ
∂d1 d1 d2 d1 d2 + g(d1 ) ∂σ σ σ 2 d1 d2 d2 + d22 + d1 d2 = g(d1 ) − g(d1 ) 1 σ σ2
= g (d1 )
= g(d1 )
d21 d22 − d21 − d22 − d1 d2 . σ2
First-order conditions give that either d1 = 0 or d2 = 0. When d1 = 0, 2 g(d ) −d2 −d2 1 = g(d1 ) σ22 < 0. Similarly, when d2 = 0, ∂ ∂σ = g(d1 ) σ21 < 0. 2
∂ 2 g(d1 ) ∂σ2 ∂ 2 g(d1 ) ∂σ2
< 0 holds under both cases, therefore, g(d1 ) and f (σ) are simultaneously maximized. From the above discussion, we know that the starting point σ1 should be chosen by maximizing the partial derivative of call option respective to volatility f (σ), as given in equation (106.7): 2 S 2 + rT . (106.7) σ1 = ln X T Manaster and Koehler (1982) claimed that by starting with the above σ1 , implied variance estimate converges monotonically quadratically. Brenner and Subrahmanyam (1988) applied Taylor series expansion to the cumulative normal function at zero up to the first order in the Black– Scholes option pricing model. For at-the-money options, they set the underlying asset price S equal to the present value of exercise price Xe−rT , i.e., S = Xe−rT , then d1 and d2 in equation (106.1) are 1 √ σ T, 2 1 √ d2 = − σ T . 2 d1 =
(106.8)
Taylor series expansion is applied to the cumulative normal function at zero, while ignoring all the remaining terms beyond d1 : N (d1 ) = N (0) + N (0)d1 + · · · =
1 1 + √ d1 + o(d1 ). 2 2π
(106.9)
page 3708
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3709
Therefore, we have, √ 1 1 1 1 + √ d1 = + √ σ T , 2 2 2 2π 2π √ 1 1 N (d2 ) ≈ 1 − N (d1 ) = − √ σ T . 2 2 2π N (d1 ) ≈
(106.10) (106.11)
Substituting equations (106.10) and (106.11) into call option pricing equation demonstrated in equation (106.1), we get √ Sσ T . (106.12) C= √ 2π Implied standard deviation then can be solved from equation (106.13), shown below: √ C 2π (106.13) σ= √ . S T Note that Brenner and Subrahmanyam’s method can only be used to estimate implied standard deviation from at-the-money or at least not too far in- or out-of-the-money options. Lai et al. (1992) derived a closed-form solution for the ISD in terms of ∂C the delta ∂C ∂S , ∂X , and other observable variables. From equation (106.1), ceteris paribus, the effects of a change in stock price S and exercise price X on the call price are determined by Smith Jr (1976), as follows, respectively: ∂C = N (d1 ), ∂S
2
The derivation of
∂C ∂S
(106.14)2
is as follows:
∂N (d2 ) ∂C ∂N (d1 ) = N (d1 ) + S − Xe−rT ∂S ∂S ∂S = N (d1 ) + S
∂N (d2 ) ∂d2 ∂N (d1 ) ∂d1 − Xe−rT ∂d1 ∂S ∂d2 ∂S 2
2
d1 d1 1 1 1 1 S rT √ − Xe−rT √ e− 2 √ = N (d1 ) + S √ e− 2 · e · X 2π 2π Sσ T Sσ T 2
= N (d1 ) + S = N (d1 ).
2
d1 d1 1 1 √ √ e− 2 − S e− 2 Sσ 2πT Sσ 2πT
page 3709
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3710
∂C = −e−rT N (d2 ). ∂X
(106.15)3
Equations (106.14) and (106.15) can be rearranged as equations and (106.17), respectively: ∂C −1 , d1 = N ∂S √ ∂C −1 rT . e − d2 = d1 − σ T = N ∂X Combining equations (106.16) and (106.17) yields: √ ∂C −1 ∂C −1 rT −N T, e − σ= N ∂S ∂X
(106.16)
(106.16) (106.17)
(106.18)
where N −1 (·) is the inverse cumulative normal distribution function. Equation (106.18) shows that ISD calculation depends on two partial derivatives of the call option with respect to the stock price and exercise Note that √ T )2
2
d2 (d1 −σ ∂N (d2 ) 1 1 2 = √ e− 2 = √ e− ∂d2 2π 2π 2
√ d1 σ2 T 1 = √ e − 2 · e d1 σ T · e − 2 2π 2
d1 ln( S )+ 1 = √ e− 2 · e X 2π
2 r+ σ2 T
· e−
σ2 T 2
2
d1 1 S = √ e− 2 · · erT . X 2π
See Chapter 20 of Lee et al. (2013) for details if interested. 3 ∂C The derivation of ∂X is as follows: ∂C ∂N (d2 ) ∂N (d1 ) =S − e−rT N (d2 ) − Xe−rT ∂X ∂X ∂X ∂N (d2 ) ∂d2 ∂N (d1 ) ∂d1 − e−rT N (d2 ) − Xe−rT ∂d1 ∂X ∂d2 ∂X d2 1 S 1 X 1 − 2 − e−rT N (d2 ) = S √ e− 2 √ X 2π σ T S d2 1 S rT S 1 1 X √ − Xe−rT √ e− 2 e − 2 X X 2π σ T S
=S
= −e−rT N (d2 ). See Chapter 20 of Lee et al. (2013) for details if interested.
page 3710
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3711
∂C price, i.e., ∂C ∂S and ∂X , and other two observable variables: time to maturity T and risk-free rate r. Note that implied volatility σ should not be negative, therefore, a negative right-hand side of equation (106.18) is not feasible. Lai et al. (1992) argued that although the Black–Scholes option pricing model is a function of five variables, according to Merton (1973), the BS model exhibits homogeneity of degree one in the stock price and exercise price, which is shown in equation (106.19): ∂C ∂C S+ X = βS S + βX X, (106.19) C= ∂S ∂X ∂C where βS = ∂C ∂S and βX = ∂X . The two partial derivatives can be estimated by running the following linear multiple regression: −rT e Xit + εit . Cit = α + βS St + βX
(106.20)
Substituting the least square estimators β S and β X in equation (106.20) into equation (106.18), the implied variance can be estimated as √ σ = [N −1 (β S ) − N −1 (−β X )]/ T .
(106.21)
Instead of running linear regression to estimate the two partial derivatives ∂C ∂S ∂C ∂C C and ∂X , we can first find ∂X by simple or weighted averaging X for various exercise prices (S is being held constant provided the call price quotes are simultaneous). Then the other partial derivative ∂C ∂S is got from equation (106.19) Lai et al. (1992) mentioned that this alternative approach would work best for index options, where there are many simultaneous quotes. It should also be noted that, following their method, there is an alternative way to estimate implied standard deviation only using one partial derivative ∂C ∂S . From equation (106.19), we have ∂C ∂C X=C− S ∂X ∂S C S ∂C ∂C = − . ⇒ ∂X X X ∂S
(106.22)
Substituting equation (106.22) into equation (106.18), we will have a new closed form solution for the ISD only depending on delta ∂C ∂S and other
page 3711
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3712
observable variables, as shown in equation (106.23): √ C ∂C S ∂C − N −1 erT − T. σ= N −1 ∂S X ∂S X
(106.23)
Brenner and Subrahmanyam (1988)’s formula for estimating implied variance is simple, but limited only to at-the-money or at least too far in- or out-of-the-money cases. On the basis of their research, Chance (1996) developed a generalized formula so that this formula can be implemented under other cases when options are in-the-money or out-of-the-money. Recall Brenner–Subrahmanyam formula for ISD is √ C ∗ 2π ∗ √ , (106.24) σ = S T where C ∗ is the price of the at-the-money call. We assume the call has an exercise price X ∗ . Chance (1996) proposed a model that start with equation (106.24), and added terms to reflect both the moneyness and sensitivity of standard deviation. The option with the unknown implied standard deviation is priced at C and has an exercise price of X. By definition, the difference between the at-the-money call and the call with unknown ISD is given as ΔC ∗ = C − C ∗ .
(106.25)
He argued that the difference in the prices of the two calls comes from: (1) the difference in exercise prices, i.e., ΔX ∗ = X − X ∗ ; (2) the difference in standard deviation, i.e., Δσ ∗ = σ − σ ∗ . He applied second-order Taylor series expansion on ΔC ∗ , which yields ΔC ∗ =
∂C ∗ 1 ∂2C ∗ ∂C ∗ ∗ ∗ 2 (ΔX ) + (ΔX ) + (Δσ ∗ ) ∂X ∗ 2 ∂X ∗2 ∂σ ∗ ∂2C ∗ 1 ∂2C ∗ ∗ 2 (Δσ ) + (Δσ ∗ ΔX ∗ ). + 2 ∂σ ∗2 ∂σ ∗ ∂X ∗
(106.26)
Since these partial derivatives which appear in equation (106.26) are for at-the-money calls, their formulas can be simplified using the following relationships: S = X ∗ e−rT , 1 √ d∗1 = σ ∗ T , 2 1 √ d∗2 = − σ ∗ T . 2
(106.27) (106.28) (106.29)
page 3712
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3713
Therefore, we have the following important equations for partial derivatives hold, respectively: ∂C ∗ = −e−rT N (d∗2 ), ∂X ∗ d∗2 2
− ∗ ∗ 1 ∂2C ∗ −rT ∂N (d2 ) ∂d2 −rT e 2 √ √ = −e = −e ∗ ∗2 ∗ ∗ ∂X ∂d2 ∂X 2π σ T
(106.30)4
X∗ S∗
S∗ − ∗2 X
d∗2 2
e−rT e− 2 √ . = X ∗ σ ∗ 2πT
(106.31)
For an at-the-money call, equation (106.31) is given as in equation (106.32): d∗2 2
∗2 e−rT e−rT e− 2 ∂2C ∗ −σ 8 T √ √ = e = , ∂X ∗2 X ∗ σ ∗ 2πT X ∗ σ ∗ 2πT
√ − d∗2 1 ∗) ∂N (d∗1 ) S ∂N (d Te 2 ∂C ∗ 2 ∗ −rT √ . =S −X e = ∂σ ∗ ∂σ ∗ ∂σ ∗ 2π
(106.32)
(106.33)5
Given the call is at the money, equation (106.33) is given as in equation (106.34): √ Xe−rT T − σ∗2 T ∂C ∗ √ e 8 , = (106.34) ∂σ ∗ 2π d∗2 d∗2 √ √ √ 1 1 ∗ S T e− 2 S T e− 2 ∗ T ln(S/X ∗ ) + rT ∂2C ∗ ∗ ∂d1 √ √ + (−d1 ) ∗ = − √ d1 − = ∗2 ∂σ ∗2 ∂σ 2 2π 2π σ T √ ∗ ∗ S T d∗2 1 d d (106.35) = √ e− 2 1 ∗ 2 . σ 2π For an at-the-money call, equation (106.35) becomes: 3
X ∗ e−rT T 2 σ ∗ − σ∗2 T ∂2C ∗ √ e 8 , = ∂σ ∗2 4 2π
4 5
The derivation of equation (106.30) has been shown in Footnote 3. The derivation of equation (106.33) has been shown in Footnote 1.
(106.36)
page 3713
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3714
∗ ∂2C ∗ −rT ∂N (d2 ) ∂d2 = −e ∂σ ∗ ∂X ∗ ∂d∗2 ∂σ √ S 1 ∗2 d∗2 σ T − ln + r + T S 1 1 ∗ X 2 erT = −e−rT √ e− 2 ∗ ∗2 X σ T 2π d∗2 1
Se− 2 d∗1 √ . = X ∗ σ ∗ 2π Given the call is at the money, equation (106.37) becomes √ e−rT T − σ∗2 T ∂2C ∗ e 8 . = √ ∂σ ∗ ∂X ∗ 2 2π
(106.37)
(106.38)
Equation (106.25) can be restated as C ∗ − C + ΔC ∗ = 0.
(106.39)
Substituting equation (106.26) into equation (106.39), equation (106.39) can be viewed as a quadratic equation of Δσ ∗ , written as a(Δσ ∗ )2 + b(Δσ ∗ ) + c = 0,
(106.40)
where 1 ∂2C ∗ , 2 ∂σ ∗2 ∂2C ∗ ∂C ∗ + (ΔX ∗ ), b= ∂σ ∗ ∂σ ∗ ∂X ∗ ∂C ∗ 1 ∂2C ∗ ∗ (ΔX ) + (ΔX ∗ )2 . c = C∗ − C + ∂X ∗ 2 ∂X ∗2 Therefore, the solution of equation (106.40) should be √ −b ± b2 − 4ac ∗ . (106.41) Δσ = 2a Experiments in Chance (1996) suggest that positive root for Δσ ∗ of equation (106.40) give the correct solution for implied variance when adding it to the value of σ ∗ from Brenner–Subrahmanyam formula. One thing that needs to be noted is that in order to apply Chance’s formula to compute the ISD, the standard deviation and the option price under the at-the-money case must be given. In other words, if the underlying asset price deviates from the present value of the exercise price and the call option price is not available (or unobservable) in the market, then Chance’s formula for the ISD may not apply, just as the case of Brenner–Subrahmanyam formula. a=
page 3714
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3715
To allow for the deviation between the underlying asset price and the present value of exercise price, Corrado and Miller (1996) expanded the cumulative normal function at zero to the first-order term in the Black– Scholes OPM to derive a quadratic equation of the ISD. Their approach followed the method employed by Brenner and Subrahmanyam, and made use of the expansion of the normal distribution function as stated in equation (106.42): z3 1 1 z− + ··· . (106.42) N (z) = + √ 2 6 2π Substituting equation (106.42) into the normal probabilities N (d1 ) and N (d2 ) in classic Black–Scholes model as equation (106.1) states, we have equation (106.43) holds when cubic and higher order terms are ignored. √ d1 d − σ T 1 1 1 √ − Xe−rT . (106.43) +√ + C=S 2 2 2π 2π K is defined as the present value of strike price X, i.e., K = Xe−rT . Recall the expressions for d1 and d2 are
ln(S/K) + 12 σ 2 T ln(S/X) + r + 12 σ 2 T √ √ = , d1 = σ T σ T √ d2 = d1 − σ T . Equation (106.43) can be restated as 1 ln(S/K) − 12 σ 2 T 1 ln(S/K) + 12 σ 2 T √ √ −K . (106.44) + + C=S 2 2 σ 2πT σ 2πT √ Equation (106.44) can be formulated as the quadratic equation of σ T , as shown in equation (106.45): √ √ √ σ 2 T (S+K)−σ T [(2 2πC− 2π(S−K)]+2(S−K) ln(S/K) = 0. (106.45) Corrado and Miller (1996) proved that only the largest root for equation (106.45) reduced to the original Brenner–Subrahmanyam formula, which is shown in equation (106.46): √ √ 2C − S + K π 2C − S + K 2 2(S − K) ln(S/K) + . − σ T = 2π 2(S + K) 2 S+K S+K (106.46)
page 3715
July 6, 2020
16:3
3716
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
√ After solving the quadratic equation of σ T , they improve the accuracy of approximation by minimizing its concavity.6 Therefore, their final formula to compute the implied standard deviation was given as ⎡ ⎤ 2 2 S−K (S − K) ⎦ 2π 1 ⎣ S −K C− + . − C− σ= T S+K 2 2 π (106.47) Li (2005) also followed Brenner and Subrahmanyam and expanded the expression to the third-order term and solved for the ISD with a cubic equation. Taylor expansion as equation (106.42) states was used in Li’s paper. He retained the cubic order and substituted equation (106.42) into the normal probabilities in the Black–Scholes model as stated in equation (106.1), and this yielded: d31 d32 d1 d2 1 1 −rT − Xe . (106.48) +√ − √ +√ − √ C=S 2 2 2π 6 2π 2π 6 2π √ √ For at-the-money calls, d1√= 12 σ T , d2 = − 12 σ T , S = Xe−rT . When defining ξ = 12 σ T , the following equation holds for at-the-money calls: √ 1 2πC ≈ 2ξ − ξ 3 . (106.49) S 3 Equation (106.49) can be solved by using the cubic formula.7 6
This approach included several steps. First, logarithmic approximation was used. ln(S/K) ≈ 2(S − K)/(S + K). Secondly, they replaced the value “4” with the parameter α to restate equation (106.46) as 2 2 √ √ π 2C − S + K 2C − S + K S−K σ T = 2π −α . + 2(S + K) 2 S+K S+K
They chose a value for α such that the above equation is approximately linear in the stock price when the option is near the at-the-money case. When setting the second derivative of right-hand side of the above equation with respect to stock price equal to zero, they found the realistic estimated value for α was close to 2. 7 The general cubic equation has the form ax3 + bx2 + cx + d = 0, with a = 0. If the cubic equation is in the form of t3 + pt + q = 0, it is called a depressed cubic equation. Note that any general cubic equation can be reduced to the depressed cubic equation by dividing the b . general equation with a and substituting variable x with t = x − 3a 3 For a depressed cubic equation t + pt + q = 0, the roots are: 2π p 1 3q −3 −k , k = 0, 1, 2. tk = 2 − cos arccos 3 3 2p p 3
page 3716
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3717
He went through some tedious derivations and simplifications, and finally obtained the formula to compute implied standard deviation: √ 1 6α 2 2 (106.50) 8z 2 − √ , σ= √ z−√ 2z T T √
and z = cos[ 13 cos−1 ( √3α32 )]. where α = 2πC S Since Li included the third-order term in the Taylor expansion on the cumulative normal distribution in his derivation, Li claimed that his formula for ISD provided a consistently more accurate estimate of the true ISD than previous studies. To sum up, the existing researches mainly follow two different routines to estimate implied volatility. Numerical search methods tried to find an approximate solution for implied volatility which makes the theoretical option value equal to or very close to market observed option price. These methods do not provide closed-form solution for estimated implied volatility, and need iterative algorithms to approximate the ISD. Closed-form derivation approaches took use of either Taylor expansion or inverse function to calculate the analytical solutions for the ISD. First-order, second-order, and third-order Taylor expansions were applied to cumulative normal distribution function respectively to estimate the implied volatility in previous studies. There were also studies using inverse function of normal distribution to derive closed-form solution of the ISD. An important point to be noted is that some methods rely upon the existence of “at-the-money” options, or at least not too far in- or out-ofthe-money options. These approaches include Brenner and Subrahmanyam (1988), Chance (1996), and Li (2005). Table 106.1 classifies the existing researches of estimating implied volatility accordingly. 106.3 MATLAB Approach to Estimate Implied Variance Usually, implied variance can be obtained from a call or put option model by an optimization technique. For each individual option, the implied variance can be obtained by first choosing an initial estimate σ0 , and then equation (106.51) is used to iterate towards the correct value: T ∂Cj,t M T (106.51) Cj,t − Cj,t (σ0 ) = σ0 (σ − σ0 ) + ej,t , ∂σ where
page 3717
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3718 Table 106.1:
Classification of the ISD estimation methods.
Numerical search Trial and error Latane and Rendleman (1976)
Choose an initial point, iterative algorithm Manaster and Koehler (1982)
Closed-form derivation Taylor Series Expansion First-order expansion: Brenner and Subrahmanyam (1988); Corrado and Miller (1996) Second-order expansion: Chance (1996) Third-order expansion: Li (2005) Inverse Function Estimate parameters by regression: Lai et al. (1992)
M = market price of call option j at time t; Cj,t σ = true or actual implied standard deviation; σ0 = initial estimate of implied standard deviation; T (σ ) = theoretical price of call option j at time t given σ = σ ; Cj,t 0 0 T ∂Cj,t ∂σ σ0 = partial derivative of the call option with respect to the stan-
dard deviation σ at σ = σ0 ; ej,t = error term. The partial derivative of the call option with respective to the standard deviation
T ∂Cj,t ∂σ
from Black–Scholes model is
√ T ∂Ct,j τ 2 −rτ √ −rτ √ e−d1 /2 . = Xe τ N (d1 ) = Xe ∂σ 2π
(106.52)
It is also called Vega of option. The iteration proceeds by reinitializing σ0 to equal σ1 at each successive stage until an acceptable tolerance level is attained. The tolerance level used is σ1 − σ0 (106.53) σ0 < 0.001. The MATLAB finance toolbox provides a function blsimpv to search for implied volatility. The algorithm used in the blsimpv function is Newton’s method, just as the procedure described in equation (106.51). This approach minimizes the difference between observed market option value and the theoretical value of B–S model, and obtain the ISD estimate until tolerance level is attained.
page 3718
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3719
The complete command of the function blsimpv is: Volatility = blsimpv(Price, Strike, Rate, Time, Value, Limit, Yield, Tolerance, Class). And the command with default setting is: Volatility = blsimpv(Price, Strike, Rate, Time, Value). There are nine inputs in total, while the last four of them are optional. Detailed explanations of all the inputs are as follows: Inputs: Price — Current price of the underlying asset. Strike — Strike (i.e., exercise) price of the option. Rate — Annualized continuously compounded risk-free rate of return over the life of the option, expressed as a positive decimal number. Time — Time to expiration of the option, expressed in years. Value — Price (i.e., value) of a European option from which the implied volatility of the underlying asset is derived. Optional Inputs: Limit — Positive scalar representing the upper bound of the implied volatility search interval. If empty or missing, the default is 10, or 1000% per annum. Yield — Annualized continuously compounded yield of the underlying asset over the life of the option, expressed as a decimal number. For example, this could represent the dividend yield and foreign risk-free interest rate for options written on stock indices and currencies, respectively. If empty or missing, the default is zero. Tolerance — Positive scalar implied volatility termination tolerance. If empty or missing, the default is 1e − 6. Class — Option class (i.e., whether a call or put) indicating the option type from which the implied volatility is derived. This may be either a logical indicator or a cell array of characters. To specify call options, set Class = true or Class = {‘Call ’}; to specify put options, set Class = false or Class = {‘Put’}. If empty or missing, the default is a call option. Output: Volatility — Implied volatility of the underlying asset derived from European option prices, expressed as a decimal number. If no solution can be found, a NaN (i.e., Not-a-Number) is returned. Example: Consider a European call option trading at $5 with an exercise price of $95 and 3 months until expiration. Assume the underlying stock pays 5% annual
page 3719
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3720
dividends, which is trading at $90 at this moment, and the risk-free rate is 3% per annum. Under these conditions, the command used in Matlab will be either of the following two: Volatility = blsimpv(90, 95, 0.03, 0.25, 5,[],0.05,[], {‘Call’}) Volatility = blsimpv(90, 95, 0.03, 0.25, 5,[],0.05,[], true) Note that this function provided by MATLAB’s toolbox can only estimate implied volatility from a single option. For more than one option, the user needs to write their own programs to estimate implied variances. 106.4 Approximation Approach to Estimate Implied Variance In this section, we will discuss alternative method proposed by Ang et al. (2009) to use the call option model and put option model to estimate implied volatility. Our approximation approach can also estimate implied volatility from two options with the same maturity, but different exercise prices and values. Recall the Black–Scholes call option pricing model (with continuous dividends), we have C = S N (d1 ) − KN (d2 ),
(106.54)
where d1 =
2
ln(S/X)+(r+ σ2 −q)T √ σ T √
=
ln(S /K) √ σ T
√ + σ T /2;
d2 = d1 − σ T ; C = call price; S = stock price; q = annual dividend yield; S = Se−qT ; X = exercise price; r = risk-free interest rate; K = Xe−rT , present value of exercise price; T = time to maturity of option in years; N (·) = standard normal distribution; σ = stock volatility. We derive a formula to estimate the ISD by applying the Taylor series expansion on a single call option. We show that, following method proposed by Ang et al. (2009) and Ang et al. (2013), the formula for ISD derived by Corrado and Miller (1996) can be improved further without any replacements.
page 3720
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3721
Recall the Taylor series expansion approximating complex functions from calculus (Lee et al., 2009) (Lee et al., 2009, Appendix 5B), which can be mathematically written as follows: Fn (x) = F (a) + F (a)(x − a) +
F (a) F (n) (a) (x − a)2 + · · · + (x − a)2 , 2! n! (106.55)
where Fn (x) is the function we are approximating; F (a) is the first derivative of the function; F (n) (a) is the nth derivative of the function; n! is the factorial value of n, i.e., n! = (n)(n − 1) · · · (2)(1); a is the value near which we are making the approximation to the function F (x). √ Let L = ln(S /K)/σ T . Here, we apply the Taylor series expansion to both cumulative normal distributions in the Black–Scholes formula at L . Then we have √ √ N (L + σ T /2) = N (L ) + N (L )(L + σ T /2 − L ) √ 2 (L + σ T /2 − L ) + ··· + N (L ) 2! √ √ = N (L ) + N (L )σ T /2 + N (L )(σ T /2)2 /2 + e1 √ = N (L ) + N (L )(σ T /2)[1 − ln(S /K)/4] + e1 (106.56) and
√ √ N (L − σ T /2) = N (L ) + N (L )(L − σ T /2 − L ) √ 2 (L − σ T /2 − L ) + ··· + N (L ) 2! √ √ = N (L ) − N (L )σ T /2 + N (L )(σ T /2)2 /2 + e2 √ √ √ N (L − σ T /2) = N (L ) − N (L )σ T /2 + N (L )(σ T /2)2 /2 + e2 √ = N (L ) − N (L )(σ T /2)[1 + ln(S /K)/4] + e2 . (106.57)
where e1 and e2 are the remainder terms of Taylor’s formulas. The above equations can be obtained by the fact that N (x) = −N (x)x.
page 3721
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3722
√ Given N (0) = 1/2, N (0) = 1/ 2π, N (0) = −N (0), N (0) = N (0) = 0, we expand N (L ) and N (L ) at 0 respectively: N (L ) = N (0) + N (0)L + N (0)L2 /2 + e3 √ 1 = + L / 2π + e3 , 2 N (L ) = N (0) + N (0)L + N (0)L2 /2 + e4 √ √ = 1/ 2π − L2 /2 2π + e4 .
(106.58)
(106.59)
Substituting equations (106.56)–(106.59) into equation (106.54), dropping all of remainder terms, equation (106.54) becomes √ C = (S − K)/2 + (ln(S /K)/σ 2πT )[(S − K)(1 + [ln(S /K)/4]2 ) − ln(S /K)(S + K)/4] √ √ (106.60) + (σ T /2 2π)[S + K − ln(S /K)(S − K)/4]. √ Equation (106.60) is a quadratic equation of σ T and can be rewritten as √ √ σ 2 T [8(S + K) − 2(S − K) ln(S /K)] − 8σ T 2π(2C − S + K) + ln(S /K)[(S − K)(16 + (ln(S /K))2 ) − 4(S + K) ln(S /K)] = 0. (106.61) √ Solving σ T from equation (106.61) yields √ √ −b ± b2 − 4ac , σ T = 2a
(106.62)
where a = 8(S + K) − 2(S − K) ln(S /K), √ b = −8 2π(2C − S + K), c = ln(S /K)[(S − K)(16 + ln(S /K))2 ] − 4(S + K) ln(S /K)]. A merit of equation (106.62) is to circumvent the ad hoc substitution present in Corrado and Miller (1996) and improve the accuracy of the ISD’s estimation. Other methods to calculate the implied volatility can be found in Lai et al. (1992). According to Lee et al. (2013), put–call parity can be defined in equation (106.63), and we can calculate implied volatility, stock price per share, and
page 3722
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3723
exercise price per share in terms of put option model: P = C + Xe−rT − Se−qT .
(106.63)
Let Xe−rT = K and let Se−qT = S , then we have following equation: P = C + K − S.
(106.64)
Substituting equation (106.60) into equation (106.64), we obtain following equation: √ P = (K − S )/2 + (ln(S /K)/σ 2πT )[(S − K)(1 + [ln(S /K)/4]2 ) √ √ − ln(S /K)(S + K)/4] + (σ T /2 2π) ×[S + K − ln(S /K)(S − K)/4].
√
(106.65)
Equation (106.65) is also a quadratic equation of σ T and can be rewritten as √ √ σ 2 T [8(S + K) − 2(S − K) ln(S /K)] − 8σ T 2π(2P − K + S ) + ln(S /K)[(S − K)(16 + (ln(S /K))2 ) − 4(S + K) ln(S /K)] = 0. √
(106.66)
Solving σ T from equation (106.66) yields √ √ −b ± b2 − 4ac , σ T = 2a
(106.67) √ where a = 8(S + K) − 2(S − K) ln(S /K), b = −8 2π(2P − K + S ), c = ln(S /K)[(S − K)(16 + ln(S /K)2 )] − 4(S + K) ln(S /K)]. We rearrange equation (106.61) in terms of S , then we obtain equation (106.68): √ [8σ 2 T + 8σ 2πT + 2σ 2 T ln K − 16 ln K − 4(ln K)2 − (ln K)3 ]S + [2σ 2 T K − 16K + 8K ln K − 3K(ln K)2 ] ln S + [16 − 2σ 2 T + 8 ln K + 3(ln K)2 ]S ln S + (3K ln K − 4K)(ln S )2 − (3 ln K + 4)S (ln S )2 − K(ln S )3 + S (ln S )3 √ √ = 16Cσ 2πT + 8σ 2πT K − 8σ 2 T K + 2σ 2 T K ln K − 16K ln K + 4K(ln K)2 − K(ln K)3 .
(106.68)
Equation (106.68) can be used to estimate S if we have the information of the other five variables. The solution of S can only be obtained by the trial-and-error method.
page 3723
July 6, 2020
16:3
3724
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
Now, consider two call options, C1 and C2 , on the same time to the maturity with exercise prices of X1 and X2 . q is an annual dividend yield, S is the underlying asset value, and we denote S = Se−qT . We also denote the present values of the exercise prices K1 = X1 e−rT and K2 = X2 e−rT , respectively. For C1 , we apply Taylor’s expansion to equation (106.54) at K2 . This yields the following equation: √ √ (106.69) C1 = C2 − N (ln(S /K2 )/σ T − σ T /2)(K1 − K2 ) + ε1 , where ε1 is the remainder term of Taylor’s formula. Similarly, for C2 , we apply Taylor’s expansion to equation (106.54) at K1 , which yields the following equation. √ √ (106.70) C2 = C1 − N (ln(S /K1 )/σ T − σ T /2)(K2 − K1 ) + ε2 , where ε2 is the remainder term of Taylor’s formula. Rearranging the equations, dividing both sides by (K2 − K1 ), then applying the inverse function of cumulative normal function on both sides, we have the following two equations: √ √ N −1 [(C1 − C2 )/(K2 − K1 )] = ln(S /K1 )/σ T −σ T /2 + η1 , √
√
(106.71)
N −1 [(C1 − C2 )/(K2 − K1 )] = ln(S /K2 )/σ T −σ T /2 + η2 . (106.72) Combining the two above equations, the effect of remainder √ terms may be partially offset. Then we get the quadratic equation of σ T as follows: √ σ 2 T + 2N −1 [(C1 − C2 )/(K2 − K1 )](σ T ) − ln(S /K1 ) − ln(S /K2 ) = 0. (106.73) Then we can solve implied volatility as √ σ T = −N −1 ((C1 − C2 )/(K2 − K1 )) ± [N −1 ((C1 − C2 )/(K2 − K1 ))]2 + ln(S 2 /K1 K2 ). (106.74) Similarly, we consider two put options P1 and P2 on the same time to the maturity with exercise prices of X1 and X2 . q is the annual dividend yield, S is the underlying asset value, we denote S = Se−qT . We also denote the present values of the exercise prices K1 = X1 e−rT and K2 = X2 e−rT , respectively.
page 3724
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3725
According to put–call parity defined in equation (106.64), we have the following equations: P1 = C1 + K1 − S ,
(106.75)
P2 = C2 + K2 − S .
(106.76)
If we substitute the above equations into equations (106.69) and (106.70), then we have the following equations: √ √ P1 = P2 + (K1 − K2 ) − N (ln(S /K2 )/σ T − σ T /2)(K1 − K2 ) + δ1 , √
(106.77)
√
P2 = P1 + (K2 − K1 ) − N (ln(S /K2 )/σ T − σ T /2)(K2 − K1 ) + δ2 . (106.78) Rearranging the equations, dividing both sides by (K2 −K1 ), and then applying the inverse function of cumulative normal function on both sides, we have the following two equations: √ √ N −1 ((P1 − P2 )/(K2 − K1 ) + 1) = ln(S /K2 )/σ T − σ T /2 + γ1 , √
√
(106.79)
N −1 ((P1 − P2 )/(K2 − K1 ) + 1) = ln(S /K1 )/σ T − σ T /2 + γ2 . (106.80) By combining the two above equations, the effect of the remaining terms may √ be partially offset. Then we get the quadratic equation of σ T as follows: √ σ 2 T + 2N −1 [(P1 − P2 )/(K2 − K1 ) + 1](σ T ) − ln(S /K1 ) − ln(S /K2 )=0. √
(106.81)
Solving the equation for σ T , we obtain √ σ T = −N −1 ((P1 − P2 )/(K2 − K1 ) + 1) ± [N −1 ((P1 − P2 )/(K2 − K1 ) + 1)]2 + ln(S 2 /K1 K2 ). (106.82) 106.5 Some Empirical Results 106.5.1 Cases from US — individual stock options We select 10 constituent companies of S&P 500 as American examples to compare the implied volatility estimation methods. The selected companies have relative large market values, and are from different industries. Tables 106.2–106.4 show the details of our sample.
page 3725
July 17, 2020
16:22
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
3726
Table 106.2: Security ID
Ticker
101594
AAPL
104533 121812 107525
XOM GOOGL MSFT
106566
JNJ
111953
WFC
105169
GE
111860 102968 109224
WMT CVX PG
102936
JPM
111668
VZ
108948
Details of sample companies.
Company name
SIC code
Apple Inc.
3571
Exxon Mobil Corporation Google Inc. Microsoft Corporation
2911 7370 7372
Johnson & Johnson
2834
Wells Fargo & Company
6021
General Electric Company
3600
Wal-Mart Stores Inc. Chevron Corporation The Procter & Gamble Company JPMorgan Chase & Co.
5331 2911 2840 6021
Verizon Communications Inc.
4813
PFE
Pfizer Inc.
2834
106276
IBM
3570
109775
T
International Business Machines Corporation AT&T, Inc.
4813
Industry Electronic Computers Petroleum Refining Computer Program Prepackaged Software Pharmaceutical Preparations National Commercial Banks Electronic Equipment Variety Stores Petroleum Refining Cosmetics National Commercial Banks Radiotelephone Communications Pharmaceutical Preparations Computer/Office Radiotelephone Communications
Note: Sample time spans from May, 2018 to June, 2018. 10-year treasury rate is used as the risk-free rate. We calculated the continuously compounded annual risk free interest rate accordingly, as the parameter r. We select actively traded option pairs to estimate implied volatility.
106.5.2 Cases from China — 50 ETF options In Chinese financial market, there were no stock options in the exchange until February, 2015. Now, 50 ETF options are still the only traded options in Chinese market (Li et al., 2018). We choose the options which were traded actively on June 29, 2018 as our samples to compare the alternative methods of implied volatility estimation. (There were 120 call options trading on that day. We choose the most actively traded 10% options for implied volatility estimation.) The underlying asset price, i.e., 50 ETF was traded at 2.495 on June 29, 2018.
page 3726
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM
3727
Table 106.3: Implied volatility estimation for individual stock options: Comparison of alternative estimation methods. Ticker
IV-matlab
IV-approximation
AAPL XOM GOOGL MSFT JNJ WFC GE WMT CVX PG JPM VZ PFE IBM T
0.357 0.196 0.369 0.341 0.206 0.287 0.156 0.142 0.312 0.186 0.175 0.182 0.202 0.433 0.192
0.320 0.178 0.378 0.325 0.198 0.269 0.154 0.132 0.296 0.172 0.186 0.190 0.197 0.428 0.186
Table 106.4:
IV-regression
No No
No
No No
0.426 0.124 0.315 0.298 positive solution positive solution 0.184 0.156 0.339 positive solution 0.169 0.232 0.167 positive solution positive solution
Implied volatility estimation from ETF 50 call options.
Option ticker Exercise price Expiration date IV-matlab IV-approximation 10001361.SH 10001217.SH 10001216.SH 10001218.SH 10001219.SH 10001343.SH 10001215.SH 10001214.SH 10001213.SH 10001320.SH 10001342.SH 10001334.SH 10001321.SH 10001212.SH 10001225.SH 10001211.SH 10001395.SH 10001333.SH 10001316.SH 10001341.SH
2.9 3.4 3.3 3.5 3.6 2.85 3.2 3.1 3 2.85 2.8 2.95 2.9 2.95 3.3 2.9 2.3 2.95 2.65 2.75
2018/7/25 2018/9/26 2018/9/26 2018/9/26 2018/9/26 2018/7/25 2018/9/26 2018/9/26 2018/9/26 2018/12/26 2018/7/25 2018/12/26 2018/12/26 2018/9/26 2018/9/26 2018/9/26 2018/8/22 2018/12/26 2018/12/26 2018/7/25
0.452 0.463 0.416 0.427 0.452 0.406 0.414 0.392 0.401 0.422 0.418 0.372 0.422 0.398 0.420 0.376 0.401 0.388 0.401 0.387
0.462 0.478 0.406 0.415 0.447 0.429 0.428 0.406 0.415 0.429 0.422 0.388 0.409 0.402 0.413 0.399 0.412 0.396 0.398 0.397
page 3727
July 6, 2020
16:3
3728
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
106.6 Implied Volatility in Terms of CEV Model and Related Excel Program One alternative to Black–Scholes is the constant elasticity of variance (CEV) model. In this section, we discuss how to use CEV model to forecast implied volatility and show related EXCEL program of this approach. Cox (1975) and Cox and Ross (1976) developed the “constant elasticity of variance (CEV) model” which incorporates an observed market phenomenon that the underlying asset variance tends to fall as the asset price increases (and vice versa). Schroder (1989) derives computation formula for CEV model. Schroder’s (1989) CEV formula is generally used to do empirically computation of either option value or implied variance. The advantage of CEV model is that it can describe the interrelationship between stock prices and its volatility. The constant elasticity of variance (CEV) model for a stock price, S, can be represented as follows: dS = (r − q)Sdt + δS α dZ
(106.83)
where r is the risk-free rate, q is the dividend yield, dZ is a Wiener process, δ is a volatility parameter, and α is a positive constant. The relationship between the instantaneous volatility of the asset return, σ(S, t), and parameters in CEV model can be represented as σ(S, t) = δS α−1 .
(106.84)
When α = 1, the CEV model is the geometric Brownian motion model we have been using up to now. When α < 1 , the volatility increases as the stock price decreases. This creates a probability distribution similar to that observed for equities with a heavy left tail and a less heavy right tail. When α > 1, the volatility increases as the stock price increases, giving a probability distribution with a heavy right tail and a less left tail. This corresponds to a volatility smile where the implied volatility is an increasing function of the strike price. This type of volatility smile is sometimes observed for options on futures. The formula for pricing a European call option in CEV model is St e−qτ [1 − χ2 (a, b + 2, c)] − Ke−rτ χ2 (c, b, a) when α < 1, Ct = −qτ 2 −rτ 2 St e [1 − χ (c, −b, a)] − Ke χ (a, 2 − b, c) when α > 1, (106.85)
page 3728
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
Implied Variance Estimates for Black–Scholes and CEV OPM 2(1−α)
3729
St [Ke−(r−q)τ ]2(1−α) 1 , b = 1−α , c = (1−α) 2υ , υ = (1−α)2 υ 2(r−q)(α−1)τ 2 [e − 1], and χ (z, k, v) is the cumulative probability
where a =
δ2 2(r−q)(α−1)
that a variwith non-centrality parameter v and able with a non-central k degrees of freedom is less than z. Hsu, Lin and Lee (2008) provided the detailed derivation of approximate formula for CEV model. Based on the approximated formula, CEV model can reduce computational and implementation costs rather than the complex models such as jump-diffusion stochastic volatility model. Therefore, CVE model with one more parameter than Black–Scholes–Merton Option Pricing Model (BSM) can be a better choice to improve the performance of predicting implied volatilities of index options (Singh and Ahmad 2011). In order to price a European option under a CEV model, we need a noncentral chi-square distribution. The following figure shows the charts of the non-central chi-square distribution with 5 degrees of freedom for non-central parameter δ = 0, 2, 4, 6. χ2 distribution8
non-central Chi square df=5 0.16 0.14 0.12 0.1
ncp=0
0.08
ncp=2
0.06
ncp=4
0.04
ncp=6
0.02 0 1
3
5
7
9 11 13 15 17 19 21 23 25 27 29
Under the theoretic foundation of the CEV model in this chapter, we can write a call option price under CEV model. Following is the figure to do this:
8
The calculation process of χ2 (z, k, v) value can be referred to Ding (1992). The complementary non-central chi-square distribution function can be expressed as an infinite double sum of gamma function, which can be referred to Benton and Krishnamoorthy (2003).
page 3729
July 6, 2020
16:3
3730
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
Hence, the formula for CEV call option in B14 is = IF (B9 < 1, B3 ∗ EXP (−B6 ∗ B8) ∗ (1 − ncdchi(B11, B12 + 2, B13)) − B4 ∗ EXP (−B5 ∗ B8) ∗ ncdchi(B13, B12, B11), B3 ∗ EXP (−B6 ∗ B8) ∗ (1 − ncdchi(B13, −B12, B11)) − B4 ∗ EXP (−B5 ∗ B8) ∗ ncdchi(B11, 2 − B12, B13)). The ncdchi is the non-central chi-square cumulative distribution function. The function, IF, is used to separate the two conditions for this formula, 0 < α < 1 and α > 1. We can write a function to price the call option under CEV model. Below is the code to accomplish this: ‘ CEV Call Option Value Function CEVCall(S, X, r, q, T,sigma,alpha) Dim v As Double Dim aa As Double Dim bb As Double Dim cc As Double
page 3730
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3731
v = (Exp(2 * (r - q) * (alpha - 1) * T) - 1) * (sigma ^ 2) / (2 * (r - q) * (alpha - 1)) aa = ((X * Exp(-(r - q) * T)) ^ (2 * (1 - alpha))) / (((1 - alpha) ^ 2) * v) bb = 1 / (1 - alpha) cc = (S ^ (2 * (1 - alpha))) / (((1 - alpha) ^ 2) * v) If alpha < 1 Then CEVCall = Exp(-q * T) * S * (1 - ncdchi(aa, bb + 2, cc)) - Exp(-r * T) * X * ncdchi(cc, bb, aa) Else CEVCall = Exp(-q * T) * S * (1 - ncdchi(cc, -bb, aa)) - Exp(-r * T) * X * ncdchi(aa, 2 - bb, cc) End If End Function Use this function to value the call option is shown below:
The CEV call option formula in C14 is = CEVCall(B3, B4, B5, B6, B8, B7, B9) The value of CEV call option in C14 is equal B14.
page 3731
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3732
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
Next, we want to use Goal Seek procedure to calculate the implied volatility. To do this we can see the below figure. Set cell: B14 To value: 4 By changing cell: $B$7 After pressing the OK button, we can get the sigma value in B7.
If we want to calculate implied volatility of stock return, we show this result in B16 of the following figure. The formula of implied volatility of stock return in B16 is: = B7 ∗ B3 ∧ (B9 − 1).
page 3732
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3733
We use bisection method to write a function to calculate the implied volatility of CEV model. Following code can accomplish this task. ‘ Estimate implied volatility by Bisection ‘ Uses BSCall fn Function CEVIVBisection(S, X, r, q, T, alpha, callprice, a, b) Dim yb, ya, c, yc yb = CEVCall(S, X, r, q, T, b, alpha) - callprice ya = CEVCall(S, X, r, q, T, a, alpha) - callprice If yb * ya > 0 Then CEVIVBisection = CVErr(xlErrValue) Else Do While Abs(a - b) > 0.000000001
page 3733
July 6, 2020
16:3
3734
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
c = (a + b) / 2 yc = CEVCall(S, X, r, q, T, c, alpha) - callprice ya = CEVCall(S, X, r, q, T, a, alpha) - callprice If ya * yc < 0 Then b = c Else a = c End If Loop CEVIVBisection = (a + b) / 2 End If End Function After typing the parameters in the above function, we can get the sigma and implied volatility of stock return. The result is shown below.
The formula of sigma in CEV model in C15 is: = CEVIVBisection(B3, B4, B5, B6, B8, B9, F 14, 0.01, 100). The value of sigma in C15 is similar to B7 calculated by Goal Seek procedure. In the same way, we can calculate volatility of stock return in C16. The value of volatility of stock return in C16 is also near B16.
page 3734
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Implied Variance Estimates for Black–Scholes and CEV OPM
b3568-v4-ch106
3735
106.7 Conclusions The main purpose of this paper is to discuss how to use alternative methods for estimating implied variance. In this paper, we will first review alternative methods to estimate implied variance. We classify them into two different estimation routines: numerical search methods and closed-form derivation approaches, and discussed their limitations. Then, we show how the MATLAB computer program can be used to estimate implied variance. This kind of approach used Newton–Raphson method to derive the implied variance from the standard Black–Scholes model. In addition, we also discuss how the approximation method derived by Ang et al. (2013) can be used to estimate implied variance and implied stock price per share. Not only the case of single option is presented, this approximation method also estimates implied volatility from two options with the same maturity, but different exercise prices and values. We select individual stock options on some largecap stocks from US S&P 500 and 50 ETF options from China as empirical examples. The performances of three typical alternative methods: regression method proposed by Lai et al., MATLAB computer program approach and approximation method derived by Ang et al. are compared. At last, we introduce how to estimate implied volatility using constant elasticity of variance (CEV) model. Related Excel program for this approach is also presented.
Bibliography Ang, J.S. et al. (2009). Alternative Formulas to Compute Implied Standard Deviation. Review of Pacific Basin Financial Markets and Policies 12(2), 159–176. Ang, J.S. et al. (2013). A Comparison of Formulas to Compute Implied Standard Deviation, Encyclopedia of Finance, Springer, 765–776. Beckers, S. (1981). Standard Deviations Implied in Option Prices as Predictors of Future Stock Price Variability. Journal of Banking & Finance 5(3), 363–381. Benton, D. and Krishnamoorthy, K. (2003). Computing Discrete Mixtures of Continuous Distributions: Noncentral Chisquare, Noncentral t, and the Distribution of the Square of the Sample Multiple Correlation Coefficient. Computational Statistics & Data Analysis 43(2), 249–267. Brenner, M. and Subrahmanyam, M.G. (1988). A Simple Formula to Compute the Implied Standard Deviation. Financial Analysts Journal 80–83. Chambers, D.R. and Nawalkha, S.K. (2001). An Improved Approach to Computing Implied Volatility. Financial Review 36(3), 89–100. Chance, D.M. (1996). A Generalized Simple Formula to Compute the Implied Volatility. Financial Review 31(4), 859–867. Corrado, C.J. and Miller, T.W. (1996). A Note on a Simple, Accurate Formula to Compute Implied Standard Deviations. Journal of Banking & Finance 20(3), 595–603.
page 3735
July 6, 2020
16:3
3736
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch106
C. F. Lee, Y. Chen & J. Lee
Corrado, C.J. and Miller, T.W. (2006). Estimating Expected Excess Returns Using Historical and Option — Implied Volatility. Journal of Financial Research 29(1), 95–112. Cox, J.C. (1975). Notes on Option Pricing I: Constant Elasticity of Variance Diffusions. Working Paper, Stanford University. Cox, J.C. and Ross, S.A. (1976). The Valuation of Options for Alternative Stochastic Processes. Journal of Financial Economics 3, 145–166. Delbourgo, R. and Gregory, J.A. (1985). Shape Preserving Piecewise Rational Interpolation. SIAM Journal on Scientific and Statistical Computing 6(4), 967–976. Ding, C.G. (1992). Algorithm as 275: Computing the Non-central χ2 Distribution Function. Journal of the Royal Statistical Society 41(2), 478–482. Glau, K., Herold, P., Madan, D.B. and P¨ otz, C. (2017). The Chebyshev Method for the Implied Volatility. Preprint http://cn.arxiv.org/pdf/1710.01797. Hallerbach, W. (2004). An Improved Estimator for Black–Scholes–Merton Implied Volatility. Erasmus Research Series, Erasmus University. Hsu, Y.L., Lin, T.I. and Lee, C.F. (2008). Constant Elasticity Of Variance (CEV) Option Pricing Model: Integration and Detailed Derivation. Mathematics & Computers in Simulation 79(1), 60–71. Jackel, P. (2006). By Implication. Wilmott 2006(26), 60–66. Jackel, P. (2015). Let’s be Rational. Wilmott 2015(75), 40–53. Lai, T.-Y. et al. (1992). An Alternative Method for Obtaining the Implied Standard Deviation. Journal of Financial Engineering 1, 369–375. Latane, H.A. and Rendleman, R.J. (1976). Standard Deviations of Stock Price Ratios Implied in Option Prices. Journal of Finance 31(2), 369–381. Lee, A. et al. (2009). Financial Analysis, Planning and Forecasting: Theory and Application. Lee, C.F. et al. (2013). Statistics for Business and Financial Economics. Springer. Li, J., Yao, Y., Chen, Y. and Lee, C.F. (2018). Option Prices and Stock Market Momentum: Evidence from China. Quantitative Finance 18(2), 1517–1529. Li, S. (2005). A New Formula for Computing Implied Volatility. Applied Mathematics and Computation 170(1), 611–625. Manaster, S. and Koehler, G. (1982). The Calculation of Implied Variances from the Black– Scholes Model: A Note. Journal of Finance 37(1), 227–230. Merton, R.C. (1973). Theory of Rational Option Pricing. The Bell Journal of Economics and Management Science 4(1), 141–183. Pagliarani, S. and Pascucci, A. (2017). The Exact Taylor Formula of the Implied Volatility. Finance & Stochastics 21(3), 1–58. Salazar Celis, O. (2017). A Parametrized Barycentric Approximation for Inverse Problems with Application to the Black–Scholes Formula. IMA Journal of Numerical Analysis 38(2), 976–997. Schroder, M. (1989). A Reduction Method Applicable to Compound Option Formulas. Management Science 35(7), 823–827. Smith, Jr., C.W. (1976). Option Pricing: A Review. Journal of financial Economics 3(1–2), 3–51.
page 3736
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
Chapter 107
Crisis Impact on Stock Market Predictability Rajesh Mohnot Contents 107.1 Introduction . . . . . . . . . . . . . . 107.2 Literature Review . . . . . . . . . . 107.3 Methodology . . . . . . . . . . . . . 107.3.1 ARCH and GARCH models 107.4 Data and Sample Description . . . . 107.5 Analysis of Empirical Results . . . . 107.6 Concluding Remarks . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . Web Resources . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
3738 3740 3741 3742 3744 3744 3748 3749 3751
Abstract This research paper aims to examine the predictability of the Spanish Stock Market returns. Earlier studies suggest that stock market returns in developed countries can be predicted with a noise term but this study has specifically covered two time horizons; one pre-crisis period and the other one current crisis period to evaluate the stock market returns predictability. Since mean returns cannot prove all the time to be efficient predictor, variance of such returns do, hence various autoregressive models have been used to test the existence of persisting volatility in the Spanish Stock Market. The empirical results show that higher order of autoregressive models such as ARCH(5) and GARCH(2, 2) can be used to predict future risk in Spanish Stock Market both in pre-crisis and current crisis period. The paper also reveals that there is a positive correlation between Spanish Stock Market returns and the conditional standard deviations as produced by ARCH(5) and
Rajesh Mohnot Middlesex University Dubai e-mail: [email protected] 3737
page 3737
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3738
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
GARCH(2, 2), implying that the models have some success in predicting future risk on Spanish Stock Market. The predictability of stock market returns during crisis period is not found to be affected contrary though the degree of predictability may be. Keywords Predictability • ARCH (autoregressive conditional heteroscedasticity) & GARCH (generalized autoregressive conditional heteroscedasticity) • Stock market returns • Financial crisis.
107.1 Introduction Stock market returns are claimed to be relatively higher than other financial assets. The returns in developed stock markets are fairly moderate due to the fact that developed stock markets possess good transparency, comprehensive regulatory framework, and good disclosure and accounting information system. This makes them behave more rationally and scientifically ensuring a consistent returns to all the market participants. On the other hand, emerging stock markets provide some lucrative returns due to having ample growth opportunities. But not to forget that the risk factor in emerging stock markets remains comparatively higher than that of developed stock markets. In the past, investments were typically made with a medium to long-term perspective but changing dynamics and innovations have enabled market participants to invest for short-term. Market participants like institutional investors, speculators, arbitrageurs, traders, brokers, etc. can be observed playing with stock markets on different time horizons such as monthly, weekly, daily, and to some extent hourly basis. In such cases, correct and reliable forecasting approaches will certainly help them realize short-term profits from stock market activities. Though the stock market predictability lies in its efficiency and performance, however, volatility has been an issue propping up in the stock markets especially in last couple of decades. This is certainly due to globalization process and rapidly growing investment activities by multinational enterprises across the globe. Spain’s stock market has been comparatively more matured in terms of having transparency, good disclosure requirements and an overall effective corporate governance system. But the recent crisis has severely impacted the entire economy of Spain. An abnormal high unemployment rate of 14%, bankruptcy of major real estate companies, a huge trade deficit accounting for 10% of the GDP, high oil prices and comparatively higher inflation rate have been main reasons behind the downturn of the economy and the poor performance of the financial markets. The situation got worse in 2009 when the IMF revised its estimates for the Spain’s economy at minus 4.6% of the
page 3738
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability
b3568-v4-ch107
3739
GDP for the year 2009 and a further contraction of 0.8% for the year 2010. But the banking sector of Spain remained resilient otherwise the crisis might have taken an additional huge toll. All these facts compel us to look into the financial market and find out whether the financial market has shown resistance to the crisis especially when the leading stock markets around the world have lost approximately 30% of their total capitalization value. As mentioned earlier, volatility is a distinct feature in the financial markets as a whole and stock markets in particular. In fact, stock market volatility was once again realized in the year 2008 when Lehman Brothers collapse triggered worldwide recession. Though a moderate volatility has always been welcomed in the financial market circles but the recent pattern of volatility has become an issue of concern for the monetary policy makers and the economists. This is not the first time when excess volatility has triggered a debate. Financial crisis and economic turbulence have been witnessed in the past from time to time, and the same issue became debatable every time. Cochran, Heck and Shaffer (2003) reveal that volatility was unusually high in most world markets post 1997 financial crisis. This sparks a further debate why volatility clusters cannot be captured in the times of turbulence. As we know that volatility refers to fluctuations in a time series data due to flow of time-dependent information, it may be of concern to find out whether the past returns are able to predict the future returns in the stock markets. Moreover, once volatility increases, it tends to remain high for several future periods (Apergis and Eleptheriou 2001). While capturing volatility, one may observe that there are some calm periods with relatively small returns and some wide swings with large positive and negative returns. This is characterized as volatility clustering. If the variance of a stock market return series depends on the past, then the series is likely to have conditional heteroskedasticity. Researchers are constantly experimenting new ways to measure the volatility in order to provide more reliable and consistent predictability in the stock markets. Though there has been some extensive research work on the measurement and forecasting of volatility in stock markets but most of the research attempts are related to normal economic times. Volatility patterns have been estimated during the normal economic scenarios but times of turbulence have remained quite research-isolated. A renewed interest of research in this area seems inevitable as researchers would like to ascertain the volatility patterns during the times of turbulence. The main objective of this paper is to examine the predictability of volatility patterns in the stock market returns especially in the current crisis period. It intends to run various autoregressive models to predict volatility in the time series data of Spain’s
page 3739
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
3740
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
flagship index — “IBEX-35”. The second section of this paper deals with the review of existing research literature in the area of stock market returns volatility and its forecasting. Section 107.3 discusses the methodology used in this paper. Section 107.4 presents the data and sample description. Section 107.5 outlines the analytical part of empirical results based on methodology as discussed in Section 107.3. The last section presents the concluding remarks.
107.2 Literature Review Forecasting financial assets’ return has not been a na¨ıve exercise. Evolution of its mechanism, methods and models attest the fact that it has been around for quite long time now. Forecasting of stock markets returns especially in view of volatility is well attempted by researchers in the past. Therefore, it will be important to review some of the important characteristics of forecasting volatility and factors affecting such forecasted volatility. In developed markets, the volatility structure has changed quite significantly especially in the last two decades. The volatility in the long run has been observed to be low and the same phenomenon can be observed in many emerging markets returns as well. Brazil, China, India and Russia are, among others, some of the countries’ stock markets where volatilities have fairly moderated in the recent years. Chiou (2009) has revealed that emerging stock markets in Southeast Asia, South Europe and Latin America are observed to be highly volatile compared to the developed stock markets around the world. In the same context, Arora, Das and Jain (2009) also reveal similar result that the ratios of mean return to volatility for emerging markets are higher than that of developed markets. This could be partially attributed to a fact that many emerging markets are transforming their corporate governance systems, transparency and disclosure requirements, operational mechanism from manual to fully automated system. But this low volatility structure should not be construed as insignificant issue in anticipation that it has become a permanent feature. Statistical measures may not help us in capturing the volatility in its absolute terms, therefore, it is pertinent to structure some advanced models in order to measure volatility in first instance, and then to effectively forecast for the future. Goretti (2005) highlighted that nonlinear models in financial time series analysis work better than linear models due to the fact that the latter sometimes ignore unobservable factors such as herding behavior, investors’ beliefs, financial panic and political uncertainty. Jiang, Xu and Yao (2009) have tested idiosyncratic
page 3740
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability
b3568-v4-ch107
3741
volatility in stock returns stating that this type of volatility is affected by the information content of the future earnings. On the other hand, Dennis, Mayhew and Stivers (2007) find that systematic market-wide factors affect the asymmetric volatility rather than aggregate firm level effects. Alberg, Shalit and Yosef (2008) also find that asymmetric GARCH model performs better in measuring conditional variance. Some authors (Chowdhury and Sarno, 2004; McMillan and Speight, 2006) have used different forms of GARCH model to define intra-day volatility in the financial markets. Chowdhury and Sarno (2004) used multivariate stochastic volatility models to investigate the degree of persistence of volatility at different frequencies while McMillan and Speight (2006) used FIGARCH model to capture long memory dynamics in intra-day volatility. Leon, Rubio and Serna (2005) tested the GARCH model to test time-varying volatility incorporating skewness and kurtosis and they revealed a significant presence of skewness and kurtosis. It is interesting to learn that different versions of GARCH may provide different outcomes when the sample is based on different time intervals. The volatility forecasting over one week and longer time interval may be better with standard asymmetric GARCH models while MRS-GARCH models provide better outcomes over shorter than one week time horizons (Marcucci 2005). Okimoto (2008) views that asymmetries in dependence tend to be high in volatile markets, the Markov switiching model and the copula theory can best yield the results. Studies conducted by researchers in 1980s and 1990s decades (Chou, 1988, Baillie and DeGennaro 1990, Kim and Kon 1994) have categorically remarked that GARCH specifications are able to predict returns if volatility clusters can be captured. Hansen and Lunde (2001) have done an extensive comparison of various volatility models drawing out an inference that GARCH(1, 1) model best forecast the volatility. On the contrary, Johnston and Scott (2000) have observed that GARCH models with normality assumptions do not provide a good description of returns dynamics, thereby raises a question on the contribution of GARCH type models in the determination of the stochastic process. Lee, Chen and Lui (2000) have applied GARCH and EGARCH models to that returns are long momentary and volatility is highly persistent and predictable in China’s stock markets.
107.3 Methodology As mentioned earlier, it is important to check whether short-term opportunities do exist in the crisis time so that short-term traders, speculators,
page 3741
July 6, 2020
16:3
3742
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
and arbitrageurs can continue benefiting from the market. This is obviously dependent upon the predictability feature of any stock market under the given crisis scenario. If returns on stock markets can be predicted for the future, short-term traders can grab those opportunities. In this regard, it will be important to note that autoregressive model as first propounded by Engle (1982) and subsequently developed by Bollerslev (1986) naming it as GARCH model, deal with the changing variance in time series. 107.3.1 ARCH and GARCH models Autoregressive conditional heteroskedasticity (ARCH) model has evolved over last couple of decades in the field of economics and finance. There are challenging issues with regard to investment in stocks and other similar securities, especially how the prices will behave in the future. This timeseries-based model helps capture those oscillations which were there in the historical data, and predict the future behavior. The use of the model has been extended to market efficiency determination, capital asset pricing modeling, hedging strategies, interest rates measurement, debt portfolio construction, among others (Bera and Higgins 1993). The application of ARCH model primarily relies on some properties of the historical time series. For example, financial markets’ asset prices tend to have thick tails, hence they are leptokurtic. More often, those assets demonstrate volatility clustering over time, meaning large fluctuations are affected by large changes and small fluctuations by small changes either side. Another important characteristic with asset prices is leverage effect. So if the value of the firm falls, the capital structure of the firm becomes more leveraged. Authors (Chou, Fan Denis and Lee 1996) have used vector error correction model in order to check the relationship between variables. And last but not the least, information flow, e.g., corporate earnings announcements will put stock prices in more volatile zone. Thus, ARCH becomes more relevant in terms of volatility measurement and estimation. The ARCH model specifies the following equation: rt = β0 + β1 zt + ut .
(107.3a)
( µZt )
is constant following the According to this static equation, Var homoskedasticity where Z denotes all n outcomes of zt . But heteroskedasticity may arise if we look at the conditional variance of μt . With this heteroskedasticity, Engle suggested the following first-order ARCH model: E(u2t /ut−1, ut−2,... ) = E(u2t /ut−1 ) = α0 + α1 u2t−1 .
(107.3b)
page 3742
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability
b3568-v4-ch107
3743
This way, it can be stated that the conditional variance is based on its past observation and the errors are serially uncorrelated. Since there are no dynamics presumed in the variance equation, the model must produce conditional variances to be positive. Thus, the equation can be rewritten as u2t = α0 + α1 u2t−1 + vt .
(107.3c)
And it can be called as GARCH model. The main objective of this model is to represent changes in the variance knowing that mean cannot be used as future predictor variable. The forecasting accuracy will depend upon how the model captures changes in variance. However, there are always challenges in estimating the model when a large number of factors are involved. Bolerslev (1986) suggested solution to this issue stating that all the coefficients in the infinite-order linear ARCH must represent positive value. The GARCH (p, q) model can be specified as follows: Rt = μ + εt σt2
=ω+
q i=1
(Mean Equation), αi ε2t−i
+
p j=1
2 βj σt−j
(Variance Equation).
(107.3d)
It is important to find out autocorrelations of AR term before setting the null hypothesis that there are no ARCH or GARCH errors. If null hypothesis is rejected, it signifies that conditional variance contains errors. GARCH(1, 1) refers to the presence of a first-order GARCH term (the first term in parentheses) and a first-order ARCH term (the second term in parentheses). An ordinary ARCH model is a special case of a GARCH specification in which there are no lagged forecast variances in the conditional variance equation. The model specifies that the volatility in the current period is related to its past value with a white noise error term i.e., μt . As mentioned earlier, it is the variance which should be used as a measure of volatility. During high volatile period, if residuals are found to be large then the estimated conditional variances will also be relatively large. Arora et al. (2009) also confirms that GARCH(1, 1) model is appropriate to capture volatility in stock returns. Rahman, Lee and Ang (2002) have made similar attempt in applying GARCH(1, 1) model on NASDAQ stock market and revealed that this model better performs in the case of intraday volatility estimation. This will consequently lead the forecast intervals to be wider as well.
page 3743
July 6, 2020
16:3
3744
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
107.4 Data and Sample Description As mentioned earlier that the main objective of this study is to find out whether short-term opportunities do exist even in the crisis period, data selection has been done in accordance. Since the study aims to reveal this feature on comparison basis, the data have been evenly distributed to precrisis and the current crisis period. It is well known that the global financial crisis broke in June 2008 and spread in European region and soon across the globe. Since then, crisis has hardly calmed down and its impact is still surmounting on developed and developing countries alike. Keeping in mind this fact, the current-crisis period has been fixed from July 2008 to December 2010 consisting of 30 months. It is unarguably the most affected period and demands a proper investigation with regard to volatility forecasting. A similar range of data is taken representing the pre-crisis period starting from January 2006 and ending in June 2008. In this way, it will be a fair strategy to first look into the short-term opportunities available in these two crisis periods and subsequently measuring the degree of those opportunities. The analysis is done for one of the most prominent stock markets in the European region, i.e., Spain’s Stock market. The IBEX-35 which represents as Spain’s main benchmark index comprises of 35 leading companies’ stocks from variety of sectors and is based on market capitalization weight. Adopted in 1992, the index is governed by Bolsa de Madrid. For the analysis purpose, the IBEX-35 index is chosen compiling daily index returns. The total number of observations is 633 in each period. The data have been collected from the official website of Spain’s stock market. 107.5 Analysis of Empirical Results First of all, the daily changes in the IBEX-35 have been calculated using the following equation: ΔIBEX − 35t = IBEX − 35t − IBEXt−1 .
(107.5a)
Two series statistics clearly show that the mean of the daily returns in the pre-crisis period was 0.024% while in the current-crisis period, it is recorded at −0.0012% which is quite in consonance with the fall feature in crisis period. The range of maximum and minimum daily returns can also be seen to be substantially different in both the periods. The maximum daily return in the pre-crisis period is recorded at 6.95% while it is 14.43% in the current crisis period; clearly indicating high volatility. However, the minimum daily returns are −7.54% and −9.14% in pre-crisis and current crisis period,
page 3744
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability Table 107.1: Mean
b3568-v4-ch107
3745
IBEX-35 daily returns statistics.
Maximum Minimum Standard deviation Kurtosis
Pre-crisis 0.000241 Current-crisis −0.0000118
0.069533 0.144349
−0.075399 −0.091408
0.011457 0.020936
8.997421 9.347610
0.1 0.05 0 -0.05 -0.1
Figure 107.1:
Daily returns pre-crisis.
0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15
Figure 107.2:
Daily returns post-crisis.
respectively. The standard deviation seems to have almost doubled in the current crisis period to 2.1% from pre-crisis level of 1.15%; again reconciling with the facts that volatility remains high in the crisis period (Table 107.1). Figures 107.1 and 107.2 show the changes in the daily returns of IBEX35 index series with lot of swings which further indicate the existence of volatility clustering. Having evaluated the autocorrelations in both periods’ series, it was evident that there were very small correlations but when the series were converted into the squared returns series, significant correlations could be observed implying that returns cannot be predicted from their past but risk can be. Moreover, the kurtosis was also found to be very high in both periods. All these properties suggest that GARCH can prove to be better
page 3745
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
3746 Table 107.2: Variable C
ARCH(5) statistics (pre-crisis).
Coefficient
Std. Error
z-Statistic
Prob.
0.000928
0.000367
2.525969
0.0115
4.49E–05 0.052345 0.170307 0.123792 0.261833 0.090124
4.50E–06 0.042001 0.03917 0.041429 0.038687 0.041927
9.961367 1.246259 4.347935 2.988054 6.768074 2.149565
0.0000 0.2127 0.0000 0.0028 0.0000 0.0316
Variance equation C ARCH(1) ARCH(2) ARCH(3) ARCH(4) ARCH(5)
Table 107.3: Variable C
GARCH(2, 2) statistics (pre-crisis).
Coefficient
Std. Error
z-Statistic
Prob.
0.000873
0.000393
2.225067
0.0261
6.42E−06 0.033027 0.084761 1.238894 −0.40174
2.35E−06 0.038472 0.061103 0.25879 0.20605
2.731659 0.858462 1.38717 4.78725 −1.949723
0.0063 0.3906 0.1654 0.0000 0.0512
Variance equation C ARCH(1) ARCH(2) GARCH(1) GARCH(2)
model in predicting future risk in the Spain’s stock markets. Two versions of the model are used to analyze the data and determine the predictability for the future; one with ARCH-5 process and the second with GARCH(2, 2) process. Finally, it is attempted to compare both versions of the model in the context of the crisis period. Tables 107.2 and 107.3 show that ARCH and GARCH parameters are observed to be significant except the second lagged terms in the GARCH(2, 2) model and the first lagged term in the ARCH(5) model. The standardized residuals contain no ARCH anymore as p = 0.69 in ARCH(5). Moreover, the coefficient of the variable and the coefficient of the variance are close to zero. The Kurtosis has decreased to around 4.7 for the ARCH(5) and 4.9 for GARCH(2, 2). If we look at the sample mean of the daily returns which is 0.024% and compare it with the estimated mean daily return μ, being 0.093% in ARCH(5) and 0.087% in GARCH(2, 2),
page 3746
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
Crisis Impact on Stock Market Predictability Table 107.4: Variable C
3747
ARCH(5) statistics (current crisis period).
Coefficient
Std. Error
z-Statistic
Prob.
0.000977
0.000632
1.545897
0.1221
0.000119 0.063898 0.113871 0.102771 0.246052 0.210344
1.92E-05 0.039407 0.046327 0.042216 0.039844 0.052836
6.189857 1.621466 2.457973 2.434421 6.175448 3.981088
0.0000 0.1049 0.0140 0.0149 0.0000 0.0001
Variance equation C ARCH(1) ARCH(2) ARCH(3) ARCH(4) ARCH(5)
Table 107.5: Variable C
GARCH(2, 2) statistics (current crisis period).
Coefficient
Std. Error
z-Statistic
Prob.
0.000892
0.000625
1.428161
0.1532
7.03E−06 0.045127 0.065392 1.274544 −0.394917
4.78E−06 0.041939 0.075365 0.31944 0.261651
1.470417 1.076019 0.867675 3.989937 −1.509326
0.1414 0.2819 0.3856 0.0001 0.1312
Variance equation C ARCH(1) ARCH(2) GARCH(1) GARCH(2)
it becomes clear that the sample mean cannot be an efficient predictor as returns are not normally distributed. Therefore, ARCH and GARCH models are believed to be more reliable. If we analyze statistics of these two versions of the model in the context of current crisis period looking at Tables 107.4 and 107.5, it is evident that ARCH and GARCH parameters are significant with the exception of second lagged term in GARCH(2, 2) and first lagged term in ARCH(5) model, thus conforming to the pre-crisis period outcomes. In both versions of the model, the standardized residuals were observed to be having no ARCH as p = 0.75 and 0.44 in ARCH(5) and GARCH(2, 2) respectively. The kurtosis can also be seen decreasing significantly to 3.82 and 3.99 in ARCH(5) and GARCH(2, 2) respectively. The sample mean of daily returns is recorded at −0.0012% which is not surprising as crisis period tends to have more declining trend.
page 3747
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
Correlations between IBEX-35 returns and S.D. (pre-crisis period).
IBEX-35 Returns S.D. (ARCH-5) S.D. (GARCH2,2)
Table 107.7:
b3568-v4-ch107
R. Mohnot
3748 Table 107.6:
9.61in x 6.69in
IBEX-35 returns
S.D. (ARCH-5)
S.D. (GARCH2,2)
1.000000 0.245168 0.278001
0.245168 1.000000 0.921450
0.278001 0.921450 1.000000
Correlations between IBEX-35 returns and S.D. (current crisis period).
IBEX-35 Returns S.D. (ARCH-5) S.D. (GARCH2,2)
IBEX-35 returns
S.D. (ARCH-5)
S.D. (GARCH2,2)
1.0000000 0.3046341 0.2937160
0.3046341 1.0000000 0.8841684
0.2937160 0.8841684 1.0000000
But the estimated mean daily return μ is 0.098% and 0.089% in ARCH(5) and GARCH(2, 2) model, respectively. Again, these results indicate that the two versions of the model can predict the future more efficiently even in the crisis period. The model’s predictability can also be evaluated by comparing the ARCH(5) and GARCH(2, 2) produced period conditional standard deviations with the series of absolute mean. It is evident from Tables 107.6 and 107.7 that correlations are observed to be positive in both pre-crisis and crisis period. The ARCH(5) model and GARCH(2, 2) model correlations are 0.25 and 0.28, respectively, in pre-crisis period while they are estimated at 0.31 and 0.29 for crisis period. These figures apparently suggest some success in predicting the risks in the daily movements of the IBEX-35 index. This outcome is in agreement with other empirical work that conditional expected returns should be positively and statistically significant in relation to the conditional variance (Apergis et al. 2008, Campbell and Hentschel 1992). Since the standard deviations in both periods are found to be much larger than the mean returns, the next day forecast will indicate the degree of riskiness. 107.6 Concluding Remarks The volatility issue has emerged as one of the most critical issues in the financial circles. Forecasting returns on stocks and other financial assets is becoming a challenging task for the security analysts, fund managers,
page 3748
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability
b3568-v4-ch107
3749
portfolio managers, institutional investors and other similar traders, speculators and arbitrageurs. The task becomes much more crucial when the decisions are related to crisis period such as the current one. This article has attempted to evaluate the predictability of Spain’s stock markets. As generally known phenomenon, the average return turned negative while the standard deviation almost doubled in the crisis period. This has further implicated the more widening gap between the risk and returns. Testing of two different versions of the model, i.e., ARCH(5) and GARCH(2, 2), has revealed some interesting facts. Since crisis period tends to indicate high volatility, it became imperative to apply high order autoregressive mode. First of all, in case of pre-crisis period, both versions of the model generated autoregressive conditional heteroskedasticity feature, indicating predictability of the risk in the stock market. Since the original series is not found to be normally distributed, returns cannot be predicted efficiently, hence, both the versions produced mean daily return μ, being 0.093% in ARCH(5) and 0.087% in GARCH(2, 2) in the pre-crisis period which are significantly different from the series original mean returns. Similarly, in the current crisis period, both models produced mean returns μ of 0.098% and 0.089% against the series mean returns of −0.0012%. The correlations between IBEX-35 index returns and standard deviations of ARCH(5) and GARCH(2, 2) clearly indicate the success of these two models in predicting future risk as they are positive in pre-crisis and current-crisis period. The findings are quite in line with earlier crisis effect study (Choudhry, 1996) which revealed that 1987 stock market crash impacted ARCH parameters and volatility persistence of some emerging markets. The present study has attempted to evaluate the predictability of volatility patterns in the Spain’s stock markets in the crisis period, however, it remains with certain limitations. Further investigations can be carried out with applications of more versions of ARCH model in order to gauge an explicit view about the risk predictability in the financial markets. Further studies, covering more crisis periods, could legitimate the authenticity of the results. Bibliography Alberg, D., Shalit, H. and Yosef, R. (2008). Estimating Stock Market Volatility Using Asymmetric GARCH Models. Applied Financial Economics 18(15), 1201–1208, DOI: 10.1080/09603100701604225. Apergies, N. and Eleptheriou, S. (2001). Stock Returns and Volatility: Evidence from the Athens Stock Market Index. Journal of Economics and Finance 25(1), 50–61.
page 3749
July 6, 2020
16:3
3750
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch107
R. Mohnot
Arora, R.K., Das, H. and Jain P.K. (2009). Stock Returns and Volatility: Evidence from Select Emerging Markets. Review of Pacific Basin Financial Markets and Policies 12(4), 567–592. Baillie, R. and DeGennaro, R. (1990). Stock Returns and Volatility. Journal of Financial and Quantitative Analysis 25, 203–214. Bera, A.K. and Higgins, M.L. (1993). ARCH Models: Properties, Estimation and Testing. Journal of Economic Surveys 7(4), 305–362. Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroscedasticity. Journal of Econometrics 31, 307–327. Campbell, J.Y. and Hentschel, L. (1992). No News is Good News: An Asymmetric Model of Changing Volatility in Stock Returns. Journal of Financial Economics 31, 281–318. Chowdhury, I. and Sarno, L. (2004). Time-Varying Volatility in the Foreign Exchange Market: New Evidence on its Persistence and on Currency Spillovers. Journal of Business Finance and Accounting 31, 759–793. Cochran, S.J., Heck, J.L. and Shaffer, D.R. (2003). Volatility in World Equity Markets. Review of Pacific Basin Financial Markets and Policies 6(3), 273–290. Chiou, W.J.P. (2009). Variation in Stock Return Risks: An International Comparison. Review of Pacific Basin Financial Markets and Policies 12(2), 245–266. Choudhry, T. (1996). Stock Market Volatility and the Crash of 1987: Evidence From Six Emerging Markets. Journal of International Money and Finance 15(6), 969–981. Chou, R. (1988). Volatility Persistence and Stock Valuations: Some Empirical Evidence Using GARCH. Journal of Applied Econometrics 3, 279–294. Chou, W.L., Fan Denis, K.K. and Lee, C.F. (1996). Hedging with the Nikkei Index Futures: The Conventional Model versus the Error Correction Model. Quarterly Review of Economics and Finance 36(4), 495–505. Dennis, P., Mayhew, S. and Stivers, C. (2007). Stock Returns, Implied Volatility Innovations, and the Asymmetric Volatility Phenomenon. Journal of Financial and Quantitative Analysis 41(2), 381–406. Dooley, M., Dornbusch, R. and Park, Y.C. (2002). A Framework for Exchange Rate Policy in Korea. Korea Institute for International Economic Policy, Working Paper 02-02. Engle, R.F. (1982). Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of UK Inflation. Econometrica 50, 987–1008. Goretti, M. (2005). The Brazilian Currency Turmoil of 2002: A Non-linear Analysis. International Journal of Finance and Economics 10, 289–306. Hansen, P.R. and Lunde, A. (2001). A Comparison of Volatility Models: Does Anything Beat a GARCH(1, 1)? Working Paper Series No. 84, Center for Analytical Finance. Jiang, G.J., Xu, D. and Yao, T. (2009). The Information Content of Idiosyncratic Volatility. Journal of Financial and Quantitative Analysis 44(1), 1–28. Johnston, K. and Scott, E. (2000). GARCH Models and the Stochastic Process Underlying Exchange Rate Price Changes. Journal of Financial and Strategic Decisions 13, 13–24. Kim, D. and Kon, S. (1994). Alternative Models for the Conditional Heteroscedasticity of Stock Returns. Journal of Business 67, 563–598. Lee, C.-F., Chen, G.-M. and Rui, O.M. (2001). Stock Returns and Volatility on China’s Stock Markets. Journal of Financial Research, (with Gong-Ming Chen and Oliver M. Rui) 24(4), 523–544. Leon, A., Rubio, G. and Serna, G. (2005). Autoregressive Conditional Volatility, Skewness and Kurtosis. Quarterly Review of Economics and Finance 45, 599–618. Marcucci, J. (2005). Forecasting Stock Market Volatility with Regime-Switching GARCH Models. Studies in Nonlinear Dynamics & Econometrics 9(4), doi:10.2202/15583708.1145.
page 3750
July 6, 2020
16:3
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Crisis Impact on Stock Market Predictability
b3568-v4-ch107
3751
McMillan, D.G. and Speight, E.H. (2006). Volatility Dynamics and Heterogeneous Markets. International Journal of Finance and Economics 11, 115–121. Nobuyoshi, A. (2001). Exchange Rate Policy of Russia: Lessons to Learn from Russian Experience. Economic and Social Resource Institute 1–29. Okimoto, T. (2008). New Evidence of Assymetric Dependence Structures in International Equity Markets. Journal of Financial and Quantitative Analysis 43(3), 787–815. Rahman, S., Lee, C. and Ang, K.P. (2002). Review of Quantitative Finance and Accounting 19, 155, https://doi.org/10.1023/A:1020683012149.
Web Resources http://www.bis.org/press. http://www.bolsamadrid.es. http://www.imf.org. http://www.finance.yahoo.com. http://www.bolsasymercados.es.
page 3751
This page intentionally left blank
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
Chapter 108
How Many Good and Bad Funds are There, Really? Wayne Ferson and Yong Chen Contents 108.1 Introduction . . . . . . . . . . . . . . . . . . . 108.2 The Model . . . . . . . . . . . . . . . . . . . 108.2.1 Estimation by simulation . . . . . . . 108.2.2 Using the model . . . . . . . . . . . . 108.2.3 Relation to the classical FDR method 108.3 Data . . . . . . . . . . . . . . . . . . . . . . . 108.4 Simulation Exercises . . . . . . . . . . . . . . 108.4.1 Simulation details . . . . . . . . . . . 108.4.2 Finite sample properties . . . . . . . 108.4.3 Empirical power evaluation . . . . . . 108.5 Empirical Results . . . . . . . . . . . . . . . . 108.5.1 Mutual funds . . . . . . . . . . . . . . 108.5.2 Hedge funds . . . . . . . . . . . . . . 108.5.3 Joint estimation . . . . . . . . . . . . 108.5.4 Rolling estimation . . . . . . . . . . . 108.6 Robustness . . . . . . . . . . . . . . . . . . . 108.6.1 Pattern of missing values . . . . . . .
Wayne Ferson University of Southern California e-mail: [email protected] Yong Chen Texas A&M University e-mail: [email protected] 3753
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
3754 3758 3759 3762 3762 3765 3768 3768 3769 3774 3778 3778 3780 3782 3785 3790 3790
page 3753
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3754
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
108.6.2 Choice of goodness-of-fit criterion . . . . . . . . . . 108.6.3 Are the alphas correlated with active management? 108.6.4 Alternative alphas . . . . . . . . . . . . . . . . . . . 108.6.5 Return smoothing . . . . . . . . . . . . . . . . . . . 108.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 108A Standard Errors . . . . . . . . . . . . . . . . . . . Appendix 108B More on the Relation to Previous Approaches . . Appendix 108C A Two-Distribution Model . . . . . . . . . . . . . Appendix 108D Kernel Smoothing . . . . . . . . . . . . . . . . . . Appendix 108E Robustness Results Details . . . . . . . . . . . . . Appendix 108F Simulations of the Simulations . . . . . . . . . . . Appendix 108G Trading Strategies . . . . . . . . . . . . . . . . . . Appendix 108H The Impact of Missing Data Values on the Simulations . . . . . . . . . . . . . . . . . . . . . . Appendix 108I Analysis of the Impact of Variation in the δ and β Parameters on the Simulations . . . . . . . . . . . Appendix 108J Alternative Goodness of Fit Measures . . . . . . .
. . . . . . . . . . . . .
3791 3792 3793 3793 3793 3795 3797 3800 3802 3803 3803 3811 3817
.
3823
. .
3825 3826
Abstract Building on the work of Barras, Scaillet and Wermers (BSW, 2010), we propose a modified approach to inferring performance for a cross-section of investment funds. Our model assumes that funds belong to groups of different abnormal performance or alpha. Using the structure of the probability model, we simultaneously estimate the alpha locations and the fractions of funds for each group, taking multiple testing into account. Our approach allows for tests with imperfect power that may falsely classify good funds as bad, and vice versa. Examining both mutual funds and hedge funds, we find smaller fractions of zero-alpha funds and more funds with abnormal performance, compared with the BSW approach. We also use the model as prior information about the cross-section of funds to evaluate and predict fund performance. Keywords Hedge fund • Mutual fund • Fund performance • False discovery rates • Bayes rule • Bootstrap • Goodness of fit • Test power • Trading strategies • Kernel smoothing.
108.1 Introduction A big problem for studies that empirically examine a cross-section of investment funds is separating true performance from luck. This is inherently a problem of classifying or grouping the funds. We contribute to the literature by further developing an approach for grouping fund alphas, motivated by Barras, Scaillet and Wermers (BSW, 2010). BSW evaluate fund performance
page 3754
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3755
accounting for multiple comparisons in the cross-section of mutual funds. To illustrate the multiple comparisons problem, consider a large cross-section where 4000 funds are evaluated and a 10% test size is used. We may expect 400 funds to record abnormal performance even if all of the funds have zero alphas and no future performance is expected. BSW describe a model in which the population of funds consists of three subpopulations. A fraction of funds, π0 , have zero alphas, while a fraction πg of “good” funds have positive alphas and a fraction πb of “bad” funds are with negative alphas. BSW estimate the fractions for mutual funds by adjusting for false discovery rates (FDR). The FDR is the fraction of “discoveries,” or rejections of the null hypothesis of zero alpha when it is correct (Storey, 2002). Their estimate of πg is the fraction of funds where the null hypothesis that alpha is zero is rejected in favor of a good fund, minus the expected FDR among the zero-alpha funds. We refer to this approach in our paper as the “classical” FDR method. BSW’s main estimates for mutual funds are π0 = 75% and πg = 1%, using data up to 2006. A number of subsequent studies have applied their approach to mutual funds and hedge funds.1 In this paper, by modifying the classical FDR method, we propose an approach to estimate the fractions of funds in different alpha groups based on the structure of the probability model. Our model assumes that a fund’s performance is drawn from a mixture of distributions, where there are fixed expected alpha values (to be estimated) and certain fractions of the funds belong to each group. Unlike the classical FDR method that does not use the locations of the fund groups, we use the location information as part of the model.2 Using simulations, we estimate the alpha locations of the fund groups and the fractions of the funds in each group simultaneously, while accounting for multiple comparisons. Our approach accounts for imperfect test power and the possibility of confusion where tests with imperfect power can confuse a truly bad fund
1
Cuthbertson et al. (2012) apply the BSW approach to UK mutual funds. Criton and Scaillet (2014) and Ardia and Boudt (2018) apply the method to hedge funds and Dewaele et al. (2011) apply it to funds of hedge funds. Romano et al. (2008) also present a small hedge fund example. Bajgrowicz and Scaillet (2012) apply false discovery methods to a large sample of technical trading rules. 2 BSW do compute the locations of the nonzero alpha funds and explore the locations through the tails of the t-ratio distributions, but they do not simultaneously estimate the locations and the fractions of funds in each group. In an example, they estimate the alpha location of the good funds by finding the noncentrality parameter of a noncentral t-distribution that matches the expected fractions of funds rejected (see their Internet Appendix).
page 3755
July 6, 2020
16:4
3756
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
(alpha < 0) for a good fund (alpha > 0) or vice versa. Hence, our estimates of the fractions of funds in each group adjust for both false discovery rates (i.e., type I error) and low test power (i.e., type II error), and we show how the benefit of this adjustment depends on the power of the tests as well as the fractions of funds. Our approach reduces to the BSW estimator when the power of the tests is 100%. However, when the test power is low (as in the tests for hedge funds), our approach can deliver more accurate inferences. In addition, we show that the structure of our model helps to better separate skill from luck in the cross-section of fund performance. We estimate the probability that a given fund has a positive alpha, using information from the entire cross-section of funds. For example, if the model says that almost no funds have positive alphas, the chances that a particular fund with a positive alpha estimate was just lucky, are much higher than if the model says many funds have positive alphas.3 In simulations with known fractions of zero and positive-alpha funds, our method detects the positive-alpha funds more reliably than the classical FDR method. In this sense, our method has improved “power” to detect good funds. We apply our approach to both active US equity mutual funds during January 1984–December 2011 and hedge funds during January 1994–March 2012.4 For mutual funds, during our sample period the classical estimator suggests that about 72% of the mutual funds have zero alphas. Our model implies that about 51% of the mutual funds have zero alphas, about 7% are bad (alphas of −0.03% per month) and the remaining are “ugly” (alphas of −0.20% per month). In our sample of hedge funds, the classical estimate of π0 is 76%. In contrast, we estimate that very few of the hedge funds’ alphas are zero. The best models imply that more than 50% of the hedge funds are good, with
3
Formally, we compute the posterior probability of a positive alpha using Bayes rule, where the cross-section of funds, as characterized through the probability model, represents the prior. Previous studies that use Bayesian methods for fund performance measurement and fund selection include Brown (1979), Baks, Metrick, and Wachter (2001), P´ astor and Stambaugh (2002), Jones and Shanken (2005), Avramov and Wermers (2006), among others. Our application is different from the earlier studies because our prior reflects multiple skill distributions for subpopulations of different skill levels. 4 Following BSW, we measure net-of-fee fund alpha based on benchmark factor returns so that we can compare the inferences between the two approaches. Our approach can be applied to other fund performance measures, such as holding-based measures (Daniel et al., 1997), stochastic discount factor alpha (Farnsworth et al., 2002), measure of value added (Berk and van Binsbergen, 2015), and gross alpha (P´ astor, Stambaugh, and Taylor, 2015).
page 3756
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3757
alphas centered around 0.25% per month, and most of the others are bad, with alphas centered around −0.11% per month. The basic probability model assumes that funds are drawn from one of three distributions centered at different alphas. We find that models with three subpopulations fit the cross-section of funds’ alpha t-ratios better than models in which there is only a single group, or in which there are two groups. In principle, the approach can be extended for many different groups. However, in the three-alpha model joint estimation reveals that there are linear combinations of the nonzero alpha values that produce a similar fit for the data, indicating that a three-group model is likely all that is needed. We estimate the models on 60-month rolling windows to examine trends in the parameters over time, and we use rolling formation periods to assess the information in the model about future fund returns. In our hedge fund sample, the difference between the Fung and Hsieh (2004) alphas of the good and bad fund groups in the first year after portfolio formation is about 7% per year, with a t-ratio of 2.3. The rolling window parameters show worsening performance over time for the good mutual funds and hedge funds, while the alphas of the bad funds are relatively stable over time. Much of the potential investment value for mutual funds comes from avoiding bad funds, whereas for hedge funds there is value in finding good funds. While the classical FDR method is essentially non-parametric with no assumption about the distribution of fund performance, our approach uses the structure of the probability model. It is worth noting the tradeoff between these two approaches. The classical method is flexible about the performance distribution, but it may have low power and underestimate πg and πb . Our parametric approach improves the power when the probability model is reasonable, but its accuracy depends on the probability model being correctly specified. In Section 108.2, we explain the economic reasoning behind the probability model used in our approach. The approach here also generalizes studies such as Kowsowski et al. (2006) and Fama and French (2010), who bootstrap the cross-section of mutual fund alphas. In those studies, all of the inferences are conducted under the null hypothesis of zero alphas, so there is only one group of funds. The analysis is directed at the hypothesis that all funds have zero alphas, accounting for the multiple hypothesis tests. The current approach also accounts for multiple hypothesis tests, but allows that some of the funds have nonzero alphas. Related to our paper, Chen, Cliff and Zhao (2017) consider a parametric model in which there are groups of hedge funds, with each group’s alpha drawn form a normal distribution with a different mean and standard deviation. The unconditional distribution in their model is a mixture of
page 3757
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3758
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
normals. They use a modified EM algorithm to estimate the locations of the groups and the fractions of hedge funds in each group. Subsequent work by Harvey and Liu (2018) develops a full-blown maximum likelihood approach to estimate the parameters. Chen, Cliff and Zhao use a larger sample of hedge funds that include more strategies than our sample, and they find a larger number of groups best fits their data — four groups of hedge funds while we find three groups. Unlike our approach, these two studies do not explicitly examine the effect of imperfect test power and confusion on the inferences. This paper is organized as follows. Section 108.2 describes the model and its estimation. Section 108.3 describes the data. Section 108.4 describes our simulation methods and presents a series of simulation experiments to evaluate the models. Section 108.5 presents the empirical results. Section 108.6 discusses robustness checks and Section 108.7 concludes. The Appendix describes our approach to the standard errors. An Internet Appendix provides technical details and ancillary results.
108.2 The Model The model assumes that the mutual funds or the hedge funds are members of one of three subpopulations. This is appealing, as grouping is common when evaluating choices on the basis of quality that is hard to measure. For example, Morningstar rates mutual funds into “star” groups. Security analysts issue buy, sell and hold recommendations. Academic journals are routinely categorized into groups, as are firms’ and nations’ credit worthiness, restaurants, hotels, etc. For investment funds, a structure of three groups associated with zero, negative and positive alphas seems natural. That some funds should have zero alphas is predicted by Berk and Green (2004). Under decreasing returns to scale, new money should flow to positive-ability managers until the performance left for competitive investors is zero. There are also many reasons to think that some funds may have either positive or negative alphas. Frictions (e.g., taxes, imperfect information, agency costs, and cognitive errors) may keep investors from quickly pulling their money out of bad funds. It is also natural to hope for funds with positive alphas, and these funds could be available because costs slow investors’ actions to bid them away. A number of empirical studies find evidence that subsets of funds may have positive alphas. Jones and Mo (2016) identify more than 20 fund characteristics that previous studies associate with alphas.
page 3758
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3759
0.5
Density
0.4
H :α=0 0
0.3
0.2
0.1
0 -8
-7.5
-7
-6.5
-6
-5.5
-5
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
t
t
t
b
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
8.5
9
9.5
10
6
6.5
7
7.5
8
8.5
9
9.5
10
g
0.5
Density
0.4
H :α>0
H :α 0]. A fraction of the funds belongs to each subpopulation. The fractions, which sum to 1.0, are [πb , π0 , πg ]. The unconditional distribution of funds’ alphas is a mixture of the three distributions. 108.2.1 Estimation by simulation The set of unknown parameters is θ = [πg , πb , αg , αb , βg , βb , δg , δb ], where the last four are the power and confusion parameters of the tests. The parameters are estimated in three stages as follows. In the first stage, for given values of [αb , αg ] we use three simulations to estimate the parameters [βb , βg , δb , δg ]. We set the size of the tests, (γ/2), say to 10% in each tail. The first simulation of the cross-section of fund alphas, imposing the null hypothesis that all of the alphas are zero, produces two critical values for the t-statistics, tg and tb . The critical value tg is the t-ratio above which 10% of
page 3759
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3760
the simulated t-statistics lie when the null of zero alphas is true. The critical value tb is the value below which 10% of the simulated t-statistics lie under the null hypothesis of zero alphas. The second simulation imposes the alternative hypothesis that funds are good; that is, the alphas are centered at the value αg > 0. The fraction of the simulated t-ratios above tg is the power of the test for good funds, βg . The fraction of the simulated t-ratios below tb is an empirical estimate of the probability of rejecting the null in favor of finding a bad fund when the fund is actually good. This is the confusion parameter, δb .5 The third simulation adopts the alternative hypothesis that funds are bad; that is, the alphas in the simulation are centered at the value αb < 0. The fraction of the simulated t-ratios below tb is the power of the test to find bad funds, βb . The fraction of the simulated t-ratios above tg is the confusion parameter, δg . Our approach says that a good or bad fund has a single value of alpha. However, it should be robust to a model where the good and bad alphas are random and centered around the values (αg , αb ).6 The second stage of our procedure combines the simulation estimates with results from the cross-section of funds in the actual data as follows. Let Fb and Fg be the fractions of rejections of the null hypothesis in the actual data using the simulation-generated critical values, tb and tg . We model E(Fg ) = P (reject at tg |H0 )π0 + P (reject at tg |Bad)πb + P (reject at tg |Good)πg = (γ/2)π0 + δg πb + βg πg ,
(108.1)
and similarly E(Fb ) = P (reject at tb |H0 )π0 + P (reject at tb |Bad)πb + P (reject at tb |Good)πg = (γ/2)π0 + βb πb + δb πg .
5
(108.2)
Formally, δb is one corner of the 3 × 3 probabilistic confusion matrix characterizing the tests. See Das (2013, p. 148) for a discussion. 6 Suppose, for example, that in the simulation we drew for each good fund a random true alpha, αpTRUE , equal to αg plus mean zero independent noise. In order to match the sample mean and variance of the fund’s return in the simulation to that in the data, we would reduce the variance of {rpt − αpTRUE }, by the amount of the variance of the noise in αpTRUE around αg . The results for the cross-section should come out essentially the same.
page 3760
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3761
Equations (108.1) and (108.2) present two equations in the two unknowns, πb and πg , (since π0 = 1 − πb − πg ) which we can solve for given values of [βg , βb , δg , δb , Fg , Fb ]. The solution to this problem is found numerically, by minimizing the squared errors of equations (108.1) and (108.2) subject to the Kuhn–Tucker conditions for the constraints that πb ≥ 0, πg ≥ 0 and πb + πg ≤ 1. We estimate E(Fg ) and E(Fb ) by the fractions rejected in the actual data and we calibrate the parameters [βg , βb , δg , δb ] from the simulations. We assume that with enough simulation trials, we can nail down the β and δ parameters with zero error. The impact of this assumption is addressed in the Appendix, where we conclude that variation in these parameters across the simulation trials has a trivial impact on the results. The estimates of the π’s that result from the second stage are conditioned on the values of [αg , αb ]. The third stage of our approach is to search for the best-fitting values of the alphas. The search proceeds as follows. Each choice for the alpha values generates estimates π(α) for the fractions. At these parameter values, the model implies a mixture of distributions for the cross-section of fund returns. We simulate data from the implied mixture of distributions, and we search over the choice of alpha values and the resulting π(α) estimates (repeating the first two stages at each point in the search grid) until the simulated mixture distribution generates a cross-section of estimated fund alpha t-ratios that best matches the cross-sectional distribution of alpha t-ratios estimated in the actual data. We determine the best match using the familiar Pearson χ2 statistic as the criterion: Pearson χ2 = Σi (Oi − Mi )2 /Oi ,
(108.3)
where the sum is over K cells, Oi is the frequency of t-statistics for alpha that appear in cell i in the original data, and Mi is the frequency of t-statistics that appear in cell i using the model, where the null hypothesis is that the model frequencies match those of the original data. The Pearson statistic requires choosing the cell sizes. We choose K = 100 cells, with the cell boundaries set so that an approximately equal number of t-ratios in the original data appear in each cell (i.e., Oi ≈ N/100). The Pearson statistic may also be affected by the fact that the alpha t-ratios are estimates, so that the estimation error creates an errors-in-variables problem. In a robustness section we address these issues by examining other goodnessof-fit measures, and find that our results are robust to alternative measures. In summary, the fractions of actual funds discovered to be good or bad at the simulation-generated critical values determine the fractions of funds
page 3761
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3762
in each group according to the equations (108.1) and (108.2), given the locations of the groups. This is a modification of the classical FDR framework, accounting for the power and confusion parameters. Unlike the FDR estimator that is nonparametric, our approach is based on parametric simulations. We use three simulations and one overall goodness-of-fit measure to identify the four main parameters of our model, [πg , πb , αg , αb ], taking the power and confusion parameters of the tests as given. 108.2.2 Using the model We use the information from the cross-section of funds and the estimated parameters of the probability model to draw inferences about individual funds. Given a fund’s alpha estimate αp , we use Bayes rule to compute: P (α > 0|αp ) = f (αp |α > 0)πg /f (αp ),
(108.4)
f (αp ) = f (αp |α = 0)π0 + f (αp |α > 0)πg + f (αp |α < 0)πb .
(108.5)
The densities f (αp |α = 0), f (αp |α > 0) and f (αp |α < 0) are the empirical conditional densities of alpha estimates from the three subpopulations of funds. These are estimated by simulation and fit using a standard kernel density estimator for f (·) as described in the Internet Appendix. We actually use the t-ratios of the alphas instead of the alpha estimates in our simulations, because the t-ratio is a pivotal statistic. The inference for a given fund reflects the estimated fractions of funds in each subpopulation. For example, as the prior probability, πg , that there are good funds approaches zero, the posterior probability that the particular fund has a positive alpha, given its point estimate αp , approaches zero. This captures the idea that if there are not many good funds, a particular fund with a positive alpha estimate is likely to have been lucky. The inference also reflects the locations of the groups through the likelihood that the fund’s alpha estimate could be drawn from the subpopulation of good funds. If the likelihood f (αp |α > 0) is small, the fund is less likely to have a positive alpha than if f (αp |α > 0) is large. This uses information about the position of the fund’s alpha estimate relative to the other funds in a category. 108.2.3 Relation to the classical FDR method The classical false discovery rate estimator makes the assumption that the fraction of funds not rejected in the data is the fraction of zero alpha funds, multiplied by the probability that the test will not reject when the null is true. The estimator πg,C adjusts the fraction of funds found to be good by
page 3762
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3763
the expected fraction of false discoveries among the zero-alpha funds. The idea is that if the size of the test is large enough, as suggested by Storey (2002) and BSW, then all the good funds will have alpha t-statistics larger than the small critical value that results. The Internet Appendix shows that our estimators of the π fractions, derived by solving equations (108.1) and (108.2), reduce to the classical estimators from Storey (2002) and BSW when βg = βb = 1 and δg = δb = 0. The classical FDR estimators of the fractions are: π0,C = [1 − (Fb + Fg )]/(1 − γ); πg,C = Fg − (γ/2)π0,C .
(108.6)
The assumption that βg = βb = 1 says that the tests have 100% power, an assumption that would bias the estimates toward finding too many zeroalpha funds in the presence of imperfect test power. Storey (2002) motivates β = 1 as a “conservative” choice, justified by choosing the size of the tests to be large enough.7 This bias, however, can be important when the power of the tests is low. Low-power tests have long been seen as a problem in the fund performance literature. The potential bias of the classical estimator can be measured by combining equations (108.1), (108.2) and (108.6) as follows: E(π0,C ) = [1 − E(Fb ) − E(Fg )]/(1 − γ) = {1 − [(γ/2)π0 + δg πb + βg πg ] − [(γ/2)π0 + βb πb + δb πg ]}/(1 − γ) = [π0 (1 − γ) + πg (1 − βg − δb ) + πb (1 − βb − δg )]/(1 − γ) = π0 + [πg (1 − βg − δb ) + πb (1 − βb − δg )]/(1 − γ).
(108.7)
The second term in the last line of equation (108.7) captures the bias under the structure of the probability model. If we assume βg = βb = β and δg = δb = δ for ease of illustration, then the bias becomes (1−π0 ) (1−β −δ)/ (1 − γ). The bias is likely to be small if the true fraction of zero-alpha funds is large or the power of the tests is close to 100%. Intuitively, the 7
BSW, following Storey (2002), search using simulations for the size of the test, γ, that solves Minγ E{[π0 (γ)−Minγ π0 (γ)]2 }, where π0 (γ) is the estimate that results from equation (108.6) when γ determines the fractions rejected. This step reduces the estimate of π0 and its bias. For this reason, we refer to estimates based on equation (108.6) but without this additional estimation step, as the “classical” FDR estimator. BSW find that using fixed values of γ/2 near 0.25–0.30 without the additional minimization produces similar results to estimates that minimize the mean squared errors. We use these sizes in the classical estimators below, without the mean square error minimization. In our paper, the value γ is also used in equation (108.6) as the threshold (denoted as λ in Storey (2002) and BSW) to estimate π0,C , as BSW (2010, p. 189) suggest a similarly large value (such as 0.5 or 0.6) for the threshold.
page 3763
July 6, 2020
16:4
3764
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
accuracy of the classical estimator depends on large π0 or large β. However, in simulations and empirical analysis to actual funds, we find that the bias can be remarkable when the fraction of zero-alpha funds is small and the test power is low, as in the case of hedge funds. Our approach delivers more accurate inferences accounting for imperfect test power. More specifically, our analysis in the Appendix reveals two offsetting biases in the classical estimator. Assuming perfect power (i.e., setting βg = βb = 1) biases π0 upwards, while assuming that the tests will never confuse a good and bad fund (i.e., setting δg = δb = 0) biases π0 downwards. The result of our simulations shows that the upward bias from assuming perfect power dominates and the classical estimator finds too many zero alpha funds. The π0,C estimator does not rely on the location of the good and bad funds, [αg , αb ], in the sense that it only considers the null distributions. Thus, it is likely to be robust to the structure of the alpha distributions. For example, if the true distribution is multimodal but we assume too-small a number of groups, our results can be biased due to model misspecification. The choice between the two approaches trades off the robustness and smaller sampling variability of the classical method with the smaller bias and greater power of our approach using more of the structure of the probability model. Our estimators use the structure of the probability model, including β’s and δ’s, which are functions of the alpha locations under the alternative hypotheses. Like BSW, our estimator of πg adjusts the fraction of funds discovered to be good for false discoveries among zero-alpha funds. Moreover, allowing for imperfect test power, we control for cases where the tests are confused, and “very lucky” funds with negative alphas are found to have significant positive performance.8 In summary, there is a tradeoff between the two approaches in inferring the fractions of funds in each alpha group. The classical FDR method is a good choice if the fraction of zero alpha funds is known to be large and the power of the tests is high, or if the distribution of fund performance is a complex mixtures of distributions with different moments.9 On the other hand, our estimator is ideal if the structure of the probability model (e.g., a
8
The Internet Appendix derives the false discovery rates from our model, relates them to previous studies and applies the methods in a trading strategy. 9 As another useful feature of the classical method, the FDR theory provides asymptotic results that can be used for statistical inference (e.g., Genovese and Wasserman, 2002, 2004).
page 3764
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3765
three-alpha distribution) is justified by economic theory, in which case our parametric approach improves power and provides more accurate inferences. 108.3 Data We study mutual fund returns measured after expense ratios and funds’ trading costs for January 1984–December 2011 from the Center for Research in Security Prices Mutual Fund database, focusing on active US equity funds. We subject the mutual fund sample to a number of screens to mitigate omission bias (Elton, Gruber, and Blake, 2001) and incubation bias (Evans, 2010). We exclude observations prior to the reported year of fund organization, and we exclude funds that do not report a year of organization or which have initial total net assets (TNA) below $10 million or less than 80% of their holdings in stock in their otherwise first eligible year to enter our data set. Funds that subsequently fall below $10 million in assets under management are allowed to remain, in order to avoid a look-ahead bias. We combine multiple share classes for a fund, focusing on the TNA-weighted aggregate share class.10 These screens leave us with a sample of 3619 mutual funds with at least 8 months of returns data. Our hedge fund data are from Lipper TASS. We study only funds that report monthly net-of-fee US dollar returns, starting in January 1994. We focus on US equity oriented funds, including only those categorized as either dedicated short bias, event driven, equity market neutral, fund-of-funds or long/short equity hedge. We require that a fund have more than $10 million in assets under management as of the first date the fund would otherwise be eligible to be included in our analysis. To mitigate backfill bias, we remove the first 24 months of returns and returns before the dates when funds were first entered into the database, and funds with missing values in the field for the add date. Our sample includes 3620 hedge funds over January 1994– March 2012. Table 108.1 presents summary statistics of the mutual fund and hedge fund data. We require at least 8 monthly observations of return to include 10
We identify and remove index funds both by Lipper objective codes (SP, SPSP) and by searching the funds’ names with key word “index.” Our funds include those with Policy code (1962–1990) CS, Wiesenberger OBJ (1962–1993) codes G, G-I, G-I-S, G-S, G-S-I, I, IFL, I-S, I-G, I-G-S, I-S-G, S-G, S-G-I, S-I-G, GCI, IEQ, LTG, MCG, SCG, Strategic Insight OBJ code (1993–1998) AGG, GMC, GRI, GRO, ING, SCG, Lipper OBJ/Class code (1998-present), CA, EI, G, GI, MC, MR, SG, EIEI, ELCC, LCCE, LCGE, LCVE, LSE, MCCE, MCGE, MCVE, MLCE, MLGE, MLVE, SESE, SCCE, SCGE, SCVE and S.
page 3765
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3766
Table 108.1:
Summary statistics.
Monthly returns (%) Fractile
Nobs
Mean
Std
Rho1
Alphas (%)
Alpha t-ratios
Panel A: Mutual Fund 0.01 335 0.05 276 0.10 214 0.25 148
Returns: January 1984–December 2011 (336 months) 1.95 9.88 0.38 0.505 2.407 1.30 7.72 0.28 0.270 1.474 1.11 7.04 0.24 0.170 1.020 0.88 6.13 0.19 0.031 0.194
Median 0.75 0.90 0.95 0.99
0.62 0.33 0.04 −0.18 −1.25
89 39 26 23 12
5.35 4.71 4.15 3.68 2.50
0.12 0.00 −0.09 −0.13 −0.21
−0.094 −0.234 −0.415 −0.573 −1.109
Panel B: Hedge Fund Returns: January 1994–March 2012 (219 months) 0.01 208 2.60 14.44 0.61 2.671 0.05 153 1.47 8.68 0.50 1.251 0.10 122 1.15 6.70 0.43 0.861 0.25 79 0.74 4.22 0.30 0.437 Median 0.75 0.90 0.95 0.99
46 25 15 11 8
0.37 −0.04 −0.63 −1.25 −3.20
2.62 1.73 1.21 0.95 0.52
0.16 0.03 −0.12 −0.21 −0.37
0.112 −0.210 −0.732 −1.356 −3.697
−0.665 −1.511 −2.381 −2.952 −4.024 12.471 4.175 3.108 1.617 0.428 −0.656 −1.809 −2.640 −14.136
Notes: Monthly returns are summarized for mutual funds and hedge funds, stated in monthly percentage units. The values at the cutoff points for various fractiles of the cross-sectional distributions of the sample of funds are reported. Each column is sorted on the statistic shown. Nobs is the number of available monthly returns, where a minimum of 8 are required. Mean is the sample mean return, Std is the sample standard deviation of return, reported as monthly percentages, and Rho1 is the first-order sample autocorrelation in raw units. The alpha estimates are based on OLS regressions using the Fama–French three factors for mutual funds (3619 mutual funds in the sample), and the Fung and Hsieh seven factors for hedge funds (3620 hedge funds in the sample). The alpha t-ratios are based on heteroskedasticity-consistent standard errors.
a fund in this table. The mean hedge fund return (0.37% per month) is smaller than the average mutual fund return (0.62%), but the longer sample for the mutual funds includes the high-return 1984–1993 period. The range of average returns across funds is much greater in the hedge fund sample, especially in the negative-return, left tail. A larger fraction of the hedge funds lose money for their investors, and the losses have been larger than in the mutual funds. The two right-hand columns of Panel A of Table 108.1 summarize the Fama–French (1996) three-factor alphas and their heteroskedasticityconsistent t-ratios for the mutual funds. For the hedge funds in Panel B, we
page 3766
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3767
use the Fung and Hsieh (2001, 2004) seven factors. The Internet Appendix shows the results for hedge funds when the Fama and French factors are used.11 We use standard alphas, similar to BSW, in all of our analyses. The median alpha for the hedge funds is positive, while for the mutual funds it is slightly negative. The tails of the cross-sectional alpha distributions extend to larger values for the hedge funds. For example, the upper 5% tail value for the t-ratio of the alphas in the hedge fund sample is 4.18 (the alpha is 1.25% per month), while for the mutual funds it is only 1.47 (the alpha is 0.27%). In the left tails the two types of funds also present different alpha distributions, with a thicker lower tail for the alphas and t-ratios in the hedge fund sample. One of our goals is to see how these impressions of performance hold up when we consider multiple hypothesis testing, and use bootstrapped samples to capture the correlations and departures from normality that are present in the data. Table 108.1 shows that the sample volatility of the median hedge fund return (2.62% per month) is smaller than for the median mutual fund (5.34%). The range of volatilities across the hedge funds is greater, with more mass in the lower tail. For example, between the 10% and 90% fractiles of hedge funds the volatility range is 1.2%–6.7%, while for the mutual funds it is 4.2%–7.0%. Getmansky, Lo and Makarov (2004) study the effect of return smoothing on the standard deviations of hedge fund returns and show that smoothed returns reduce the standard deviations and induce positive autocorrelation in the returns. The autocorrelations of the returns are slightly higher for the hedge funds, consistent with more return smoothing in the hedge funds. The median autocorrelation for the hedge funds 0.16, compared with 0.12 for the mutual funds, and some of the hedge funds have substantially higher autocorrelations. The 10% right tail for the autocorrelations is 0.50 for the hedge funds, versus only 0.24 for the mutual funds. Asness, Krail, and Liew (2001) show that return smoothing can lead to upwardly biased estimates of hedge fund alphas. We consider the effect of
11
The hedge fund alphas are slightly smaller on average and the cross-sectional distribution of the alphas shows thinner tails when the Fama and French factors are used. The Fung and Hsieh seven factors include the excess stock market return and a “small minus big” stock return similar to the Fama and French factors, except constructed using the S&P 500, and the difference between the S&P500 and the Russell 2000 index. In addition, they include three “trend-following” factors constructed from index option returns; one each for bonds, currencies and commodities. Finally, there are two yield changes; one for ten-year US Treasury bonds and one for the spread between Baa and ten-year Treasury yields.
page 3767
July 6, 2020
16:4
3768
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
return smoothing in the robustness section and in the Internet Appendix, and conclude that smoothing is not likely to be material for our results. 108.4 Simulation Exercises This section describes the simulation method in more detail and presents simulation exercises with two main goals. The first goal is to evaluate small sample biases in the estimators and their standard errors and to inform our choices for the values of some of the parameters, such as the size of the tests (γ/2), for our empirical analyses. The second goal is to evaluate the “power”, through the discovery rates, of our approach compared with classical methods. 108.4.1 Simulation details The first stage of our procedure follows Fama and French (2010), simulating under the null hypothesis that alpha is zero. We draw randomly with replacement from the rows of {rpt − αp , ft }t , where αp is the vector of funds’ alpha estimates in the actual data, rpt is the funds’ excess returns vector and ft is a vector of the factor excess returns. This allows for correlation between the residuals of the factor models used to estimate alphas. (Alternative approaches are considered in the robustness section.) All returns are in excess of the 1 month Treasury bill return. This imposes the null that the “true” alphas are zero in the simulation. When we simulate under the assumption that the true alphas are not zero for a given fraction of the population, π, we select N π funds at random, where N is the total number of funds in the sample, and add the relevant value of alpha to their returns, net of the estimated alpha. (An alternative approach is considered in the robustness section.) Each trial of the three simulations delivers an estimate of the β and δ parameters. We use and report the average of these across 1000 simulation trials. Following Fama and French (2010), we use an 8-month survival screen for mutual funds (and 12 months for hedge funds). We impose the selection criterion only after a fund is drawn for an artificial sample. This raises the issue of a potential inconsistency in the bootstrap, as the missing values will be distributed randomly through “time” in the artificial sample, while they tend to occur in blocks in the original data. We consider an alternative approach to address this issue in a robustness section. While we describe the results in terms of the alpha values, all of the simulations are conducted using the t-ratios for the alphas as the test statistic,
page 3768
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3769
where the standard errors are the White (1980) heteroskedasticity consistent standard errors. We use the t-ratio because it is a pivotal statistic, which should improve the properties of the bootstrap compared with simulating the alphas themselves. An overview of the bootstrap is provided by Efron and Tibshirani (1993). 108.4.2 Finite sample properties To evaluate the finite sample properties of the estimators, we conduct simulations of the simulation method. In each of 1000 draws, artificial data are generated from a mixture of three fund distributions, determined by given values of the π fractions and alphas. A given draw is generated by resampling months at random from the hedge fund data, where the relevant fractions of the funds have the associated alphas added to each of their returns, after the sample estimates of their alphas have been subtracted. We use the hedge fund data because, as Table 108.1 suggests, the departures from normality are likely to be greater for hedge funds than for mutual funds, providing a tougher test of the finite sample performance. For each draw of artificial data from the mixture with known parameters, we run the estimation by simulation for a given value of the size of the tests, γ/2. The π fractions are estimated in each of these trials from the three simulations as described above, each using 1000 artificial samples generated by resampling from the one draw from the mixture distribution, treating that draw the same way we treat the original sample when conducting the estimation by simulation. The α-parameters are held fixed for these experiments. Table 108.2 presents the results of simulating the simulations. The π fractions are set to π0 = 0.10, πg = 0.60 and πb = 0.30, and the bad and good alphas are set to −0.138% and 0.262% per month. We report experiments in the Internet Appendix where we set all the π fractions to 1/3, and where we set the π fractions to the banner values reported by BSW: π0 = 0.75, πg = 0.01 and πb = 0.24. These experiments generate broadly similar results. The Avg. estimates in Table 108.2 are the averages over the 1000 draws from the mixture distribution. These capture what we expect to find when applying a given estimator in the simulated economy. The empirical standard deviations are the standard deviations of the parameter estimates, taken across the 1000 simulation draws. This is the variability in the estimators that the reported standard errors should capture. The Root MSE’s are the square roots of the averages over the 1000 draws, of the squared difference
page 3769
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3770 Table 108.2:
Finite sample properties of the estimators. πzero
πgood
πbad
Panel A: γ/2 = 0.05 Population values
0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.417 0.955
0.408 0.052
0.175 −0.007
Empirical SDs Avg. Reported SDs Root MSE
0.270 0.059 0.416
0.246 0.409 0.312
0.240 0.421 0.270
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.039 0.007 0.856
0.038 0.008 0.549
0.014 0.013 0.307
Panel B: γ/2 = 0.10 Population values
0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.266 0.859
0.523 0.133
0.211 0.008
Empirical SDs Avg. Reported SDs Root MSE
0.208 0.247 0.266
0.195 0.313 0.210
0.200 0.414 0.219
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.056 0.010 0.761
0.062 0.012 0.471
0.030 0.018 0.293
Panel C: γ/2 = 0.20 Population values
0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.214 0.784
0.542 0.195
0.245 0.022
Empirical SDs Avg. Reported SDs Root MSE
0.173 0.531 0.207
0.161 0.285 0.172
0.170 0.515 0.178
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.060 0.014 0.686
0.072 0.015 0.412
0.042 0.024 0.281
Panel D: γ/2 = 0.30 Population values
0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.203 0.748
0.549 0.226
0.248 0.026
Empirical SDs Avg. Reported SDs Root MSE
0.167 0.753 0.196
0.151 0.321 0.160
0.166 0.628 0.174 (Continued)
page 3770
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really? Table 108.2:
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
b3568-v4-ch108
3771
(Continued)
πzero
πgood
πbad
0.066 0.020 0.651
0.081 0.020 0.382
0.050 0.030 0.279
Notes: In each of 1000 bootstrap simulation trials artificial data are generated from a mixture of three fund distributions. The “population” values of the fractions of funds in each group, π, shown here in the first row, determine the mixture, combined with the good, zero or bad alpha values that we estimate as the bestfitting values for the full sample period. Hedge fund data over January 1994–March 2012 are used, and the values of the bad and good alphas are −0.138 and 0.262% per month. For each simulation draw from the mixture distribution we run the estimation by simulation with 1000 trials, to generate the parameter and standard error estimates. Standard error estimates are removed for a given trial, when an estimated fraction is on the boundary of a constraint. The empirical SD’s are the standard deviations taken across the remaining simulation draws. The Avg. estimates are the averages over the 1000 draws. The Root MSE’s are the square root of the average over the 1000 trials, of the squared difference between an estimated and true parameter value. γ/2 indicates the size of the tests (the area in one tail of the two-tailed tests). The Classical FDR estimators follow Storey (2002) and BSW, except with a fixed test size.
between an estimated and true parameter value.12 The four panels of the table use different choices for (γ/2). Table 108.2 shows that, under the mixture distribution, the classical estimator of π0 can be severely biased in favor of finding too many zero-alpha funds, and the estimators of the fractions of good and bad funds biased toward zero. When 10% of the funds have zero alphas, the classical estimates are 75–96%, depending on the size of the tests. The bias is smaller at the larger test sizes, as suggested by BSW. Like the classical estimator our approach finds too many zero alpha funds and too few good and bad funds. But our point estimates are much less biased than the classical estimators. Our point estimates are typically within one empirical standard deviation of the true values of the π fractions at the 5% test size. Our estimator is more accurate at the 10% size and slightly more accurate still at the 20% size, where the expected point estimate is within 0.3–0.6 standard errors of the true parameter value.
12
In the event that a parameter estimate is on the boundary of the parameter space (a π fraction is zero or 1.0), we drop the estimated standard error for that simulation trial for the calculations. This choice has only a small effect on the results.
page 3771
July 6, 2020
16:4
3772
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
The results of Table 108.2 are remarkably different from Figure 108.3 in BSW, and from reported simulation and empirical results in the literature for the Storey (2002) estimator. There are several reasons for the differences. While BSW use mutual fund data, we conduct our simulations using hedge fund data that exhibit greater dispersion in alpha, as shown in Table 108.1. We do not employ the additional minimization step described in footnote 7. This difference will make the bias in the BSW approach smaller for smaller test sizes, but as BSW report the results are not changed much by this step for the larger sized tests reported in Table 108.2 Panels C and D. Another difference is that BSW use standard normal p-values to estimate the fractions rejected, Fg and Fb , whereas we use the empirical critical values from the bootstrapped null distribution. Using standard normal p-values the tests will be improperly sized. We compare the critical values in our simulations with the standard normal values and find that the empirical critical values are larger. Using empirical critical values in our analysis as opposed to the normal ones, we get smaller Fb and Fg , and thus larger estimate of π0 in our calculations. When the test sizes γ/2 are 0.3 and 0.4, however, these differences are small. Panel D of Table 108.2 takes the size of the tests to 30% in each tail. This is the approximate test size that BSW advocate. The classical point estimates are improved but remain substantially biased. This means that, for the hedge fund sample, even when we use a large size of the tests in the classical estimator π0,C , there are still too many good and bad funds whose alpha t-ratios fall between tb and tg in the simulations, a situation that BSW assume to be unlikely for mutual funds. Thus, the simulation results imply that the test power for hedge funds is well below 100% even when the test size is set to a large value.13 13
As a numerical example, suppose the power of the tests βg = βb = 0.63 and the confusions δg = δb = 0.12, roughly the case with hedge funds when γ/2 = 0.3 for the chosen alpha values in Panel A of Table 108.5. Given the true π0 = 0.1 in the simulations and based on equation (108.7), E(π0,C ) = π0 +(1−π0 )(1−β−δ)/(1−γ) = 0.1 + (0.9 × 0.25)/0.4 = 66.3%. Here, π0,C is severely biased when the fraction of zero alpha funds is small and the test power is low. On the other hand, if the true π0 is large at say 75% and the test power is as high as 0.98 with no confusion when γ/2 = 0.3, then the bias of π0,C would be (0.25 × 0.02)/0.4 = 1.25%, a very small value. The choice of a large γ in the classical estimator is used to deliver high test power, but in the case of hedge funds, the power is still well below 100% even at γ/2 = 0.3 or 0.4. Our approach allows for imperfect test power. Thus, while the classical FDR method may well suit the mutual fund setting where large γ is associated with high test power as in BSW, our approach has advantages in the setting of hedge funds, in which π0 is relatively small and large γ does not provide near-perfect power.
page 3772
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3773
Table 108.2 also evaluates the standard errors of the estimators. The reported classical standard errors understate the sampling variability of the estimates for all test sizes. Even when γ/2 = 0.20 or 0.30 (Panels C and D), where they perform the best, the average classical standard errors range from 20% to 60% of the empirical standard errors.14 Our standard error estimates are also biased. When the test size is 5%, they are far too small for π0 and too large for πg and πb . When the size is 10%, the standard errors are reasonably accurate for π0 but still too large for πg and πb by 50–100%.15 The empirical standard errors for our estimators in Table 108.2 get smaller as the size of the tests is increased from 5% to 20%. The average empirical standard error at the 20% size is about 60% of the value at the 5% size. This is the opposite of the pattern in the average reported standard errors, which are larger at the larger test sizes. Given this tradeoff, the 10% test size appears to be the best choice in our method. The reported standard errors for π0 are fairly accurate, and they are overstated for πg and πb , and thus conservative.16 The reported standard errors are close to the RMSEs when the test sizes are 10%. The simulations show that the classical estimators display lower sampling variability than our estimators. This makes sense, given their relative simplicity. However, the classical estimators concentrate around biased values. For example, when the size is 20%, the classical estimators RMSEs are larger than for our estimators by 150–300%. When the size of the tests is 30% in Panel D of Table 108.2, the average point estimates, RMSEs and empirical standard errors of our estimators are similar to those in Panel C, but the average reported standard errors are more overstated. The reported standard errors still average only 30–60% of
14
As a check, the reported standard error in BSW (Table II on p. 197) for πb is 2.3%. Our simulations of the reported classical standard errors, when γ/2 = 0.20, average 2.2%. The simulations in BSW do not account for the variance of the factors, while our simulations do capture the factor variances. According to Fama and French (2010), not accounting for the factor variances understates the sampling variability of the estimates. Consequently, understated variability would inflate the test power in the simulations. In addition, while BSW study mutual funds, our simulations here use hedge funds that exhibit greater dispersion than mutual funds (see Table 108.1). 15 We conduct experiments where we set the correlation of the tests across funds to zero, as assumed by BSW, and we find that the standard errors are then an order of magnitude too small. 16 As previously described, our reported standard errors do not reflect sampling variability in the δ and β parameters, but this variation is captured in the simulations. Formally incorporating this variation in the standard errors would make them larger, but the impact would be small as shown in the Appendix.
page 3773
July 6, 2020
16:4
3774
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
the empirical standard errors. The RMSEs are slightly improved, but still larger than the RMSEs of our estimators by 160–300%. Overall, the simulations of the simulation approach lead to several conclusions. First, in finite samples the classical estimator of π0 overstates the fraction of zero-alpha funds, and understates the fractions of good and bad funds, even at the large test sizes. The classical standard errors are understated in finite samples and the mean squared errors can be quite large. Our approach has a smaller finite sample bias, and performs best overall at the sample sizes used here, when the size of the tests is set to γ/2 = 10%. When the size is 10% the point estimates are usually within about one empirical standard error of the true parameter values. Our standard errors are reasonably accurate for π0 but overstated for πg and πb . They are closer to the RMSEs in these cases. We conduct some experiments where we expand the number of time-series observations in the simulations to 5000, in order to see which of the biases are finite sample issues and which are likely inconsistencies. (These experiments are reported in the Appendix.) These large-sample experiments suggest that the classical standard error estimators are consistent. The experiments also suggest that the classical point estimator of π0 is inconsistent. For example, the average estimated value is about 30% when the true value is 10%, and the expected estimate is 49% when the true value is 1/3. Our estimates are much closer to the true values when T = 5000, suggesting that their biases in Table 108.2 are finite sample biases. 108.4.3 Empirical power evaluation The empirical power of a test is usually measured as the fraction of simulation trials in which the test rejects the null hypothesis in favor of the alternative, when the alternative hypothesis is true. To evaluate “power” in a multiple comparisons setting it is natural to measure the expected discovery rates. For example, in a model with zero and positive alphas, the correct discovery rate is the fraction of funds that the tests find to have positive alphas, which actually have positive alphas. The false discovery rate is the fraction of funds that the method detects as good funds, but which actually have zero alphas. We are interested in both the correct discovery rates and the false discovery rates. The total discovery rate is the sum of the correct and false discovery rates. Of course, we cannot know which funds actually have positive and zero alphas, except in a simulation exercise. This section evaluates the discovery rates in simulations, comparing our approach with two classical methods. To simplify we use the two-group model
page 3774
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3775
where there are only good funds and zero-alpha funds. We vary the true fraction πg and the good alpha parameter αg , and keep track of which funds in the simulation actually have good or zero alphas. We record the discovery rates for each method as a function of the parameter values, averaging the number of true and false discoveries across 1000 simulation trials. Like in the previous section we bootstrap from the hedge fund data, but now we form a mixture of the two known distributions and we run a two-group version of our model on each draw of the mixture, treating each draw as if it was the original data. (The equations for the two-group model are given in the Internet Appendix.) We compare the discovery rates of three approaches. The first is a na¨ıve application of the simple t-ratio. Setting the size of the one-tailed test to 10%, an empirical critical value of the t-ratio is found by simulating each fund separately under the null hypothesis that its alpha is zero. The funds whose alpha t-ratios exceed their critical values are discovered to have positive alphas. The calculation is na¨ıve in that it takes no account of the multiple comparisons. The second approach is the classical false discovery rate (FDR) method. Here, the critical value of the t-ratio is adjusted to obtain a desired false discovery rate in the cross-section of funds. We search for a size of the test, γ, so that γ(π0,C )/Fg = 0.10, where π0,C is the classical estimator of the fraction of zero-alpha funds and Fg is the fraction of funds where the null hypothesis is rejected. We select FDR = 10%, following BSW who find (Table V) their best results at this FDR value. Both π0,C and Fg depend on the size of the tests. Simulation under the null hypothesis at the optimal test size determines a critical value for the alpha t-ratio, and all funds with alpha t-ratios in excess of this single critical value are discovered to have positive alphas. Note that since there is no confusion parameter in the two-group model, the classical FDR cannot be biased by confusion in this example. The third approach is our method as described in equations (108.4) and (108.5). For each draw from the mixture of distributions we implement our simulation method, treating that draw like we do the original data. We use a two-group model with a test size of γ = 10%. Using the estimated model parameters, a fund is discovered to have a positive alpha if the posterior probability given its alpha estimate, αp , satisfies P (α > 0|αp ) > P (α = 0|αp ), so that a positive alpha is more likely than a zero-alpha. Table 108.3 summarizes the results of our analysis of the discovery rates. We present results for two choices of the true alpha values. The first, 0.252% per month, is the value from Table 108.5 below, estimated on the hedge funds when the test size is 10%. The second value, 0.861% is the cutoff for
page 3775
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3776
Table 108.3: Fraction
Analysis of discovery rates.
Simple t-test
FDR Method
CF Method
Good αg Good, πg Correct False πg,C Correct False πg,CF Correct False 0.252 0.252 0.252 0.252 0.252 0.252 0.252 0.252 0.252 0.252 0.252
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
0.000 0.032 0.069 0.107 0.132 0.159 0.198 0.231 0.271 0.301 0.333
0.108 0.097 0.085 0.074 0.065 0.054 0.044 0.031 0.020 0.011 0.000
0.008 0.031 0.059 0.090 0.106 0.126 0.158 0.181 0.213 0.236 0.260
0.000 0.031 0.067 0.103 0.126 0.154 0.191 0.224 0.262 0.291 0.324
0.108 0.095 0.082 0.070 0.061 0.051 0.041 0.028 0.019 0.011 0.000
0.078 0.131 0.218 0.325 0.385 0.457 0.569 0.649 0.743 0.801 0.851
0.000 0.013 0.045 0.109 0.166 0.243 0.377 0.513 0.659 0.798 0.907
0.042 0.059 0.095 0.139 0.155 0.174 0.196 0.175 0.135 0.078 0.000
0.861 0.861 0.861 0.861 0.861 0.861 0.861 0.861 0.861 0.861 0.861
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
0.000 0.069 0.138 0.205 0.287 0.335 0.420 0.492 0.576 0.635 0.709
0.109 0.096 0.087 0.077 0.062 0.053 0.044 0.032 0.020 0.011 0.000
0.009 0.078 0.148 0.233 0.284 0.363 0.429 0.500 0.587 0.645 0.719
0.000 0.073 0.143 0.210 0.298 0.366 0.426 0.521 0.617 0.680 0.758
0.110 0.088 0.078 0.070 0.055 0.048 0.041 0.033 0.022 0.012 0.000
0.031 0.106 0.202 0.291 0.404 0.493 0.589 0.687 0.806 0.884 0.971
0.000 0.042 0.117 0.192 0.298 0.389 0.483 0.597 0.742 0.856 0.990
0.013 0.031 0.046 0.056 0.065 0.075 0.079 0.083 0.086 0.070 0.000
Notes: This table presents simulated discovery rates for three methods. The values of the good alpha, αg , in the simulated populations are shown in the first column and the fractions of good funds, πg are shown in the second column. The discovery rates are the fractions of funds in the simulated sample that a test finds to be a positive-alpha fund, averaged over the 1000 simulation trials. The total fraction discovered to be good is the sum of the False discoveries and the Correct discoveries. The three methods are the simple t-test, the classical false discovery rate method (FDR Method) and our approach (CF Method). The symbol πg,C denotes the classical FDR estimate of the fraction of positive-alpha funds averaged across simulation trials. The symbol πg,CF denotes our average estimate. The simulations are based on a parametric bootstrap from a sample of 3620 hedge funds during January 1994–March 2012.
the upper 10% tail of the alphas in the original hedge fund sample, as shown in Table 108.1. The rows in Table 108.3 vary the true fraction of good funds between 0.0 and 100%. Consider the results in the first ten rows. In the first row, when the true fraction of good funds is zero, there are no correct discoveries to be had. The number of false discoveries by both the classical t-ratio and the FDR
page 3776
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3777
method are close to the desired 10%. In row ten, when all of the funds are good, there are no false discoveries to be had. As the fractions of good funds is increased across the rows, the FDR method delivers the smallest number of false discoveries among the three methods, but the classical t-ratio is very close behind. The FDR method is based on the biased classical estimator, which finds too many zero-alpha funds, so it tends to over-correct for false discoveries. For example, when the true fraction of good funds exceeds 50%, the FDR method delivers 5% or fewer false discoveries, even though it is calibrated to a 10% false discovery rate. The FDR method also posts the smallest number of correct discoveries among the methods, topping out at only 32.4% when the true fraction is 100%. The correct discovery rates of the classical t-ratio are slightly better. Interpreting the discovery rates as size and power, this says that both the na¨ıve t-ratio and the FDR method using the classical estimator are undersized: when the desired false discovery rate is 10% the actual false discovery rate is lower than 10%. Our “CF” method turns in the highest rate of correct discoveries. The estimates of the πg fractions are considerably more accurate, and the total discovery rates are usually much closer to the actual fractions of good funds than with the other methods. Our method excels in correct detection, especially when the fraction of good funds reaches and exceeds 50% (55% is the value that we estimate using the two-group model for hedge funds in Table 108.5). The correct discovery rates of the CF method top out at 90.7% when the true fraction is 100%, where the other two methods deliver 33% or less. The cost of the improved power of our CF method is more false discoveries than either the classical t-ratio or the FDR method. When the true fraction of good funds is in the 30–80% range, our method posts false discovery rates in the 14–20% range. It would be possible in future research, to assign utility costs to the different cases of potential misclassification, and our method could then make a different tradeoff between correct and false discoveries. In this example, once the investor has incurred the costs of engaging in active fund selection, the utility cost of falsely discovering a zero alpha fund to be a good fund is likely to be very low, given that the investor’s alternative decision is to invest in a zero alpha fund. The improved ability of our method to correctly detect positive-alpha funds is the important result of this example. The last ten rows of Table 108.3 represent a world in which the good alpha is the value that defines the top 10% in our hedge fund sample. The classical estimates of πg are less biased in this example. This illustrates that while the classical estimator does not refer to the locations of the nonzero
page 3777
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3778
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
alpha funds, its performance under the alternative hypothesis does depend on the locations of the nonzero alpha funds. The FDR method correctly discovers more good funds than it did before. It now outperforms the classical t-test in this regard, and with slightly smaller false discovery rates. Our approach again delivers the best correct discovery rates, matching or beating the other methods for all true fractions of good funds above 40%, and the false discovery rates are also improved. In this exercise our method has false discovery rates below 9% for all values of the true fractions of good funds. In summary, the simulations show that our approach to discriminating between good funds and zero-alpha funds presents an improvement over previous methods, especially when the fraction of good funds is large. By using the full probability structure of the model, we obtain better power to detect funds with nonzero alphas. We examine below the performance of our methods when there are three groups of funds, in rolling estimation on actual data. 108.5 Empirical Results 108.5.1 Mutual funds Table 108.4 presents empirical results for the mutual fund data. Our parameter estimates are compared with the classical FDR estimators. The alphas are estimated using the Fama and French (1996) three-factor model. (We check the sensitivity of these findings to the factor model in a robustness section.) The fractions of managers in the population with zero, good or bad alphas are estimated using our method with 1000 simulation trials. The standard errors for the π fractions, shown in parentheses, are the empirical standard errors from our bootstrap simulations. These are the standard deviations of the parameter values obtained across the 1000 trials of the simulations. The standard errors account for the dependence of the tests across funds (see the analyses in the Appendix). In Panel A of Table 108.4 we first set the good and bad alphas equal to the values specified by BSW and vary the size of the tests. This confirms in the data the importance of using the right test sizes as suggested by BSW and the simulations of the simulations. The fourth and fifth columns show the π fractions using the classical FDR estimators. In our sample period when the test size is 10% the BSW estimators say π0 = 81.3% and πg = −1.7%.17 17
The negative values arise in the BSW calculation because the fraction rejected Fg is smaller than (γ/2)π0 . Our simultaneous approach, using constrained optimization, avoids negative probabilities.
page 3778
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108.4: Alphas (%) Good αg
Bad αb
Size γ/2
3779
Estimated fractions of mutual funds. FDR Calcs. π0
πg
Powers
Confusions
Fractions
βg
δg
δb
π0
πg
74.7 (5.8) 70.4 (8.1) 65.6 (10.8) 61.0 (14.9) 56.9 (22.0) 53.2 (42.1)
0.0 (na) 0.0 (na) 0.0 (na) 0.0 (na) 0.0 (na) 0.0 (na)
βb
Panel A: Estimates for Given Alpha Values and Various Test Sizes 0.317
−0.267
0.025
90.9
−0.8
50.0
40.7
0.3
0.3
0.317
−0.267
0.05
87.0
−1.4
60.4
51.2
0.5
0.4
0.317
−0.267
0.10
81.3
−1.7
73.5
66.9
0.9
0.7
0.317
−0.267
0.20
77.0
−3.9
81.7
80.0
2.1
1.5
0.317
−0.267
0.30
72.3
−4.4
87.7
83.7
3.4
2.5
0.317
−0.267
0.40
67.9
−4.4
91.7
88.5
5.4
3.9
Alphas (%) Good αg
Bad αb
Size γ/2
Fit
Powers
Confusions
Fractions
βg
δg
π0
βb
δb
πg
Panel B: Joint Estimation of Alphas and Fractions in the Populations Unconstrained Alpha Domains, 3-Group Model −0.034 −0.204 0.10 2155 6.9
52.4
1.5
15.4
50.7 6.9 (26.7) (37.2)
Constrained Alpha Domains (αg ≥ 0, αb ≤ 0), 3-Group Model 0.001 −0.173 0.10 3816 10.3 45.7 1.8
10.3
0.0 55.2 (na) (44.5)
2-Group Model 0.0 −0.172
0.10
2862
na
45.5
na
na
Single-Alpha Model na −0.205
0.10
3401
na
na
na
na
48.4 0.0 (37.8) (na) 0.0 (na)
0.0 (na)
Notes: The fractions of funds in the population with specified values of zero, good or bad alphas, are estimated using simulation. The symbol πg denotes the estimated fraction of good funds and π0 denotes the fraction of zero-alpha funds. Alphas are stated in monthly percentage units. The power parameters of the test are βg , the power to reject against the alternative of a good fund, and βb , the power to reject against the alternative of a bad fund. The confusion parameters are δb , the probability of finding a good fund when it is bad, and δg , the probability of finding a bad fund when it is good. All fractions except for the test sizes are stated as (monthly, for the alphas) percentages. Empirical standard errors for the pie fractions are indicated in parentheses, except when a constraint is binding (na). Panel A presents the estimates with pre-set alpha values and various test sizes. Panel B summarizes the joint estimation of the alphas and π fractions using the 10% test size. Fit is the goodness of fit based on the Pearson χ2 statistic. The sample period for mutual funds is January 1984–December 2011 (336 months).
page 3779
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3780
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
As noted by BSW, the estimate of π0 gets smaller as the size of the tests γ/2 increases. We find 67.9% when γ/2 = 0.40. Interpolating, we obtain a value close to BSWs banner estimate of 75% when γ/2 is about 0.25. This reconfirms the appeal of the larger test sizes, as suggested by BSW and by our simulations, for the classical FDR approach. As BSW and Storey (2002) argue, the power parameters β increase as the size of the tests increases, ranging from about 50% to just over 90% in Table 108.4. The confusion parameters δ also increase with the size of the tests, but are 3.9% or less. Our approach delivers smaller estimates for the fractions of zero alpha funds than the classical estimators at each test size. At the preferred 10% test size, our point estimate of π0 is 65.6% (with a standard error of 10.8%), which is fairly close to the FDR estimate of 67.9% when γ/2 = 0.40, given the specified good and bad alpha values. At the size γ/2 = 0.40, the combined values of powers and confusions are close to one, at 95.6% (βg + δb ) and 93.9% (βb + δg ), which suggests a small bias with the classical estimator (see equation (108.7)). We find no evidence for any good mutual funds in the population in Panel A, as all of the πg estimates are 0.0. The inference that there are no good funds is consistent with the conclusions of Fama and French (2010), who simulate the cross-section of alphas for mutual funds under the null hypothesis that the alphas are zero, but do not estimate the π fractions. Our approach simultaneously considers the alpha-locations as well as the fractions of funds. (In Table 108E.2 of the Appendix, we show how the alpha locations affect the estimates of π fractions.) Panel B of Table 108.4 presents the results when the alpha parameters are set equal to the “optimal” values that best fit the cross-section of the t-ratios for alpha in the actual data, as discussed below. We focus on the preferred 10% test size. (In Table 108E.2, we report results for the 5% test size to show that the results are relatively insensitive to this choice.) In the unconstrained domain case, where the search is free to pick positive or negative alpha values, we estimate that 50.7% of the mutual funds have zero alphas, whereas the classical estimates would suggest about 72% (the results of the classical estimator are reported in Table 108E.2). We estimate that the rest of the mutual funds have negative alphas. These results are interpreted more fully in Section 108.5.3. 108.5.2 Hedge funds Table 108.5 repeats the analysis in Table 108.4 for the hedge fund sample. We use the Fung and Hsieh seven-factor model to compute alphas. (Table 108E.3 contains results using the Fama and French three-factor alphas for hedge
page 3780
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108.5: Alphas (%) Good αg
Bad αb
Size γ/2
3781
Estimated fractions of hedge funds.
FDR Calcs. π0
πg
Powers βg
βb
Confusions δg
Fractions
δb
π0
πg
100.0 (na) 40.4 (24.2) 41.0 (19.7) 41.6 (18.6) 37.3 (16.9) 36.9 (20.4)
0.0 (na) 48.2 (21.2) 46.8 (16.1) 45.4 (14.4) 47.1 (12.7) 47.3 (14.7)
Panel A: Estimates for Given Alpha Values and Various Test Sizes 0.317
−0.267
0.025 100
0.0
4.1
4.9
1.8
1.9
0.317
−0.267
0.05
91.3
7.9
20.9
17.8
2.9
2.9
0.317
−0.267
0.10
85.9
12.3
35.7
35.5
5.0
5.0
0.317
−0.267
0.20
77.7
19.5
56.6
52.2
8.0
7.0
0.317
−0.267
0.30
76.1
20.6
64.4
62.2
12.1
11.2
0.317
−0.267
0.40
73.3
22.6
73.1
71.2
16.7
15.5
Alphas (%) Good αg
Bad αb
Size γ/2
Powers Fit
βg
βb
Confusions δg
δb
Panel B: Joint Estimation of Alphas and Fractions in the Populations Unconstrained Alpha Domains, 3-Group Model: 0.252 −0.108 0.10 2225 31.4 17.4 7.0 5.0
Fractions π0
πg
0.0 53.2 (na) (16.8)
2-Group Model: 0.252 na
0.10
4807
31.9
na
na
na
44.6 55.4 (28.2) (33.5)
Single-Alpha Model 0.434 na
0.10
3120
na
na
na
na
0.0 100.0 (na) (na)
Notes: The fractions of funds in the population with specified values of zero, good or bad alphas, are estimated using simulation. The symbol πg denotes the estimated fraction of good funds and π0 denotes the fraction of zero-alpha funds. Alphas are stated in monthly percentage units. The power parameters of the test are βg , the power to reject against the alternative of a good fund, and βb , the power to reject against the alternative of a bad fund. The confusion parameters are δb , the probability of finding a good fund when it is bad, and δg , the probability of finding a bad fund when it is good. All fractions except for the test sizes are stated as (monthly, for the alphas) percentages. Empirical standard errors for the pie fractions are indicated in parentheses, except when a constraint is binding (na). Panel A presents the estimates with pre-set alpha values and various test sizes. Panel B summarizes the joint estimation of the alphas and π fractions using the 10% test size. Fit is the goodness of fit based on the Pearson χ2 statistic. The sample period for hedge funds is January 1994–March 2012 (219 months).
page 3781
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3782
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
funds. The results are similar.) Many of the patterns in Table 108.5 are similar to the results for the mutual funds. For example, in Panel A the BSW estimates of π0 decrease in the test size (γ/2), and the power of the tests increases but remains substantially below 100% even at (γ/2) = 0.40, where the power parameter is 73%. The power parameter is lower than it is for the mutual fund sample, consistent with the greater dispersion in the hedge fund data. The confusion parameters get larger with the size of the tests like in Table 108.4, topping out here at just over 15%. Our estimates of π0 in Panel A of Table 108.5 indicate smaller fractions of zero-alpha hedge funds and thus, more good and bad hedge funds than the classical estimator at any test size. The classical estimator says that 76% of the funds have zero alphas at the preferred 30% test size. Our estimate at the preferred 10% size is 41%. The empirical standard errors of the π fractions, shown in Panel A, are larger for the hedge funds than we saw for the mutual funds. They do not increase with the size of the tests like they did for the mutual funds. (The asymptotic standard errors, however, do increase with the test size for the hedge funds; see Table 108E.3.) Panel B of Table 108.5 presents the results when the true alpha parameters are set equal to the values that best fit the cross-section of the actual alpha estimates in the data. For these alpha values we estimate that very few hedge funds have zero alphas, whereas the classical estimates suggest 76% (see Table 108E.2). We estimate that 53.2% of the hedge funds have positive alphas, while the classical estimator suggests about 22%. These estimates are now described and interpreted. 108.5.3 Joint estimation As discussed above, our inferences about the fractions of good and bad funds in the population are sensitive to the assumptions about the alpha locations of the good and bad funds. Our estimates are sensitive because the power of the tests is strongly sensitive to the alpha locations. The confusion parameters also vary with the alpha values, but with a smaller effect. In this section we search over the choice of the good and bad alpha parameters, and the corresponding estimates of the π fractions, to find those values of the parameters that best fit the distribution of the t-ratios in the actual data, according to the χ2 statistic in equation (108.3). (We consider alternative distance measures in a robustness section.) The best-fitting good and bad alpha parameters minimize the difference between the cross-section of fund alpha t-ratios estimated in the actual data, versus the cross-section
page 3782
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3783
estimated from a mixture of return distributions, formed from the zero, good and bad alpha parameters and the estimated π fractions for each of the three types. The good and bad alpha parameters, αg and αb , are found with a grid search. The search looks from the lower 5% to the upper 95% tail values of the alpha t-ratio estimates in the data, summarized in Table 108.1, with a grid size of 0.001% for mutual funds and 0.005% for hedge funds. At each point in the grid, the π fractions are estimated using simulation. We start with models in which there are three groups, with zero, good and bad alphas. In the first case, the “unconstrained domain” case, the search does not impose the restriction that the good alpha is positive or the bad alpha is negative. The probability model remains valid without these restrictions, so we let the data speak to what are the best-fitting values. Figure 108.2 depicts the results of the grid search for the alpha parameters for hedge funds. The search is able to identify global optima at αg = 0.237, αb = −0.098 when the size of the tests is 5% in each tail. When the size is 10% the values are αg = 0.252, αb = −0.108, as shown in Panel B of Table 108.5. Table 108E.4 presents the joint estimation results using different values for the size of the tests, γ/2. Figure 108.2 reveals “valleys” in the criterion surface where linear combinations of the two nonzero alpha values produce a similar fit for the data. The impression is that three groups based on their alphas are plenty to describe the data, and that even fewer groups might suffice. Based on this impression we do not consider models with more than three groups. The joint estimation for the three-group model applied to mutual funds is summarized in the first row of Panel B of Table 108.4. We find two negative alphas and some zero alpha funds, but no mutual funds with positive alphas. For the 10% test size we estimate that 50.7% of mutual funds are “good” (which here, means zero alpha), 6.9% are “bad” (meaning, alphas of −0.034% per month) and the remaining 42.4% are “ugly” (meaning, alphas of −0.204% per month). Thus, our estimates of mutual fund performance paint a picture that is similar to but somewhat more pessimistic than the estimates in BSW. Similar to BSW, we find that a large fraction of mutual funds have zero alphas and some funds have strong negative alphas. The bad alpha estimate is similar to the −0.267% per month value suggested by BSW, but our negative “good” alpha estimate is much smaller than the 0.317% good alpha value they suggest. In the second row of Panel B of Table 108.4, we summarize the results from repeating the joint estimation for mutual funds, where we constrain the values of the good alpha to be positive and the bad alpha to be negative.
page 3783
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3784
5
x 10 2.5
newfit - best overall fit
2
1.5
1
0.5
0 0.2 0.1 0 -0.1 -0.2 Alfbad
-0.3
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Alfgood
Figure 108.2: Simultaneous estimation of α’s and π’s for hedge funds. This figure depicts the results of grid search for the good and bad alpha parameters for the hedge fund sample. The vertical axis is the goodness of fit based on the Pearson χ2 statistic in equation (108.3). The sample period for hedge funds is January 1994–March 2012 (219 months).
The goodness-of-fit measures are larger, indicating a relatively poor fit to the data compared with the unconstrained case.18 It is interesting that the best-fitting good alphas for the mutual funds are very close to zero: 0.001% per month at the 10% test size. Because the good alphas are so close to zero, the zero-alpha null and the good-alpha alternative distributions are very close to each other. As a result, the power of the tests to find a good alpha and the confusion parameter δb are both very close to the size of the tests. The evidence for mutual funds suggests that the best fitting alphas are either zero or negative, which motivates a simpler model with only two 18
Asymptotically, the goodness-of-fit statistic is Chi-squared with 99 degrees of freedom and the standard error is about 14. The p-values for all of the statistics and the differences across the models are essentially zero.
page 3784
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3785
distributions in the population instead of three. The Internet Appendix describes the model when there are two groups. In the hedge fund sample, we let there be one zero and one positive alpha. The results from the two-group models are summarized in the third row of Panel B of Table 108.4 for mutual funds, and in the second row of Panel B of Table 108.5 for hedge funds. For mutual funds the nonzero alpha is −0.17% per month and the model says that 52% of the mutual funds have the negative alpha. For hedge funds the nonzero alpha is positive: 0.25% per month, and the estimates say that about 55% of the hedge funds have the positive alpha and 45% have a zero alpha. The larger fractions of zero alphas for hedge funds in the two-group model, compared to the three-group model makes sense, as the two-group model best fits the data by assigning a zero alpha to some of the previously negative-alpha hedge funds. The goodness-of-fit measure, however, shows that the two-group models do not fit the cross-section of funds’ alphas as well as the three-group models. Finally, we consider models in which there is only a single value of alpha around which all the funds are centered. The results are summarized in the last row of Panel B of Tables 108.4 and 108.5. For mutual funds, the single alpha is estimated to be negative, at −0.21%. For the hedge funds, the single alpha is estimated to be positive, at 0.43%. For mutual funds the goodness-of-fit statistics say that the one-group model fits the data better than the constrained three-group model or the two-group model, but not as well as the unconstrained three-group model. For hedge funds the threegroup model provides the best fit. In summary, the joint estimation results indicate that smaller fractions of funds have zero alphas and larger fractions have nonzero alphas, compared with the evidence using the classical FDR approach. The difference in the results between the two approaches is larger for hedge funds than for mutual funds. As shown in the simulation exercises, the classical FDR approach tends to overestimate π0 and our approach fares better in correct detection, when the true π0 is small. Thus, our approach is appealing for inferring performance of funds with disperse performance, like hedge funds. 108.5.4 Rolling estimation We examine the models in 60-month rolling estimation periods. Our goals are two-fold. First, we wish to see how stable the model parameters are over time and to detect any trends. Second, the end of each estimation period serves as a formation period for assigning funds annually into one of the three groups. If there is no information in the model’s parameter estimates about
page 3785
July 6, 2020
16:4
3786
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
future performance, the subsequent performance of the three groups should be the same. If the group of positive-alpha (negative-alpha) funds continues to have abnormal performance, it indicates persistence in performance that may have investment value. The first formation period ends in December of 1998 for hedge funds and in December of 1988 for mutual funds.19 Figure 108.3 summarizes the 60-month rolling estimates of the good and bad alphas for both mutual funds (Panel A) and hedge funds (Panel B). These are jointly estimated like in Panel B of Tables 108.4 and 108.5. The bad alphas fluctuate with no obvious trend, but the good alphas show a marked downward trend for mutual funds, and especially for the hedge funds. The smoothness of these graphs, especially for the hedge funds, increases our confidence that the alpha estimates are well identified, despite the 60-month window estimation. For hedge funds the good alpha starts at more than 1% per month and leaves the sample at 0.3%. The ending value is similar to the full sample estimate for the good alpha of 0.25% when the test size is 10%. For mutual funds, both the good and bad alphas are below zero after 2007, consistent with the full sample estimates. It makes sense that the full sample estimates are strongly influenced by the end of the sample period, when there are many more funds in the data. For both kinds of funds the good and the bad alphas get closer together over time. There are reasons to think that fund performance should be worse and more similar across funds in more recent data. BSW (2010) find evidence of better mutual fund performance in the earlier parts of their sample. Cremers and Petajisto (2009) find a negative trend in funds’ active shares over time, and suggest that recent data may be influenced by more “closet indexing” among “active” mutual funds. Kim (2012) finds that the flow-performance relation in mutual funds attenuates after the year 2000, which could be related to the trend toward more similar performance in the cross-section of funds. Next, funds are assigned each year to one of the three alpha groups on the basis of the model parameters estimated during the formation period. We do this in three different ways. The first way uses the classical false discovery rate method. We find critical t-ratios that control the false discovery rates 19
The cross-section includes every fund with at least eight observations (12 for hedge funds) during the formation period. BSW use five year fund performance records and a 60-month survival screen on the funds. Fama and French (2010) criticize the 60-month survival screen, and we prefer not to impose such a stringent survival screen. If we encounter an alpha estimate larger than 100% per month in any simulation trial, we discard that simulation trial.
page 3786
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3787
Panel A: Mutual Funds
Panel B: Hedge Funds
Figure 108.3: 60-month rolling alphas. Panel A depicts the time series of formation period estimates of good and bad mutual fund alphas. Panel B depicts the time series of formation period estimates of good and bad hedge fund alphas. The fund alphas are estimated jointly with the π fractions over 60-month rolling windows. The date shown is the last year of the 60-month formation period. Alphas are stated in monthly percentage units.
in the cross-section of funds, accounting for lucky funds with zero alphas that are found to be good funds, and also for the very lucky bad funds that the test confuses with good. The second approach uses a simple group assignment based on the ranked alphas and the estimated proportions of funds in each group. The results of these two approaches are presented in the Internet Appendix. Here we present the results using Bayesian selection to group the funds. Bayesian methods for fund performance evaluation and fund selection have been used in previous studies, such as Brown (1979),
page 3787
July 6, 2020
16:4
3788
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
Baks, Metrick, and Wachter (2001), P´astor and Stambaugh (2002), Jones and Shanken (2005), and Avramov and Wermers (2006). The assignment using Bayesian selection follows equations (108.4) and (108.5). A fund is assigned to the Good group based on its point estimate of alpha, αp , if f (αp | α > 0) πg > f (αp | α < 0) πb and f (αp | α > 0) πg > f (αp | α = 0) π0 . A fund is assigned to the Bad group if f (αp | α < 0) πb > f (αp | α > 0) πg and f (αp | α < 0) πb > f (αp | α = 0) π0 . For these calculations the densities f (·|·) are the simulated conditional distributions estimated recursively to the end of the formation period, and evaluated at the rolling alpha estimates using standard kernel density estimation as described in the Internet Appendix. Equal-weighted portfolios of the selected funds are examined during a holding period. If a fund ceases to exist during the holding period, the portfolio allocates its investment equally among the remaining funds in the group that month. The holding period is a one-year future period; either the first, second, third or fourth year after formation. The 60-month formation period is rolled forward year by year. This gives us a monthly series of holding period returns for each of the first four years after portfolio formation, starting in January 1999 for the hedge funds and in January 1989 for the mutual funds. The holding period returns for the fourth year after portfolio formation start in January 2002 for the hedge funds, and in January 1992 for the mutual funds. The average returns, their alphas and t-ratios during the holding periods are shown in Table 108.6. The alphas use the Fama and French factor model in the case of mutual funds, and the Fung and Hsieh factor model in the case of hedge funds. We show results for the selected good funds (Good), the funds in the zero-alpha group (Zero) and the funds in the bad alpha group (Bad). We also show the excess returns of a Good–Bad portfolio (G–B). Of course, the G–B excess return is not obtainable when we cannot short mutual funds or hedge funds. This should be interpreted as the difference between the return obtained by identifying the good funds, compared with choosing bad funds. The differences between the means and alphas of the good and bad groups are not equal to the G–B values, because in many years no bad hedge funds are selected or no good mutual funds are selected, and the G-B series uses only those months where funds exist in both groups.20 20
The estimate of the fraction of bad hedge funds is less than 12% in all of the formation years, and either one or zero hedge funds are selected as bad for the first seven years. There are no good mutual funds selected during six of the last 13 formation years.
page 3788
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108.6:
3789
Holding period returns after Bayesian fund selection. Years after formation period 1
Portfolio
N
μ
α
2 T
μ
α
3 T
μ
α
4 T
μ
α
T
Panel A: Hedge Funds during January 1999–March 2012 Good Zero Bad G–B
377.7 0.35 0.24 3.0 245.0 0.38 0.28 3.6 7.8 −0.68 −0.57 −2.3 0.41 0.59 2.3
0.23 0.20 2.4 0.26 0.16 1.9 0.28 0.17 2.0 0.12 0.11 1.3 0.21 0.13 1.3 0.40 0.24 2.3 0.55 0.24 0.9 −0.08 0.16 0.5 0.96 0.78 2.1 0.06 −0.03 −0.1 −0.15 −0.10 −0.4 −0.50 −0.72 −2.0
Panel B: Mutual Funds during January 1989–December 2011 Good Zero Bad G–B
216.2 510.8 290.8
0.19 −0.01 −0.1 0.59 −0.07 −1.9 0.45 −0.13 −2.3 −0.01 0.09 1.0
0.43 −0.11 −1.9 0.59 −0.03 −0.6 0.40 −0.15 −2.6 0.10 0.14 1.8
1.10 −0.07 −1.3 0.44 −0.13 −3.3 0.50 −0.11 −2.7 0.19 −0.07 −1.1
0.71 0.07 1.1 0.66 −0.06 −1.3 0.37 −0.17 −3.5 0.23 0.29 3.8
Notes: A 60-month rolling formation period is used to estimate the model of good, zero alpha and bad funds. The π fractions are estimated by simulation using a test size of 10% in each tail, recursive estimation and 1000 simulation trials. Bayesian selection is used to assign funds to one of three groups, held for evaluation during the next four years. Good is the equal weighted portfolio of funds detected to have high alphas during the formation period, Zero is the portfolio of funds found to have zero alphas and Bad is the portfolio of low-alpha funds. G–B is the excess return of the good over the bad funds, during the months when both exist. N is the average number of funds in the holding portfolio returns during the holding period, taken over all of the formation periods, μ is the sample mean portfolio return during the holding periods, and α is the portfolio alpha, formed using the Fama–French factors for mutual funds and the Fung and Hsieh factors for hedge funds during the holding period. The holding period is one year in length and follows the formation period by one to four years. Mean returns and alphas are stated in monthly percentage units. T is the heteroskedasticity-consistent t-ratio.
The second columns of Table 108.6 show the averages of the numbers of funds in the various portfolios, averaged across the formation years. The smallest group of hedge funds is the bad group. On average, only eight hedge funds are in the bad group, while 378 are in the good group and 245 are in the zero alpha group. For the mutual funds in Panel B, there are many more zero-alpha funds (511) and bad funds (291) than there are good funds (216). Early in the evaluation period there are more good funds, and later in the sample there are more bad funds. A portfolio of all hedge funds has a positive alpha over the first annual holding period, as does the zero-alpha group, both with t-ratios in excess of three. This reflects the good performance of the hedge funds during our sample period. The bad hedge fund group has a negative alpha, −57 basis points per month, and the G–B difference alpha is 59 basis points per month,
page 3789
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3790
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
or about 7% per year, with a t-ratio of 2.3. This compares favorably with the evidence in BSW and our findings using FDR methods to select funds.21 Thus, there is persistence in the hedge fund performance, detectable by our grouping procedures. In the second and later years after portfolio formation, with one exception the three groups become statistically indistinguishable. The results for the mutual funds are summarized in Panel B of Table 108.6. All three groups have negative alphas during the first year, and the bad alpha t-ratio is −2.3. This reflects the poor performance of the mutual funds during our sample period. The G–B difference alpha is 9 basis points per month, with a t-ratio of 1.0. During the second year, the G–B alpha is 14 basis points per month, with a t-ratio of 1.8. With one exception, in the third and fourth years after portfolio formation the three groups become statistically indistinguishable. The evidence for persistence is weaker than it is for the hedge funds. 108.6 Robustness This section describes a number of experiments to assess the sensitivity of our results to several issues. These include the pattern of missing values, the level of noise in the simulated fund returns, a possible relation between funds’ alphas and active management, alternative factor models and return smoothing. More details are provided in the Appendix. 108.6.1 Pattern of missing values There is a potential issue of inconsistency in the bootstrap, as the missing values will be distributed randomly through “time” in the artificial sample, while they tend to occur in blocks in the original data. In fund return data we are much more concerned with cross-sectional dependence and conditional heteroskedasticity, which the simulations do preserve, than we are with serial dependence that can create inconsistency, which is very small in monthly returns. Nevertheless, we conduct an experiment to assess the impact of this issue. 21
BSW use the FDR method for selecting good mutual funds and find alphas of 1.45% per year or less, with p-values for the alphas of 4% or larger. In the Internet Appendix we select funds using our modification of the FDR application, and find that the G–B alpha difference for mutual funds in our sample is eight basis points per month, or 0.96% per year during the first year after portfolio formation, with a t-ratio of 1.4. For hedge funds, our G–B alpha is 13 basis points per month, with a t-ratio of 1.8.
page 3790
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3791
We exploit the fact that the beta and alpha estimates when funds are combined are the results of a seemingly-unrelated regression model (SURM), with the same right-hand side variables for each fund. Thus, equation-byequation OLS produces the same point estimates of the alphas as does the estimation of the full system. We bootstrap artificial data for each fund, i, separately, drawing rows at random from the data matrix (f, rf , ri ), which concatenates the factors (f ), the risk-free rate (rf) and the returns data for fund i, ri . If we encounter a missing value for a fund, we keep searching randomly over the rows until we find one that is not missing, and we include the value with its associated monthly observation for (f, rf ). In this way, we preserve the relation between ri , the risk-free rate and the vector of factors. When the time-series has been filled out for each fund, we have a simulated sample with no missing values. We then form a “Hole Matrix,” H, which is the same size as the original fund sample, with zeros where the original fund data are missing and ones elsewhere. We apply the H matrix to assign missing values in the simulated data for the same months in which they appear in the original data. We estimate the alphas treating this simulated data the same way we treat the original data and the baseline simulation data. We compare the results of this approach with that of our baseline simulation method in the Appendix and find that the Baseline and Hole-preserving simulations deliver similar statistical properties for funds’ residual standard deviations and factor model R-squares. Either method closely reproduces the statistical properties of the original data. The alphas and t-ratios for alpha at various fractiles show that the cross-sectional distributions of alphas and alpha t-ratios that is produced by the two simulation methods are very similar. These results are tabulated in the Appendix. 108.6.2 Choice of goodness-of-fit criterion We assess the sensitivity of our joint estimation results to the use of the Pearson Chi-square goodness-of-fit measure. The Pearson measure has the disadvantage that the number of cells must be chosen. Because the alpha t-ratios are estimates, they have estimation error which can affect the measure. In the Internet Appendix we consider two alternative goodness-of-fit measures. The first alternative measure is the two-sample, Kolmogorov–Smirnov distance: DKS = Supx |F1 (x) − F2 (x)|,
(108.8)
page 3791
July 6, 2020
16:4
3792
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
where F1 (·) and F2 (·) are the two empirical cumulative distribution functions (CDFs) of alpha t-ratios; one from the data and one from the model. This measure looks at the maximum vertical distance between the CDFs. The second alternative measure is the Cramer–von Mises distance, DCvM = Ex {[F1 (x) − F2 (x)]2 },
(108.9)
which looks at the mean squared distance. To implement the alternative measures we combine the observations of the alpha t-ratios from the original data and from a model, rank the values and calculate the two CDFs at each of the discrete points. The results, reported in the Appendix, are similar to those using the original measures. 108.6.3 Are the alphas correlated with active management? Several studies suggest that more active funds have larger alphas (e.g., Cremers and Petajisto 2009, Titman and Tiu 2011, Amihud and Goyenko 2013, Ferson and Mo 2016). In particular, funds with lower factor model regression R-squares are found to have larger alphas. We examine the correlations between the factor model R-squares and the estimated alphas and find a correlation of −0.015 in the mutual fund sample and −0.111 in the hedge fund sample. The mixtures of distributions simulated above do not accommodate this relation. We modify the simulations to allow a relation in the cross-section between alpha and active management, measured by the R-squares in the factor model regressions that deliver the alphas. We sort the funds by their factor model R-squares, and group them into three groups with the group sizes determined by the π-fractions at any point in the simulations, and assign the good alpha first to the low R-square group and the bad alpha first to the high R-square group. Thus, this approach builds in a relation between the alpha and active management, measured by the factor model R-squares. The results are summarized in Table 108E.4, Panel D. We find that this modification improves the goodness-of-fit statistic. The best-fitting alpha parameters are further out in the tails. In the left tail, the bad alpha moves about 1/4 of a percent to the left. The estimates of the good alpha and the fraction of good hedge funds are similar to those in the original design. Fewer fractions of the hedge funds are estimated to have the more pessimistic bad alpha, and the fraction of zero alpha hedge funds increases to 30–44%, which is a significant positive fraction when the test size is 10%. The BSW estimates are similar to those in the original design. The results suggest that incorporating the relatively poor performance of the high R-square hedge
page 3792
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3793
funds improves the fit of the model. This result may not be surprising, but it does suggest that future work on estimation by simulation might profit from building in associations between other fund characteristics and the performance groups. 108.6.4 Alternative alphas While the Fama and French (1996) three-factor model is less controversial for fund performance evaluation than for asset pricing, it is still worth asking if the results are sensitive to the use of different models for alpha. We examine two alternatives for mutual funds: one with fewer factors and one with more factors. The first is the Capital Asset Pricing Model (Sharpe, 1964), with a single market factor and the second is the Carhart (1997) model, which adds a momentum factor. For the hedge fund sample we use the multifactor model of Fung and Hsieh (2001, 2004) in the main tables and try the Fama and French three-factor model as a robustness check. Results using the alternative alphas are similar, and some are reported in the Internet Appendix. 108.6.5 Return smoothing Return smoothing tends to reduce the standard errors of fund returns, can increase alphas by lowering the estimated betas, and may be important for hedge funds (e.g., Asness et al. 2001). We address return smoothing by replacing the estimates of the betas and alphas with Scholes–Williams (1977) estimates. Here we include the current and lagged values of the factors in the performance regressions and the beta is the sum of the coefficients on the contemporaneous and lagged factor. The first thing to check is the impact on the t-ratios for alpha in the original data. The results are presented in the Internet Appendix. The effect of using the Scholes–Williams betas on the alpha t-ratio distributions is small, so we do not further investigate the issue. 108.7 Conclusions We build on the approach to mutual fund classification of Barras, Scaillet and Wermers (BSW, 2010). We simultaneously estimate the fractions of good, zero-alpha and bad funds in the population along with the alpha locations of the three groups. We modify the False Discovery Rate framework of BSW to allow for imperfect test power and confusion, where a test indicates that
page 3793
July 6, 2020
16:4
3794
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
a good fund is bad, or vice versa. We show how to use the model as prior information about the cross-section of funds to predict fund performance. We apply our approach to a sample of US active equity mutual funds and a sample of TASS hedge funds. Large fractions of hedge funds are estimated to have either positive or negative alphas. For mutual funds, a model with only zero and negative alphas best fits the data. Both mutual funds and hedge funds present a trend toward decreasing performance over time in the high-alpha group, while the performance of the low-alpha group shows no trends. We study the finite sample performance of the estimators through a parametric bootstrap simulation of the simulations. We show both analytically and through the simulations that the classical FDR approach finds too many zero-alpha funds and thus too few good and bad funds when the true fraction of zero-alpha funds is small. Our approach offers improved power in the sense of better detection rates for funds with abnormal performance. Our simulation-based empirical standard errors indicate that the confidence intervals around the fractions of good and bad funds are wide. In an example using hedge funds, the classical FDR method implies a two-standard error confidence band of (11.0%, 15.8%) for the fraction of zero alpha funds. However, adjusting for finite sample biases in the standard errors, the confidence band is (21.1%, 99.1%). Despite the low precision, we can say with statistical confidence using our estimators, that there are positive and negative alpha hedge funds in our sample. The mutual funds are a different story, where there is no evidence of positive alphas and strong evidence for negative alphas. Our results motivate future research. For example, one of our robustness checks suggests that more precise inferences might be available by associating fund performance groups with other fund characteristics. Chen, Cliff and Zhao (2017) present some analyses along these lines, and further investigation of this idea seems warranted. Another tack is to find more precise performance measures. We illustrate our approach with the alphas from standard factor models, but our approach can be applied to other fund performance measures, such as holding based measures (e.g., Daniel et al., 1997), stochastic discount factor alpha (Farnsworth et al., 2002; Ferson, Henry, and Kisgen, 2006), measure of value added (Berk and van Binsbergen, 2015), and gross alpha (P´astor, Stambaugh and Taylor, 2015). Each of these measures has been shown to have their own appeals in measuring fund performance.
page 3794
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3795
Bibliography Amihud, Y. and Goyenko, R. (2013). Mutual Fund R2 as a Predictor of Performance. Review of Financial Studies 26, 667–694. Ardia, D. and Boudt, K. (2018). The Peer Performance Ratios of Hedge Funds. Journal of Banking and Finance 87, 351–368. Asness C., Krail, R. and Liew, J. (2001). Do Hedge Funds Hedge? Journal of Portfolio Management 28, 6–19. Avramov, D. and Wermers, R. (2006). Investing in Mutual Funds when Returns are Predictable. Journal of Financial Economics 81, 339–377. Bajgrowicz, P. and Scaillet, O. (2012). Technical Trading Revisited: False Discoveries, Persistence Tests, and Transactions Cost. Journal of Financial Economics 106, 473–491. Bajgrowicz, P., Scaillet, O. and Treccani, A. (2016). Jumps in High-Frequency Data: Spurious Detections, Dynamics, and News. Management Science 62, 2198–2217. Baks, K., Metrick, A. and Wachter, J. (2001). Should Investors Avoid all Actively Managed Mutual Funds? A Study in Bayesian Performance Evaluation. Journal of Finance 56, 45–85. Barras, L., Scaillet, O. and Wermers, R. (2010). False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas. Journal of Finance 65, 179–216. Barras, L., Scaillet, O. and Wermers, R. (2010). Internet Appendix to: “False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas.” Journal of Finance 65, 179–216. Berk, J. and Green, R. (2004). Mutual Fund Flows and Performance in Rational Markets. Journal of Political Economy 112, 1269–1295. Berk, J. and van Binsbergen, J.H. (2015). Measuring Skill in the Mutual Fund Industry. Journal of Financial Economics 118, 1–20. Brown, S.J. (1979). Optimal Portfolio Choice Under Uncertainty: A Bayesian Approach. In Bawa, V.S., Brown, S.J. and Klein, R.W. (Eds.), Estimation Risk and Optimal Portfolio Choice. North-Holland, Amsterdam, pp. 109–144. Carhart, M.M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance 52, 57–82. Chen, Y., Cliff, M. and Zhao, H. (2017). Hedge Funds: The Good, the Bad, and the Lucky. Journal of Financial and Quantitative Analysis 52, 1081–1109. Cremers, M. and Petajisto, A. (2009). How Active is Your Mutual Fund Manager? A New Measure that Predicts Performance. Review of Financial Studies 22, 3329–3365. Criton, G. and Scaillet, O. (2014). Hedge Fund Managers: Luck and Dynamic Assessment. Bankers, Markets & Investors 129, 1–15. Cuthbertson, K., Nitzsche, D. and O’Sullivan, N. (2012). False Discoveries in UK Mutual Fund Performance. European Financial Management 19, 444–463. Daniel, K., Grinblatt, M., Titman, S. and Wermers, R. (1997). Measuring Mutual Fund Performance with Characteristic-Based Benchmarks. Journal of Finance 52, 1035–1058. Das, S.R. (2013). Data Science: Theories, Models, Algorithms and Analytics, A Web Book. Dewaele, B., Pirotte, H., Tuchschmid, N. and Wallerstein, E. (2011). Assessing the Performance of Funds of Hedge Funds. Working Paper, Solvay Brussels School. Efron, B. and Tibshirani, R.J. (1993). An Introduction to the Bootstrap. Chapman & Hall. Elton, E.J., Gruber, M.J. and Blake, C.R. (2001). A First Look at the Accuracy of the CRSP Mutual Fund Database and A Comparison of the CRSP and Morningstar Mutual Fund Databases. Journal of Finance 56, 2415–2430. Evans, R.B. (2010). Mutual Fund Incubation. Journal of Finance 65, 1581–1611.
page 3795
July 6, 2020
16:4
3796
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
Fama, E.F. and French, K.R. (1996). Multifactor Explanations of Asset Pricing Anomalies. Journal of Finance 51, 55–87. Fama, E.F. and French, K.R. (2010). Luck Versus Skill in the Cross Section of Mutual Fund Returns. Journal of Finance 65, 1915–1947. Farnsworth, H., Ferson, W., Jackson, D. and Todd, S. (2002). Performance Evaluation with Stochastic Discount Factors. Journal of Business 75, 473–504. Ferson, W., Henry, T. and Kisgen, D. (2006). Evaluating Government Bond Fund Performance with Stochastic Discount Factors. Review of Financial Studies 19, 423–456. Ferson, W. and Lin, J. (2014). Alpha and Performance Measurement: The Effects of Investor Disagreement and Heterogeneity. Journal of Finance 69, 1565–1596. Ferson, W. and Mo, H. (2016). Performance Measurement with Market and Volatility Timing and Selectivity. Journal of Financial Economics 121, 93–110. Fung, W. and Hsieh, D.A. (2001). The Risk in Hedge Fund Strategies: Theory and Evidence for Trend Followers. Review of Financial Studies 14, 313–341. Fung, W. and Hsieh, D.A. (2004). Hedge Fund Benchmarks: A Risk Based Approach. Financial Analysts Journal 60, 65–80. Genovese, C. and Wasserman, L. (2002). Operating Characteristics and Extensions of the FDR Procedure. Journal of the Royal Statistical Society B 64, 499–517. Genovese, C. and Wasserman, L. (2004). A Stochastic Process Approach to False Discovery Control. Annals of Statistics 32, 1035–1061. Getmansky, M., Lo, A. and Makarov, I. (2004). An Econometric Model of Serial Correlation and Illiquidity in Hedge Fund Returns. Journal of Financial Economics 74, 529–610. Harvey, C. and Liu, Y. (2018). Detecting Repeatable Performance. Review of Financial Studies 31, 2499–2552. Jones, C. and Mo, H. (2016). Out-of-Sample Performance of Mutual Fund Predictors. Working Paper. Jones, C. and Shanken, J. (2005). Mutual Fund Performance with Learning Across Funds. Journal of Financial Economics 78, 507–552. Kim, M.S. (2011). Changes in Mutual Fund Flows and Managerial Incentives. Working Paper, University of New South Wales. Kosowski, R., Timmerman, A., White, H. and Wermers, R. (2006). Can Mutual Fund “Stars” Really Pick Stocks? Evidence from a Bootstrap Analysis. Journal of Finance 61, 2551–2569. P´ astor, L. and Stambaugh, R.F. (2002). Investing in Equity Mutual Funds. Journal of Financial Economics 63, 351–380. P´ astor, L., Stambaugh, R.F., and Taylor, L.A. (2015). Scale and Skill in Active Management. Journal of Financial Economics 116, 23–45. Romano, J., Shaikh, A. and Wolf, M. (2008). Control of the False Discovery Rate Under Dependence Using the Bootstrap and Subsampling. TEST 17, 417–442. Scholes, M. and Williams, J. (1977). Estimating Betas from Nonsynchronous Data. Journal of Financial Economics 5, 309–328. Sharpe, W.F. (1964). Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk. Journal of Finance 19, 425–442. Storey, J.D. (2002). A Direct Approach to False Discovery Rates. Journal of the Royal Statistical Society B 64, 479–498. Titman, S. and Tiu, C. (2011). Do the Best Hedge Funds Hedge? Review of Financial Studies 24, 123–168. White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 48, 817–838.
page 3796
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3797
Appendix 108A Standard Errors Solving equations (108.1) and (108.2) for the π fractions we obtain the estimators: πb = B(Fg − γ/2) + C(Fb − γ/2), πg = D(Fg − γ/2) + E(Fb − γ/2),
(108A.1)
where the constants B, C, D and E depend only on γ, the β’s, and the δ coefficients.22 We assume that by simulating with a large enough number of trials, we can accurately identify the β and the δ parameters as constants. When the power and confusion parameters are equal, the coefficients in (108A.1) imply division by zero, and the π fractions are not identified. Using (108A.1) we compute the variances of the π fractions: Var(πb ) = B 2 Var(Fg ) + C 2 Var(Fb ) + 2BC Cov(Fg , Fb ), Var(πg ) = D 2 Var(Fg ) + E 2 Var(Fb ) + 2DE Cov(Fg , Fb ).
(108A.2)
The variance of the π0 estimator is found from Var(1 − πb − πg ) = Var(πb ) + Var(πg ) + 2Cov(πb , πg ), where the covariance term is evaluated by plugging in the expressions in (108A.1). The standard errors depend on Cov(Fg , Fb ), Var(Fb ) and Var(Fg ). Consider that the fractions Fg and Fb are the result of Bernoulli trials. Let xi be a random variable, which under the null hypothesis that alpha is zero, takes the value 1 if test i rejects the null (with probability γ/2) and 0 otherwise (with probability 1 − γ/2). Then under the null, E(xi ) = (γ/2) = E(x2i ) and Var(xi ) = (γ/2)(1 − γ/2), and we have xi Var(Fg ) = Var(Fb ) = Var (1/N ) i
= (γ/2)(1 − γ/2)(1/N )[1 + (N − 1)ρ], (108A.3) when there are N funds tested, and ρ = [N (N − 1)]−1 j i=j ρij is the average correlation of the tests, where ρij is the correlation between the tests for fund i and fund j. 22
The coefficients are D = (−δb + γ/2)/G, E = (βg − γ/2)/G, B = (βb − γ/2)/G, C = (−δg + γ/2)/G, with G = −(δg − γ/2) (δb − γ/2) + (βb − γ/2)(βg − γ/2). Setting the β parameters equal to 1.0 and the δ parameters equal to 0.0, then 1 − πb − πg in (108A.1) is equal to the estimator used by BSW.
page 3797
July 6, 2020
16:4
3798
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
We proxy for the correlation ρ by the average of the pairwise correlations of the mutual fund returns, adjusted for the extent of data overlap among the fund returns. The adjustment to the average correlation assumes that the correlations of tests for funds with no overlapping data are zero. The estimated correlation ρ is 0.044 for the mutual fund sample and 0.086 in the hedge fund sample. BSW estimate the same average correlation in their mutual fund sample, adjusted for data overlap (p. 193), of 0.08 (0.55) = 0.044. To derive the standard errors we introduce indicator variables for tests rejecting the null hypothesis that fund i has a zero alpha in favor of the alternative that fund i is a good fund: xig = I(ti > tg ), where ti is the t-statistic for fund i’s alpha and tg is the empirical critical value for the onesided t-test, computed by simulation under the null hypothesis. Similarly, xib = I(ti < tb ), where tb is the empirical critical value for the alternative of a bad fund. Then, Fb = (1/N ) xib and Fg = (1/N ) xig and the variances and covariances of the sums are computed as functions of the variances and covariances of the xib and xig . We generalize E(xi ) above to consider the conditional expectations of xig and xib given each of the three hypothesized values for the alpha parameters. The unconditional expectations of the x’s are then computed as the averages of the conditional expectations, given the three subpopulations, weighted by the estimated π fractions. We find that the use of an asymptotic normal approximation in these calculations provides improved finite sample performance for the standard errors. We introduce indicator variables for the event where a test rejects the null hypothesis that fund i has a zero alpha in favor of the alternative that fund i is a good fund: xig = I(ti > tg ), where ti is the t-statistic for fund i’s alpha and tg is the empirical critical value for the one-sided t-test, computed by simulation under the null hypothesis. Similarly, xib = I(ti < tb ), where tb is the empirical critical value for the alternative of a bad fund. Then, Fb = (1/N ) xib and Fg = (1/N ) xig and the variances and covariances are computed as functions of the variances and covariances of the xib and xig . We use the fact that the expectation of an indicator variable is the probability that it takes the value 1.0, and the expected value is the expected value of the square. We compute Var(xig ) = E(x2ig ) − E(xig )2 = E(xig ) − E(xig )2 , and Cov(xib , xig ) = −E(xib )E(xig ). Since our calculations allow for dependence across funds, the standard errors depend on covariance terms for funds i = j: Cov(xib , xjb ), Cov(xig , xjg ) and Cov(xib , xjg ). We make an asymptotic normality assumption for the t-ratios and use the bivariate normal probability function with
page 3798
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3799
correlation ρ described in the main text to compute these covariances, as well as the expectations like E(xib ). The unconditional expectations of the x’s are computed as the averages of the conditional expectations, given the three subpopulations, weighted by the estimated π fractions. To compute the probability that the t-ratio exceeds a critical value we require the expected t-ratio given the hypothesized value of alpha. These conditional expected values of the t-ratios in the subpopulations are approximated as μg = αg /σ(α), where αg is the alpha value assumed for the good funds, and σ(α) is the average across all funds, of the consistent standard error estimate for alpha. The expected values for the t-ratios of the bad funds are similarly approximated as μb = αb /σ(α).23 Let F (x, y) be the lower tail region of the bivariate normal cumulative distribution function with correlation equal to ρ. The calculations for the covariances are illustrated with the following example. Cov(xib , xjg ) = E(xib xjg ) − E(xib )E(xjg ) = F (tb , −tg )π02 + F (tb , μg − tg )π0 πg + F (tb , μb − tg )π0 πb + F (tb − μb , −tg )πb π0 + F (tb − μb , μg − tg )πb πg + F (tb − μb , μb − tg )πb2 + F (tb − μg , −tg )πg π0 + F (tb − μg , μg − tg )πg2 + F (tb − μg , μb − tg )πg πb − E(xib )E(xjg ). (108A.4) Note that in (108A.4) we have used the symmetry of the normal density, implying Pr(t > tg ) = Pr(−t < −tg ). When symmetry is used for only one of the arguments of F (·, ·), the correlation is −ρ instead of ρ. We evaluate the expectations like E(xib ) = F (∞, tb )π0 +F (∞, tb −μb )πb +F (∞, tb −μg )πg . We use the asymptotic normality assumption here in the calculations, instead of the estimated β and δ parameters. The bivariate normal probabilities do not exactly match the empirical β and δ parameters, which are estimated under nonnormal distributions, and using them we would have either to empirically estimate all of the joint probabilities in (108B.3) by simulation, or make other strong simplifying assumptions. We also develop a version of our standard error estimator that avoids the asymptotic normality assumption for the t-ratios. We estimate E(Fb ) = 23 Since we assign the nonzero alphas to funds randomly in our bootstrap simulations we use the unconditional standard errors of the alphas here. In the robustness section in the main text, we build in a relation between the standard deviations and the alphas in the different fund groups.
page 3799
July 6, 2020
16:4
3800
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
E(xib ) = (γ/2)π0 + δb πg + βb πb , and similarly for E(Fg ) = E(xig ). We estimate Var(xib ) = E(xib ){1 − E(xib )} and use the correlation, ρ, described above to approximate Cov(xib , xjg ) and Cov(xib , xjb ) as the correlation times the product of the standard deviations. However, when we simulate the simulations to evaluate the finite sample performance of this version of the standard errors, we find that they perform much worse. In the two-distribution example for mutual funds, the standard errors of π0 and πb are equal. The standard errors follow from equation (108A.4) with Var(Fb ) = Var(xib )[(1/N ) − ρ(1 − 1/N )], Var(xib ) = E(xib )[1 − E(xib )] and E(xib ) = π0 CDFN(tb ) + CDFN(tb − μb ) (1 − π0 ), where CDFN(·) is the standard normal cumulative distribution function. These expressions for the variances of the pie fractions hold when the constraints that the fractions are positive and sum to less than 1.0 are not binding. When the constraints are binding the distribution of the estimators is complicated by the truncation, and it involves the sampling variance of the Lagrange multipliers. We do not report standard errors when the constraints are binding. The standard errors in BSW are based on equation (108A.1), which implies Var(π0,C ) = Var(F )/(1 − γ)2 ,
(108A.5)
where Var(F ) is computed as the sum of Bernoulli random variables, specialized to the case of a two-sided test, with F = Fg + Fb , and ignoring dependence across the tests (ρ = 0). Thus, BSW use Var(F ) = γ(1 − γ)/N when computing their standard errors. Similarly, their estimate of πg is Fg − (γ/2)π0,BSW , and they use Var(πg,C ) = Var(Fg ) + Var(π0,C ) ∗ (γ/2)2 − 2(γ/2)Cov(Fg , π0,C ), with Var(Fg ) = Fg (1 − Fg )/N and Cov(Fg , π0,C ) = Fg (1 − F )/[N (1 − γ)]. It follows that Var(πb,C ) = Var(Fg ) + Var(π0,C )(1 − γ/2)2 + (1 − γ/2)Cov(Fg , π0,C ). The BSW calculations do not use the asymptotic normal approximations that we employ, and they do not arrive at the unconditional expectations by averaging over the conditional expectations given each fund group, weighted by the estimated fractions in each group, as in our standard error estimators. Appendix 108B More on the Relation to Previous Approaches This appendix describes in more detail than in the main text how our approach modifies the framework of BSW (2010) and Storey (2002). Consider the case of a two-tailed test with size γ and power β < 1, and let
page 3800
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3801
F = (Fb + Fg ), where Fb and Fg are the fractions of funds where the test rejects the null hypothesis of zero alphas in favor or bad or good alphas, respectively. Then: 1 − E(F ) = P (do not reject |Ho )π0 + P (do not reject |Ha )(1 − π0 ) = (1 − γ)π0 + (1 − β)(1 − π0 ).
(108B.1)
Solving (108B.1) for π0 gives the estimator: π0∗ = [β − E(F )]/(β − γ).
(108B.2)
BSW estimate the fraction of zero-alpha funds, π0 , following Storey (2002). Storey’s classical estimator for π0 is, in our notation:24 π0,C = [1 − (Fb + Fg )]/(1 − γ).
(108B.3)
As equation (108B.3) indicates, this estimator assumes that the fraction of zero-alpha funds in the population, multiplied by the probability that the test will not reject the null of zero alpha when it is true, is equal to the fraction of funds for which the null of zero alpha is not rejected in the actual sample. But a test will also not reject the null for cases where alpha is not zero, if the power of the tests is below 100%. This motivates our modification to the case where β < 1. Equation (108B.3) is a special case of (108B.1) when β = 1. Storey motivates β = 1 as a “conservative” choice, justified by choosing the size of the tests to be large enough. Comparing the two estimators, E(π0,C − π0∗ ) > 0 when E(F ) > γ and β > γ, so this classical estimator is likely biased in favor of large values of π0 when the power of the tests is below one. Our estimators in the three-group model are found by solving the two equations (108.1) and (108.2) in the main paper, subject to the constraints that the π fractions are probabilities. When the constraints are not binding, the solutions are: πg = B(Fg − γ/2) + C(Fb − γ/2), πb = D(Fg − γ/2) + E(Fb − γ/2),
24
(108B.4)
In the estimation of π0 , BSW follow Storey (2002), choosing the value of γ such that in simulations, the sum of squares of the π0 estimator is minimized around the minimum π ˆ0 value found for all γ values. This obviously reduces the estimate and the bias in the estimator of the fraction of zero alpha funds. BSW find that setting γ to 0.5 or 0.6 produces similar results, and that these results are not highly sensitive to the γ value. In our paper, the value γ is also used as the threshold (denoted as λ in Storey (2002) and BSW) to estimate π0,C , as BSW suggest a similarly large value (such as 0.5 or 0.6) for the threshold.
page 3801
July 6, 2020
16:4
3802
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
where the coefficients are D = (−δb + γ/2)/G, E = (βg − γ/2)/G, B = (βb −γ/2)/G, C = (−δg +γ/2)/G, with G = −(δg −γ/2) (δb −γ/2) + (βb −γ/2) (βg − γ/2). Our estimator for π0 derives from (108A.4) as 1 − πb − πg . Setting the β parameters equal to 1.0 and the δ parameters equal to 0.0, then 1 − πb − πg in (108B.4) is equal to the classical FDR estimator π0,C . At these parameter values, assuming Fb > γ/2 and Fg > γ/2, we find that ∂π0 /∂δb > 0 and ∂π0 /∂δg > 0. Setting the δ-parameters to zero biases the estimator of π0 towards zero. There are two offsetting biases. Setting the β-parameters to 1.0 creates an upward bias in the estimator, while setting the δ-parameters to 0.0 creates a downward bias in the estimator. In the BSW analysis, a central focus is estimating the fractions of skilled and unskilled managers, πg and πb . We would call this high and lowperformance funds, acknowledging that the after-cost alphas used in both of our studies are better measures of the performance available to investors than they are measures of fund skill. BSW estimate the fractions by subtracting the expected fraction of “lucky” funds, P (reject at tg |Ho )π0 = (γ/2)π0 , from the observed fraction of funds where the null hypothesis of zero alpha is rejected in favor of the alternative of a positive alpha. In our notation the classical estimator for πg is Fg − (γ/2)π0,C , where π0,C . The classical estimator separates skill from luck by subtracting from Fg , the fraction of “lucky” funds, that the test says have positive alphas but which actually have zero alphas, (γ/2)π0,C . Our approach can better separate skill from luck, because we also consider the expected fraction of “extremely lucky” bad funds, where the test is confused and indicates that the fund is good. Appendix 108C A Two-Distribution Model With two distributions we use two-sided tests of the null hypothesis. Solving the probability model for an estimator of π0 in the two-distribution model with only a zero alpha and a positive alpha, the parameters βb , δb and δg are no longer relevant and we obtain: π0 = [βg − E(Fg )]/(βg − γ),
(108C.1)
where γ is the size of the one-tailed test of the null of zero alpha against the alternative of a positive alpha. We use this version of the model in our simulations that examine the discovery rates in the main paper. The twodistribution model with only a zero and a bad alpha modifies (108C.1) in the obvious way. The standard error of the estimator follows from (108C.1) and the variance of Fb .
page 3802
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3803
Appendix 108D Kernel Smoothing We employ a kernel smoothing function to estimate the simulation-generated empirical densities of the alphas, conditional on each subpopulation of funds. Let the sample of the relevant cross-section of funds’ alpha estimates be {xi }i=1, ... n . The estimator for the density function evaluated at some particular value, x, is: K((x − xi )/h), (108D.1) f (x) = (1/nh) i
where the kernel function K(·) must be strictly positive and integrate to 1.0. The symbol h denotes the bandwidth parameter and n is the number of observations. We choose the Epanechnikov optimal kernel function: K(u) = (3/4)(1 − u2 )I(|u| < 1),
(108D.2)
with the bandwidth parameter that approximately minimizes the mean integrated squared error of the kernel approximation for second-order kernels (see Hansen (2009, Section 2.7)), which is 0.374 in our application when the number of funds in the sample is 3865. When the conditional distributions are estimated from simulations and there are 1000 trials, there are 3,865,000 observations of the xi , which is unwieldy. To keep the problem of manageable size, we use the 3865 observations from the first simulation trial. We experiment with concatenating the observations from the first k simulation runs, and adjust the bandwidth of the kernel according to Silverman’s rule of thumb, multiplying it by k(−1/5) . We also experiment with using the means of the simulated alphas for each fund taken across the simulation trials to characterize the conditional distributions. Neither of these alternatives changes the results much in the full sample. In the rolling, 60-month analysis, k = 1 produces some instability across simulation trials. We find that recursive estimation with k = 2 produces more stable results, so we recursive estimation with k = 2. Appendix 108E Robustness Results Details Tables 108E.1–108E.5 present ancillary results described in the main text. This section provides additional information for interpreting these tables. Table 108E.1 presents summary statistics for the hedge fund returns, comparing the effects of different benchmarks, different numbers of required observations and the effects of return smoothing. In the main text, Table 108.1, the minimum number of observations is 8, while here it is 12. The summary statistics show that the cross-sectional distributions of
page 3803
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3804 Table 108E.1:
Summary statistics with alternative benchmarks.
Hedge fund returns: January 1994–March 2012 (219 months) Fractile SW 0.010 0.050 0.100 0.250
α Estimates (%)
Monthly returns (%)
α t-ratios
Nobs
Mean
Std
Rho1
3-fac
7-fac
3-fac
7-fac
7-fac
208 154 125 83
2.52 1.46 1.15 0.75
14.31 8.57 6.65 4.21
0.61 0.50 0.43 0.30
2.147 1.042 0.736 0.342
2.200 1.140 0.811 0.431
6.395 3.369 2.485 1.287
6.787 3.380 2.959 1.598
7.326 3.932 2.919 1.682
2.63
0.17
0.030
0.119
0.110
0.456
0.504
1.74 0.03 −0.307 1.23 −0.11 −0.813 0.97 −0.20 −1.296 0.58 −0.33 −2.626
−0.186 −0.672 −1.079 −2.786
−0.931 −1.850 −2.436 −4.031
−0.605 −1.642 −2.248 −3.742
−0.668 −1.806 −2.486 −4.012
Median
48
0.38
0.750 0.900 0.950 0.990
28 18 15 12
−0.01 −0.56 −1.07 −2.61
Notes: Monthly returns are summarized for hedge funds, stated in monthly percentage units. The values at the cutoff points for various fractiles of the cross-sectional distributions of the sample of funds are reported. Each column is sorted on the statistic shown. Nobs is the number of available monthly returns, where a minimum of 12 are required. Mean is the sample mean return, Std is the sample standard deviation of return, reported as monthly percentages, and Rho1 is the first-order sample autocorrelation in raw units. The alpha (α) estimates and their t-ratios are based on OLS regressions using the Fama–French three factors (denoted 3-fac) or the Fung and Hsieh seven factors (denoted 7-fac) and heteroskedasticity-consistent standard errors for the hedge funds. The t-ratios denoted 7-fac SW use Scholes Williams betas, and increase the minimum number of required observations to 19. Table 108E.2:
Estimated fractions of funds with various alpha values.
Panel A: Mutual Funds Monthly Alphas (%) Good αg
Bad αb
Size γ/2
FDR Calcs:
Powers
Confusions
π0
βg
δg
πg
βb
Fractions
δb
π0
πg
0.0 (na) 63.2 (34.1) 75.9 (6.1) 77.7 (4.1)
0.0 (na) 0.0 (na) 0.0 (na) 0.0 (na)
Estimates for Given Test Sizes and Various Alpha Values −0.084
−0.104
∗
72.6
−4.6
4.1
29.5 3.3
25.6
0.031
−0.234
∗
72.4
−4.5
14.3
57.9 1.1
7.3
0.170
−0.415
∗
71.7
−4.3
45.2
81.5 0.4
1.8
0.270
−0.573
∗
72.7
−4.5
64.6
91.1 0.2
0.9
(Continued)
page 3804
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108E.2: Alphas (%) Good αg
Bad αb
Fit
Size γ/2
(Continued)
Powers βg
3805
βb
Confusions
Fractions
δg
π0
δb
πg
Joint Estimation of Alphas and Fractions in the Populations Unconstrained Alpha Domains, 3-Group Model −0.087
−0.305
2509
0.05
2.1
58.7
0.4
15.6
−0.034
−0.204
2155
0.10
6.9
52.4
1.5
15.4
38.9 4.4 (14.6) (22.1) 50.7 6.9 (26.7) (37.2)
Constrained Alpha Domains (αg ≥ 0, αb ≤ 0), 3-Group Model 0.004
−0.162
3925
0.05
5.6
30.0
1.0
4.8
0.001
−0.173
3816
0.10
10.3
45.7
1.8
10.3
33.4 12.2 (28.3) (35.3) 0.0 55.2 (na) (44.5)
2-Group Model 0.00
−0.139
4694
0.05
na
24.0
na
na
0.0
−0.172
3862
0.10
na
45.5
na
na
29.3 (23.2) 48.4 (37.8)
0.0 (na) 0.0 (na)
0.0 (na) 0.0 (na)
0.0 (na) 0.0 (na)
Single-Alpha Model na
−0.220
3431
0.05
na
na
na
na
na
−0.205
3401
0.10
na
na
na
na
Panel B: Hedge Funds Monthly Alphas (%) Good αg
Bad αb
Size γ/2
FDR Calcs: π0
πg
Powers βg
βb
Confusions δg
δb
Fractions π0
πg
Estimates for Given Test Sizes and Various Alpha Values 0.040
0.020
∗
75.8
22.3
12.8
9.4
11.6
8.7
0.342
−0.307
∗
76.6
21.8
40.4
39.2
4.2
4.1
0.736
−0.813
∗
75.0
22.6
68.2
73.5
2.1
2.4
1.042
−1.296
∗
76.5
22.1
80.0
85.8
1.4
1.8
0.0 100 (na) (na) 53.3 39.8 (18.8) (13.7) 77.6 20.6 (9.1) (7.2) 81.5 17.1 (6.4) (4.7) (Continued)
page 3805
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3806
Table 108E.2: Alphas (%) Good αg
Bad αb
Fit
Size γ/2
(Continued) Power
βg
βb
Confusions
Fractions
δg
δb
π0
πg
0.0 (na) 0.0 (na)
51.7 (24.5) 53.2 (16.8)
Joint Estimation of Alphas and Fractions in the Populations Unconstrained Alpha Domains, 3-Group Model 0.237
−0.098
2633
0.05
15.9
7.6
4.0
3.1
0.252
−0.108
2225
0.10
31.4
17.4
7.0
5.0
2-Group Model 0.287
na
4073
0.05
18.2
na
na
na
0.252
na
4807
0.10
31.9
na
na
na
45.5 54.5 (32.9) (37.1) 44.6 55.4 (28.2) (33.5)
Single-Alpha Model 0.452
na
3160
0.05
na
na
na
na
0.434
na
3120
0.10
na
na
na
na
0.0 (na) 0.0 (na)
100.0 (na) 100.0 (na)
Notes: The fractions of managers in the population with specified values of zero, good or bad alphas are estimated using simulation. The symbol πg denotes the estimated fraction of good funds and π0 denotes the fraction of zero-alpha funds. Alphas are stated in monthly percentage units. The power parameters of the test are βg , the power to reject against the alternative of a good fund, and βb , the power to reject against the alternative of a bad fund. The confusion parameters are δb , the probability of finding a good fund when it is bad, and δg , the probability of finding a bad fund when it is good. All fractions except for the test sizes are stated as (monthly, for the alphas) percentages. Empirical standard errors for the pie fractions are indicated in parentheses, except when a constraint is binding (na). ∗ denotes that the test sizes for the classical FDR estimator is 30% in each tail, and the size is 10% for our estimators. In Panel A, the sample period for mutual funds is January 1984–December 2011 (336 months). In Panel B, the sample period for hedge funds is January 1994–March 2012 (219 months).
the means, standard deviations and autocorrelations are quite similar. The cross-sectional distributions of the alpha estimates and t-alphas are presented for hedge funds using the Fama and French (1996) three-factor model. For ease of comparison, the results using the Fund and Hsieh seven-factor model as in the main text are shown here as well. However, the minimum number of observations here is 12, so comparing these figures with the main text shows the small impact of requiring more observations. The three-factor and
page 3806
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3807
Table 108E.3: Estimated fractions of hedge funds with specified values of nonzero alphas: Using the Fama and French factors. Monthly Alphas (%) Good αg
FDR Calcs:
Power
Confusions
Bad αb
γ/2
π0
πg
βg
βb
δg
δb
0.317
−0.267
0.025
92.2
6.1
17.5
14.2
1.2
1.1
0.317
−0.267
0.05
85.8
9.6
29.5
25.2
2.0
1.8
0.317
−0.267
0.20
71.8
17.2
58.4
54.3
6.6
5.8
0.317
−0.267
0.30
66.4
20.0
68.0
64.5
10.6
9.3
0.317
−0.267
0.40
69.3
18.4
76.0
72.5
15.4
13.1
0.040
0.020
0.05
85.9
9.6
6.9
4.8
6.1
4.2
0.342
−0.307
0.05
86.4
9.6
31.8
28.6
1.8
1.7
0.736
−0.813
0.05
85.9
9.6
62.2
68.2
0.8
0.9
1.042
−1.296
0.05
85.8
9.6
75.1
83.1
0.5
0.7
Fractions π0
πg
41.3 40.7 (7.5) (5.2) 35.1 39.6 (15.7) (9.8) 25.7 41.6 (38.7) (16.0) 17.5 45.1 (60.0) (17.1) 28.5 39.2 (121) (17.6) 0.0 100 (na) (na) 45.2 35.5 (13.7) (9.3) 76.9 16.1 (6.3) (4.9) 81.2 13.1 (6.3) (4.8)
Notes: The fractions of managers in the population with specified values of alphas, which are either zero, good or bad, are estimated using simulation as described in the text. πg denotes the estimated fraction of good funds, πb denotes the fraction of bad funds and π0 denotes the fraction of zero-alpha funds. All alphas are stated in monthly percentage units. The power parameters of the test are denoted as βg , the power to reject against the alternative of a good fund, and βb , the power to reject against the alternative of a bad fund. The confusion parameters are denoted as δb , the probability of finding a good fund when it is bad, and δg , the probability of finding a bad fund when it is good. All fractions are stated as percentages. The reported asymptotic standard errors are in parentheses. These are not applicable when a constraint is binding (na). The sample period for hedge funds is January 1994–March 2012 (219 months).
seven-factor models produce similar cross-sectional distributions of alphas and their t-ratios. Finally, the impact of return smoothing is examined by running the seven-factor model using the Scholes Williams beta estimator. The cross-sectional distribution is very similar to that of the seven-factor model without the lagged betas included. Table 108E.2 reports the results of two additional analyses mentioned in the main text. The first one examines the effects of the alpha locations on the estimation results; and the second reports the robustness of the joint estimation results to the use of the 5% test size.
page 3807
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3808
In Panel A of Table 108E.2 we fix the size of the tests and vary the locations of the good and bad alphas. We use the preferred sizes: γ/2 = 0.10 for our estimator, and γ/2 = 0.30 for the classical estimator. We first examine mutual funds and then hedge funds. The first row examines the case where we set the alpha values to the median of the estimated alphas across all of the mutual funds, plus or minus 0.01%. For these values the power parameters of the tests are small (4.1% and 29.5%) because the null and alternatives are very close, and the δ errors can be large. The value of δb is 25.6%, indicating a high risk of rejecting the null in favor of a bad fund when the fund is truly good. Thus, in settings where the null and alternatives are close together, our modification of the model to account for nonzero confusion parameters could be important. Then, we examine values for the alphas that correspond to the estimates at the boundaries of the 25%, 10% and 5% tail areas in the actual sample, as shown in Table 108.1. The classical FDR estimates of the fraction of zero-alpha mutual funds are very similar across the rows, at 72–73%, as they do not refer to the locations of the good and bad fund alphas. Our estimates of the π fractions and the standard errors are highly sensitive to the alpha values. The π0 estimate Table 108E.4: models. γ/2
Simultaneous estimation of alpha parameters in three-distribution
Good αg
Bad αb
βg
βb
δg
δb
πg
πb
0.000 0.287 0.312 0.044 0.069 0.190 0.165 0.280
1.000 0.351 0.426 0.567 0.424 0.349 0.411 0.296
0.000 0.341 0.502 0.517 0.532 0.564 0.652 0.517
1.000 0.266 0.000 0.483 0.468 0.436 0.348 0.441
Panel A: Mutual Funds: January 1984–December 2011 (336 months) 0.005 0.010 0.025 0.050 0.100 0.200 0.300 0.400
−0.074 −0.074 −0.034 −0.087 −0.034 −0.064 −0.064 −0.094
−0.184 −0.184 −0.184 −0.305 −0.204 −0.224 −0.234 −0.334
0.001 0.006 0.016 0.021 0.069 0.051 0.174 0.210
0.001 0.134 0.254 0.587 0.524 0.554 0.798 0.928
0.001 0.003 0.006 0.004 0.015 0.012 0.043 0.033
0.001 0.030 0.045 0.156 0.154 0.198 0.476 0.632
Panel B: Hedge Funds: January 1994–March 2012 (219 months) 0.005 0.010 0.025 0.050 0.100 0.200 0.300 0.400
0.260 0.280 0.380 0.237 0.252 0.230 0.167 0.230
−0.220 −0.030 0.030 −0.098 −0.108 −0.200 −0.123 −0.220
0.002 0.056 0.219 0.159 0.314 0.469 0.508 0.702
0.002 0.009 0.225 0.076 0.174 0.495 0.449 0.671
0.001 0.011 0.030 0.004 0.070 0.076 0.195 0.191
0.001 0.006 0.010 0.031 0.050 0.082 0.163 0.167
(Continued)
page 3808
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108E.4: Alphas (%) Good αg Bad αb
Fit
3809
(Continued)
Size BSW Calcs: γ/2 π0 πg
Power βg
βb
Confusions
Fractions
δg
π0
δb
πg
Panel C: Hedge Funds, Baseline Case, Unconstrained Alpha Domains, 3-Group Model 0.237
−0.098
2633 0.05
90.9
8.2
0.252
−0.108
2225 0.10
84.1
13.4
15.9
7.6 4.0
3.1
31.4 17.4 7.0
5.0
0.0 51.7 (na) (24.5) 0.0 53.2 (na) (16.8)
Panel D: Results When Alphas are Associated with Active Management 0.272
−0.373
1068 0.05
92.1
8.3
18.4 26.7 2.4
2.6
0.282
−0.308
2081 0.10
85.8
13.2
35.0 39.0 4.2
4.5
29.7 (23.3) 43.8 (17.3)
58.9 (23.8) 51.4 (17.5)
12.4 (28.1) 0.02 (29.0)
58.6 (25.6) 28.2 (13.1)
Panel E: Results Accounting for Error Variance in Estimated Alphas 0.209
−0.083
1596 0.05
95.2
6.1
0.292
−0.038
1413 0.10
84.2
13.1
19.3
7.5 3.6
2.6
36.7 12.0 8.8
4.3
Notes: The true good and bad alpha parameters, αg and αb , are estimated for each test size γ/2, simultaneously with the fractions of managers in the population having the specified alphas. A grid search over the true good and bad alpha parameters looks from the median alpha value in the data from Table 108.1, to the upper or lower 5% tail values in the data, with a grid size of 0.001% for mutual funds and 0.005% for hedge funds. The best fitting good and bad alpha parameters are shown, where the fit is determined by minimizing the difference between the cross-section of fund alphas estimated in the actual data, versus the cross-section of alphas estimated from a mixture of return distributions, formed from the good and bad alpha parameters and the estimated π fractions for each of the three types. At each point in the grid, the π fractions are estimated using simulation as described in the text, with 100 simulation trials. πg denotes the fraction of good funds and πb denotes the fraction of bad funds. All alpha parameters are stated in monthly percentage units. The power parameters of the test are denoted as βg , the power to reject against the alternative of a good fund, and βb , the power to reject against the alternative of a bad fund. The confusion parameters are denoted as δb , the probability of finding a good fund when it is bad, and δg , the probability of finding a bad fund when it is good. In Panels C–E, the alphas use 100 trials in the grid search, but the other parameter estimates and the standard errors, conditioning on those alphas, use 1000 trials.
varies from 0.0 to 78% as the alpha values move from the center of the distribution to the extreme tails. In Panel B of Table 108E.2, we report the findings for the hedge fund sample, and the impression from hedge funds here is similar to that from mutual funds. The dependence of the inferences about the fractions on the choice of the alpha locations of the good and bad funds motivates our simultaneous estimation of the alphas and the π fractions.
page 3809
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3810
Table 108E.5: Simultaneous estimation of true alpha parameters in the twodistribution model for mutual funds. γ
Bad αb
0.010 0.025 0.050 0.060 0.070 0.100 0.150 0.200 0.300 0.400
−0.136 −0.141 −0.156 −0.142 −0.158 −0.148 −0.158 −0.162 −0.187 −0.203
βb
πb
π0
0.073 0.173 0.296 0.292 0.353 0.401 0.507 0.573 0.723 0.824
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Notes: The true bad alpha parameters, αb , are estimated for each test size γ, simultaneously with the fractions of managers in the population having zero or bad alphas. A grid search over the bad alpha parameters looks from the median alpha value in the data from Table 108.1, to the lower 5% tail values in the data, with a grid size of 0.001%. The best fitting bad alpha parameters are shown, where the fit is determined by minimizing the difference between the cross-section of fund alphas estimated in the actual data, versus the cross-section of alphas estimated from a mixture of return distributions, formed from the zero and bad alpha parameters and the estimated π fractions for each of the types. At each point in the grid, the π fractions are estimated using simulation as described in the text, with 100 simulation trials. πb denotes the fraction of bad funds. All alpha parameters are stated in monthly percentage units. The power parameters of the test are denoted as βb , the power to reject against the alternative of a bad fund.
In Table 108E.2, we also examine the robustness of the joint estimation results to the choice of the test size, 5% and 10%. As explained in the main text, the test size of 10% is our preferred test size, and the results from the 10% test size are reported in the main text. Here, we find that the results are similar in general when a test size of 5% is used. Table 108E.3 repeats the analysis of Table 108.4 using the Fama and French (1996) three-factor model for hedge funds, and finds similar results those reported in the main paper. The difference is that the reported standard errors, not the empirical standard errors from the simulations, are displayed here. This provides a feel for the biases in the reported standard errors and shows the increase in the reported standard errors as the test size increases. Table 108E.4 presents results for joint estimation of the three-group model, with mutual fund results in Panel A and hedge fund results in Panel B. The joint estimation here uses 100 simulation trials, but in a few untabulated experiments we find that the results for 100 and 1000 trials usually differ only in the last decimal place. Panels C–E summarize robustness checks. Panel C reproduces results from the baseline case in the main text for
page 3810
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3811
comparison. Panel D presents results where the alpha parameters are associated with active management, measured by low R-squares in factor model regressions for the funds returns. Funds are ranked on their R-squares and when assigned to the three alpha groups, the low R-square funds go into the high-alpha group and the high R-square funds go into the low-alpha group. Panel E of Table 108E.4 presents results where we account for estimation error in mutual funds’ alphas, as described in the main text. After we subtract the random alpha to account for estimation error, we rescale the adjusted simulated fund returns so that they still match the standard deviations of the actual fund returns in the data, and we add a constant to preserve the means of the transformed simulated fund returns. The transformation produces r ∗ = wr + x, where r is the simulated return with the random alpha, w = [σ 2 (r)/(σ 2 (α) + σ 2 (r))]1/2 , where σ 2 (r) is the variance of the fund returns before adjustment and σ 2 (α) is the variance of the estimated alpha, and x = E(r)(1 − w). Table 108E.5 presents results from joint estimation of the two-group model for mutual funds, using a range of test sizes. The result that the fraction of mutual funds having the bad alpha is 100% in the two-group model is robust to the size of the test, and the estimate of the bad alpha varies between −0.136% per month and −0.203% per month as the size of the test is varied between 1% and 40%.
Appendix 108F Simulations of the Simulations Panels A–D of Table 108F.1 present simulation results that evaluate the standard errors when the population values of the π fractions are each set to 1/3. The simulated samples are of the same size as the samples in our actual data, and 1000 simulation trials are used. The results are similar to those in Table 108.2 of the main paper. Our point estimates are typically within one empirical standard deviation of the true values of the π fractions at the 5% test size, and are even closer to the true values than the example in the main text, at all sizes above 10%. The standard errors are the most accurate at the 10% test size, but understated, and the average reported standard errors get larger with the larger test sizes. Unlike the case in the main paper, there seems to be no benefit to using test sizes larger than 10%. The results show that the findings in the main paper are conservative, in the sense that the fractions of zero alpha funds are likely even smaller than our estimates indicate, and the fractions of good and bad hedge funds are likely even larger. Panels E–H of Table 108F.1 present the results of experiments where we set the true π fractions equal to the banner values reported by
page 3811
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3812 Table 108F.1:
Finite sample properties of the estimators. πzero
Panel A: γ/2 = 0.05 Population values
πgood
πbad
1/3
1/3
Our Avg. Estimates Classical FDR Avg. Estimates
0.548 0.980
0.197 0.024
0.255 −0.003
Empirical SDs Avg. Reported SDs Root MSE
0.290 0.054 0.361
0.186 0.357 0.231
0.261 0.330 0.272
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.029 0.006 0.647
0.027 0.007 0.311
0.015 0.011 0.337
1/3
1/3
1/3
Our Avg. Estimates Classical FDR Avg. Estimates
0.417 0.910
0.286 0.069
0.298 −0.020
Empirical SDs Avg. Reported SDs Root MSE
0.233 0.241 0.247
0.161 0.354 0.168
0.227 0.295 0.230
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.048 0.009 0.599
0.047 0.011 0.270
0.036 0.017 0.315
1/3
1/3
1/3
Our Avg. Estimates Classical FDR Avg. Estimates
0.380 0.857
0.298 0.097
0.322 0.056
Empirical SDs Avg. Reported SDs Root MSE
0.230 0.528 0.235
0.145 0.406 0.149
0.237 0.360 0.237
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.061 0.014 0.527
0.063 0.015 0.244
0.060 0.023 0.293
Panel B: γ/2 = 0.10 Population values
Panel C: γ/2 = 0.20 Population values
Panel D: γ/2 = 0.30 Population values
1/3
1/3
1/3
1/3
Our Avg. Estimates Classical FDR Avg. Estimates
0.368 0.830
0.308 0.115
0.324 0.056
Empirical SDs Avg. Reported SDs Root MSE
0.233 0.772 0.235
0.146 0.461 0.148
0.249 0.488 0.249
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.069 0.020 0.500
0.073 0.020 0.231
0.072 0.030 0.287 (Continued)
page 3812
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108F.1:
3813
(Continued) πzero
πgood
πbad
Panel E: γ/2 = 0.05 Population values
0.750
0.010
0.240
Our Avg. Estimates Classical FDR Avg. Estimates
0.829 1.010
0.027 −0.010
0.143 −0.003
Empirical SDs Avg. Reported SDs Root MSE
0.285 0.054 0.293
0.085 0.319 0.087
0.253 0.282 0.271
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.024 0.005 0.264
0.017 0.006 0.026
0.017 0.009 0.244
Panel F: γ/2 = 0.10 Population values
0.750
0.010
0.240
Our Avg. Estimates Classical FDR Avg. Estimates
0.706 0.988
0.046 −0.008
0.248 0.020
Empirical SDs Avg. Reported SDs Root MSE
0.301 0.284 0.304
0.097 0.370 0.104
0.274 0.267 0.274
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.053 0.008 0.244
0.038 0.010 0.042
0.044 0.015 0.224
Panel G: γ/2 = 0.20 Population values
0.750
0.010
0.240
Our Avg. Estimates Classical FDR Avg. Estimates
0.609 0.959
0.068 −0.007
0.322 0.048
Empirical SDs Avg. Reported SDs Root MSE
0.314 0.693 0.344
0.118 0.499 0.132
0.294 0.342 0.306
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.083 0.014 0.225
0.066 0.015 0.068
0.076 0.022 0.206
Panel H: γ/2 = 0.30 Population values
0.750
0.010
0.240
Our Avg. Estimates Classical FDR Avg. Estimates
0.574 0.946
0.083 −0.006
0.333 0.061
Empirical SDs Avg. Reported SDs Root MSE
0.311 1.084 0.357
0.124 0.628 0.144
0.301 0.560 0.318
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.100 0.021 0.220
0.084 0.021 0.086
0.095 0.030 0.203 (Continued)
page 3813
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3814
Table 108F.1:
(Continued) πzero
πgood
πbad
Panel I: Maximizing the Use of Asymptotic Normality Population values 0.100 0.600
0.300
γ/2 = 0.05 Our Avg. Estimates
0.382
0.404
0.214
Empirical SDs Avg. Reported SDs Root MSE
0.271 0.058 0.390
0.259 0.416 0.323
0.230 0.426 0.244
Our Avg. Estimates
0.294
0.512
0.194
Empirical SDs Avg. Reported SDs Root MSE
0.212 0.237 0.286
0.194 0.312 0.212
0.163 0.406 0.193
Our Avg. Estimates
0.186
0.548
0.265
Empirical SDs Avg. Reported SDs Root MSE
0.165 0.537 0.186
0.165 0.288 0.172
0.174 0.523 0.177
Our Avg. Estimates
0.198
0.545
0.247
Empirical SDs Avg. Reported SDs Root MSE
0.160 0.775 0.187
0.145 0.332 0.151
0.151 0.643 0.159
γ/2 = 0.10
γ/2 = 0.20
γ/2 = 0.30
γ/2 = 0.30, Using samples of size T = 5000 Our Avg. Estimates
0.082
0.595
0.323
Empirical SDs Avg. Reported SDs Root MSE
0.018 0.054 0.026
0.016 0.016
0.025 0.058 0.034
Panel J: γ/2 = 0.05, Using samples of size T = 5000 Population values
0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.075 0.267
0.595 0.525
0.329 0.208
Empirical SDs Avg. Reported SDs Root MSE
0.016 0.089 0.030
0.013 0.014 0.014
0.023 0.101 0.038
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.009 0.008 0.167
0.012 0.012 0.076
0.015 0.019 0.093 (Continued)
page 3814
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really? Table 108F.1:
3815
(Continued) πzero
πgood
πbad
Panel K: γ/2 = 0.10, Using samples of size T = 5000 Population values 0.100
0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.078 0.257
0.599 0.521
0.322 0.222
Empirical SDs Avg. Reported SDs Root MSE
0.012 0.059 0.025
0.015 0.021 0.015
0.020 0.079 0.030
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.009 0.009 0.157
0.014 0.012 0.080
0.015 0.019 0.079
Panel L: γ/2 = 0.05, Using samples of size T = 5000 Population values 0.333
0.333
0.333
Our Avg. Estimates Classical FDR Avg. Estimates
0.199 0.490
0.345 0.286
0.456 0.223
Empirical SDs Avg. Reported SDs Root MSE
0.049 0.256 0.143
0.020 0.362 0.023
0.030 0.413 0.126
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.016 0.009 0.158
0.017 0.012 0.050
0.022 0.020 0.112
Panel M: γ/2 = 0.05, Samples of Size T = 5000, Maximizing Use of Asymptotic Normality Assumption Population values 0.100 0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.074 0.291
0.599 0.503
0.328 0.206
Empirical SDs Avg. Reported SDs Root MSE
0.013 0.062 0.029
0.018 0.038 0.018
0.025 0.098 0.038
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
0.009 0.008 0.192
0.016 0.011 0.098
0.038 0.019 0.096
Panel N: γ/2 = 0.30, Samples of Size T = 5000, Maximizing Use of Asymptotic Normality Assumption Population values 0.100 0.600
0.300
Our Avg. Estimates Classical FDR Avg. Estimates
0.082 0.213
0.595 0.538
0.323 0.249
Empirical SDs Avg. Reported SDs Root MSE
0.018 0.054 0.026
0.016 0.038 0.016
0.025 0.058 0.034 (Continued)
page 3815
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3816
Table 108F.1:
Classical Empirical SD Classical Avg. Reported SD Classical Root MSE
(Continued) πzero
πgood
πbad
0.014 0.012 0.114
0.016 0.014 0.064
0.021 0.024 0.055
Notes: In each of bootstrap simulation trials artificial data are drawn randomly from a mixture of three fund distributions. The “population” values of the fractions of funds in each group, π, shown here in the first row, determine the mixture, combined with the good, zero or bad alpha values that we estimate as the best-fitting values for the full sample period. Hedge fund data for January 1994–March 2012 are used, and the values of the bad and good alphas are −0.138 and 0.262% per month. For each simulation draw we run the estimation by simulation with 1000 trials, to generate the parameter and standard error estimates. Standard error estimates are removed when an estimated fraction is on the boundary of a constraint. The Avg. estimates are the averages over the 1000 draws from the mixture distribution. The empirical SDs are the standard deviations taken across all of the 1000 trials. The root MSEs are the square root of the average over the 1000 trials, of the squared difference between an estimated and true parameter value. γ/2 indicates the size of the tests (the area in one tail of the two-tailed tests). When the sample size is T = 5000, we use only 100 simulation trials.
BSW: π0 = 0.75, πg = 0.01 and πb = 0.24. When the test size is 5% in each tail, the BSW estimator of π0 is upwardly biased, with an average estimate of 1.01. The upward bias remains at the larger test sizes, and the average estimate is 0.93 at the 30% size. Our point estimate is also upwardly biased, averaging 0.829 at size 5%, but becomes more accurate at the 10% size, averaging 0.740 when the true value is 75%. The average values of our estimates are always within one empirical standard deviation of the true value, and the patterns in both the reported and the empirical standard errors are similar to what we observed at the other parameter values. Our standard errors are understated for the larger π0 estimates and overstated for the other two fractions. Moving to larger test sizes beyond 10% our point estimates do not improve, but the empirical standard errors get larger and the reported standard errors get much larger, resulting in dramatic overstatement in the standard errors at the largest test sizes. Overall, the results confirm the patterns reported in the main text, and suggest that the 10% test size results in the best overall performance of the estimators. Panel I presents the results of experiments in which we extend the use of the asymptotic normality assumption in computing the standard errors, using it for the computation of the point estimates of the π fractions. We
page 3816
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3817
use the expectations of the x’s described above in place of the simulationgenerated β and δ parameters. This simpler approach has the advantage that joint estimation of the π fractions and alpha parameters can be conducted using only one instead of three simulations. The results show that the point estimates are similar, and the standard error patterns are similar to what we report for the base case in the main paper. The last part of the panel runs the sample size up to T = 5000, but with only 100 simulation trials. This suggests that the point estimates making use of the asymptotic normality assumption are consistent estimators, but the standard errors under the simpler approach remain overstated relative to the variation across the bootstrap simulation trials. Panels J and K of Table 108F.1 present the results of experiments where we increase the size of the time-series samples used in the simulations to 5000 observations for the base simulation results. We use again only 100 simulation trials in these experiments, given the computational requirements. In Panel J the test size is 5% in each tail and the true π fractions are set equal to (0.10, 0.60, 0.30). In Panel L each of the true fractions is set equal to 1/3. Our point estimates are within 3% of the true values. This is about the magnitude of the simulation errors that we experience using 100 trials in the simulations. The average BSW estimate of π0 is 26.7% when the true value is 10%, indicating that the upward bias in the estimator remains in large samples, and the estimated fractions of bad funds is about 10% too low. All of the standard errors approach zero as the sample sizes grow, of course, and at these sample sizes the classical standard errors are quite accurate. Our standard errors remain overstated at the large sample sizes when (γ/2) = 0.05. In Panel K when the true values of the π fractions are each 1/3, and the classical estimates of π0 remain overstated, averaging 49%. Our average estimates are closer, but can be off by as much as 13%. Our standard errors remain overstated at the 5% test size, and the other results are similar. Panels M and N report the results of experiments where we use the large sample sizes, T = 5000, and use the estimators which maximize the use of the asymptotic normality assumption. The results are similar to those for the baseline estimator. Our point estimates are within about 3% of the true values, while the classical estimates of π0 are overstated, and our standard errors remain overstated. Appendix 108G Trading Strategies The analysis of BSW (2010) finds that by controlling for the rate of false discoveries it is possible — at least early in the sample — to better identify
page 3817
July 6, 2020
16:4
3818
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
positive performance mutual funds and profit thereby. The False Discovery Rate (FDR) is the expected fraction of lucky funds, in the set of funds where the tests reject the null of zero alphas in favor of a good fund. BSW compute the false discovery rate as FDR = π0,BSW (γ/2)/Fg , in our notation. The false discovery rate is a natural extension of the idea of the size of a test to a multiple-comparisons setting. Controlling the FDR involves using simulations, searching for the value of γ that delivers the desired value of the FDR, where both π0,BSW and Fg depend on the chosen γ. Bootstrapping under the null hypothesis of zero alphas determines a critical value for the alpha t-statistic for a test of the optimal size, and all funds with t-statistics in the sample above this critical value are selected as good funds. Portfolios of good funds are formed in this way during a series of rolling formation periods, and their performance is examined during subsequent holding periods. We implement a version of this strategy using our model. The evidence in BSW for economic significance when selecting good mutual funds using the FDR approach is not very strong (alphas of 1.45% per year or less, with p-values for the alphas of 4% or larger). Given our evidence that there are no significant positive-alpha mutual funds, such weak results are not surprising. However, we do find that not all mutual funds have zero alphas. There are plenty of bad mutual funds and investors might benefit from attempting to avoid the bad funds. Previous studies suggest that the most significant persistence in fund performance is that of the bad funds (e.g., Carhart, 1997). We therefore also examine the performance of strategies that use our results to detect and avoid bad mutual funds. We modify the FDR approach to detecting bad funds by controlling the expected fraction of funds that the tests find to be bad, but which are not really not bad. The fraction of these unlucky funds, as a ratio to the total number of funds that the tests find to be bad, is the false discovery rate for bad funds: FDRb = [(γ/2)π0 + δb πg ]/Fb .
(108G.1)
This modification of the FDR considers the “unlucky” funds with zero alphas, as in the classical FDR approach, and also the “very unlucky” funds with positive alphas, that were confused with bad funds by the tests. We also form strategies that attempt to find good funds by controlling the false discovery rate for good funds, which in our more general model takes the following form: FDRg = [(γ/2)π0 + δg πb ]/Fg .
(108G.2)
Following BSW, we pick a rolling, 60-month formation period. The first formation period ends in December of 1998 for hedge funds and in December
page 3818
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3819
of 1988 for mutual funds. The cross-section during a formation period includes every fund with at least eight observations (12 for hedge funds) during the formation period. We first jointly estimate the alphas and the π fractions for each formation period and we hold these alphas fixed for that formation period.25 Figure 108.3 in the main text illustrates the rolling good and bad alpha values. For a given level of the false discovery rate, we run a grid search over the test size, (γ/2), estimating the π fractions at each point in the grid, along with the δ’s and β’s, to find the choice that makes the FDR expressions in equations (108G.1) and (108G.2) the closest to the target value of the FDRs, minimizing the absolute deviation between the target and the value of the expressions. These estimates by simulation over the formation period determine critical values for the alpha t-ratios, and a set of funds are chosen to be good or bad funds based on those critical values. An equally weighted portfolio of the selected funds is examined during a holding period. If a fund ceases to exist during a holding period, the portfolio allocates its investment equally among the remaining funds. The holding period is a one-year future period: Either the first, second, third or fourth year after formation. The 60-month formation period is rolled forward year by year. This gives us a series of holding period returns for each of the first four years after formation, starting in January of 1999 for the hedge funds and in January of 1989 for the mutual funds. The holding period returns for the fourth year after formation start in January of 2002 for the hedge funds, and in January of 1992 for the mutual funds. We show results for equally-weighted portfolios of the selected good funds (Good), the zeroalpha funds (Zero) and the selected bad funds (Bad). We also present the excess return difference between the good and the bad (G–B) using only those months where both good and bad funds are found. The results of the trading strategies are summarized in Table 108G.1. The first columns show the averages of the size of the test chosen to control the false discovery rates, where the FDR is chosen following BSW to be 30% in each tail. The test sizes that best match are between 10% and 26%. Also shown is the average over the formation periods of the number of funds in each portfolio. For the hedge funds in Panel A, many more good than bad funds are found. For the mutual funds in Panel B, there are many more zero-alpha funds and bad funds than there are good funds, but some good
25
This exploits our observation that the alphas we find with joint estimation are not very sensitive to the value of γ, and saves considerable computation time.
page 3819
July 6, 2020
Year after formation period 1 α
T
α
T
μ
α
T
0.22 0.12 0.08 0.14
2.7 1.3 0.7 1.5
0.23 0.31 0.29 −0.06
0.13 0.17 0.14 −0.02
1.4 1.8 1.4 −0.2
0.30 0.26 0.41 −0.10
0.19 0.12 0.27 −0.10
2.1 1.0 2.3 −1.1
Panel B: Mutual Funds during January 1989–December 2011 Good 0.10 236.6 0.53 −0.02 −0.5 0.44 Zero 398.8 0.49 −0.06 −1.8 0.45 Bad 0.19 382.4 0.48 −0.10 −2.4 0.49 G–B 0.06 0.08 1.4 −0.05
−0.09 −0.07 −0.04 −0.05
−1.8 −1.8 −1.0 −0.8
0.51 0.52 0.54 −0.03
−0.11 −0.09 −0.07 −0.04
−1.9 −2.5 −2.0 −0.8
0.49 0.44 0.42 0.07
−0.04 −0.09 −0.11 0.07
−0.8 −2.4 −2.8 −1.4
b3568-v4-ch108
Notes: A 60-month rolling formation period is used to select good and bad funds, controlling the false discovery rate at 30% in each tail. (γ/2) is the average, over the formation periods, of the significance levels that best control the false discovery rates during the formation periods. Good is the portfolio of funds found to have high alphas, Bad is the portfolio of low alpha funds and G–B is the difference between the Good and Bad alpha fund returns in the months when both exist. N is the average number of funds in each group, μ is the sample mean portfolio return and α is the portfolio alpha, formed using the Fama–French factors for mutual funds and the Fung and Hsieh factors for hedge funds during the holding period. The holding period is one year in length and follows the formation period by one to four years. Mean returns and alphas are percent per month. T is the heteroskedasticity-consistent t-ratio.
9.61in x 6.69in
0.26 0.21 0.15 0.11
W. Ferson & Y. Chen
T
Panel A: Hedge Funds during January1999–March 2012 Good 0.11 360.6 0.34 0.23 2.9 Zero 193.7 0.26 0.17 2.1 Bad 0.26 76.6 0.19 0.10 1.1 G–B 0.15 0.13 1.8
μ
4
α
(γ/2)
μ
3
μ
Portfolio
N
2
Handbook of Financial Econometrics,. . . (Vol. 4)
Holding period returns after fund selection with FDR control.
16:4
3820
Table 108G.1:
page 3820
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
How Many Good and Bad Funds are There, Really?
b3568-v4-ch108
3821
funds are found. Early in the evaluation period there are more good than bad funds, while later in the sample there are more bad funds than good funds. For the hedge funds in Panel A, both the mean excess returns and the alphas are ordered roughly as expected across the good, zero and bad groups during the first two years after portfolio formation, although many of the groups have positive point estimates of alpha. In the first year the G–B excess return alpha is 0.13% per month, or about 1.5% per year, with a t-ratio of 1.8. During the second year the G–B excess return is similar in magnitude but the t-ratio is only 1.5. During the third and fourth years after portfolio formation there is no economic or statistically significant difference between the fund groups. The mutual funds in Panel B also display mean excess returns and alphas that are ordered as expected across the groups during the first year, but most of the alphas are negative. The bad group has an alpha t-ratio of −2.4, and the G–B excess alpha is 0.08% per month, with a t-ratio of 1.4. During the second through fourth years after portfolio formation some groups have statistically significant negative alphas, but there is no economic or statistically significant difference between the mutual fund groups. We examine a trading strategy based on assigning funds to the various alpha groups in the simplest possible way. We form portfolios of funds based on the estimated fractions in each subpopulation, and on the alpha estimates for each fund, in the formation period. Funds are sorted each year from low to high on the basis of their formation period alphas, and they are assigned to one of the three groups according to the current estimates of the π fractions. Equally-weighted portfolios of the selected funds are examined during a subsequent holding period, just like in the previous exercise. The average returns and their alphas during the holding periods are shown in Table 108G.2. For the hedge funds in Panel A, many more good than bad funds are found. For the mutual funds in Panel B, there are many more zero-alpha funds and bad funds then there are good funds. But, some good funds are found. Early in the evaluation period there are more good than bad funds, while later in the sample there are more bad funds than good funds. For the hedge funds in Panel A, both the mean excess returns and the alphas are ordered roughly as expected across the good, zero and bad groups during the first two years after portfolio formation, although many of the groups have positive point estimates of alpha. In the first year the G–B excess return alpha is 0.33% per month, or about 4% per year, with a t-ratio
page 3821
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3822 Table 108G.2:
Holding period returns by fund groups after rolling estimation. Year after formation period 1
Portfolio
N
μ
α
2 T
μ
α
3 T
μ
Panel A: Hedge Funds during January1999–March 2012 Good 339.3 0.32 0.20 2.5 0.22 0.18 2.1 0.25 Zero 263.6 0.30 0.21 2.6 0.22 0.13 1.5 0.28 Bad 27.8 0.21 −0.01 −0.1 0.09 0.00 0.01 0.15 G–B 0.31 0.33 2.1 0.24 0.26 1.3 −0.01
α
0.16 0.15 0.07 0.03
4 T
μ
1.9 0.29 1.7 0.31 0.3 0.35 0.1 −0.02
Panel B: Mutual Funds during January 1989–December 2011 Good 444 0.56 0.02 −0.3 0.41 −0.11 −2.4 0.49 −0.13 −2.2 Zero 294.7 0.52 −0.06 −1.6 0.84 −0.06 −1.7 0.29 −0.11 −2.8 Bad 277.5 0.43 −0.10 −2.2 0.51 −0.05 −1.2 0.53 −0.08 −2.2 G–B 0.08 0.12 1.8 −0.08 −0.06 −0.1 −0.04 −0.05 −0.8
α
T
0.17 0.18 0.12 0.10
2.0 1.7 0.6 0.5
0.48 −0.05 −1.0 0.61 −0.11 −2.5 0.42 −0.11 −2.6 0.06 0.06 1.2
Notes: A 60-month rolling formation period is used to select good, zero alpha and bad funds, in which the basic probability model used to estimate the π fractions when the size of the tests is 10% in each tail. The jointly estimated good and bad alpha values are those depicted in Figure 108.3 of the main paper. Good is the equal weighted portfolio of funds detected to have positive alphas during the formation period, zero is a portfolio of zero-alpha funds, Bad is the portfolio of low alpha funds and G–B is the difference between the two, in months where both have nonzero numbers of funds. The groups are formed by ranking funds on their formation period alphas and assigning their group membership according to the π fractions estimated for that formation period. N is the average number of funds over all of the formation periods, μ is the sample mean portfolio return during the holding periods, and α is the portfolio alpha, formed using the Fama–French factors for mutual funds and the Fung and Hsieh factors for hedge funds during the holding period. The holding period is one year in length and follows the formation period by one to four years. Mean returns and alphas are percent per month. T is the heteroskedasticity-consistent t-ratio.
of 2.1. During the second year the G–B excess return is 0.26% per month and t-ratio is 1.3. During the third and fourth years after portfolio formation there is no economic or statistically significant difference between the fund groups. The mutual funds in Panel B also display mean excess returns and alphas that are ordered as expected across the groups during the first year, but all of the alphas are negative. The bad group has an alpha t-ratio of −2.2, and the G–B excess alpha is 0.12% per month, with a t-ratio of 1.8. During the second through fourth years after portfolio formation some groups have statistically significant alphas, but there is no economic or statistically significant difference between the mutual fund groups.
page 3822
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3823
Appendix 108H The Impact of Missing Data Values on the Simulations Our baseline simulations use a cross-sectional bootstrap method similar to Fama and French (2010). There is a potential issue of inconsistency in this procedure. Since we draw a row from the data matrix at random, the missing values will be distributed randomly through “time” in the artificial sample, while they tend to occur in blocks in the original data. The number of missing values will be random and differ across simulation trials. The bootstrap can be inconsistent under these conditions. In this experiment, we exploit the fact that the beta and alpha estimates for funds are the results of a seemingly-unrelated regression model (SURM), with the same right-hand side variables for each fund. Thus, equation-byequation OLS produces the same point estimates of the alphas as does estimation of the full system. We bootstrap artificial data for each fund, i, separately, drawing rows at random from the data matrix (f, rf , ri ), which concatenates the factors (f ), the risk-free rate (rf ) and the returns data for fund i, ri . If we encounter a missing value for a fund, we keep searching randomly over the rows until we find one that is not missing, and we include the nonmissing value with its associated monthly observation for (f, rf ). In this way, we preserve the relation between ri , the risk-free rate and the vector of factors for each fund. This continues until the time-series of the proper length has been filled out for a particular fund, resulting in an artificial sample with no missing values. We then form a “Hole Matrix”, H, which is the same size as the original fund sample, and which contains zeros where the original fund data are missing and ones elsewhere. We apply the H matrix to assign missing values for the same months in which they appear in the original data for each fund. We estimate the alphas treating this simulated data the same way we treat the original data and the baseline simulation data. The simulations using the H matrix guarantee that each fund has the same number of missing values in each simulation draw, appearing at each point in “time” as it does in the original data. The artificial data for each fund should replicate the statistical properties of the original data, on the assumption of independence over time. This approach also preserves cross-sectional dependence and conditional heteroscedasticity to some extent, through the common dependence of the fund returns on the factors. We compare the results of this approach with that of our baseline simulation method in Table 108H.1. Simulations are conducted under the null hypothesis that all of the alphas are zero. Baseline Sims. refers to the simulation method used in the main paper. Hole-preserving Sims. is based
page 3823
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3824 Table 108H.1: Residual standard deviations:
Impact of missing value patterns on the simulations.
Actual data
Baseline sims.
Hole-preserving sims
Panel A: Mutual Fund Statistics in Original Data and Two Simulation Methods Mean 1.55 1.46 1.48 Min 0.24 0.00 0.00 Max 21.02 16.58 13.20 Factor Model R-squares Mean Min Max Fraction of funds above
90.6 3.6 99.8
90.8 3.6 100
Baseline sims. Alphas
T -ratio
Panel B: Average Values at Various Fractiles 0.010 0.673220 2.83576 0.050 0.344044 1.83518 0.100 0.239758 1.41382 0.250 0.111415 0.75920 0.500 0.006508 0.04764 0.750 −0.096898 −0.65971 0.900 −0.222724 −1.32077 0.950 −0.326849 −1.73972 0.990 −0.654136 −2.71572
90.0 4.5 100 Hole-preserving sims. Alphas
0.669183 0.336002 0.230975 0.104421 −0.000549 −0.105605 −0.231732 −0.337738 −0.668404
T -ratio
2.75717 1.81665 1.38223 0.71177 −0.00393 −0.72110 −1.39057 −1.82626 −2.77866
Notes: Baseline Sims. refers to the simulation method used in the main paper. Holepreserving Sims. is based on an alternative simulation methodology that exactly reproduces the number of missing values and their location in time for each mutual fund. In Panel A the statistics are drawn from the first simulation trial. Mean is the average across the mutual funds, Min is the minimum and Max is the maximum, requiring at least eight observations during the January 1984–December 2011 sample period (336 months). In Panel B the values at each fractile of the distribution are the averages across 100 simulation trials. Alpha is the average alpha value at each fractile of the cross-sectional fund distribution of alphas. T -ratio is the heteroscedasticity-consistent t-ratio for alpha at each fractile of the cross-section of alpha t-ratios. Alphas are in percent per month.
on the alternative simulation methodology. Each case has the same number of fund and time-series observations as in the original data. In Panel A the statistics are drawn from the first simulation trial. Mean is the average across the mutual funds, Min is the minimum and Max is the maximum. A fund is required to have at least 8 observations in the simulated samples to be included in the summary statistics. Panel A of Table 108H.1 shows that the Baseline and Hole-preserving simulations deliver similar statistical properties for funds’ residual standard
page 3824
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3825
deviations and factor model R-squares. Either method closely reproduces the statistical properties of the original data. In Panel B of Table 108H.1, the values at each fractile of the crosssectional distributions of mutual funds’ alphas and alpha t-ratios are shown for the two simulation methods. These are the averages across 100 simulation trials, thus estimating the expected outcomes. Alpha is the average alpha value at each fractile of the cross-sectional fund distribution of alphas. T -ratio is the heteroscedasticity-consistent t-ratio for alpha at each fractile of the cross-section of alpha t-ratios. Alphas are in percent per month. The results show that the cross-sectional distributions of alphas and alpha t-ratios that is produced by the two simulation methods are very similar. Appendix 108I Analysis of the Impact of Variation in the δ and β Parameters on the Simulations Our standard error calculations assume that the δ and β parameters are estimated without error. As the number of simulation trials gets large, these errors should be negligible, but it is useful to evaluate their impact when we use 1000 simulation trials. Consider the estimated fractions π = π(φ, F ) as a function of φ = (δg , δb , βg , βb ) and the fractions rejected, F . Note that we report and use the average values of the φ parameters over 1000 simulation trials, so it is the variance of the mean value that we are concerned with here. Our current estimate for the variance of π may be written as Q Vf Q , where the Q = Q(φ) is defined by equation (108A.4). Vf is the variance matrix of the fractions rejected, F . To consider the impact of variation in the φ parameters, we expand π = π(φ, F ) using the delta method and assume that the covariance matrix of (φ, F ) is block diagonal. The standard errors in the main paper include the part due to Vf , but not the part due to V (φ). Equivalently, we can assume that our current estimate is the expected value of the conditional variance of π given φ, and that we are missing the variance of the conditional mean, taken with respect to the variation in φ. Either approach leads to the same missing term due to variation in φ: E(∂π/∂φ) V(φˆ − φ)E(∂π/∂φ) .
(108I.1)
The analytical derivatives are evaluated at the average parameter values across the 1000 simulation trials, and the variance matrix V (φ∧ − φ) is estimated from the covariances of the (δ, β) estimates across the 1000
page 3825
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
W. Ferson & Y. Chen
3826
trials. These covariances are estimated assuming independent simulation trials. The results for hedge fund data and 1000 simulation trials, with (γ/2) = 0.10, and evaluated at the best fitting alpha values are:
Q Vf Q =
0.0327520
−0.0175781
−0.0175781
0.0488524 0.000831679 ∧ E(∂π/∂φ)V (φ − φ)E(∂π/∂φ) = 0.000259822
0.000259822 9.87724e − 005
.
We conclude that the variation in the δ and β parameters has a trivial impact on our results. Even if the covariance of (φ, F ) is not block diagonal, the cross terms should be small compared with the included Q Vf Q terms.
Appendix 108J Alternative Goodness of Fit Measures The alternative goodness-of-fit measures are the two-sample, Kolmogorov– Smirnov distance: KS = Supx |F1 (x) − F2 (x)|, where F1 (·) and F2 (·) are the two empirical cumulative distribution functions (CDFs) of alpha t-ratios; one from the data and one from the model. This measure looks at the maximum vertical distance between the CDFs. The second alternative measure is the Cramer–von Mises distance: CvM = Ex {[F1 (x)−F2 (x)]2 }, which looks at the mean squared distance. To implement the alternative measures we combine the observations of the alpha t-ratios from the original data and from a model, rank the values and calculate the two CDFs at each of the discrete points. Table 108J.1 presents the analysis. The first row replicates the Pearson Chi-square case, because these exercises are based on 100 simulation trials at each point in the grid, whereas the results in the main text use 1000 trials. The difference between 100 and 1000 here is a maximum difference of two basis points per month in the alpha parameters. The difference across the goodness-of-fit measures is less than eight basis points per month in the alpha parameters, 7% in the power parameters, less than one percent in the confusions and within about one standard error on the π parameters. While the alternative measures find a few more zero-alpha hedge funds, the results are otherwise very similar. We conclude that results are robust to the use of the different goodness-of-fit measures.
page 3826
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch108
How Many Good and Bad Funds are There, Really?
3827
Table 108J.1: Joint estimation of alphas and fractions in the populations with alternative goodness of-fit measures. Alphas (%) Good αg
Bad αb
Power Measure
βg
βb
Unconstrained Alpha Domains, 3-Group Model 0.237 −0.128 Pearson 30.7 19.5
Confusions
Fractions
δg
δb
π0
πg
6.6
4.9
7.3 (20.6) 10.0 (20.3) 26.5 (19.9)
63.2 (20.1) 65.9 (21.3) 46.0 (14.3)
0.227
−0.148
KS
29.7
21.7
6.2
5.1
0.307
−0.113
CvM
37.5
18.3
6.9
4.3
Notes: All tests are conducted using a size of γ/2 = 0.10. Simulations use 100 trials at each point in the grid search of the good and bad alpha values. Given those alpha values, simulations use 1000 trials to estimate the other model parameters. The standard errors in parentheses are the asymptotic standard errors. Pearson denotes the Pearson Chi-square goodness-of-fit measure, KS denotes the Kolmogorov–Smirnov distance and CvM denotes the Cramer–von Mises distance.
page 3827
This page intentionally left blank
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Chapter 109
Constant Elasticity of Variance Option Pricing Model: Integration and Detailed Derivation∗ Y. L. Hsu, T. L. Lin and Cheng Few Lee Contents 109.1 Introduction . . . . . . . . . . . . . . . . . . . . . 109.2 The CEV Diffusion and its Transition Probability Density Function . . . . . . . . . . . . . . . . . . 109.3 Review of Noncentral Chi-Square Distribution . . 109.4 The Noncentral Chi-Square Approach to Option Pricing Model . . . . . . . . . . . . . . . . . . . . 109.4.1 Detailed derivations of C1 and C2 . . . . 109.5 Some Computational Considerations . . . . . . . 109.6 Conclusions . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . Appendix 109A Proof of Feller’s Lemma . . . . . . . . .
. . . . . . .
3830
. . . . . . . . . . . . . .
3830 3834
. . . . . . .
3836 3836 3842 3844 3844 3844 3845
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Y. L. Hsu National Chung Hsin University e-mail: [email protected] T. L. Lin National Chao-Tung University Cheng Few Lee Rutgers University e-mail: cfl[email protected] ∗
This chapter draws upon the paper, “Constant elasticity of variance (CEV) option pricing model: Integration and detailed derivation,” which was published in Mathematics and Computers in Simulation, Vol. 79, 1, pp. 60–71, 2008. 3829
page 3829
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3830
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
Abstract In this paper, we review the renowned constant elasticity of variance (CEV) option pricing model and give the detailed derivations. There are two purposes of this chapter. First, we show the details of the formulae needed in deriving the option pricing and bridge the gaps in deriving the necessary formulae for the model. Second, we use a result by Feller to obtain the transition probability density function of the stock price at time T given its price at time t. In addition, some computational considerations are given for the facilitation of computing the CEV option pricing formula. Keywords Constant elasticity of variance model • Noncentral chi-square distribution • Option pricing.
109.1 Introduction Cox (1975) has derived the renowned constant elasticity of variance (CEV) option pricing model and Schroder (1989) has subsequently extended the model by expressing the CEV option pricing formula in terms of the noncentral Chi-square distribution. However, neither of them has given details of their derivations as well as the mathematical and statistical tools in deriving the formulae. 109.2 The CEV Diffusion and its Transition Probability Density Function The CEV option pricing model assumes that the stock price is governed by the diffusion process β
dS = μSdt + σS 2 dZ,
β < 2,
(109.1)
where dZ is a Wiener process and σ is a positive constant. The elasticity is β − 2 since the return variance υ(S, t) = σ 2 S β−2 with respect to price S has the following relationship: dυ(S, t)/dS = β − 2, υ(S, t)/S which implies that dυ(S, t)/υ(S, t) = β − 2dS/S. Upon integration on both sides, we have log υ(S, t) = (β − 2) log S + log σ 2 , or υ(S, t) = υ(S, t) = σ 2 S β−2 . If β = 2, then the elasticity is zero and the stock prices are lognormally distributed as in the Black and Scholes model. If β = 1, then equation (109.1) is the model proposed by Cox and Ross (1976).
page 3830
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Constant Elasticity of Variance Option Pricing Model
b3568-v4-ch109
3831
In this chapter, we will focus on the case of β < 2 since many empirical evidences (see Campbell (1987), Glosten et al. 1993, Brandt and Kang 2004) have shown that the relationship between the stock price and its return volatility is negative. The transition density for β > 2 is given by Emanuel and Macbeth (1982) and the corresponding CEV option pricing formula can be derived through a similar strategy. For more details, see Chen and Lee (1993). In order to derive the CEV option pricing model, we need the transition probability density function f (ST |St , T > t) of the stock price at time T given the current stock price St . For the transition probability density function f (ST |St ), we will start with the Kolmogorov forward and backward equations. Assume Xt follows the diffusion process dX = μ(X, t)dt + σ(X, t)dZ,
(109.2)
and P = P (Xt , t) is the function of Xt and t, then P satisfies the partial differential equations of motion. From equation (109.2), we have the Kolmogorov backward equation, ∂2P ∂P ∂P 1 2 σ (X0, t0 ) + μ(X0 , t0 ) + = 0, 2 2 ∂X0 ∂t0 ∂X0
(109.3)
and the Kolmogorov forward (or Fokker–Planck) equation ∂P ∂P 1 ∂2 2 = 0. [σ (X, t)P ] − [μ(Xt , t)P ] − 2 2 ∂Xt ∂Xt ∂t
(109.4)
Consider the following parabolic equation: (P )t = (axP )xx − ((bx + h)P )x ,
0 < x < ∞,
(109.5)
where P = P (x, t) are constants with a > 0, (P )t is the partial derivative of P with respect to t, ( )x and ( )xx are the first and second partial derivatives of ( ) with respect to x. This can be interpreted as the Fokker–Planck equation of a diffusion problem in which bx + h represents the drift, and ax represents the diffusion coefficient. Lemma 1 (Feller (1951)). Let xf (x, t|x0 ) be the probability density function for x and t conditional on x0 . The explicit form of the fundamental
page 3831
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3832
solution to the above parabolic equation is given by f (t, x |x0 ) =
b
e−bt x x0
(h−a)/2a exp
a(e−bt − 1) 1 2b −bt 2 (e xx0 ) , ×I1−h/a a(1 − e−bt )
−b(x + x0 ebt ) a(ebt − 1)
(109.6)
where Ix (x) is the modified Bessel function of the first kind of order k and is defined as Ik (x) =
∞ r=0
(x/2)2r+k . r!Γ(r + 1 + k)
(109.7)
Proof. See Appendix 109A. Before pursuing further, we will first consider the special case in which β = 1 which is the model considered by Cox and Ross (1976). In this situation we have dS = μ(S, t)dt + σ(S, t)dZ,
(109.8)
√ where σ(S, t) = σ S. Now suppose also that each unit of the stock pays out in dividends in the continuous stream b(S, t) so that the required mean becomes μ(S, t) = rS − b(S, t) = rS − (aS + h), where b(S, t) = aS √ + h and r is the riskfree interest rate. Then dS = [(r − a)S − h]dt + σ Sdz and the differential option price equation becomes ∂P ∂P 1 2 ∂2P σ S 2 + [(r − a)S − h] + = rP, 2 ∂S ∂S ∂t
(109.9)
and the corresponding Kolmogorov forward equation for the diffusion process (equation (109.8)) is ∂ ∂P 1 ∂2 2 = 0, (σ ST P ) + [((r − a)S − h)P ] − 2 2 ∂St ∂ST ∂t
(109.10)
which is obtained by using (109.4) with μ(xt , t) = (r − a)S − h. Comparing with equation (109.6), we set a = σ 2 /2, x = ST , x0 = St , b = r − σ 2 /2, h = −h and t = τ = (T − t). Thus, we have the following transition
page 3832
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
3833
probability density function for the Cox–Ross model: (1+2h/σ2 )/2 2 St e(r−σ /2)τ 2(r − σ 2 /2) × f (ST |St , T > t) = 2 (r−σ2 /2)τ ST σ [e − 1]
2 −2(r − σ 2 /2)[ST + St e(r−σ /2)τ ] × 2 σ 2 [e(r−σ /2)τ − 1] 1 2 4(r − σ 2 /2)(St ST e(r−σ /2)τ ) 2 . ×I1+2h/σ2 σ 2 [e(r−σ2 /2)τ − 1]σ 2 (109.11) We next consider the constant elasticity of variance diffusion, dS = μ(S, t) + σ(S, t)dZ,
(109.12)
μ(S, t) = rS − aS,
(109.13)
where
and σ(S, t) = σS β/2 dZ,
0 ≤ β < 2.
(109.14)
Then β
dS = (r − a)Sdt + σS 2 dZ.
(109.15)
Let Y = Y (S, t) = S 2−β . By Ito’s lemma with ∂2Y ∂Y ∂Y = (2 − β)S 1−β , = 0, = (2 − β)(1 − β)S −β , ∂S ∂t ∂S 2 we have
1 dY = (r − a)(2 − β)Y − σ 2 (β − 1)(β − 2) dt + σ 2 (2 − β)2 Y dZ. 2 (109.16) The Kolmogorov forward equation for Y becomes 1 ∂2 2 ∂P = [σ (2 − β)Y P ] ∂t 2 ∂Y 2 1 ∂ (r − a)(2 − β)Y + σ 2 (β − 1)(β − 2) P . − ∂Y 2 (109.17)
page 3833
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3834
Then f (ST |St , T > t) = f (YT |yt , T > t)|J| where J = (2 − β)S 1−β . By Feller’s lemma with a = 12 σ 2 (2 − β)2 , b = (r − a)(2 − β), h = 12 σ 2 (β − 2) (1 − β), x = T1 , x0 = 1 and t = τ = (T − t), we have f (ST |St , T > t) = (2 − β)k∗1/(2−β) (xz 1−β )1/(2(2−β)) e−x−z I1/(2−β) (2(xz)1/2 ), (109.18) where k∗ =
σ 2 (2
2(r − a) , x = k∗ St2−β e(r−a)(2−β)τ , (r−a)(2−β)τ − β)[e − 1]
z = k∗ ST2−β
Cox (1975) obtained the following option pricing formula: C = St e−rτ
∞ −x n e x G(n + 1 + 1/(2 − β), k∗ K 2−β )
Γ(n + 1)
n=0
− Ke−rτ
∞ −x n+1/(2−β) e x G(n + 1, k∗ K 2−β ) n=0
Γ(n + 1 + 1/(2 − β))
,
(109.19)
∞ where G(m, v) = [Γ(m)]−1 v e−u um−1 du is the standard complementary gamma distribution function. For a proof of the above formula, see Chen and Lee (1993). We next present the detailed derivations of the option pricing formula as presented by Schroder (1989). Since the option pricing formula is expressed in terms of the noncentral Chi-square complementary distribution function, a brief review of the noncentral Chi-square distribution is presented in the next section. 109.3 Review of Noncentral Chi-Square Distribution If Z1 , . . . , Zv are standard normal random variables, and δ1 , . . . , δv , are constants, then Y =
υ
(Zi + δi )2
(109.20)
i=1
is the noncentral Chi-square distribution with degrees of freedom and non centrality parameter λ = vj=1 δj2 , and is denoted as χ2 v (λ). When δj = 0 for all j, then Y is distributed as the central Chi-square distribution with v degrees of freedom, and is denoted as χ2v . The cumulative distribution
page 3834
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
3835
function of χ2 v (λ) is F (x; v, λ) = P (χ2 v (λ) ≤ x) −λ/2
=e
∞ j=0
(λ/2)j × v/2 j!2 Γ(v/2 + j)
x 0
y
y v/2+j−1 e− 2 dy,
x > 0. (109.21)
An alternative expression for F (x; v, λ) is ∞ (λ/2)j e−λ/2 P (χ2v+2j ≤ x). F (x; v, λ) = j!
(109.22)
j=0
The complementary distribution function of χ2 v (λ) is Q(x; v, λ) = 1 − F (x; v, λ),
(109.23)
where F (x; v, λ) is given in either equation (109.21) or equation (109.22). The probability density function of χ2 v (λ) can be expressed as a mixture of central Chi-square probability density functions: ∞ ∞ ( 12 λ)j e−(x+λ)/2 xv/2+j−1 λj −λ/2 p . (x) = e (x) = Pχ2 2 v (λ) j! χv+2j Γ(v/2 + j)22j j! 2v/2 j=0 j=0 (109.24) An alternative expression for the probability density function of χ2 v (λ) is √ (v−2)/4 1 x 1 (x) = exp − (λ + x) × I(v−2)/2 ( λx), x > 0, Pχ2 v (λ) 2 λ 2 (109.25) where Ik is the modified Bessel function of the first kind of order k and is defined as k ∞ 1 (z 2 /4)j z . (109.26) Ik (z) = 2 j!Γ(k + j + 1) j=0
It is noted that for integer k, 1 π z cos(kθ) e cos(kθ)dθ = I−k (z). (109.27) Ik (z) = π 0 The noncentral Chi-square distribution satisfies the reproductivity property with respect to n and λ. If X1 , . . . , Xk are independent random variables with Xt distributed as χ2 ni (λi ) then k k 2 Xi ∼ χk n λi . (109.28) Y = i=1
i=1
i
i=1
page 3835
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3836
109.4 The Noncentral Chi-Square Approach to Option Pricing Model Following Schroder (1989), with the transition probability density function given in (109.18), the option pricing formula under the CEV model is C = E(max(0, ST − K)) ∞ −rτ f (ST |St , T > t)(ST − K)dST =e
τ =T −t
K
= e−rτ
∞
K
−rτ
−e
K
ST f (ST |St , T > t)dST
∞
K
f (ST |St , T > t)dST
= C1 − C2 .
(109.29)
109.4.1 Detailed derivations of C1 and C2 Making the change of variable w = k∗ ST2−β , we have dST = (2 − β)−1 k∗−1/(2−β) w(β−1)/(2−β) dw. Thus, with y = k ∗ K 2−β , we have ∞ 1/(4−2β) −rτ e−x−w (x/w) ×I C1 = e = e−rτ
y
∞ y
1/(4−2β)
e−x−w (x/w)
× (x/k∗ )1/(2−β) dw = e−rτ (x/k∗ )1/(2−β) = e−rτ St e(r−a)τ = e−aτ St
∞ y
y
∞ y
∞
×I
1 2−β
1 2−β
√ (2 xw)(w/k∗ )1/(2−β) dw √ (2 xw)(w/x)1/(2−β)
e−x−w (x/w)−1/(4−2β) I
e−x−w (w/x)1/(4−2β) I 1/(4−2β)
e−x−w (w/x)
I
1 2−β
1 2−β
1 2−β
√ (2 xw)dw
√ (2 xw)dw
√ (2 xw)dw,
(109.30)
page 3836
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
and −rτ
C2 = Ke
∞ y
1
(2 − β)k∗1/(2−β) (xw1−2β ) 4−2β e−x−w
√ k∗−1/(2−β) β−1 w 2−β dw (2 xw) 2−β ∞ 1 x 4−2β w(1−2β+2β−2)/(4−2β) e−x−w I = Ke−rτ I
3837
1 2−β
= Ke−rτ
y
∞ y
e−x−w (x/w)1/(4−β) I
1 2−β
1 2−β
√ (2 xw)dw.
√ (2 xw)dw (109.31)
Recall that the probability density function of the noncentral Chi-square distribution with noncentrality λ and degree of freedom υ is √ 1 Pχ2 (λ) (x) = (x/λ)(υ−2)/4 I(υ−2)/2 ( λx)e−(λ+x)/2 υ 2 = P (x; υ, λ).
∞ Let Q(x; υ, λ) = x Pχ2 (λ) (y)dy. Then letting w = 2w and x = 2x, we υ have ∞ w 1/(4−2β) √ −aτ e−(x+w)/2 I 1 (2 xw)dw C1 = S t e 2−β x y 1/(4−2β) ∞ √ w e−(x +w )/2 I x w )dw = St e−aτ 1 (2 2−β x 2y = St e−aτ Q(2y; υ, x ) −aτ = St e Q 2y; 2 +
2 , 2x , 2−β
(109.32)
obtained by noting that (υ − 2)/2, 1/(2 − β), implying υ = 2 + 2/(2 − β). Analogously, with w = 2w, x = 2x, and In (z) = I−n (z), we have ∞ x 1/(4−2β) √ e−x−w I 1 (2 xw)dw C2 = Ke−rτ 2−β w y 1/(4−2β) ∞ √ x −rτ −(x +w )/2 e × I x w )dw = Ke 1 (2 2−β w 2y 2 , 2x , (109.33) = Q 2y; 2 − 2−β
page 3837
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3838
obtained by noting that (υ ∗ − 2)/2 = −1/(2 − β), implying υ ∗ = 2 − 2/(2 − β), Thus, −aτ
C = Sr e
2 2 −rτ , 2x − Ke Q 2y; 2 + , 2x . Q 2y; 2 + 2−β 2−β (109.34)
It is noted that 2−2/(2 − β) can be negative for β < 2. Thus further work is needed. Using the monotone convergence theorem and the integration by parts, we have ∞ ∞ √ υ−1 P (2y, 2υ, 2k)dk = e−z−k (z/k)υ−1 kz y
y
∞
(zk)n dk n!Γ(n + υ − 1 + 1) n=0 ∞ −z n+υ−1 ∞ −k n e z e k dk = Γ(n + υ) y Γ(n + 1) y ×
=
∞
g(n + υ, z)G(n + 1, y)
n=0
=
∞
g(n + υ − 1, z)
n=0
∞
g(i, y).
(109.35)
i=1
Now we also have the result G(n, y) = by observing that
∞
i=1 g(i, y),
which can be shown
∞ n−1 e−k kn−1 k dk = − de−k G(n, y) = Γ(n) Γ(n) y y ∞ n−2 −k y n−1 e−y k e + dk = Γ(n) Γ(n − 1) y
=
∞
n y i−1 e−y i=1
Γ(i)
=
n
g(i, y).
i=1
The above result can also be expressed as G(m + 1, t) = G(m + 1, t) + G(m, t).
(109.36)
page 3838
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
3839
Next, applying the monotone convergence theorem, we have Q(z; υ, k) =
1 y (υ−2)/4 I υ−2 ( ky)e−(k+y)/2 dy 2 2 k (υ−2)/2 ∞ 1 y (υ−2)/4 1 (ky/4)n −(k+y)/2 e ky × dy 2 k 2 n!Γ( υ+2n 2 ) n=0
∞ z ∞
= z
=
∞
−k/2
e
n=0
=
∞
∞ n=0
∞ z
(1/2)(υ+2n)/2 dy ( 12 )−(υ+2n)/2 Γ( υ+2n 2 )
∞ 1 (υ+2n)/2 υ+2n ( /2) e−y/2 y 2 −1 dy υ−2n Γ( 2 ) z
e
(k/2)n × Γ(n + 1)
e−k/2
(k/2)n Q(z; υ + 2n, 0), Γ(n + 1)
−k/2
n=0
=
kn ( 12 )n × Γ(n + 1)
(109.37)
where Q(x; υ + 2n, 0) = =
∞ z ∞
(1/2)(υ+2n)/2 −y/2 υ+2n −1 e y 2 dy Γ( υ+2n ) 2
1 −y υ+2n −1 2 dy υ+2n e y Γ( ) z/2 2
= G(n + υ/2, z/2). Furthermore, from the property of G(·, ·) as shown in equation (109.36), we have Q(z; υ, k) =
∞ n=0
=
∞ n=0
g(n + 1, k/2)G(n + υ/2, z/2) υ−2 , z/2 . g(n, k/2)G n + 2
Hence Q(2z; 2υ, 2k) =
∞ n=0
g(n, k)G(n + υ − 1, z).
(109.38)
page 3839
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3840
Again from the property of G(·, ·) as given by (109.36), we have Q(2z; 2υ, 2k) = g(1, k)G(υ, z) + g(2, k)G(υ + 1, z) + g(3, k)G(υ + 2, z) + · · · = g(1, k)[G(υ − 1, z) + g(υ, z)] + g(2, k)[G(υ − 1, z) + g(υ + 1, z)] + g(3, k)[G(υ − 1, z) + g(υ, z) + g(υ + 1, z) + g(υ + 2, z)] + · · · = [G(υ − 1, z) + g(υ, z)]
∞
g(n, k) + g(υ + 1, z)
n=1
×
∞
g(n, k) + g(υ + 2, z)
n=2
∞
g(n, k) + · · ·
n=3
= G(υ − 1, z) + g(υ, z) + g(υ + 1, z)[1 − g(1, k)] + g(υ + 2, z)[1 − g(1, k) − g(2, k)] + · · · = G(υ − 1, z) +
∞
g(υ + n, z) − g(υ + 1, z)[g(1, k)]
n=0
− g(υ + 2, z)[g(1, k) + g(2, k)] + · · · = 1 − g(υ + 1, z)[g(1, k)] − g(υ + 2, z)[g(1, k) + g(2, k)] − · · · . We conclude that
∞
g(i, k).
(109.39)
From (109.35) and (109.39) we observe that ∞ P (2z; 2υ, 2k)dk = 1 − Q(2z; 2(υ − 1), 2y).
(109.40)
Q(2z; 2υ, 2k) = 1 −
g(n + υ, z)
n
n=1
i=1
y
Thus, we can write C2 as
2 , 2w dw P 2x; 2 + C2 = Ke 2−β y 2 −rτ , 2x = Ke Q 2y; 2 − 2−β 2 −rτ , 2y . 1 − Q 2x; = Ke 2−β From (109.41) we immediately obtain 2 2 , 2x + Q 2x; , 2y = 1, Q 2y; 2 − 2−β 2−β −rτ
∞
(109.41)
(109.42)
page 3840
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Constant Elasticity of Variance Option Pricing Model
b3568-v4-ch109
3841
implying Q(z; 2n, k) + Q(k; 2 − 2n, z) = 1,
(109.43)
with degrees of freedom 2 − 2n of Q(k; 2 − 2n, z) can be a non-integer. From equation (109.42), we can obtain the noncentral Chi-square Q(2y; 2 − 2/(2 − β), 2x) with 2 − 2/(2 − β) degrees of freedom and the noncentrality parameter 2x can be represented by another noncentral Chi-square distribution 1 − Q(2x; 2 − 2/(2 − β), 2y) with degrees of freedom 2/(2 − β) and the noncentrality parameter 2y. The standard definition of noncentral Chi-square distribution in Section 109.3 has integer degrees of freedom. If the degree of freedom is not an integer, we can use equation (109.43) to transfer the original noncentral Chi-square distribution into another noncentral Chi-square distribution. Thus, we obtain an option pricing formula for the CEV model in terms of the complementary noncentral Chi-square distribution function Q(z; υ, k) which is valid for any value of β less than 2, as required by the model. Substituting equation (109.41) into equation (109.34), we obtain 2 2 −aτ −rτ , 2x − Ke , 2y , 1 − Q 2x; C = St e Q 2y; 2 + 2−β 2−β (109.44) where y = k ∗ K 2−β , x = k∗ St2−β e(r−a)(2−β)τ , k∗ = 2(r − a)/(σ 2 (2 − β) (e(r−a)(2−β)τ − 1)) and a is the continuous proportional dividend rate. The corresponding CEV option pricing formula for β > 2 can be derived through a similar manner. When β > 2 (see Emanuel and Macheth 1982, Chen and Lee 1993), the call option formula is as follows: 2 2 −aτ −rτ , 2y − Ke , 2x . 1 − Q 2y; 2 + C = St e Q 2x; β−2 β−2 (109.45) We note that from the evaluation of the option pricing formula C, especially C2 , as given in (109.34), we have 2k∗ ST2−β ∼ χ2 v (λ), where υ =2−
2 , 2−β
λ = 2k ∗ St2−β e(r−a)(2−β)τ ,
(109.46)
page 3841
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
3842
Thus, the option pricing formula for the CEV model as given in (109.44) that can be obtained directly from the payoff function ST − K, if ST > K, (109.47) max(ST − K, 0) = 0, otherwise, by taking the expectation of (109.47), with ST having the distribution given by (109.46). Before concluding this subsection we consider that the noncentral Chisquare distribution will approach log-normal as β tends to 2. Since when either λ or υ approaches to infinity, the standardized variable χ2 υ (λ) − (υ + λ) 2(υ + 2λ) tends to N (0, 1) as either υ → ∞ or λ → ∞. Using the fact that (xa − 1)/a will approach to ln x as a → ∞, it can be verified that 2k∗ ST2−β − (υ + λ) χ2 υ (λ) − (υ + λ) = 2(υ + 2λ) 2(υ + 2λ) 2r • ST2−β − (1 − β)σ 2 (er τ (2−β) − 1) − 2r • St2−β er • σ 2 (er τ (2−β) − 1) σ 2 (er• τ (2−β) − 1)/(2 − β) × (1 − β)σ 2 (er• τ (2−β) − 1) + 4r • St2−β er• τ (2−β) •
=
=
• τ (2−β)
ln ST − [ln St + (r • − σ 2 /2)τ ] √ , σ τ
where r • = r − a. Thus, ln ST | ln ST ∼ N
σ2 τ, σ 2 τ , ln ST + t − a − 2
(109.48)
as β → 2− . Similarly, equation (109.48) also holds when β → 2+ . From equation (109.45), we have 2k ∗ ST2−β ∼ χ2 υ (λ), where υ = 2 + 2/(β − 2) if β > 2. Thus, we clarify the result of equation (109.48). 109.5 Some Computational Considerations As noted by Schroder (1989), equation (109.39) allows the following iterative algorithm to be used in computing the infinite sum when z and k are not
page 3842
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
3843
large. First initialize the following four variables (with n = 1) gA =
e−z z υ = g(1 + υ, z), Γ(1 + υ)
gB = e−k = g(1, k), Sg = gB, R = 1 − (gA)(Sg). Then repeat the following loop beginning with n = 2 and increase increment n by one after each iteration. The loop is terminated when the contribution to the sum, R, is declining and is very small. z = g(n + υ, z), gA = gA n+υ−1 k = g(n, k), gB = gB n−1 Sg = Sg + gB = g (1, k) + g(n, k), R = R − (gA)(Sg) = the nth partial sum. As each iteration, gA equals g(n + υ, z), gB equals g(n, k) and Sg equals g(1, k) + · · · + g(n, k). The computation is easily done. As for an approximation, Sankaran (1963) showed that the distribu2 tion of (χυ /(υ + k))h is approximately normal with the expected value μ = 1 + h(h − 1)P − h(2 − h)mP 2 /2 and variance σ 2 = h2 P (1 + mP ), where h = 1 − 23 (υ + k)(υ + 3k)(υ + 2k)−2 , P = (υ + 2k)/(υ + k)2 and m = (h − 1)(1 − 3h). Using the approximation, we have approximately Q(z, υ, k) = P r(χ2 > z) 2 z χ > = Pr υ+k υ+k h h z χ2 > = Pr υ+k υ+k ⎛ h ⎞ z 1 − hP [1 − h + 0.5 (2 − h) mP ] − υ+k ⎟ ⎜ = Φ⎝ ⎠. h 2P (1 + mP )
page 3843
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
3844
9.61in x 6.69in
b3568-v4-ch109
Y. L. Hsu, T. L. Lin & C. F. Lee
109.6 Conclusions The option pricing formula under the CEV model is quite complex because it involves the cumulative distribution function of the noncentral Chi-square distribution Q(z, υ, k). Some computational considerations are given in the chapter which will facilitate the computation of the CEV option pricing formula. Hence, the computation will not be a difficult problem in practice.
Acknowledgments We gratefully acknowledge the editor and an anonymous referee for his insightful comments and suggestions of the paper. The research was supported in part by NSC grant 95-2118-M-005-003.
Bibliography Brandt, M.W. and Kang, Q. (2004). On the Relationship Between the Conditional Mean and Volatility of Stock Returns: A Latent VAR Approach. Journal of Financial Economics 72, 217–257. Campbell, J. (1987). Stock Returns and the Term Structure. Journal of Financial Economics 18, 373–399. Chen, R.R. and Lee, C.F. (1993). A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions in Option Pricing: Review and Integration. Journal of Financial Studies 1, 29–51. Cox, J. (1975). Notes on option pricing I : constant elasticity of variance diffusion, Unpublished Note, Standford University, Graduate School of Business. Also, Journal of Portfolio Management (1996) 23, 5–17. Cox, J. and Ross, S.A. (1976). The Valuation of Options for Alternative Stochastic Processes. Journal of Financial Economics 3, 145–166. Emanuel, D. and MacBeth, J. (1982). Further Results on the Constant Elasticity of Variance Call Option Pricing Formula. Journal of Financial and Quantitative Analysis 17, 533–554. Feller, W. (1951). Two Singular Diffusion Problems. Annals of Mathematics 54, 173–182. Glostern, L., Jagannathan, R. and Runkle, D. (1993). On the Relation Between the Expected Value and the Volatility of the Nominal Excess Returns on Stocks. Journal of Finance 48, 1779–1802. Sankaran, M. (1963). Approximations to the Non-central Chi-square Distribution. Biometrika 50, 199–204. Schroder, M. (1989). Computing the Constant Elasticity of Variance Option Pricing Formula. Journal of Finance 44, 211–219.
page 3844
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch109
Constant Elasticity of Variance Option Pricing Model
3845
Appendix 109A Proof of Feller’s Lemma We need some preliminary results in order to prove equation (109.6). Proposition 109A.1. f (z) = eAυ/z z −1 is the Laplace transformation of 1 I0 (2(Aυx) /2 ), where Ik (x) is the Bessel function ∞ (x/2)2r+k . Ik (x) = r!(r + 1 + k) r=0
Proof. By the definition of Laplace transformation and the monotone convergence theorem, we have ∞ 1 e−zx I0 (2(Aυx) /2 )dx f (z) =
0
1 ∞ (Aυx) /2 e = r!Γ(r + 1) 0 0 ∞ (Aυx)n (Aυx) (Aυx)2 −zx + + ··· + + ··· e 1+ = Γ(2) 2!Γ(3) n!Γ(n + 1) 0 ∞
−zx
(Aυ)n 1 Aυ (Aυ)2 + 2 + + · · · + + ··· z z 2!z 3 n!z n+1 Aυ (Aυ)2 (Aυ)n 1 1+ + + ··· + = z z 2!z 2 n!z n
=
= eAυ/z z −1 . Proposition 109A.2. Consider the parabolic differential equation Pt = (axP )xx − ((bx + h)P )x ,
0T ] + [1 − ψ(log(Vτ (T ) /Kτ (T ) ))]1[τ (T )>T ] .
(110.4)
Here the writedown function ψ(log x) reports the writedown fraction for the bond that is lost due to bankruptcy costs or the others. (Actually, Zhou (2001) considered ψ of the affine-exponential form: ψ(y) = a − bey ). In literatures, 1 − ψ is also referred to as recovery rate. It is worth noting that in empirical studies even for the same class of bond issues, the recovery rate 1 − ψ differs significantly over different time periods and different firms. See Altman (1995) and Franks and Torous (1994). Based on the choice of writedown functions, Zhou (2001) provided a Monte Carlo simulation scheme. With his simulation results, he found that by manipulating the parameters, various shapes of credit spreads, default probabilities, and the other characteristics of risky bonds found in empirical studies can be recovered. On the other hand, Hilberink and Rogers (2002) generalized the Leland’s model (see (1994)). They assumed that the firm’s log asset value process is the sum of a Brownian motion with drift and a downward jump compound Poisson process. Although no closed form solution for the price of the perpetual debt, they found the Fourier transform of it. By inverting the Fourier transform, they also found numerically the result that credit spreads do not tend to zero as time to maturity goes to zero. Note that the unique (up to indistinguishability) solution to (110.3) is given by Vt = V0 exp
Nt 1 2 r − σ − λν t + σWt (1 + Uj ). 2 j=1
page 3851
July 6, 2020
16:4
3852
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
Set Xt = log(Vt /Kt ) and put τ (T ) = inf{c ≤ s ≤ t; Xs ≤ 0}. Then we have
1 2 Xt = log(Vt /Kt ) + r − σ − λν − κ t + σWt − Zt 2 = X0 + ct + σWt − Zt ,
t ∈ R+ ,
(110.5)
where X0 = x = log(V0 /K0 ), c = r − 12 σ 2 − λν − κ, and Zt Nt n=1 − log(1 + Un ). It follows from (110.4) that the no arbitrage price of the bond is given by D(V0 , T ) = e−rT − e−rT Ex [ψ(Xτ (T ) )1{τ (T ) J1 ≥ τ (T )] (Jump occurs up to maturity and default occurs before J1 ), C = [T > τ (T ) = J1 ] (Jump occurs up to maturity and default occurs at J1 ), D = [T > τ (T ) > J1 ] (Jump occurs up to maturity and default occurs after J1 ). Note that {A, B, C, D} is a partition of {τ (T ) ≤ T }. With these, we define GA (x, T ) = Ex [ψ(Xτ (T ) ); A] and similarly for GB , GC and GD . Before stating our results, we recall some facts about the joint distribution of Brownian motion with drift and its maximum process. For details and proofs, see Shreve (2003). Theorem 110.1. Let α ∈ R, T > 0, W (t, α) = αt + W (t) and M (T ; α) = M ax0≤t≤T W (t; α). Then the joint density of M (T ; α) and W (t; α) is given by
1 2 1 2 2(2m−w) √ eαw− 2 α T − 2T (2m−w) , w ≤ m, m ≥ 0, T 2πT fM (T ;α),W (T ;α) (m, w) = 0, otherwise (110.9)
page 3853
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3854
Therefore, the density of M (T ; α) is given by
1 2 √ 2 e− 2T (2m−w) − 2αe2αm N −m−αT √ , 2πT T fM (T ;α) (m) = 0, and P [M (T ; α) ≤ m] =
m ≥ 0, otherwise (110.10)
2αm N −m−αT √ √ ) − 2αe , N ( −m−αT T T 0,
m ≥ 0, m < 0. (110.11)
Here N (·) is the cumulative distribution function of standard normal distribution. Proposition 110.2. We have the following representations of GA , GB and GC : −ˆ x + cˆT −ˆ x − cˆT −λT −2ˆ cx ˆ √ √ +e , N N GA (x, T ) = ψ(0)e T T (110.12) T −ˆ x − cˆt √ dFJ1 (t) N GB (x, T ) = ψ(0) t 0 T −ˆ x + cˆt −2ˆ cx ˆ √ dFJ1 (t) , e N (110.13) + t 0 T ∞ y dFJ1 (t) dF (y) dwψ(w − y)H(x, w, t), GC (x, T ) = 0
0
0
(110.14) where g(μ; σ 2 ) =
√1 2πσ
μ2
exp{− 2σ2 } and
H(x, w, t) = g(x − w + ct; tσ 2 ) − e−2ˆcxˆ g(x + w − ct; tσ 2 ).
(110.15)
Proof. Note that ψ(Xτ (T ) ) = ψ(0) on A and B. By independence of {Wt ; t ∈ R+ } and J1 , we obtain GA (x, T ) = P [J1 > T ]ψ(0)P min x + cs + σWs ≤ 0 = P [J1 > T ]ψ(0)P = P [J1 > T ]ψ(0)P
s≤T
max −x − cs − σWs ≥ 0 s≤T
max −ˆ cs + Ws ≥ x ˆ , s≤T
page 3854
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
An Integral Equation Approach for Bond Prices with Applications
3855
where the last equality follows from the symmetry of standard Brownian motion. By (110.11), we have cs + Ws ≥ x ˆ] = 1 − P [max −ˆ cs + Ws ≤ x ˆ] P [max −ˆ s≤T
s≤T
−ˆ x + cˆT x ˆ + cˆT −2ˆ cx ˆ √ √ −e N =1− N T T −ˆ x + cˆT −ˆ x − cˆT √ √ + e−2ˆcxˆ N . (110.16) = T T
This completes the proof of (110.12). We next turn to the proof of (110.13). Again by the independence of {Wt ; t ∈ R+ } and J1 , and the summary of standard Brownian motion, we get
GB (x, T ) = ψ(0)P min x + cs + σWs ≤ 0, J1 ≤ T s≤J1
= ψ(0) = ψ(0)
T 0 T 0
P min x + cs + σWs ≤ 0 dFJ1 (t)
s≤t
P min −cs + Ws ≥ x ˆ dFJ1 (t). s≤t
Then replacing T with t for (110.16), we get (110.13). Finally, from independence of {Wt ; t ∈ R+ }, Y1 − log(1 + U1 ) and J1 , GC (x, T ) =
0
T
dFJ1 (t)E ψ(x + Xtc − Y1 )1
× = 0
×
min x + Xsc > 0, x + Xsc − Y1 < 0 s≤t
T
dFJ1 (t)
∞ 0
dF (y)E ψ(x + Xtc − y)1
min x + Xsc > 0, x + Xsc − y < 0 ,
0≤s≤t
where in the last line we use the fact that P [min0≤s≤t x + Xsc > 0, x + Xsc − y < 0] = 0 for y < 0. Also, observe that, using the symmetry of
page 3855
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3856
Brownian motion, E[ψ(x + Xtc − y)1( min x + Xsc > 0, x + Xsc − y < 0)] 0≤s≤t
cs = E[ψ(x − σ(−ˆ ct + Wt ) − y)1(min −ˆ s≤t
ˆ, x ˆ − (−ˆ ct + Wt ) − yˆ ≤ 0)]. + Wt ≤ x Now, applying the formula of the joint distribution of W (α; t) and M (α; t) with α = −ˆ c, we get, for all t, y > 0, E ψ(x − σ(−ˆ ct + Wt ) − y)1 ×
cs + Wt ≤ x ˆ, x ˆ − (−ˆ ct + Wt ) − yˆ ≤ 0 min −ˆ s≤t
dw
=
∞ w+
dmψ(x − σw − y)1(m ≤ x ˆ, x ˆ − w − yˆ ≤ 0)
1 2
1
2
× e−ˆcw− 2 cˆ t− 2π (2m−w) xˆ −ˆ cw− 21 cˆ2 t = dwψ(x − σw − y)e
x ˆ−ˆ y x ˆ
=
x ˆ−ˆ y x ˆ
= x ˆ−ˆ y
2(2m − w) √ t 2πt
1 2 t
dwψ(x − σw − y)e−ˆcw− 2 cˆ
−ˆ cw− 21 cˆ2 t
dwψ(x − σw − y)e
x ˆ w+
dm
2ˆ x−w |w|
2(2m − w) − 1 (2m−w)2 √ e 2t t 2πt
m −1m e 2t dm √ t 2πt
2 (2ˆ x−w)2 1 − w2t − 2t √ e −e . 2πt
cw + cˆ2 t2 = (2ˆ x − cˆt − w)2 + 4ˆ xcˆt. Therefore, Note that (2ˆ x − 2)2 + 2tˆ xˆ (2ˆ x−w)2 1 2 w2 1 e− 2t − e− 2t dwψ(x − σw − y)e−ˆcw− 2 cˆ t √ 2πt x ˆ−ˆ y xˆ (w+ˆ ct)2 (2ˆ x−ˆ ct−w)2 1 − 2t −2ˆ xcˆ − 2t e dwψ(x − σw − y) √ −e e = 2πt x ˆ−ˆ y y 2 2 1 − (x−w+ct) − (x+w−ct) −2ˆ x c ˆ 2 2 2tσ 2tσ e dwψ(w − y) √ −e e = σ 2πt 0 y ψ(w − y)[g(x − w + ct; tσ 2 ) − e−2ˆxcˆg(x + w − ct; tσ 2 )]dw. = 0
This gives (110.14). To calculate GD , we use the Strong Markov Property of Levy process.
page 3856
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
An Integral Equation Approach for Bond Prices with Applications
3857
Proposition 110.3. We have GD (x, T ) = Ex [1D Φ(XJ1 , T − J1 )] T dFJ1 (t) dF (y) = 0
∞ y+
dwΦ(w − y, T − t)H(x, w, t),
where H is defined in (110.15) ˜ = {J1 ≤ T, mins≤J Xs > 0} ∈ FJ . Therefore, by the Proof. Note that D 1 ! Strong Markov Property, we have GD (x, T ) = Ex [Ex [1D˜ ψ(Xτ (T ) )1(τ (T ) ≤ T )|FJ1 ]] = Ex [1D˜ Φ(XJ1 , T − J1 )]. Recall that Y1 = − log(1 + U1 ). Therefore, by the independence of J1 , Y1 and {Wt ; t ≥ 0}, we get c c GD (x, T ) = E 1 J1 ≤ T, min x + Xs > 0, x + XJ1 − Y1 > 0 x≤J1
× Φ(x +
T
= 0
XJc1
dFJ1 (t)
− Y1 , T − J1 ) c c dF (y)E 1 min x + Xs > 0, x + Xt − y > 0
c Φ(x + Xt − y, T − t) .
x≤t
We compute the integrand. Using the symmetry of Brownian motion, we get c c E 1 min x + Xs > 0, x + Xt − y > 0 x≤t
× Φ(x +
Xtc
− y, T − t)
= E 1 max −x − cs + σWs < 0, −x − cs + σWt + y < 0 x≤t
× Φ(x + ct − σWt − y, T − t) x − cˆs + Ws < 0, −ˆ x − cˆs + Wt + y < 0 = E 1 max −ˆ x≤t
× Φ(x − σ(−ˆ ct + Wt ) − y, T − t) .
page 3857
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3858
Using the joint distribution of M (−ˆ c, t) and −ctˆ + Wt , we get x − cˆs + Ws < 0, −ˆ x − cˆs + Wt + y < 0 E 1 max −ˆ x≤t
× Φ(x − σ(−ˆ ct + Wt ) − y, T − t)
(ˆ x−ˆ y )∧ˆ x
= −∞
dw
x ˆ
w+
dvΦ(x − σw − y, T − t)
2(2v − w) √ t 2πt
1 1 × exp −ˆ cw − cˆ2 t − (2v − z)2 2 2t
(ˆx−ˆy)∧ˆx 1 2 dwΦ(x − σw − y, T − t) exp −ˆ cw − cˆ t = 2 −∞
xˆ 1 2(2v − w) √ exp − (2v − z)2 dv. × 2t t 2πt w+ Similar to the calculation of GC , we have
xˆ (ˆx−ˆy )∧ˆx 1 2 2(2v − w) √ dwΦ(x − σw − y, T − t) exp −ˆ cw − cˆ t 2 t 2πt −∞ w+
1 2 × exp − (2v − z) dv 2t (ˆx−ˆy)∧ˆx Φ(x − σw − y, T − t) = −∞
× [g(w + cˆt; t) − g(w + tˆ c − 2ˆ x; t)e−2ˆcxˆ ]dw ∞ Φ(x − y, T − t)[g(w − x − ct; tσ 2 ) = y+
− g(w + x − ct; tσ 2 )e−2ˆcxˆ ]dw, where we use the change of variable x − σw → w in the last equation. This completes the proof of the proposition. Theorem 110.4. For every writedown function ψ, the function Φ defined in (110.7) satisfies the following integral equation: T dFJ! (s) Φ(x, T ) = G(x, T ) + 0
×
dF (y)
∞ y+
dwΦ(w − y, T − s)H(x, w, s),
(110.17)
page 3858
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
An Integral Equation Approach for Bond Prices with Applications
b3568-v4-ch110
3859
where H is defined in (110.15), G = GA + GB + GC and GA , GB , and GC are given as in (110.12), (110.13), and (110.14), respectively. Remark. The decomposition of Φ into GA , GB , GC and GD is quite intuitive actually. Given the interest rate is zero, there is a financial security with time to maturity T which pays ψ(Xτ (T ) ) upon the time τ (T ) ≤ T . Assume Ex is the “right” measure we can use to compute prices. So that the price of such security is Φ. We divide this security into four and classify the possibilities of cash flow given X ≤ 0 up to time T . Up to time T , we check whether a jump has occurred. If it has not, namely, T ≤ J1 , then the cause of cash flow must arise from diffusion. Namely, Xτ (T ) = 0. The price of cash flow on this event is given by GA . On the other hand, suppose it has: J1 ≤ T . We further classify the possibilities of cash flow. If τ (T ) < J1 , then the cause must arise from diffusion again. The price of cash flow on this event is given by GB . Suppose τ (t) = J1 . Then cause of cash flow is jump and the price of cash flow is GC . Otherwise, we have J1 < τ (T ) ≤ T . But, at the time J1 , from the renewal property of X, the security can be seen as a “new: security almost the same as the old one, except that the time to maturity is T − J1 , the price of this “new” security at time 0 is GD . We will further extend this idea of decomposition to an infinite series expansion of the aforementioned bond price in the appendix. 110.3 Analytical Properties of Bind Prices First, to fix idea, we adopt from Lando (2004) and Merton (1974) the following definition of yield spreads and credit spreads. Definition 110.5. For the bond price defined in (110.6), the promised yield for maturity T is given by the formula y(V0 , T ) = T1 log( D(V10 , T ) ) and the credit spread for maturity T by s(V0 , T ) = y(V0 , T ) − r. Note the 1 is the face value of the bond and one get immediately from the definition that D(V0 , T )ey(V0 , T )T = 1. Lemma 110.6. For all x > 0 and y > 0, the function c c c t → E ψ(x + Xt − y)1 min x + Xs > 0, x + Xt − y < 0 s≤t
is continuous on R++ .
(110.18)
page 3859
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3860
Proof. Recall that in the proof of Proposition 110.2, we have c c c E ψ(x + Xt − y)1 min x + Xs > 0, x + Xt − y < 0
s≤t
y
= 0
ψ(w − y)H(x, w, t)dw,
where H is given by (110.15). From this, it follows easily that the function in (110.8) is continuous. This completes the proof. Lemma 110.7. Assume x > 0. Then ∂ P min x + cs + σWs = 0. lim T →0+ ∂T x≤T Also, for all n ∈ N , we have P min x + cs + σWs < 0 = 0(Tn ), as T → 0 + .
(110.19)
(110.20)
x≤T
Proof. Firstly, we prove (110.19). By the symmetry of Brownian motion, we have P min x + cs + σWs ≤ 0 = P max −cs + σWs ≥ x x≤T
x≤T
cs + Ws ≥ x ˆ . = P max −ˆ x≤T
Note that x > 0. Therefore, by (110.11), we have x ˆ + cˆT −ˆ x + cˆT −2ˆ cx ˆ √ +e , cs + Ws ≥ x ˆ =N − √ N P min −ˆ x≤T T T 2
x which converges to 0 as T → 0+. Recall that g(x; σ 2 ) = √ 1 2 exp{− 2σ 2 }. 2πσ We have x ˆ + cˆT −ˆ x + cˆT ∂ −2ˆ cx ˆ √ +e N − √ N ∂T T T 1 √ T cˆ − (ˆ x + cˆT ) 12 T − 2 − (ˆ x + cˆT ) √ ;1 = −g T T 1 √ T cˆ − (−ˆ x + cˆT ) 12 T − 2 −ˆ x + cˆT −2ˆ cx ˆ √ ;1 g +e T T cˆT − x ˆ cˆT + x ˆ − (ˆ x + cˆT ) x ˆ + cˆT 1 √ √ ;1 ;1 −g +g = 2 T 3/2 T 3/2 T T
√ 2 x ˆ 1 x ˆ 1 1 x ˆ + cˆT √ √ + cˆ T ;1 ˆ exp − =√ x . =g 3/2 3/2 2 T T 2 T T
page 3860
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
An Integral Equation Approach for Bond Prices with Applications
3861
Observe that for all T > 0
√ 2 1 x ˆ 1 x ˆ 1 1 √ + cˆ T ˆ exp − ≤ C exp − 0≤ √ x 3/2 2 2T T 3/2 T 2π T for some constant C > 0 independent of T > 0. This implies that (110.19) holds. We prove (110.20). Let n ∈ N . Then, by L’Hospital’s rule, ∂ ∂T
P [mins≤t x + cs + σWs < 0] nT n−1
1 x ˆ ≤ lim C exp − = 0. (n−1)+3/2 T →0+ 2T nT
P [mins≤t x + cs + σWs < 0] ≤ lim T →0+ T →0+ Tn lim
We have completed the proof. Theorem 110.8. Let D(V0 , T ) be the bond price as defined in (110.6) and set x = log(V0 /K0 ). We have the following analytical properties of bond prices: (a) For all T > 0, we have limV0 →∞ D (V0 , T ) = e−rT . (b) For all V0 > K0 , limT →0+ Px [τ (T ) ≤ T ] = 0. (c) Assume that the writedown function ψ is continuous and 0 ≤ ψ ≤ 1. Then ψ(x − y)dF (y). λ ≥ lim sup {s (V0 , T )} ≥ lim inf {s (V0 , T )} ≥ λ T →0+
T →0+
y>x
(110.21) In particular, if ψ > 0 and P [Y1 > x] > 0, we have a strictly positive credit spread for zero maturity. Proof. We prove (a) first. Since the writedown function ψ is bounded, by (110.6) and (110.7), it suffices to show that limx→∞ Px [τ (T ) ≤ T ] = 0. Now since X = (Xt ) is cadlag, it is clear that, for fixed T , τ (T, x) inf{0 ≤ t ≤ T ; x + Xtc − Xtd ≤ 0} → ∞ as x ↑ ∞. This implies that limx→∞ Px [τ (T ) < T ] = 0. Next, consider (b). Write Px [τ (T ) ≤ T ] = Px (A) + Px (B ∪ C ∪ D),
(110.22)
where {A, B, C, D} is the partition of {τ (T ) ≤ T } in (110.8). For the second term on the right-hand side of (110.22), we have Px (B ∪ C ∪ D) = Px [τ (T ) ≤ T, J1 ≤ T ] ≤ P {J1 ≤ T } = 1 − e−λT → 0,
T →0+.
(110.23)
page 3861
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3862
On the other hand, we write Px (A) = Px [τ (T ) ≤ T, J1 > T ] = P [J1 > T ]P [mins≤T x + cs + σWs ≤ 0]. By (110.20), we get limT →0+ Px (A) ≤ limT →0+ P [mins≤t x + cs + σWs ≤ 0] = 0. Combining these results with (110.22) gives (b). Finally, consider (c). We firstly estimate limT →0+ inf s(V0 , T ). Since 0 ≤ ψ ≤ 1, we have 1 ≥ Φ(x, T ) ≥ GC (x, T ). Therefore, erT 1 log −r lim inf (s(V0 , T )) = lim inf T →0+ T →0+ T 1 − Φ(x, T ) erT 1 = lim inf log T →0+ T 1 − Φ(x, T ) 1 1 . ≥ lim inf log T →0+ T 1 − GC (x, T )
Note that by (b), Φ(x, T ) → 0 as T → 0+. Hence we get 0 ≤ Gc (x, T ) ≤ Φ(x, T ) → 0 as T → 0+. By L’Hospital’s rule, we obtain 1 log lim T →0+ T
1 1 − Gc (x, T )
= lim
T →0+
∂GC ∂T (x, T )
1 − GC (x, T )
∂GC (x, T ). T →0+ ∂T
= lim
By (110.14), Lemma 110.6 and Fundamental Theorem of Calculus, we obtain ∞ ∂GC −λT (x, T ) = λe dF (y) ∂T 0 c c c ×E ψ(x + XT − y)1 min x + Xs > 0, x + XT − y < 0 . s≤t
Since ψ is continuous and 0 ≤ ψ ≤ 1, we obtain ∞ ∂GC (x, T ) ≥ λ ψ(x − y)1(1 − y < 0)dF (y) T →0+ ∂T 0 ψ(x − y)dF (y). =λ
lim inf s(V0 , T ) ≥ lim T →0+
y>x
This proves the lower bound of (110.21). Next, we show that λ ≥ lim supT →0+ s(V0 , T ). Using the partition {A, B, C, D} of {τ (T ) ≤ T } as in (110.8), we have Φ(x, T ) = Ex [ψ(Xτ (T ) ); B ∪ C ∪ D] + GA (x, T ). Since P (B ∪ C ∪ D) ≤ P [J1 ≤ T ] = 1− e−λT and ψ ≤ 1, we have Ex [ψ(Xτ (T ) ); B ∪ C ∪ D] ≤ 1− eλT . This implies
page 3862
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
An Integral Equation Approach for Bond Prices with Applications
b3568-v4-ch110
3863
that Φ(x, T ) ≤ (1 − eλT ) + GA (x, T ). Therefore, we obtain 1 1 lim sup s(V, T ) = lim sup log 1 − Φ(x, T ) T →0+ T →0+ V 1 1 log −λT . ≤ lim T →0+ T e − GA (x, T ) Note that GA (x, T ) ≤ Φ(x, T ) → 0 as T → 0+ and, thence, limT →0+ (e−λT − GA (x, T )) = 1. By L’Hospital’s rule, we have −λT − ∂GA (x, T ) − −λe ∂T 1 1 log −λT = lim lim −λT T →0+ T T →0+ e − GA (x, T ) e − GA (x, T ) ∂GA (x, T ) . = lim λ + T →0+ ∂T −λT A To compute limT →0+ ∂G ∂T (x, T ), we note that GA (x, T ) = ψ(0)e P [mins≤T s + cs + σWs ≤ 0]. A By Lemma 110.7, we get limT →0+ ∂G ∂T (x, T ) = 0. We have obtained the upper bound in (110.21). This completes the proof.
Bibliography Altman, E.I. (1995). A Yield Premium Model for The High Yield Debt Market. Financial Analyst Journal, 49–56. Bingham, N.H. and Kiesel, R. (2004). Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, 2nd edn. Springer-Verlag, London. Black, F. and Cox, J.C. (1976). Valuing Corporate Securities: Some Effects of Bond Indenture Provisions. Journal of Finance 31, 351–367. Franks, J.R. and Torous, W. (1994). A Comparison of Financial Recontracting in Distresses Exchanges and Chapter 11 Organizations. Journal of Financial Economics 35, 349–370. Hilberink, B. and Rogers, L. C. G. (2002). Optimal Capital Structure and Endogenous Default. Finance and Stochastics 6, 237–363. Lando, D. (2004). Credit Risk Modeling. Princeton University Press, New Jersey. Leland, H. (1994). Corporate Debt Value, Bond Covenants, and Optimal Capital Structure. Journal of Finance 49, 1213–1252. Longstaff, F. and Schwartz, E. (1995). A Simple Approach to Valuing Risky Fixed and Floating Rate Debt. Journal of Finance 50, 789–819. Merton, R. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance 29, 449–470. Shreve, S.E. (2003). Stochastic Calculus for Finance II: Continuous Time Models. SpringerVerlag, New York. Zhou, C. (2001). The Term Structure of Credit Spreads with Jump Risk. Journal of Banking & Finance 26, 2015–2040.
page 3863
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
3864
Appendix 110A Infinite Series Expression for Bond Price Now, consider the space ET R+ × [0, T ], T > 0. Take E = R+ × [0, ∞) for notational convenience. For every bounded measurable function f on E, the operator L is defined by the formula, ∞ t dFJ1 (s) dF (y) dwf (w − y, t − s)H(x, w, s), Lf (x, t) = y+
0
where H is defined in (110.15). Then by Theorem 110.4, we can write Φ as Φ(x, t) = G(x, t) + LΦ(x, t),
∀(x, t) ∈ E.
(110A.1)
Now, after nth iteration of (110A.1), we have Φ(x, t) =
n
Lk G(x, t) + Ln+1 Φ(x, t),
(110A.2)
k=0
where L0 = Id and K k+1 f = L(Lk f ). From the definition of f , one sees that Lf (x, t) = Ex f (XJ1 , t − J1 )1 J1 ≤ t, min Xs > 0 . 0≤s≤J1
In general, after nth iteration, we have the following result. Lemma 110A.1. Let f be a nonnegative measurable function. Then, for any n ∈ N, n (110A.3) L f (x, t) = Ex f (XJn , t − Jn )1 Jn ≤ t, min Xs > 0 . 0≤s≤Jn
Moreover, if f is bounded, Ln f converges to as n → ∞ uniformly on ET . Proof. The proof proceeds with induction. We already have the case n = 1 by the definition of L. Assume that for n = k the conclusion of the lemma holds. Then for n = k + 1, k+1 k L f (x, t) = Ex L f (XJ1 , t − J1 )1 J1 ≤ t, min Xs > 0
0≤s≤J1
= Ex EXJ1 f (XJk , v − Jk )1 Jk ≤ t, min Xs > 0
v−t−J1
1 J1 ≤ t, min Xs > 0 0≤s≤J1
0≤s≤Jk
page 3864
July 6, 2020
16:4
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
An Integral Equation Approach for Bond Prices with Applications
3865
where we have applied the case of (110A.3) for n = k in the last line. On the other hand, by the Strong Markov Property of X, we have Ex f (XJk+1 , t − Jk+1 )1 Jk+1 ≤ t, min Xs > 0
0≤s≤Jk+1
= Ex Ex f XJk+1 , t − Jk+1 1 Jk+1 ≤ t, 1 J1 ≤ t, min Xs > 0
0≤s≤J1
min
0≤s≤Jk+1
Xs > 0 FJ1
= Ex EXJ1 f (XJk , v − Jk ) 1 Jk ≤ t, min Xs > 0
v−t−J1
1 J1 ≤ t, min Xs > 0 0≤s≤J1
0≤s≤Jk
.
Hence, we have proved that (110A.3) holds for n = k + 1. By induction hypothesis, we have proved the first part of the lemma. Note that for any (x, t) ∈ ET , we have n 0 ≤ L (x, t) = Ex f (XJn , t − Jn ) 1 Jn ≤ t, min Xs ≥ 0 0≤s≤Jn
≤ f ∞ P [J1 ≤ t] ≤ f ∞ P [Jn ≤ T ] . Since Jn is the nth jump time of the compound Poisson process, we have ∞ )m n e−λT (λT P [Jn ≤ T ] = m! . This implies that L f converges to zero unim=n
formly on ET as n → ∞. From this, we obtain from (110A.2) that for every bounded ψ, n ∞ Lk G(x, T ) + Ln+1 Φ(x, T ) = Lk G(x, T ). Φ(x, T ) = lim n→∞
k=0
k=0
(110A.4) Write x = log(V0 /K0 ). By (110.6), we get the following infinite series expression for bond prices. Theorem 110A.2. For the general nonnegative bounded writedown function ψ, the bond price (110.6) is given by the formula −rT
D(V0 , T ) = e
−rT
−e
∞ m=0
Lm G(log(V0 , K0 ), T ),
(110A.5)
page 3865
July 6, 2020
16:4
3866
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch110
Y.-T. Chen, C. F. Lee & Y.-C. Sheu
where G = GA + GB + GC and GA , GB , and GC are given as in (110.12), (110.13), and (110.14), respectively. Moreover, the series converges uniformly on ET . Remark. The result of Theorem 110A.2 is quite intuitive if we see Φ as an infinite sum of securities. We have discussed in the remark following Theorem 110.4 that the expected penalty part Φ(x, T ) can be seen as a sum of four securities. In particular, the part G = GA +GB +GC contributes to the expected cash flow ψ(Xτ (T ) ) arising from all possibilities except that J1 < τ (T ) ≤ T . From (110A.1) and the definition of GD , GD (x, T ) = LΦ(x, T ) can be seen as the time-0 price of a security issued at J1 which is almost the same as Φ except the time to maturity is T − J1 . The new security has Φ(log(VJ1 /KJ1 ), T − J1 ). Following the similar decomposition of Φ(x, T ), at time J1 , Φ(log(VJ1 /KJ1 ), T − J1 ) can be decomposed mainly into two securities, the one whose payment is made after the first jump (after J1 ): its time-(J1 ) price given by GD (log(VJ1 /KJ1 ), T − J1 ) = LΦ(log(VJ1 /KJ1 ), T − J1 ), the other for the remaining possibilities: its time-(J1 ) price given by GD (log(VJ1 /KJ1 ), T − J1 ). That is, the price Φ has the decomposition Φ(x, T ) = G(x, T ) + LΦ(x, T ) = G(x, T ) + L[G + LΦ](x, T ) = G(x, T ) + LG(x, T ) + L2 Φ(x, T ). Continuing the above, we have, after n steps, equation (110A.2). Since the number of jumps in a finite period of time is finite (Jm → ∞), this stepby-step analysis must cover all possibilities of τ < T . We are led to the conclusion (110A.4).
page 3866
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
Chapter 111
Sample Selection Issues and Applications Hwei-Lin Chuang and Shih-Yung Chiu Contents 111.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2 Sample Selection Issues . . . . . . . . . . . . . . . . . . . . 111.2.1 Sample selection bias . . . . . . . . . . . . . . . . . 111.2.2 Correction for sample selection bias . . . . . . . . . 111.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3.1 Female labor supply–probit correction approach . . 111.3.2 Employability and wage compensation-multinomial logit correction approach . . . . . . . . . . . . . . . 111.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
3868 3869 3869 3873 3876 3876
. . .
3880 3883 3884
Abstract In many occasions of regression analysis, researchers may encounter the problem of a nonrandom sample that leads to a biased estimator when using the OLS method. This study thus examines some related issues of sample selection bias due to non-random sampling. We first explain the source of bias caused by non-random sampling and then demonstrate that the direction of such bias in most cases cannot be ascertained based on prior information. By treating the sample selection as informative sampling, we can formulate the sample selection bias issue as an omitted variable problem in the regression model. Heckman (1979) proposed a two-stage estimation procedure to correct for selection bias. The first stage applies the Probit model to produce the estimated value of the inverse Mill’s ratio and
Hwei-Lin Chuang National Tsing Hua University e-mail: [email protected] Shih-Yung Chiu Soochow University e-mail: [email protected] 3867
page 3867
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3868
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
then includes it into the second-stage regression model as an explanatory variable to yield unbiased estimators. As the sample selection rule may not always be derived from a yes– no choice, our study further utilizes Lee’s (1983) extension by applying the Multinomial Logit model into the first-stage estimation procedure to allow for its application with multichoice sample selection rule. Since the pioneer works related to sample selection issues are mostly in the field of labor economics, we give two examples of an empirical study in labor economics to respectively demonstrate applications of the Probit correction approach and Multinomial Logit correction approach. Finally, we point out that the problem of a nonrandom sample is not limited to applications in economics. In the past 20 years, quite a few researchers have taken into account the issue of sample selection for studies of finance and management issues. Keywords Sample selection bias • Heckman’s two-stage estimation • Probit model • Multinomial logit model.
111.1 Introduction Regression is a main tool of econometrics and is widely applied in many disciplines of social science, especially for empirical analyses in the fields of economics and finance. Regression deals with the dependence of one variable (commonly termed dependent variable) on other variables (commonly termed explanatory variables or independent variables). In addition, regression analysis is largely concerned with estimating the population mean value of a dependent variable on the basis of known or fixed values of the explanatory variable(s). In terms of the parameter estimation, the method of ordinary least squares (OLS) is one of the most powerful and popular techniques used by researchers when applying regression analysis, because the OLS estimator, under certain assumptions, has very attractive statistical properties such as BLUE (best linear unbiased estimator). One of the key assumptions to assure the BLUE property of the OLS estimator is a zero mean value of the disturbance term, which implies that the dependent variable is randomly sampled along with the explanatory variables. This assumption suggests that there is no specification bias or specification error in the model used in the regression analysis. In some applications, there are missing values for the dependent variable, but at the same time one can observe the exogenous explanatory variables. For example, in the estimation of the wage equation for women, the wage variable is missing for those women who do not work, but we can observe explanatory variables such as age and schooling for non-working women. The
page 3868
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3869
pioneer works of estimating the wage equation for women by Gronau (1974), Lewis (1974), and Heckman (1974) indicate that the observed distribution of wages is a censored distribution, because only those women choosing to work can have observed wages. The term “selection bias” refers to the fact that if we estimate the wage equation by OLS based on the sample of working women, then we get inconsistent estimates of the parameters, because the assumption of a zero mean value of the disturbance term in the wage regression model is violated. Heckman (1976, 1979) proposes a two-stage estimation method to resolve the problem of sample selection bias based on the idea of correcting the specification error due to omitted variables in the regression model. The purpose of our study is to discuss some related issues of sample selection bias. Section 111.2 introduces the idea of sample selection bias in more details and presents some commonly used approaches to deal with the problem of sample selection bias. Section 111.3 offers some applications in labor economics. Section 111.4 concludes the study by giving some examples of applications in the fields of finance and management. 111.2 Sample Selection Issues A non-random sample is the key reason for selection bias. In a random sample, the sample mean of the dependent variable should mirror that of the population mean. This is the basis for unbiasedness of the OLS estimator. On the one hand, a non-random sample leads to the problem of selection bias. On the other hand, it can be considered as informative sampling — that is, the mere observability of the dependent variable implies information about the regression residual. Based on this line of thinking, we can better understand the source of the sample selection bias and the rationale behind the procedure of correcting the selection bias. 111.2.1 Sample selection bias Let us start with the problem of estimating the wage equation for women. The market wage equation can be specified as follows: y1 = a + bx + u1 , where y1 denotes the wage variable, and x denotes the explanatory variable such as years of schooling. We do not always observe y1 since only those who work have observed wages.
page 3869
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
3870
Let y2 be the index function for observability, i.e.: 1 if u2 > c0 + c1 x1 , u2 ∼ N (0, 1), y2 = 0 otherwise.
(111.1)
We only observe y1 if y2 = 1. It implies that Pr(y2 = 1) = 1 − F (c0 + c1 x1 ) = 1 − Φ(c0 + c1 x1 ).
(111.2)
If we let y2∗ denote the unobserved reservation wage for women, then we observe wages for working women when their market wage is higher than their reservation wage according to the theory of labor supply. In other words, we observe y1 if y1 > y2∗ , where: y2∗ = d0 + d1 x + u3 , a + bx + u1 > d0 + d1 x + u3 , u1 − u3 > (d0 − a) + (d1 − b)x,
(111.3)
which imply that
u1 − u3
Var(u1 − u3 ) u2
>
(d0 − a) + (d1 − b)x . Var(u1 − u3 )
(111.4)
c0 +c1 x1
Here, by simple transformation, we can rewrite the labor supply decision rule (market wage > reservation wage) as an index function model as specified in equation (111.1). We can further demonstrate the bias due to sample selection by examining the observability of y1 based on the sample selection rule: y1 > y2∗ . Since we do not observe y1 below y2∗ , the regression residual will be sampled disproportionately from the upper or lower tail of the distribution. As a result, we notice that the residual will not have a zero mean anymore. Let us consider a simple case where the threshold is fixed, i.e., y2∗ = d0 . We can compare the sample regression line (dotted line) and the population regression line (solid line) for this case in Figure 111.1. As shown in Figure 111.1, at x1 , we sample from the left (upper) tail of the distribution. At x2 , we sample from the left (upper) half of the distribution. Thus, the sample mean at both x1 and x2 will be higher than the population mean. At x3 , we sample from almost the whole distribution. The observed mean is very close to the population mean at x3 . As a result, when we try to fit a linear regression using the OLS method, we under-estimate the effect of x on y 1 in this simple case, as Figure 111.1 illustrates.
page 3870
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3871
Figure 111.1: Comparing the sample regression line and population regression line for the fixed threshold case.
Figure 111.2:
Observability of y1 .
In the fixed threshold case as shown in Figure 111.1, we can predict the direction of selection bias. However, if we allow the threshold to be stochastic, then the direction of bias will be very difficult to predict. For example, let y2∗ = d0 + u3 , where u3 denotes a stochastic error term. As shown in Figure 111.2, we are more likely to observe y1 when y2∗ is in the lower tail than when in the upper tail since y1 > y2∗ is more likely to be satisfied when y2∗ is in the lower tail than when in the upper tail. We also
page 3871
July 6, 2020
16:5
3872
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
Figure 111.3: Comparing the sample regression line and population regression line for a stochastic threshold case where u1 and u3 are positively correlated and σu3 > σu1 .
tend to observe most of y1 when x is large since the selection criterion is more likely to be satisfied as x increases. Suppose u1 and u3 are positively correlated and σu3 > σu1 ; we then will tend not to observe y1 in the upper tail, because of a smaller variance in u1 . We are more likely to observe y1 when its residual is negative than when it is positive. The sample mean value of the observed y1 (dotted line in Figure 111.3) tends to be smaller than the population mean (solid line in Figure 111.3). In other words, we will over-estimate the slope of the regression line in this specific case as presented in Figure 111.3. We contrarily note that if u1 and u3 are negatively correlated, then we will tend to under-estimate the slope, because we are more likely to observe y1 when its residual is positive than when it is negative. In other words, in the stochastic threshold case, there is no a priori supposition that we will under-estimate or over-estimate the slope. Thus, it is impossible to predict, a priori, the direction of selection bias. The bias depends on the correlation between u1 and u3 and how the probability of observing y1 changes as x changes. The basic problem in the sample selection model is that the sample mean does not represent the population mean as it is in the random sampling regression model. The deviation of the sample mean from the population mean results in biased parameter estimators. In most cases, the direction of bias on the OLS coefficient for the regression model is impossible to predict on the basis of prior information. Therefore, we need to adopt a proper
page 3872
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3873
estimation approach to correct for the selection bias due to non-random sampling. 111.2.2 Correction for sample selection bias The bias problem of the sample selection model can be attributed to the nonrandom sampling as discussed in Section 111.2.1. Heckman (1979) develops an estimation approach based on the idea of specification error to deal with the bias problem of sample selection. This is the well-known Heckman’s two-stage estimation approach, where the first stage applies a Probit model to specify the sample selection rule. As the sample selection rule may not always be derived from a yes–no choice, Lee (1983) extends the first-stage estimation procedure by applying the Multinomial Logit model for cases with multi-choice sample selection. We shall introduce these two estimation approaches for correcting the sample selection bias in the following. 111.2.2.1 Probit correction approach If the sample selection rule can be transformed into an index function model as specified in equation (111.1), then we can apply the Probit approach to correct for sample selection bias. Following the problem of estimating the wage equation for women, the linear regression model of the wage equation and labor supply for women can be specified as follows: y1i = x1i β1 + u1i ,
(111.5)
y2i = x2i β2 + u2i ,
(111.6)
where y1i denotes the wage variable, and y2i denotes the desired hours of work. The error terms satisfy the following condition: E(uji ) = 0, E(uji uji ) = σj2 i = i = 0 i = i .
(111.7)
The sample selection rule indicates that y1i is observed if y2i > 0 and y1i is not available for y2i ≤ 0. As a result, the expected value of the error term u1i based on the observed sample becomes: E(u1i |x1i , sample selection rule) = E(u1i |y1i , y2i > 0) = E(u1i |x1i , u 2i > − x 2i β 2 ).
(111.8)
The expected value of the dependent variable y1i then becomes: E(y1i |x1i , y2i > 0) = x1i β1 + E(u1i |x1i , u 2i > − x 2i β 2 ).
(111.9)
page 3873
July 6, 2020
16:5
3874
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
The bias that results from using non-randomly selected samples to estimate behavioral relationships arises from the ordinary problem of omitted variables. Heckman (1979) thus suggests a simple method to deal with the specification error due to the omitted variable, E(u1i |u2i > − x2i β2 ) — that is, an appropriate estimate of the omitted variable can be used to solve the problem of sample selection bias. Assuming that u1 and u2 are bivariatenormally distributed, we can evaluate the omitted variable in the regression model as σ12 λi , (111.10) E(u1i |u2i > −x2i β2 ) = √ σ22 σ22 λi , (111.11) E(u2i |u2i > −x2i β2 ) = √ σ22 ϕ(zi ) and zi = − x√2iσβ222 . where λi = Φ(−z i) Here, λ is the inverse of Mill’s ratio. Based on these results, we have σ12 (111.12) E(y1i |x1i , y2i > 0) = x1i β1 + √ λi , σ22 σ22 (111.13) E(y2i |x2i , y2i > 0) = x2i β2 + √ λi , σ22
y1i = E(y1i |x1i , y2i ≥ 0) + v1i , y2i = E(y2i |x2i , y2i ≥ 0) + v2i ,
(111.14)
where E(v1i |x1i , λi , u2i ≥ −x2i β2 ) = 0, E(v2i |x2i , λi , u2i ≥ −x2i β2 ) = 0.
(111.15)
As shown above, the new error terms v1 and v2 retrieve the property of zero mean to ensure the unbiasedness of the OLS estimator. Therefore, if we know zi and hence λi , then we can enter λi as a regressor and estimate the equation by OLS. In practice, we do not know λi . If we have the case of a censored sample for y2 — i.e., we know x2i for observations with y2 ≤ 0 — then we can use the two-stage estimation procedure as proposed in Heckman (1979) to solve the problem of sample selection bias as follows. First stage: Estimate Pr(y2 > 0) by Probit to yield a consistent estimator √ of β2 / σ22 . Next, estimate zi and hence λi based on these estimates. Second stage: Use the estimated λi as a regressor, and OLS will yield √ unbiased estimators of β1 and σ12 / σ22 . In sum, by treating the sample selection as informative sampling, we can formulate the sample selection bias issue as an omitted variable problem in
page 3874
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3875
the regression model. By correcting this specification error due to the omitted variable problem, we can yield an unbiased estimator for the coefficient of the regression model based on the two-stage procedure proposed by Heckman (1979). 111.2.2.2 Multinomial logit correction approach The sample selection rule may not always be derived from a dichotomous choice.1 In polychotomous cases with more than two alternatives, the outcome of models is observed only if the specific selection rule is satisfied. Following the concept of Heckman’s two-stage estimation approach, Lee (1983) extends the first-stage estimation procedure by applying the Multinomial Logit model for cases with multi-choice sample selection to obtain the correct selection term (similar to the inverse Mill’s ratio in the Probit correction approach). We consider the following model with a polychotomous choice under M categories and M regression equations:
ys = xs βs + us , ys∗ = zs γs + ηs ,
(111.16)
where y is observed, and y ∗ is the latent variable. All the variables xs and zs are exogenous, E(us |z) = 0 and V (us |z) = σ 2 . The vector z represents the maximum set of explanatory variables for all alternatives and the vector x contains all determinants of the variable of interest. The impact on the dependent variable is observed just for the case where the alternative s is chosen, which happens when: ys∗ > max(yj∗ ), j=s
εs = max(yj∗ − ηs ). j=s
(111.17)
We assume that the ηj ’s are independent and identically Gumbel distributed (IIA hypothesis). The marginal distribution of εs is then Fs (ε). The bivariate distribution of (ηs , εs ) is specified by the translation method. The γj can thus be easily obtained from the maximum likelihood estimates. 1
For example, Chuang, Hsieh and Lin (2010) present four types of employment sector chosen by marriage immigrants in Taiwan. Therefore, a multi-choice sample selection correction approach should be adopted in the estimation of the wage equation for workers in a specific employment sector in this case.
page 3875
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
3876
The correction term, inverse lambda, is based on the conditional mean of us : 0 us f (us , εs |Γ) dεs dus = λ(Γ), (111.18) E(us |εs < 0, Γ) = −∞ P (εs < 0|Γ) where Γ = {zγ1 , zγ2 , . . . , zγM }, and f (us , εs |Γ) is the conditional joint density of us and εs . In the econometrics literature, one of the well known and most widely used polychotomous choice models is the Conditional Multinomial Logit model. The probability that any alternative s is preferred in the Multinomial Logit model is shown as exp(zγs ) . Ps = j exp(zγj )
(111.19)
The expected value of the disturbance term us can now be written as E(us |εs < 0, Γ) = −σs ρs
φ(Jεs (0|Γ)) , Fεs (0|Γ)
(111.20)
where ρs is the correlation coefficient, and Fεs (·|Γ) is the cumulative distribution function of εs . The cumulative Jεs (·|Γ) is defined by the following transformation: Jεs (·|Γ) = Φ−1 (Fεs (·|Γ)).
(111.21)
The consistent estimator of β is then obtained by the following equation: ys = xs βs − σs ρs
φ(Jεs (0|Γ)) + ηs . Fεs (0|Γ)
(111.22)
The expression of the selection model proposed by Lee (1983) is very similar to Heckman’s two-stage estimation method, except that the selected part in Lee’s method utilizes the Multinomial Logit model for estimation in the first stage. 111.3 Applications 111.3.1 Female labor supply–probit correction approach Because the female labor force participation rate ranges around 40–60% in most countries, it implies that we will face a serious problem of sample selection when estimating the labor supply and wage equation for women, because we are not able to observe the working hours and wages for nonworking women. As mentioned earlier, the labor supply theory suggests that an individual will participate in the labor market if her market wage is higher than her reservation wage. In other words, when a woman’s reservation wage
page 3876
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
Sample Selection Issues and Applications
3877
is higher than her market wage, she will not enter the labor market. These non-working women’s wages cannot be observed, and the estimated results based only on working women will tend to be biased due to the sample selection problem. The labor supply model can be set as follows according to Killingworth (1983): H ∗ = α + βω + γV + ε, H = (α + βω + γV + ε) · I(α + βω + γV + ε > 0),
(111.23)
where H ∗ denotes the desired labor supply, H is the observed labor supply, and I is the indicator function for working. The condition for an individual to enter the labor market depends on the wage rate (ω) and reservation wage (ω r ), i.e.: H > 0 iff ω > Mi∗ = ω r ,
(111.24)
where M ∗ denotes the MRS (Marginal Rate of Substitution) function value corresponding to the reservation wage. If an individual’s utility is set as follows: u = [ω(H + ε) + V ]α [1 − (H + ε)]β ,
(111.25)
then the reservation wage can be derived as ωr =
b εω + V . 1−b 1−ε
(111.26)
The conditions for an individual to enter the labor market become: H > 0 iff εH > −[(1 − b) − b(V /ω)], H = 0 iff εH ≤ −[(1 − b) − b(V /ω)],
(111.27)
where εH = −ε. The empirical labor supply function can then be specified as H = (1 − b) − b(V /ω) + εH .
(111.28)
Under the assumption of normal distribution, we can specify the probabilities of working and non-working for individual i as follows: Pr[i works] = Pr[εHi /σH > −Ji /σH ] = 1 − Φ(−Ji /σH ), Pr[i does not work] = Φ(−Ji /σH ), where σH is the standard error of εH , and Ji = (1 − b) − b(Vi /ωi ).
(111.29)
page 3877
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
3878
We can apply the Probit model to estimate the optimal binary choices based on the following likelihood function:
{1 − Φ[−(1 − b)∗ + b∗ (Vi /ωi )]} L= i∈E
×
[−(1 − b)∗ + b∗ (Vi /ωi )].
(111.30)
i∈E
We then get E[Hi |i works] = 1 − b − b(Vi /ωi ) + κi .
(111.31)
The conditional expectation of κi is κi = E[εHi |(εHi /σH ) > −Ji /σH ] = σH
φ(−Ji /σH ) = σH λi , (111.32) 1 − Φ(−Ji /σH )
where λi denotes the inverse Mill’s ratio as earlier discussed. Based on the expectation of the labor supply condition, we can get the regression model with sample selection correction as follows: Hi = (1 − b) − b(Vi /ωi ) + ηλi + υHi .
(111.33)
In order to simplify the complexity of the estimation process, most of the literature sets the models as linear functions. The complete linear regression model for labor supply can then be specified as follows: ωi = γXi + εW i , Mi∗ = a∗M + c∗M Vi + d∗M Zi + ε∗M i , Mi = aM + b1M ωi + b2M Hi + cM Vi + dM Zi + εM i , Hi = aH + bH ωi + cH Vi + dH Zi + εHi ,
(111.34)
where Xi denotes the factors affecting wage, Zi are the characteristic variables, such as age, race, and educational attainment that may influence MRS, εW i is the unobserved component related to individuals’ ability, and εH is the unobserved component related to individuals’ preference on working. The wage equation with the correction term of sample selection is thus: E[ωi |Hi > 0] = E[γXi + εωi |εHi > −Ji ] = γXi +
σω H λi . σH
(111.35)
Based on Heckman (1979), we can rewrite the regression model of wages, which allows the possibility of discontinuous labor supply, the endogeneity of wages in the estimation of the labor supply, and the estimation of the wage
page 3878
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3879
equation with the correction of the sample selection bias. The regression model of wages is set as follows:
[1 − Φ[−Ji /σD ]] · [−Ji /σD ], L= i∈E
i∈E
ωi = γXi + δλi + υi , Hi = aH + bH γXi + cH Vi + dH Zi + η λi + ξi , i + cH Vi + dH Zi + η λi + νi , Hi = aH + bH ω
(111.36)
where Ji = γXi − (a∗M + c∗M Vi + d∗M Zi ), εD = εωi − ε∗M i , 2 − 2σωM ) σD = (σω2 + σM
1/2
.
(111.37)
According to the setting of the above model, there are three stages in estimation. The first one is to obtain the inverse Mill’s ratio based on the Probit selection model. The second stage regresses the wage equation with the inverse Mill’s ratio. The final stage replaces the wage in the labor supply function with the estimated wage from the second stage and then estimates the labor supply function. Chuang and Lin (2006) apply the aforementioned model to estimate female labor supply in Taiwan based on data drawn from the “Manpower Utilization Survey” and the “Women’s Marriage, Fertility and Employment Survey” from 1979 to 2003. The samples of the study are married women aged from 15 to 64 years. Since the female labor force participation rate in Taiwan during that time period is around 40–50%, it means that more than half of the women are not in the labor market. As a result, the sample selection issue should be taken into account in order to yield an unbiased estimator for the labor supply of married women in Taiwan. As Chuang and Lin (2006) discuss, most findings from the first-stage Probit model for the labor force participation decision are in line with theoretical expectations and are similar to the implications from the literature. For example, the effect of age shows an inverse U-shape. Women’s educational attainment brings a positive influence, while the number of young children has a negative impact on women’s labor force participation decision. The estimated results of wages in the second stage are also consistent with findings in the literature. The estimated wage from the second stage is then used in the final stage for the estimation of the labor supply function.
page 3879
July 6, 2020
16:5
3880
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
Part of the estimation results reported in Table 9 of Chuang and Lin (2006) are reproduced and are presented in Table 111.1. As shown in Table 111.1, the wage elasticity of women’s labor supply ranges from the minimum of 0.026 in 1993 to the maximum of 0.158 in 2000. The age variable shows a positive effect on labor supply, while the positive influence is declining as age increases. The presence of young children brings a negative impact on married women’s labor supply, but the number of children has no significant influence. The effect of firm size is positive before year 2000, but it reverses to a negative effect after year 2000. The effects of husband’s income do not show any significant influence in most years. However, if husband’s income does show a significant impact, then its negative impact is consistent with the theoretical expectation. Finally, the sample selection variables are significantly positive for most years, which implies that it is critical to consider the sample selection problem when estimating married women’s labor supply in Taiwan. 111.3.2 Employability and wage compensation-multinomial logit correction approach Chiu and Chuang (2016) analyze the influence of employability skills on wage compensation for female college graduates and consider two separate choices of respondents after graduation: the selection of labor force participation and the selection of occupations. They assume the two choices are independent, implying the correlation coefficient of the error terms between the two choices is zero. The wage compensation equation can then be corrected by two IMRs (inverse Mill’s ratio). These IMRs can be estimated by Probit and Multinominal Logit models for the labor force participation decision and the occupational choices, respectively. Based on the two selectivity terms, they rewrite the wage compensation equation as E(ln ω1S,i |Xi , ti , Qi = 1, Yi = S) = xi β + ti α + θQ λ Q,i + θS λS,i , (111.38)
where λ Q and λS are the IMRs from the Probit estimation of labor force participation and the Multinominal Logit estimation of occupational choice, respectively. Here, Qi = 1 and Yi = S mean individual i chooses to work in S occupations. The control variables of the labor force participation equation include dummy variables for college/university type, major, grade point average, licensing status, college/university located in a metropolitan area, and student loan status, as well as continuous variables for her father’s
page 3880
July 6, 2020
Table 111.1:
Constant
AGESQ
CHILD6 CHILD NCHILD GOV FISIZE
DU LAMBDA R-squared Adjusted R2 Obs
3.039∗∗∗ (21.583) 0.014∗∗∗ (3.437) −0.0002∗∗∗ (−3.882) 0.111∗∗∗ (6.026) −0.037∗∗∗ (−2.795) — — −0.003 (−0.873) −0.079∗∗∗ (−5.548) 0.031∗∗∗ (2.632) −0.001 (−1.235) −0.0001∗∗∗ (−2.633) 0.138∗∗∗ (3.213) 0.0199 0.018 12,290
1993 3.580∗∗∗ (30.359) 0.010∗∗ (2.342) −0.0001∗∗ (−2.481) 0.026∗∗ (2.409) −0.010 (−0.772) — — −0.006∗ (−1.912) −0.038∗∗∗ (−3.113) 0.031∗∗∗ (2.583) −0.002∗∗ (−2.151) −0.0002∗∗∗ (−5.027) −0.011 (−0.273) 0.0267 0.0251 12,471
1997 2.679∗∗∗ (13.177) 0.018∗∗∗ (3.731) −0.0002∗∗∗ (−3.885) 0.148∗∗∗ (6.486) −0.006 (−0.428) — — −0.009∗∗∗ (−2.720) −0.096∗∗∗ (−7.185) 0.001 (0.107) −0.002∗ (−1.777) 0.00007∗ (1.789) 0.089∗ (1.912) 0.0221 0.0205 12,099
2000 3.017∗∗∗ (15.912) 0.012∗∗∗ (2.747) −0.0002∗∗∗ (−3.588) 0.115∗∗∗ (5.044) −0.050∗∗∗ (−3.916) — — −0.003 (−0.849) −0.084∗∗∗ (−6.762) −0.037∗∗∗ (−3.518) 0.0001 (0.09) 0.0001∗ (1.689) 0.144∗∗∗ (2.975) 0.0178 0.0161 11,982
2001 2.516∗∗∗ (13.062) 0.022∗∗∗ (4.494) −0.0003∗∗∗ (−5.268) 0.158∗∗∗ (6.785) −0.048∗∗∗ (−3.167) — — −0.002 (−0.499) −0.134∗∗∗ (−9.576) −0.054∗∗∗ (−4.576) −0.00004 (−0.040) −0.0001∗ (−1.758) 0.245∗∗∗ (5.018) 0.0255 0.0237 11,700
2.909∗∗∗ (22.242) 0.022∗∗∗ (4.894) −0.0003∗∗∗ (−5.676) 0.080∗∗∗ (7.790) −0.04∗∗∗ (−2.817) — — −0.005 (−1.128) −0.114∗∗∗ (−9.591) −0.046∗∗∗ (−4.040) −0.0004 (−0.395) 0.00008∗∗ (2.225) 0.209∗∗∗ (4.765) 0.0297 0.028 11,793
denotes
b3568-v4-ch111
∗∗∗
3881
Notes: 1. The content of this table is obtained from Table 9 in Chuang and Lin (2006). 2. t-ratios are in parenthesis. 3. ∗ denotes statistically significant at the 10% level. ∗∗ denotes statistically significant at the 5% level. statistically significant at the 1% level.
2003
9.61in x 6.69in
LNHBINC
3.392∗∗∗ (28.013) 0.009∗ (1.82) −0.0001∗∗ (−2.325) 0.064∗∗∗ (4.715) −0.044∗∗∗ (−3.011) −0.022 (−0.738) — — −0.007 (−0.415) 0.088∗∗∗ (6.143) −0.003∗ (−1.793) −0.0003∗∗∗ (−7.285) 0.135∗∗∗ (3.446) 0.0407 0.0389 12,014
1990
Sample Selection Issues and Applications
w ˜
3.373∗∗∗ (28.782) 0.011∗∗ (1.964) −0.0002∗∗ (−2.532) 0.048∗∗∗ (3.588) −0.088∗∗∗ (−5.082) 0.015 (0.401) — — 0.061∗∗∗ (3.348) 0.137∗∗∗ (6.951) −0.001 (−0.625) −0.0004∗∗∗ (−6.734) 0.160∗∗∗ (5.114) 0.0574 0.0546 9,836
1986
Handbook of Financial Econometrics,. . . (Vol. 4)
AGE
1979
16:5
Variables
Estimation results of labor supply model for married women in Taiwan.
page 3881
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
3882 Table 111.2:
Effects of self-assessed employability skills on wage across occupations.
Salesperson KNOW ATTITUDE LEARN CAREER Control Variables
9.61in x 6.69in
Associate professionals
0.0017 −0.0199 0.0729∗ 0.0719∗∗ ∗∗ 0.0556 0.0132 −0.0345 0.1078 0.0382 0.0534∗ −0.0432∗ −0.0064 0.0793∗∗∗ 0.0653∗∗ 0.038∗ 0.0166 No Yes No Yes
Technicians and professional assistants 0.0618∗ 0.0775∗∗ 0.0022 0.0585∗∗ No
Clerks and other staff
0.0277 0.036∗ 0.0195 0.0588 −0.0092 −0.0176 0.0152 0.0377∗∗ 0.0453∗∗∗ ∗∗ 0.0552 0.0238∗ 0.0226∗ Yes No Yes
Notes: 1. The content of this table is obtained from Table 3 in Chiu and Chuang (2016). 2. Selection here contains the selection of labor force participation and the selection of occupations. 3. ∗ denotes statistically significant at the 10% level. ∗∗ denotes statistically significant at the 5% level. ∗∗∗ denotes statistically significant at the 1% level.
educational level. The control variables in the Multinominal Logit model for the occupational choice include dummy variables for college/university type, double major status, licensing status, and college/university located in a metropolitan area, as well as continuous variables for parents’ educational levels. The data they used are drawn from the Integrated Higher Education Database System in Taiwan. Their study extends the classification of Harvey et al. (2002) by dividing the work-related core competencies into four categories: “(1) professional knowledge and application capabilities (denoted by KNOW hereafter), (2) positive working attitudes and teamwork abilities (ATTITUDE hereafter), (3) aggressive and active learning (LEARN hereafter), and (4) career management skills (CAREER hereafter). There are 12 questions in total that relate to the aforementioned employability skills.” According to the results of Table 111.2, which is reproduced from Table 3 of Chiu and Chuang (2016), employability skills play different roles on wage compensation across occupation, but no one characteristic shows the same value for all occupations. For example, the category of CAREER brings a positive association with the wages of salespersons. The category of KNOW is the most influential component for associate professionals. One of the important characteristics for technicians and professional assistants is the category of positive working attitudes and teamwork abilities (ATTITUDE). For the occupation of clerks and staff, the more useful employability skill is aggressive and active learning (LEARN). The implications of this study suggest that the effects of self-assessed employability skills vary across occupations.
page 3882
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3883
111.4 Conclusions In many occasions of regression analysis, researchers may encounter the problem of a non-random sample that may violate the assumption of a zero mean value for the regression disturbance term. Such a violation leads to the problem of a biased estimator when using the OLS method. This study examines some related issues of sample selection bias due to non-random sampling. We first explain the source of bias caused by non-random sampling and then demonstrate that the direction of the sample selection bias in most cases cannot be ascertained based on prior information. As the direction of sample selection bias cannot be generally predicted in advance, an appropriate estimation approach should be adopted in order to take into account the selection bias. By treating the sample selection as informative sampling, we can formulate the sample selection bias issue as an omitted variable problem in the regression model. Heckman (1979) proposes a two-stage estimation procedure to correct for the selection bias. The first stage applies the Probit model to produce the estimated value of the inverse Mill’s ratio and then includes it into the second-stage regression model as an explanatory variable to yield unbiased estimators. As the sample selection rule may not always be derived from a yes–no choice, our study further utilizes Lee’s (1983) extension by applying the Multinomial Logit model into the first-stage estimation procedure to allow for the application with the multi-choice sample selection rule. Since the pioneer works related to sample selection issues are mostly in the field of labor economics, we give two examples of an empirical study in labor economics to respectively demonstrate the applications of the Probit correction approach and Multinomial Logit correction approach. Chuang and Lin (2006) adopt the Probit correction approach into the complete model of labor supply in order to produce an unbiased estimator of wage elasticity for married women in Taiwan. Chiu and Chuang (2016) analyze the influence of employability skills on wage compensation for female college graduates and consider two separate choices of respondents after graduation: the selection of labor force participation and the selection of occupation. As the selection of labor force participation is a binary choice and the selection of occupation is a multi-choice, Chiu and Chuang (2016) apply both the Probit and the Multinomial Logit models simultaneously to correct for the sample selection bias. The problem of a non-random sample is not limited to applications in economics. In the past 20 years, quite a few researchers have taken into account the issue of sample selection in their studies. Certo et al. (2016)
page 3883
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3884
9.61in x 6.69in
b3568-v4-ch111
H.-L. Chuang & S.-Y. Chiu
indicate that “the use of Heckman models by strategy scholars to resolve sample selection bias has increased by more than 700 percent over the last decade”, signifying the impact of sample selection issues in empirical studies of management science. Applications of sample selection issues have also become more popular in the field of finance. For example, McCahery and Schwienbacher (2010) consider that selection bias may be caused by the nonrandomness of the lender-borrower match in their sample when estimating “the impact of lead arrangers’ reputation on the design of loan contracts such as spread and fees charged.” Wang and Qian (2011) apply Heckman’s two-stage approach in their empirical analysis for examining the effect of philanthropic activities on a firm’s financial performance, dealing with the sample selection issue since “it is possible that the factors affecting whether a firm engages in corporate giving may be correlated with our dependent variable-firm financial performance”. In sum, more and more researchers in the fields of economics, finance, and management have become aware of the sample selection issue due to non-random sampling. Since it is generally impossible to predict the direction of sample selection bias based on prior information, it is critical to apply an appropriate estimation procedure to resolve sample selection bias due to either the binary choice or multi-choice sample selection rule. This study provides useful information to researchers for dealing with the sample selection issues.
Bibliography Certo, S.T., Busenbark, J.R., Woo, H.S. and Semadeni, M. (2016). Sample Selection Bias and Heckman Models in Strategic Management Research. Strategic Management Journal 37(13), 2639–2657. Chiu, S.Y. and Chuang, H.L. (2016). Employability and Wage Compensation in an Asian Economy: Evidence for Female College Graduates in Taiwan. Emerging Markets Finance and Trade 52(4), 853–868. Chuang, H.L. and Lin, E.S. (2006). The Evolution of the Empirical Study of Taiwan’s Female Labor Supply. Taiwan Economic Review 34(2), 119. Chuang, H.L., Hsieh, N. and Lin, E.S. (2010). Labour Market Activity of Foreign Spouses in Taiwan: Employment Status and Choice of Employment Sector. Pacific Economic Review 15(4), 505–531. Gronau, R. (1974). Wage Comparison: A Selectivity Bias. Journal of Political Economy 82(6), 1119–1143. Heckman, J.J. (1974). Shadow Prices, Market Wages and Labor Supply. Econometrica 42(4), 679–694. Heckman, J.J. (1976). The Common Structure of Statistical Models of Truncation, Sample Selection, and Limited Dependent Variables, and a Simple Estimator for Such Models. Annals of Economic and Social Measurement 5(4), 475–492.
page 3884
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Sample Selection Issues and Applications
b3568-v4-ch111
3885
Heckman, J.J. (1979). Sample Selection Bias as a Specification Error. Econometrica 47(1), 153–161. Killingworth, M.R. (1983). Labor Supply. Cambridge University Press, New York. Lee, L.F. (1983). Generalized Econometric Models with Selectivity. Econometrica 51(2), 507–512. Lewis, H.G. (1974). Comments on Selectivity Biases in Wage Comparisons. Journal of Political Economy 82(6), 1145–1155. McCahery, J. and Schwienbacher, A. (2010). Bank Reputation in the Private Debt Market. Journal of Corporate Finance 16(4), 498–515. Wang, H. and Qian, C. (2011). Corporate Philanthropy and Corporate Financial Performance: The Roles of Stakeholder Response and Political Access. Academy of Management Journal 54(6), 1159–1181.
page 3885
This page intentionally left blank
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Chapter 112
Time Series and Neural Network Analysis∗ K. C. Tseng, Ojoung Kwon and Luna C. Tjung Contents 112.1 Introduction . . . . . . . . . . . . . . . . . . 112.2 Alternative Methods for Forecasting Stocks 112.3 Methodology . . . . . . . . . . . . . . . . . 112.3.1 Time-series decomposition . . . . . 112.3.2 Holt’s exponential smoothing . . . . 112.3.3 Winters’ exponential smoothing . . 112.3.4 Box–Jenkins methodology . . . . . 112.3.5 Neural networks . . . . . . . . . . . 112.4 Data, Variables, and Normalization . . . . . 112.4.1 Data used . . . . . . . . . . . . . . 112.4.2 Variables used . . . . . . . . . . . . 112.4.3 Data normalization . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
3888 3892 3897 3897 3898 3899 3899 3900 3901 3901 3901 3901
K. C. Tseng California State University at Fresno e-mail: [email protected] Ojoung Kwon California State University at Fresno e-mail: [email protected] Luna C. Tjung Credit Suisse AG, Singapore e-mail: [email protected] ∗
This chapter is an update and expansion of the paper “Time series and neural network forecasts of daily stock prices,” which was published in Investment Management and Financial Innovations, Vol. 9, Issue 1, pp. 32–54, 2012. 3887
page 3887
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3888
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
112.5 Empirical Findings . . . . . . . . . . . . . . . . . . 112.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 112A Neural Networks . . . . . . . . . . . . . 112A.1 Basic neural network model . . . . . . . . 112A.2 BrainMaker . . . . . . . . . . . . . . . . . 112A.3 Alyuda NeuroIntelligence . . . . . . . . . 112A.4 Data — training data set, validation data set, and out-of-sample testing data set . . 112A.5 Non-normalized data . . . . . . . . . . . . 112A.6 Normalized data . . . . . . . . . . . . . . 112A.7 Determination of the Numbers of Hidden Nodes (NH) for the Hidden Layer . . . . 112A.8 Criterion of architecture searching . . . . 112A.9 Training stop criterion and network selection . . . . . . . . . . . . . . . . . . . Appendix 112B List of Variables Used . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
3902 3912 3913 3916 3916 3917 3918
. . . . . . . . . . . . . . . . . .
3918 3919 3919
. . . . . . . . . . . .
3921 3922
. . . . . . . . . . . .
3922 3923
Abstract This chapter discusses and compares the performances of the traditional time-series models and the neural network (NN) model to see which one does a better job of predicting changes in stock prices and to identify critical predictors in forecasting stock prices in order to increase forecasting accuracy for professionals in the market. Time-series analysis is somewhat parallel to technical analysis, but it differs from the latter by using different statistical methods and models to analyze historical stock prices and predict the future prices. Neural network approaches can make important contributions since they can incorporate very large number of variables and observations into their models. In this study, the authors apply the traditional time-series decomposition (TSD), Holt/Winters (H/W) models, Box–Jenkins (B/J) methodology, and neural network (NN) model to 50 randomly selected stocks from September 1, 1998 to December 31, 2010 with a total of 3105 observations for each company’s close stock price. This sample period covers high tech boom and bust, the historical 9/11 event, housing boom and bust, and the recent serious recession and current slow recovery. During this exceptionally uncertain period of global economic and financial crises, it is expected that stock prices are extremely difficult to predict. Keywords Forecasting stock prices • Neural network model • Time series decomposition • Holt/Winters exponential smoothing • Box–Jenkins ARIMA methodology • Technical analysis, and fundamental analysis.
112.1 Introduction History has shown that over the long period of time stocks have generated higher returns than most other assets such as fixed income securities
page 3888
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3889
and real estate (Siegel, 2008). Forecasting stock prices has been an integral part of investing in the stock markets. Practitioners have designed so many techniques and tools to predict stock prices, while academicians have developed all kinds of theories, methods and models to evaluate the basic stock values and prices. With increasing globalization of financial markets and extremely rapid improvement in information technology and quantitative methods, news and rumors can travel quickly around the world and stock prices can drastically change from second to second. Although different market participants can interpret new information very differently, the stock price at a given time reflects the equilibrium price of supply and demand at that particular moment. That equilibrium price may deviate greatly from the intrinsic value of the underlying stock and market participants’ expectations and their mood may change quickly from moment to moment. Large institutional investors typically have their own proprietary computer trading techniques guided by their own mathematical and statistical models in buying and selling stocks without human interference. Most investors apply both fundament analysis and technical analysis to make their investment decisions. Since fundamental factors such as earnings, dividends, new product and market, economic data are not available daily, weekly, or even monthly, short-term price fluctuations are determined by technical analysis, while fundamental analysis is more pertinent to long-term price changes. Over time, the institutional trading has accounted for higher and higher trading volume (Hendershott, Jones, and Menkveld, 2011) and high frequency traders trade based on analyzing data patterns in millisecond without considering the fundamental factors. Foucault Hombert, and Rosu (2016) pointed out that high frequency traders have applied different strategies. Some specialize in market making, while others apply directional strategies and use aggressive marketable orders to establish positions in anticipation of future price movements. High frequency traders’ aggressive orders have contributed to explosive trading volume and large short-term price reactions to wide varieties of news and announcements. Henry and Koski (2017) find that institutions focus on trading around certain ex-days are correlated with higher profits. Even if the dividend capture traders account for less than 6% of all trades, they represent over 15% of abnormal returns. In addition, they also find that dividend capture profitability is highly correlated to institutional trading execution skills. This new development makes technical analysis and time-series analysis even more important. Forecasting is both an art and a science. Sales of a basic necessity company can be forecast accurately. However, forecasting stock prices is rather difficult. Proponents of efficient market hypotheses argue that stock prices
page 3889
July 6, 2020
16:5
3890
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
cannot be predicted since the market prices have already reflected all known and expected fundamentals. However, precisely because the current market price reflects all relevant information, and therefore the current and historical prices are extremely useful for predicting the future prices since some relevant information is entering the stock markets continuously. As we will discuss in more detail in literature review, it is well-documented evidence that stock prices demonstrate strong momentum, short-term continuation and long-term reversal, and other identifiable patterns. When the market is in the upward movement, good news tends to reinforce but bad news is discounted. Conversely, when the market suffers downward momentum, market participants will discount the good news and exaggerate the bad news. Many professional analysts, investment advisers, investment news writers, mutual fund managers, and other so-called experts tend to be trend followers. Most individual investors tend to follow the professional advice and trend followers also. Contrarians are the minority. Indeed, stock prices may also demonstrate some January, holidays, and year-end effects and other special patterns such as sizes, market to book value ratio, and value versus growth stocks. One interesting observation is the consistent return seasonality. Kelharju, Linnaimaa, and Nyberg (2016) based on data from 1963 to 2011 they find that the expected returns vary from month to month and the return seasonality is not due to seasonal firm specific events such as earnings reports and dividend announcements. In addition, return seasonality is very pervasive and permeate the entire cross-section of US stock returns. Seasonality is equally strong in periods of high and low sentiment. Finally, seasonality is persistent over time and significant enough to completely overwhelm the unconditional differences in expected asset returns. Finally, Loh and Stulz (2018) based on I/B/E/S US detail file from 1983 to 2015 classified bad times into crisis, credit crisis, recession, and high uncertainty. They find that analyst stock recommendation and earnings forecast changes have greater impact in bad times than in good times. Analysts have greater incentive to work harder and investors rely more on analysts during the bad times. Even though it is hard to predict stock prices, both professional and individual investors have been exercising their best analyses and judgment to predict stock prices in quite diversified ways. There are so many possible factors affecting stock prices in a very complicated way. Haugen (1999) provided a rather concise summary of important factors. First, the risk factors which include market beta from capital asset pricing models, beta derived from arbitrage pricing theory, volatility of
page 3890
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3891
total return, nonmarket related variance, standard errors of earnings, debt to equity ratio and trend, times interest earned, and volatilities in earnings, dividend, and cash flow. Second, the liquidity factors: market capitalization, market price per share, trailing 12-month average monthly trading volume to market capitalization, and five-year trend in monthly trading volume. Third, ratio and trend factors: Earnings-to price ratio and trend, book-to price ratio and trend, dividend-to-price ratio and trend, cash flow-to-price ratio and trend, and sales-to-price ratio and trend. Fourth, profitability ratios and trends: profit margin and trend, total asset turnover and trend, return on assets and trend, return on equity and trend, earnings growth, and earnings surprise. Fifth, returns on different industry sectors vary from one sector to another. Finally, technical factors which measure the excess return over S&P 500 in the previous 1 month, 2 months, 3 months, 6 months, 12 months, 24 months, and 60 months. Generally speaking, the technical factors are far more complex than what Haugen indicated here (Pring, 1991). In recent year with the sophisticated computer programs and advanced cross-border communications, many more factors such as foreign exchange rates, interest rates, commodity prices, inflation fear, speculations and changes in expectations, changes in investors’ sentiment, options, futures, other derivative markets, domestic and foreign economic and political news, and more recently the financial and debt crises have significant impact on global economy and stock markets. Neural network approaches can make important contributions since they can incorporate very large number of variables and observations into their models. The objective of most neural network is to determine when to buy or sell stocks based on historical market indicators. The critical task is to find enough relevant indicators to use as input data to train the system properly. The indicators could be technical and fundamental factors as mentioned above and others. Data normalization is common in neural networks since they generally use input data within the range of [0, 1] or [−1, +1]. For some neural networks with large number of inputs some pruning techniques are required to reduce the network size and speed up the recall and training times. Some common network architecture in most financial forecasting is a multilayer feed forward network trained by back-propagation that is backpropagating errors through the system from output layer to input layer during training. Back-propagation is needed since hidden units have no training target value to be used and therefore they must be trained from the errors of previous layers. The output layer has a target value for comparison. When the errors are back-propagated through the nodes, the connection weights
page 3891
July 6, 2020
16:5
3892
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
are changed. Training will continue until the errors in the weights are small enough to be accepted. When to stop training could be a problem. Overtraining may happen if the system memorizes patterns and becomes unable to generalize. Over-training may also occur due to having too many hidden nodes or training for too many periods. Over-training can be avoided by training the network on large percentage of patterns and testing the network on the remaining patterns. The network performance on the test set is an important indication of its ability to generalize and handle data it has been trained on. If the test set is unsatisfactory, the network is retrained until the performance is satisfactory. Given the increasing importance of high frequency trading and their unique focus on analyzing stock data patterns, it is particularly meaningful to apply some statistical forecasting techniques and methods and neural network models to forecast daily stock prices. Since all market participants are subject to behavioral, emotional, and psychological attributes such as prospect theory, bandwagon effect, mental compartment, fear and greed, and self-attribution. These attributes may form certain predictable patterns reflected by the stock price movements. The chapter is divided into six sections. In Section 112.2, alternative methods for forecasting stock performance are reviewed. Various forecasting methods are discussed in Section 112.3. This section is divided into four subsections. Section 112.3.1 discusses Time-series decomposition; Section 112.3.2 discusses Holt’s exponential smoothing; Section 112.3.3 discusses Winters’ exponential smoothing; Section 112.3.4 discusses Box– Jenkins methodology; Section 112.3.5 discusses neural network model. Section 112.4 discusses the data and variables used and the data normalization process. Section 11.5 presents a discussion on the empirical findings. Finally, in Section 6 we provide a summary and conclusion. 112.2 Alternative Methods for Forecasting Stocks One of the most significant contributions of market timing is to stay in the market during the major bull market and to stay out during the major market crashes. For examples, it took 15 years in 1945 to recover the original investment during the peak in October 1929. The real stock return was negative from the end of 1966 to August of 1982, another 15 plus years. The roller coastal rides of US stock market during our sample period make timing especially meaningful. For examples, from January 14, 2000 to September 21, 2001, DJIA fell 29.75% from 11,722.98 to 8,235.81 and again the index
page 3892
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3893
plunged 53.78% from October 9, 2007 when DJIA was 14,164.53 to 6,547.05 on March 9, 2009. The S&P 500 index declined 36.49% from 1520.77 on September 1, 2000 to 965.80 on September 21, 2001, and again it plunged 56.78% from 1565.15 on October 9, 2007 to 676.53 on March 9, 2009. For NASDAQ it was far worse with a decline of 71.81% from 5,048.62 on March 10, 2000 to 1,423.19 on September 21, 2001. NASDAQ plunged another 55.63% from 2859.12 on October 31, 2007 to 1,268.64 on March 9, 2009. All these sharp declines occurred within about 11/2 years. Without such specific dating, during the first decade of this century the market prices all went down: DJIA was down 7.89%, S& P 500 was down 22.99%, and NASDAQ was down 45.50%. Martin Pring (1991, p. 31) showed that investment following the Dow Theory signals to buy and sell would increase from the initial investment of $100 in 1897 to $116,508 in January 1990 compared to $5,682 with a buy and hold strategy. Martin Zweig (1990, p. 121) stated that “I can’t overemphasize the importance of staying with the trend of the market, being in gear with the tape, and not fighting the major movements. Fighting the tape is an open invitation to disaster.” Professor Siegel (2008) tested DJIA and NASDAQ using 200-day moving average strategy and showed that the returns are higher than the buy-and-hold strategy before adjustment for transaction costs. When transaction costs were taken into account, the extra returns of the timing strategy became negligible, but the risk of timing strategy was lower. In addition, the timing strategy can avoid the major market crashes or prolong market declines and participated in major bull markets or secular market advance. In recent years the traditional technical analysis and behavioral finance appear to reinforce each other from different perspectives. De Bondt and Thaler (1985, 1987) found that investors tended to overweigh the most recent information and underweigh the more fundamental base rate information. The results are stock price short-term continuation and long-term reversal. Based on their two studies, one from January 1926 to December 1982 and the other from 1965 to 1984, they found that prior losers outperformed the prior winners. This momentum pattern makes the forecasts of future stock prices feasible. Kahneman and Turversky (1979) discovered that people in general and investors in particular incline to weigh heavily on memorable, salient, and vivid evidence than truly important information. Odean (1998, 1999) found that investors are prone to overestimate their own abilities and too optimistic about the future conditions. Daniel, Hirshleifer, and Subrahmanyan (1998) pointed out that because of investors’ self-attribution bias and representative heuristic, their
page 3893
July 6, 2020
16:5
3894
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
confidence grows when public information agrees with their private information; but when public information differs from their private information, their confidence declines only slightly. They show that positive autocorrelation results from short-term overreaction and long-term correction. The findings of short-term continuation and long-term reversal are consistent with a study by Balvers, Wu, and Gilliand (2000) of 18 countries from 1969 to 1996. Hong and Stein (1999) found that news-watchers made forecasts based on private observations about future fundamentals, while momentum traders applied simple or univariate functions of past prices to make their forecasts. All these findings lead to the frequent observations of short-term momentum and long-term reversal. More recently, Hong and Sraer (2016) find that high beta stocks are overpriced relative to low beta stocks and are more sensitive to disagreement about common cash flows and therefore more speculative. More recently Daniel and Hirshleifer (2016) find that self-attribution bias of investors is supported by cognitive process in psychology. The empirical findings of return anomaly and strong pattern of return predictability challenge the hypothesis that investors are rational. Corgnet, Desantis, and Porter (2018) based on 2839 profit outcomes for 167 traders trading over 17 experimental market periods find that successful traders possess high level of fluid intelligence, being able behavioral biases such as conservatism biases, and having the ability to infer other traders’ intentions. Cujean and Hasler (2017) find that market participants apply different forecast models to predict fundamentals. Active fund managers typically apply continuous-state autoregressive processes, while others assume that economic variables shift discretely or nonlinearly over different phases of business cycle. Time-series momentum strengthens during bad times. In general disagreement and lagged returns predict future returns and the predictive power become stronger during bad times. Indeed, many technical analysts have incorporated this market patterns into their trading strategies to capture some profitable opportunities. In recent years some 75–80% of all daily trading volume has been attributable to high frequency or algorithmic trading. Since algorithmic trading is based on historical data and is programmed by human beings who embed their decisions into their programs, the stock markets become more predictable and as friends of technical analyses (Baiynd, 2011). More recently, Gutierrez and Kelly (2008) found that both winners’ and losers’ portfolios experienced very short-term return reversal in the first two weeks, and longer-term continuation from week 4 to week 52. Menzly and Ozbas (2010) found that returns of both individual stocks and
page 3894
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3895
industries demonstrated strong cross-predictability with some lagged returns in supplier and customer industries. They also found that the smaller the number of analysts or institutional ownership, the greater the cross predictability. Based on these empirical findings on short-term to intermediateterm return or price momentum and medium term to long-term reversal and cross-predictability, stock prices or returns appear to be predictable to some extent and traditional technical analysis indeed has its merits. Brock, Lakonishok, and Lebaron (BLL) (1992) applied two simple trading rules (the moving averages and trading range break out) to daily Dow Jones Industrial Average from the first trading day of 1897 to the last trading day in 1986, they found that the technical trading strategies could generate returns from buy signals of 0.8 percent higher than the sell signals over the 10-day period. The returns from buy (sell) signals were higher (lower) than the normal returns. They pointed out that the return differentials between buy and sell signals cannot be explained by different risks. Sullivan, Timmermann, and White (1999) considered the on-balance volume indicator in addition to moving average, support and resistance levels, and break-out to DJIA daily data from 1897 to 1996. They found that BLL results are robust to data-snooping and technical trading rules are profitable. Finally, Lo, Mamaysky, and Wang (2000) used heads and shoulders, double tops and bottoms, triangle tops and bottoms, rectangular tops and bottoms and applied the nonparametric kernel regression method to identify the nonlinear patterns of stock price movements. They concluded that some patterns of technical analysis could provide incremental information and practical trading value. In recent years neural network (NN) has been applied to make various kinds of forecasting. For example, Mostafa (2004) applied NN to forecast Suez Canal traffic and Mostafa (2010) used NN to forecast the stock market movements in Kuwait, Videnova et al. (2006) applied NN to forecast maritime traffic, Kohzadi et al. (1996) used NN to predict commodity prices, Ruiz-Suarez et al. (1995) applied NN to forecast ozone level, Poh, Yao and Jasic (1998) used NN to predict advertising impact on sales, Aiken and Bsat (1999) applied NN to predict market trends, and Yu, Wang, and Lai (2009) used NN to make financial time-series forecasting. There are a number of studies using NN to predict stock prices or returns. Kimoto et al. (1990) and Ferson and Harvey (1993) applied several macroeconomic variables to capture the predictable variations in stock returns. Kryzanowski, Galler, and Wright (1993) used historical accounting
page 3895
July 6, 2020
16:5
3896
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
and macroeconomic data to identify some stocks that outperformed the overall market. McNeils (1996) used the Chilean stock market to predict returns on the Brazilian stock market. Yumlu, Gurgen, and Okay (2005) applied NN architectures to model the performance of Istanbul stock exchange over the period of 1990–2002. McGrath (2002) used book-to-market and price-toearnings ratios to rank stocks with likelihood estimates. Leigh, Hightower, and Modani (2005) applied both the NN and linear regression analyses to model the New York Stock Exchange Composite Index for the period of 1981–1999. Many researchers have compared NN methods with various forecasting methods and techniques. Hammad, Ali, and Hall (2009) has shown that artificial NN (ANN) models provide fast convergence, high precision, and strong forecasting ability of real stock prices. West, Brockett, and Golden (1997) have concluded that NN offers superior predictive capabilities to traditional statistical methods in forecasting consumer choice in both linear and nonlinear settings. NN can capture nonlinear relationships associated with the use of non-compensatory decision rules. Grudnitski and Osburn (1993) applied NN and used general economic conditions and traders’ expectations to predict S&P and gold futures. Tokic (2005) has shown that political events such as war on terror, fiscal policy on changing taxes and spending, monetary policy on changing short-term interest rates, and changes in federal budget deficit can affect stock prices. Nofsinger and Sias (1999) have found strong positive relationship between annual changes in institutional ownership and stock returns across different capitalizations. Dutta, Jha, Laha, and Mohan (2006) applied ANN models to forecast Bombay Stock Exchange’s SENSEX weekly closing values. They compared two ANN and used 250 weeks’ data from January 1997 to December 2001 to forecast for the period of January 2002–December 2003. Moshiri and Cameron (2000) compared the back-propagation network (BPN) with six traditional econometric models to forecast inflation. Three of the six models are structural including the well-known Ray Fair’s econometric forecasting models. The three time-series models including Box–Jenkins autoregressive integrated moving average (ARIMA) models, vector autoregressive (VAR) models, and Bayesian vector autoregressive (BVAR) models. In one-period-ahead forecasts BPN models provide more accurate forecast. In three-period-ahead forecasts BPN is better than VAR and structural models but less accurate than ARIMA and BVAR. In twelve-period-ahead forecasts BPN models match ARIMA and BVAR but are superior to structural and VAR models. Other related methods are data mining (DM) and Bayesian data mining (BDM). Giudici (2001) used BDM for benchmarking and credit
page 3896
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3897
scoring in highly dimensional complex data sets. Jeong, Song, Shin, and Cho (2008) applied BDM to a process design. Ying, Kuo, and Seow (2008) applied hierarchical Bayesian (HB) approach to forecast stock prices of 28 companies included in DJIA from the third quarter of 1984 to the first quarter of 1998. They have found that HB can better predict stock prices than the classical models. Finally, Tsai and Wang (2009) applied ANN and decision tree (DT) models and have found that the combination of ANN and DT can more accurately predict stock prices. The extant literature reviews lead us to believe that some time-series forecasting methods such as time-series decomposition, Holt/Winters models, NN, and ARIMA methods are likely to help us identify stock price patterns and predict the future stock prices. Realizing that no forecasting method or model is perfect and the stock markets are extremely complex and volatile populated with diversified market participants, it is our intention to find some workable models and methods that may prove to be fruitful for practical applications to the highly inter-connected global stock markets. 112.3 Methodology 112.3.1 Time-series decomposition Time-series decomposition (TSD) is a traditional forecasting method that has been widely applied to business and economics. Any time series can be decomposed into four basic components: long-term trend, intermediate term cyclical factor, seasonal factor for a variable with data frequency greater than once a year, and unpredictable irregular component. We follow the widely applied multiplicative model in which a variable Y = T × C × S × I, where T is the trend component, C is the cyclical component, S is the seasonal component, and I is the irregular component. The forecasting process takes the form of identifying each component except the irregular one, and then multiplying all components together. In general, after we take proper moving averages and centered moving averages on the original series, the seasonal and irregular components will be smoothed out. The remaining components are T × C. Then divide the remaining series into the original series will result in S × I. We take averages of this new series to smooth out irregular component and obtain the seasonal indexes. The trend component is derived from fitting a linear or nonlinear trend on the centered moving average series mentioned previously. Finally, divide the centered moving average series by the trend component to obtain the cyclical component. The trend factor can be extrapolated into future periods, while
page 3897
July 6, 2020
16:5
3898
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
cyclical and seasonal components are assumed to stay the same. To make any period ahead forecast we simply multiply all three factors together assuming the irregular component stay neutral at 1. The regression analyses have been widely used due to their ready economic interpretations and policy implications. But this method requires forecasting the future values of explanatory variables before one can forecast the dependent variable(s). In practical applications forecasting the future values of independent variables is just as difficult as forecasting the dependent variable. Since there are so many possible variables affecting daily stock prices we will use those variables only in NN. 112.3.2 Holt’s exponential smoothing Holt’s exponential smoothing (HES) can not only smooth the original data but also incorporate a linear trend into the forecast. The model requires two smoothing constants and three equations to smooth data, update trend, and make forecast for any desirable forecast horizon. The model can be represented by Ft+1 = aYt + (1 − a)(Ft + Tt ),
(112.1)
Tt+1 = b(Ft+1 − Ft ) + (1 − b)Tt ,
(112.2)
Ht+n = Ft+1 + nTt+1 ,
(112.3)
where Ft+1 = smoothed value for period t + 1, a = smoothing constant for the smoothed value (level) with 0 < a < 1, Yt = actual value of the original data in period t, Tt+1 = trend estimate in period t + 1, b = smoothing constant for trend estimate with 0 < b < 1, Ht+n = Holt’s forecast value for forecasting horizon of n periods, n = 1, 2, 3, . . .. The smoothed value and trend estimate will stop changing beyond one period after the end of actual data even though n keep increasing. For example, if actual stock price of a company ends on December 31, 2010, Ft+1 and Tt+1 will be the smoothed value and trend estimate one day after that. To make 2-day ahead forecast, both stay the same, but n is equal to 2 in Ht+2 . Therefore, Holt’s exponential smoothing is more suitable for short-term forecast or if the underlying time series demonstrates a linear trend.
page 3898
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis
3899
112.3.3 Winters’ exponential smoothing When a time series demonstrates both trend and seasonality, Winters’ exponential smoothing (WES) is more suitable than the HES. WES is an extension of HES by incorporating seasonal adjustment. WES is captured by the following four equations: Ft = aYt /St−p + (1 − a)(Ft−1 + Tt−1 ),
(112.4)
St = cYt /Ft + (1 − c)St−p ,
(112.5)
Tt = b(Ft − Ft−1 ) + (1 − b)Tt−1 ,
(112.6)
Wt+n = (Ft + nTt )St+n−p ,
(112.7)
where: St = seasonality estimate in period t, c = smoothing constant for seasonality estimate with 0 < c < 1, p = the number of seasons in a year, e.g., p = 4 for quarterly data, Wt+n = Winters’ forecast for forecasting horizon of n. All other symbols in the equations are the same as in HES. When we compare equations (112.4)–(112.7) with (112.1)–(112.3), it is clear WES is an extension of HES by adding seasonality adjustment. When the data end Ft and Tt , will not change. The seasonal index for each season will be fixed. However, the forecast for each season will be adjusted by its respective seasonal index. For daily stock prices, some show seasonality, while others do not. Similar to HES, WES is more appropriate for short-term forecast. The length of time depends on whether the data are quarterly, monthly, or daily. For example, if the data are quarterly, n = 4 means one-year ahead forecast. 112.3.4 Box–Jenkins methodology Box–Jenkins methodology is statistically very sophisticated and complicated. Most forecasters apply univariate autoregressive integrated moving average (ARIMA) method. The general form of an ARIMA model can be represented by the following equation: Yt = A1 Yt−1 + A2 Yt−2 + · · · + ApYt−p + et + B1 et−1 + B2 et−2 + · · · + Bq et−q .
(112.8)
If a time series is nonstationary, the common method to make it stationary is by taking differences. For most business and economic data, usually it needs only to take first or second differences. If the data demonstrate some
page 3899
July 6, 2020
16:5
3900
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
seasonality, it requires taking seasonal differences. The general model form of an ARIMA is ARIMA(p, d, q) (P, D, Q) where p is the order of autoregressive part, q is the order of the moving averages, d is the number of times of taking differences, while P , D, Q refer to the seasonal correspondences of p, d, q. A model is considered to be appropriate when the residuals are random or white noise after a model is fitted. The principal advantages of ARIMA are its ability to analyze and forecast various types of time series whether it is stationary or nonstationary, linear or nonlinear, seasonal or not, and a large number of observations. A standard procedure of ARIMA starts with data. By analyzing the data and examine the autocorrelation and partial autocorrelation functions some tentative model can be identified. Then the parameters of the tentative model are estimated and diagnostically checked to see if the model is considered appropriate. A model is considered suitable if the residuals are random or white noise or the Ljung–Box–Pierce Q-statistic is insignificant. This Q-statistic follows the chi-square distribution and is used to test if the autocorrelations of the residuals are random. The Q-statistic is defined as 2 /(n − m)), Qm = n(n + 2)((r12 /(n − 1) + r22 /(n − 2) + · · · + rm
(112.9)
which is approximately distributed as a chi-square distribution with m−p−q degrees of freedom where n = the number of observations in the time series, m = the number of time lags to be tested, ri = sample autocorrelation coefficient of the ith residual term. The Q-statistic is used to test if the residual autocorrelations as a set are significantly different from zero. If they are, the model is considered inappropriate and a new tentatively model will be identified and diagnostically tested again. On the other hand, if they are not significantly different from zero, the model is appropriate, and the estimated model can be used for forecasting.
112.3.5 Neural networks We included Appendix 112A for Neural Networks (NNs) in great details. Please see Appendix 112A for a brief introduction to Neural Networks, NN tools used, and the process of setting up and training NN models.
page 3900
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3901
112.4 Data, Variables, and Normalization 112.4.1 Data used We randomly selected daily stock prices of fifty companies from Yahoo Finance from September 1, 1998 through December 31, 2010. This time period was chosen to reflect the very volatile period including high tech boom in the late 1990s, the high tech bust from 2000 and the recession from 2001 to 2002, the historical event of 9/11, the housing boom ended in early 2007 and the great recession and financial crisis beginning in late 2007. For time-series decomposition, Holt/Winters, and univariate ARIMA only daily stock prices of 3105 observations are needed to estimate the model and to make 60 trading days’ forecasts beyond the sample period, i.e., the hold-out period is from October 7, 2010 to December 31, 2010. To capture the effects of countless relevant factors affecting stock prices, the neural networks are applied in this study. 112.4.2 Variables used For NN we need predictor variables in addition to the stock prices of the 50 companies. The predictors are classified into seven groups: the first group includes 26 world major stock market indexes, group 2 includes 14 commodities and currencies, group 3 includes 213 competitive companies, group 4 consists of 4 major market indexes, group 5 includes CBOE volatility index changes (VIX), group 6 is market sentiment indicator and is represented by Franklin Resources Inc., and group 7 includes daily and monthly dummy variables. Appendix 112B provides the details of all seven groups of predictors. These predictors are from the National Bureau of Economic Research (NBER), Yahoo Finance, The Federal Reserve Bank, Market Vane (MV), NYSE, and FXStreet. For more information the Appendix 112B provides more detailed list of all variables and sources used in this study. 112.4.3 Data normalization Since the daily stock price changes could be either positive or negative, all numbers are shifted to positive numbers to gain a better forecasting performance. The normalization method in this study was based upon previous research conducted by Tjung, Kwon and Tseng (2012) and Kwon, Wu, and Zhang (2016). They analyzed forecasting performance of NN models on the US stock markets and also applied Alyuda NeuroIntelligence as the media for processing. Furthermore, Tjung, Kwon, and Tseng (2012) and Kwon,
page 3901
July 6, 2020
16:5
3902
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
Wu, and Zhang (2016) pointed out that the NN models generate a lower standard deviation than traditional regression analysis; meanwhile, the normalized data provide a better performance than the non-normalized data in terms of models learning and forecasting. For the process of normalization, we located the minimum value (Min) of each company’s daily stock price changes, take its absolute value (ABS), and then add 0.1 to avoid a zero value in the dataset. Hence, for each company’s daily stock price changes, the normalized value of ((ABS (Min)) + 0.1) will add to every daily change of company data. For instance, if Company x’s minimum daily stock change is −3.4, the normalized value of Company x’s data will be (ABS(−3.4) + 0.1) = 3.5. Then 3.5 would be added to all data from Company x. 112.5 Empirical Findings In this study, we used ForecastX wizard 7.1 by John Galt solutions for timeseries decomposition, Holt/Winters, and Box–Jenkins models and we used Alyuda NeuroIntelligence for neural networks. ForecastXtm is a family of diversified forecasting methods that can perform complex forecast models. Since the sample size is 3105 observations, the R2 and the adjusted R2 are basically the same. For time-series decomposition the lowest R2 is 99.46% for JPM and the highest is 99.98% for KMP. The mean absolute percentage error for the whole sample period is 0.01 to 0.00%. The within sample fits for all 50 companies are extremely good. When the whole period is considered, the fitted prices are dominated by the long-term trend factor, T , the seasonal component, S, is basically neutral, and the cyclical factor, C, mostly lies between 0.95 and 1.1. Since we chose Holt’s model to forecast trend factor, T , the trend is in turn dominated by level, i.e., Ft+1 in equation (112.1). The Tt+1 factor in equation (112.2) is generally very small. For Holt/Winters model the lowest R2 is 98.38% for JPM and only five companies have R2 below 99%, and the highest R2 is 99.93% for KMP. The MAPE’s for all 50 companies range between 0.00% and 0.03%. Again, the whole sample fits of data to the model are exceptionally good for all 50 companies. For Box–Jenkins model, the lowest R2 is 98.37% for JPM and the highest R2 is 99.93% for KMP. Only four companies have R2 below 99%. The MAPE’s for all 50 firms are between 0.00% and 0.02%. In the H/W model the fitted values are dominated by trend factor, Ft , in equation (112.4), and seasonal factor, equation (112.5) are quite obvious for about half of the 50 stock prices in the sample. When we compare the seasonal factor of H/W
page 3902
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3903
model for all 50 stock prices with the B–J model, the results are quite similar. The B–J H/W models are able to identify seasonality for more companies’ stock prices than the TSD model. In terms of R2 and MAPE the H/W and B–J performs about the same, while TSD performs somewhat better than both. For such a large sample of 3105 observations, the widely used forecasting error, root-mean-squared error is basically the same as the residual sum of squares. That is, the higher the R2 , the smaller the RMSE. Stock prices vary widely among different companies, and therefore it is easier to compare MAPE across different stock prices than RMSE. That is the reason why we use MAPE instead of RMSE. The neural network model does not have the comparable statistics to compare because from learning to testing and then to forecasting, the processes are different from the three time-series models covered in this study. In the ARIMA(p, d, q) (P, D, Q) model we can detect some clear seasonality of stock prices for many of the 50 companies. ForecastX can identify the best model for a given set of data. Table 112.1 summarizes the best models identified for the 50 stocks from B–J models. Out of the 50 company stock prices the R-squared typically are between 99.5 and 99.95%, while there are three exceptions where R-squared are between 98.5% and 99% from both time-series decomposition and Box– Jenkins methodology. Among the 50 company stock prices 16 of them do not demonstrate seasonality, i.e., BHP, CBSH, PCAR, HCP, SPG, AMGN, EME, HD, RF, HIBB, IBM, CERN, CSCO, AAPL, AA, AND EXC, and 34 of them show some seasonality, SLB, GE, PG, JPM, MS, SCHW, JNJ, CT, VMC, KMP, WHR, KO, BTI, FCX, MMC, USB, MYL, BAX, BA, HON, CRH, MCD, VCO, MSFT, NOK, KYO, RIG, YHOO, SYMC, INTC, AMAT, MU, T, XOM, AND NTT. The results are different from those identified by H/W model where only 25 stock prices show some seasonality. The cumulative mean percent errors are from 0.00% to 0.02%, the root-meansquare errors are very small relative to the company stock prices. The mean errors are about zero for all companies and for all four models, which indicate that there is no forecasting bias in any one direction. From Table 112.1 it is clear that over the sample period of more than 12 years, most stocks, particularly the emerging tech stocks went up greatly to 2000 and went down extreme fast from 2000 to 2002 and then fluctuated until the current financial crisis beginning in late 2007. As a result, the stock prices in our sample did not exhibit strong upward movement. As we pointed out before the first decade of the 21st century is basically a lost decade as far as the stock markets are concerned. Since the stocks in our sample
page 3903
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
K. C. Tseng, O. Kwon & L. C. Tjung
3904
Table 112.1:
Summary of best Box–Jenkins model.
Tick symbol
ARIMA(p, d, q) (P , D, Q)
AA AAPL AMAT AMGN BAX BHP BTI CBSH CERN CRH CSCO CT EME EXC FCX GE HCP HD HIBB HON IBM INTC JNJ JPM KMP KO KYO MCD MMC MS MSFT MU MYL NOK NTT PCAR PG RF RIG SCHW SLB SPG SYMC
(1, (1, (1, (1, (2, (0, (0, (1, (0, (1, (0, (0, (0, (2, (0, (0, (1, (1, (1, (0, (1, (1, (0, (1, (2, (0, (0, (0, (0, (2, (1, (0, (0, (0, (2, (1, (0, (0, (2, (2, (2, (1, (1,
1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1,
1) 0) 2) 0) 1) 1) 2) 0) 1) 2) 1) 0) 1) 1) 0) 0) 0) 1) 1) 0) 0) 0) 0) 1) 1) 0) 0) 0) 0) 1) 2) 0) 0) 0) 0) 0) 0) 1) 0) 1) 2) 0) 1)
(0, (0, (2, (0, (1, (0, (0, (0, (0, (1, (0, (1, (0, (0, (1, (1, (0, (0, (0, (1, (0, (1, (1, (1, (1, (2, (1, (1, (0, (1, (1, (1, (0, (0, (1, (0, (0, (0, (1, (1, (1, (0, (1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0) 0) 2) 0) 1) 0) 1) 0) 0) 1) 0) 0) 0) 0) 0) 1) 0) 0) 0) 0) 0) 0) 1) 0) 1) 1) 1) 1) 1) 1) 0) 1) 1) 1) 1) 0) 1) 0) 1) 0) 1) 0) 1)
(Continued)
b3568-v4-ch112
page 3904
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis Table 112.1: Tick Symbol T USB VCO VMC WHR XOM YHOO
b3568-v4-ch112
3905
(Continued)
ARIMA(p, d, q) (P , D, Q) (0, (1, (0, (0, (0, (1, (2,
1, 1, 1, 1, 1, 0, 1,
0) 0) 0) 0) 0) 0) 1)
(1, (1, (1, (1, (1, (2, (1,
0, 0, 0, 0, 0, 0, 0,
1) 0) 1) 0) 0) 0) 1)
are large or medium capitalization companies and they do not demonstrate large upward or downward movements over the whole sample period. Nine (AAPL, AMAT, AMGN, CBSH, HCP, INTC, PCAR, SPG and XOM) of the 50 stock prices were stationary. All others require only first differencing showing either upward or downward linear trend. Although there were 34 stock showed some seasonality, except PG the other 33 stocks did not need any seasonal differencing. Most stocks follow the first-order autoregressive and/or moving average and none requires more than second-order model specifications. This is why H/W and TSD models also fit the data so well since both models capture the general linear trend. In forecasting the true test of any model is its ability to forecast beyond the sample period. In this study we use the sample observations of daily stock prices from September 1, 1998 to October 6, 2010 to estimate the models and apply the resulting models to forecast the next 60 trading days, i.e., from October 7, 2010 to December 31, 2010. The mean, maximum, minimum, median, and standard deviation of MAPE from B–J model for all companies are shown in Table 112.2. From Table 112.2, 43 out of 50 companies have minimum MAPE less than 10% over the 60 trading days (about 2.9 months), 22 have average MAPE below 5%, 21 with MPAE between 5% and 10%, and only 7 have maximum MAPE greater than 10%. The standard errors of MAPE over the 60 trading days are mostly within 5%. Given the fact that individual daily stock prices are extremely volatile for many stocks during the sample period, this table shows B–J model can predict fairly accurately the future prices over the extended period of 60 trading days or 2.9 months. For Holt/Winters model the similar statistics are shown in Table 112.3. The results from H/W model are very close to those from the B–J model. Forty four out of 50 stocks have mean MAPE less than 10%, with 22 less than 5%, 22 between 5% and 10%, and only 6 greater 10%. Only 3 have
page 3905
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3906 Table 112.2:
Out of sample MAPE (in %) from Box–Jenkins model.
Company Mean Maximum Minimum Standard dev Variance Median AA AAPL AMAT AMGN BAX BHP BTI CBSH CERN CRH CSCO CT EME EXC FCX GE HCP HD HIBB HON IBM INTC JNJ JPM KMP KO KYO MCD MMC MS MSFT MU MYL NOK NTT PCAR PG RF RIG SCHW SLB SPG SYMC
9.51 7.62 8.15 2.81 3.99 7.37 2.30 2.72 3.81 7.17 9.73 15.33 7.58 3.93 10.26 4.33 4.26 4.25 19.22 7.63 4.36 5.69 0.81 3.41 2.26 5.49 2.02 3.34 7.20 3.33 8.05 10.81 5.90 4.09 1.88 8.35 4.47 18.17 6.16 9.25 15.25 3.63 9.74
19.76 11.15 17.54 7.31 7.30 13.96 5.42 5.68 11.29 1496 16.86 42.66 16.27 7.50 22.52 8.50 11.49 10.61 37.72 15.33 6.68 9.90 2.38 7.84 3.81 9.87 5.20 6.69 13.49 8.89 14.50 19.73 12.27 12.59 5.41 14.52 7.13 44.06 13.04 18.65 24.91 10.07 15.41
1.43 0.01 0.04 0.04 0.15 0.21 0.05 0.22 0.00 0.44 0.48 0.07 0.02 0.16 0.36 0.37 0.03 0.00 2.67 0.09 0.21 0.38 0.02 0.06 0.75 0.15 0.01 0.02 0.25 0.04 0.37 2.09 0.21 0.65 0.38 0.13 0.21 0.42 0.08 0.30 0.02 0.10 0.42
5.19 2.62 5.06 1.94 1.90 4.30 1.41 1.67 3.63 4.53 4.97 10.31 5.40 2.21 6.84 2.27 3.71 3.75 12.22 4.98 1.87 2.96 0.54 2.53 0.70 3.14 1.44 1.50 3.94 2.55 4.04 3.92 3.70 2.61 1.01 4.55 1.68 12.05 3.56 5.42 8.17 2.34 3.88
0.27 0.07 0.26 0.04 0.04 0.18 0.02 0.034 0.13 0.21 0.25 1.06 0.29 0.05 0.47 0.05 0.14 0.14 1.49 0.25 0.03 0.09 0.00 0.06 0.00 0.10 0.02 0.02 0.16 0.06 0.16 0.15 0.14 0.07 0.01 0.21 0.03 1.45 0.13 0.29 0.67 0.05 0.15
7.82 8.35 7.15 2.36 4.36 7.87 2.33 2.69 2.02 7.02 10.35 15.54 6.06 4.07 8.78 4.62 3.08 2.91 11.76 7.70 5.08 6.97 0.76 3.21 2.26 5.68 1.60 3.37 5.86 2.66 8.73 11.22 6.53 3.92 1.75 8.59 4.63 18.44 6.43 7.79 16.68 3.65 10.87 (Continued)
page 3906
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis Table 112.2:
3907
(Continued)
Company Mean Maximum Minimum Standard dev Variance Median T USB VCO VMC WHR XOM YHOO
1.34 9.20 1.34 8.74 6.18 8.32 14.31
Table 112.3:
3.29 17.57 2.81 20.37 11.58 13.84 0.62
0.00 0.19 0.10 2.37 1.06 0.01 3.33
0.83 5.05 0.76 5.40 2.92 4.25 0.11
0.01 0.26 0.01 0.29 0.09 0.18 11.07
1.16 9.22 1.48 5.90 6.36 9.45 10.26
Out of sample MAPE (in %) from Holt/Winters model.
Company Mean Maximum Minimum Standard dev Variance Median AA AAPL AMAT AMGN BAX BHP BTI CBSH CERN CRH CSCO CT EME EXC FCX GE HCP HD HIBB HON IBM INTC JNJ JPM KMP KO KYO MCD MMC MS
9.52 5.73 8.06 2.45 3.84 6.61 1.97 2.73 3.47 6.99 9.71 15.26 7.30 4.24 9.66 4.34 4.67 4.26 19.01 7.74 3.95 7.50 0.93 3.38 1.42 5.36 1.75 2.71 7.27 3.19
19.79 8.57 17.52 5.99 7.18 12.68 4.91 5.83 10.54 14.47 16.95 42.23 15.70 7.99 21.59 8.43 12.43 10.61 37.42 15.43 6.11 12.57 2.19 7.56 3.16 9.70 4.83 5.94 13.59 8.31
1.46 0.10 0.08 0.05 0.01 0.09 0.01 0.16 0.00 0.45 0.39 0.07 0.02 0.07 0.64 0.46 0.02 0.11 2.67 0.07 0.03 0.06 0.00 0.05 0.15 0.22 0.04 0.02 0.10 0.08
5.19 1.90 5.06 1.44 1.90 4.02 1.26 1.63 3.28 4.27 4.92 10.31 5.23 2.31 6.63 2.27 4.31 3.75 12.13 4.99 1.69 3.82 0.54 2.53 0.66 3.12 1.36 1.47 3.94 2.37
0.27 0.04 0.26 0.02 0.04 0.16 0.02 0.03 0.11 0.18 0.24 1.06 0.27 0.05 0.44 0.05 0.19 0.14 1.47 0.25 0.03 0.15 0.00 0.06 0.00 0.10 0.02 0.02 0.16 0.06
7.86 6.31 7.04 2.48 4.32 7.10 1.82 2.62 1.78 6.84 10.22 15.64 6.00 4.38 8.13 4.63 2.31 2.91 11.54 7.82 4.46 9.02 0.82 3.21 1.49 5.54 1.49 2.53 5.97 2.74 (Continued)
page 3907
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3908
Table 112.3:
(Continued)
Company Mean Maximum Minimum Standard dev Variance Median MSFT MU MYL NOK NTT PCAR PG RF RIG SCHW SLB SPG SYMC T USB VCO VMC WHR XOM YHOO
7.85 10.79 5.86 4.26 1.60 8.27 4.49 17.94 6.16 9.12 15.04 3.33 9.73 1.35 9.23 1.27 8.73 6.18 8.13 10.76
14.20 19.58 12.28 13.32 5.13 14.34 7.06 43.78 13.05 18.50 24.66 10.07 15.39 3.54 17.52 3.30 20.36 11.54 13.18 14.69
0.43 1.91 0.10 0.00 0.09 0.01 0.11 0.41 0.06 0.40 0.06 0.13 0.42 0.01 0.25 0.01 2.43 0.63 0.06 0.21
3.96 3.94 3.69 2.74 1.01 4.48 1.69 12.01 3.56 5.37 8.16 2.26 3.88 0.82 5.06 0.79 5.38 2.90 3.99 3.28
0.16 0.16 0.14 0.08 0.01 0.20 0.03 1.44 0.13 0.29 0.67 0.05 0.15 0.01 0.26 0.01 0.29 0.08 0.16 0.11
8.42 11.15 6.28 3.89 1.45 8.49 4.64 17.97 0.42 7.63 14.44 3.05 10.86 1.26 9.23 1.27 5.93 6.43 9.27 11.46
standard deviation of MAPE greater than 10% and 39 have standard deviation less than 5%. The results are marginally better than those from the B–J models. Since it is more difficult to identify the precise autoregressive and moving average orders in B–J modeling, the simple H/W models could be the preferred choice. Although ForecastX can help researchers identify some optimum model specifications, there is no guarantee that they are truly the best specifications. However, based on the excellent fits to the both within and out of sample data, ForecastX has shown its accuracy and easiness to use, and so both H/W and B–J methods can be practically applied. From time-series decomposition the similar statistics are shown in Table 112.4. We are surprised to discover that TSD beyond sample results are much worse than those from B–J and H/W models while the within sample fits are slightly better. From TSD only 9 stocks have mean MAPE under 5% and another 13 between 5% and 10% and 4 have mean MAPE over 40% (NOK 40.71%, KYO 41.40%, MU 41.69%, and RF 52.35%). From B–J models the highest mean MAPE is 19.22% for HIBB, a sporting goods company, and from H/W models HIBB also has the highest MAPE at 19.01%. For practical applications the beyond sample predictability is more relevant and therefore one may opt for H/W or B–J.
page 3908
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis Table 112.4:
3909
Out of sample MAPE (in %) from time-series decomposition.
Company Mean Maximum Minimum Standard dev Variance Median AA AAPL AMAT AMGN BAX BHP BTI CBSH CERN CRH CSCO CT EME EXC FCX GE HCP HD HIBB HON IBM INTC JNJ JPM KMP KO KYO MCD MMC MS MSFT MU MYL NOK NTT PCAR PG RF RIG SCHW SLB SPG SYMC
2.87 2.73 1.73 14.28 8.35 16.26 6.31 2.64 20.11 36.12 16.61 18.34 3.43 7.24 37.01 17.61 11.31 4.91 22.72 11.93 11.20 7.71 9.72 32.64 5.11 8.48 41.40 8.93 12.48 23.59 8.73 41.69 5.81 40.71 4.78 12.80 4.68 52.35 20.04 9.91 1.95 17.23 21.19
6.95 7.58 5.08 28.27 18.93 30.06 12.76 5.41 35.43 60.04 31.46 47.14 11.56 12.30 60.76 29.27 23.56 8.90 42.70 23.28 23.40 12.70 21.69 52.79 11.50 15.88 84.52 21.69 23.13 39.91 15.04 71.86 12.27 76.55 11.75 25.28 10.96 91.15 35.47 19.39 6.65 34.20 33.88
0.14 0.26 0.02 0.14 0.07 0.23 0.79 0.02 1.82 5.24 0.16 0.47 0.00 0.00 1.66 0.22 0.02 1.43 2.68 0.37 0.08 0.02 0.09 3.05 0.21 1.91 1.26 0.51 0.15 1.75 1.36 4.25 0.02 0.35 0.01 0.28 0.02 2.81 0.14 0.87 0.07 2.78 0.32
1.85 1.44 1.58 8.94 6.29 9.84 3.78 1.70 9.62 15.64 12.94 11.37 2.92 3.71 18.66 7.87 7.81 2.31 13.71 6.30 7.43 3.74 6.79 14.66 3.67 4.08 25.15 6.66 6.68 11.77 3.92 18.22 3.69 26.43 3.74 7.12 3.93 27.24 10.44 5.39 1.66 10.31 9.87
0.03 0.02 0.02 0.80 0.40 0.97 0.14 0.03 0.92 2.44 1.68 1.29 0.09 0.14 3.48 0.62 0.61 0.05 1.88 0.40 0.55 0.14 0.46 2.15 0.13 0.17 6.33 0.44 0.45 1.38 0.15 3.32 0.14 6.99 0.14 0.51 0.15 7.42 1.09 0.29 0.03 1.06 0.97
2.46 2.46 1.21 16.37 7.62 17.45 5.75 2.67 24.52 36.82 23.89 18.90 2.36 7.83 44.95 21.21 12.21 4.72 15.96 13.30 10.88 9.20 9.40 32.93 4.90 7.92 42.02 8.02 11.50 25.30 9.30 39.30 6.18 41.39 3.75 14.06 3.52 63.58 20.64 11.94 1.52 18.37 23.94 (Continued)
page 3909
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3910
Table 112.4:
(Continued)
Company Mean Maximum Minimum Standard dev Variance Median T USB VCO VMC WHR XOM YHOO
16.50 13.35 11.90 9.10 15.05 12.15 6.15
27.29 24.70 25.13 21.31 32.46 25.75 15.89
Table 112.5:
Company AA AAPL AMAT AMGN BAX BHP BTI CBSH CERN CRH CSCO CT EME EXC FCX GE HCP HD HIBB HON IBM INTC JNJ
4.18 1.53 0.87 1.28 0.00 0.69 0.08
7.53 8.03 6.40 6.00 9.05 7.48 4.64
0.57 0.64 0.41 0.36 0.82 0.82 0.21
18.17 13.01 10.70 6.87 16.07 16.07 4.96
Mean MAPE (in %) for all models.
Box– Jenkins
Holt/ Winters
Normalized NN
Nonnormalized NN
9.51 7.62 8.15 2.81 3.99 7.37 2.30 2.72 3.81 7.17 9.73 15.33 7.58 3.93 10.26 4.33 4.26 4.25 19.22 7.63 4.36 5.69 0.81
9.52 5.73 8.06 2.45 3.84 6.61 1.97 2.73 3.47 6.99 9.71 15.26 7.30 4.24 9.66 4.34 4.67 4.26 19.01 7.74 3.95 7.50 0.93
6.50 18.37 1.44 4.51 8.77 25.13 6.71 4.55 12.50 11.40 2.33 1.38 9.22 3.25 26.65 5.14 6.21 6.06 88.29 5.13 9.26 3.05 4.06
80.42 55.38 57.21 2.55 4.52 46.48 24.85 20.20 45.39 39.32 57.89 1215.72 22.04 12.04 38.53 43.11 26.30 19.65 39.31 25.82 26.80 7.15 7.47
Time-series decomposition 2.87 2.73 1.73 14.28 8.35 16.26 6.31 2.64 20.11 36.12 16.61 18.34 3.43 7.24 37.01 17.61 11.31 4.91 22.72 11.93 11.20 7.71 9.72 (Continued)
To make the comparisons among the five models (B–J, H/W, TSD, normalized NN, and non-normalized NN) the mean MAPEs are summarized in Table 112.5.
page 3910
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis Table 112.5:
Company JPM KMP KO KYO MCD MMC MS MSFT MU MYL NOK NTT PCAR PG RF RIG SCHW SLB SPG SYMC T USB VCO VMC WHR XOM YHOO Average Standard Error
3911
(Continued)
Box– jenkins
Holt/ winters
Normalized NN
Nonnormalized NN
Time-series decomposition
3.41 2.26 5.49 2.02 3.34 7.20 3.33 8.05 10.81 5.90 4.09 1.88 8.35 4.47 18.17 6.16 9.25 15.25 3.63 9.74 1.34 9.30 1.34 8.74 6.18 8.32 10.26 6.62
3.38 1.42 5.36 1.75 2.71 7.27 3.19 7.85 10.79 5.86 4.26 1.60 8.27 4.49 17.94 6.16 9.12 15.04 3.33 9.73 1.35 9.23 1.27 8.73 6.18 8.13 10.76 6.50
8.65 5.40 6.39 4.61 8.77 4.46 4.25 5.55 1.42 8.72 2.35 2.14 10.93 5.62 2.79 6.69 2.47 12.28 8.50 6.24 5.08 7.79 14.77 6.85 11.19 4.51 26.65 9.30
10.94 39.98 21.53 15.47 45.55 3.96 85.88 3.78 46.54 14.50 97.76 41.67 38.99 5.80 178.08 19.95 25.15 18.27 39.39 3.27 14.34 6.90 40.52 57.03 6.02 17.72 38.53 57.11
32.64 5.11 8.48 41.40 8.93 12.48 23.59 8.73 41.69 5.81 40.71 4.78 12.80 4.68 52.35 20.04 9.91 1.95 17.23 21.19 16.50 13.39 11.90 9.10 15.05 12.15 6.15 15.00
4.16
4.17
12.83
169.92
11.93
Table 112.5 shows that H/W models produce 16 out of 50 lowest MAPEs, normalized NN models generate 11, B–J and TSD have 6 each, and nonnormalized NN has 5. However, as we pointed out before the MAPEs from both H/W and B–J are very close and similar for the same stocks. Because NN models take into consideration so many variables, those stocks in which normalized NN models produce lowest MAPEs are different from the stocks with lowest MAPE from both B–J and H/W. The most conspicuous observation is that the mean MAPEs from non-normalized NN are typically very large with CT having MAPE of 1215.72% and RF of 178.08%. When we take
page 3911
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3912
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
the average and calculate the standard error across all 50 companies from a given model, H/W model has the smallest average, but B–J has smallest standard deviation. However, both models are very close. Normalized NN is just close behind those two and TSD is not too far trailing normalized NN. Clearly, we would exclude non-normalized NN models from our consideration. Again, this clearly shows why normalizing the original data is the common practice in NN. For HIBB normalized NN also shows very high MAPE of 88.29%. B–J and H/W models generate only relatively moderate mean MAPEs. From Table 112.5 we can conclude that our preferred choices of models are H/W, B–J, and normalized NN, and they all perform very well. 112.6 Conclusions In this study, we applied the traditional time-series decomposition (TSD), Holt/Winters (H/W) models, Box–Jenkins (B–J) methodology, and neural network (NN) to 50 randomly selected stocks from September 1, 1998 to December 31, 2010 with a total of 3105 observations for each company’s close stock price. This sample period covers high tech boom and bust, the historical 9/11 event, housing boom and bust, and the recent serious recession and current slow recovery. During this exceptionally uncertain period of global economic and financial crises, it is expected that stock prices are extremely difficult to predict. All three time-series approaches fit the within sample data extremely well with R2 being around 0.995. For the hold-out period or out-of-sample forecasts over 60 trading days, the forecasting errors measured in terms of mean absolute percentage errors (MAPE) are lower for B–J, H/W, and normalized NN model, but forecasting errors are quite large for time-series decomposition and non-normalized NN models. The stock markets are populated with day traders, high frequency traders, speculators, institutional investors, retail individual investors, momentum chasers, contrarians, influential financial analysts who are biased in favor of buy recommendations, and other diversified market participants with heterogeneous views about current information and future expectations. The rapid advance in information technology has spread news and rumors at near light speed. Even if stock prices are extremely difficult to predict stock prices, market participants must make decisions based on their best judgment and the methods and models they applied. The true test of the value of the models and methods we discussed in this study depends on whether they can perform as well to other sample periods and other samples for further researches.
page 3912
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3913
Bibliography Aiken, M. and Bsat, M. (1999). Forecasting Market Trends with Neural Networks. Information Systems Management 16, 42–49. Alyuda NeuroIntelligence (2010). Alyuda NeuroIntelligence Manual, http://www.alyuda. com/neural-networks-software.htm. Baiyand. A. (2011). The Trading Book: A Complete Solution to Mastering Technical Systems and Trading Psychology. McGraw-Hill, New York. Balvers, R., Wu, R. and Gilliand, E. (2000). Mean Reversion Across National Stock Markets and Parametric Contrarian Investment Strategies. Journal of Finance 55, 773–806. Box, G. and Jenkins, G. (1976). Time Series Analysis: Forecasting & Control, Revised edn. Holden Day, San Francisco. Brock, W., Lakonishok, J. and LeBaron, B. (1992). Simple Technical Trading Rules and Stochastic Properties of Stock Returns. Journal of Finance 47, 1731–1764. Corgnet, B., Desantis, M. and Porter, D. (2018). What Makes a Good Trader? On the Role of Intuition and Reflection on Trader Performance. Journal of Finance 73, 1113–1137. Cujean, J. and Hasler, M. (2017). Why Does Return Predictability Concentrate in Bad Times. Journal of Finance 72, 2717–2757. Daniel, K., Hirshleifer, D. and Subrahmanyam, A. (1998). Investor Psychology and Security Market Under- and Overreactions. Journal of Finance 53, 1839–1885. Daniel, K. and Hirshleifer, D. (2015). Overconfident Investors, Predictable Returns, and Excessive Trading. Journal of Economic Perspectives 29, 61–88. DeBondt, W. and Thaler, R. (1985). Does the Stock Market Overreact? Journal of Finance 40, 793–805. DeBondt, W. and Thaler, R. (1987). Further Evidence on Investor Overreaction and Stock Market Seasonality. Journal of Finance 42, 557–581. Dutta, G., Jha, P., Laha, A. and Mohan, N. (2006). Artificial Neural Network Models for Forecasting Stock Price Index in the Bombay Stock Exchange. Journal of Emerging Market Finance 5, 283–295. Ferson, W. and Harvey, C. (1993). The Risk and Predictability of International Equity Returns. Review of Financial Studies 6, 527–566. Fletcher, D. and Goss, E. (1993). Forecasting with Neural Networks: An Application Using Bankruptcy Data. Information Management 24(3), 159–167. Foucault, T., Hombert, J. and Rosu, I. (2016). News Trading and Speed. Journal of Finance 71, 335–381. Garleanu, N. and Pedersen, L.H. (2018). Efficiently Inefficient Markets for Assets and Asset Management. Journal of Finance 73, 1663–1712. Gazzaz, N.M., Yusoff, M.K., Aris, A.Z., Juahir, H. and Ramli, M.F. (2012). Artificial Neural Network Modeling of the Water Quality Index for Kinta River (Malaysia) Using Water Quality Variables as Predictors. Marine Pollution Bulletin 64(11), 2409–2420. Giudici, P. (2001). Bayesian Data Mining, with Application to Benchmarking and Credit Scoring. Applied Stochastic Models in Business & Industry 17, 69–81. Glickstein, D. and Wubbels, R. (1983). Dow Theory is Alive and Well! Journal of Portfolio Management 28–32. Gordon, W. (1968). The Stock Market Indicators. Investors Press, Palisades, NJ. Grudnitski, G. and Osburn, L. (1993). Forecasting S&P and Gold Futures Prices: An Application of Neural Network. Journal of Futures Markets 13, 631–643. Gutierrez, R. and Kelly, E. (2008). The Long-Lasting Momentum in Weekly Returns. Journal of Finance 63, 415–447.
page 3913
July 6, 2020
16:5
3914
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
Hammad, A., Ali, S. and Hall, E. (2009). Forecasting the Jordanian stock price using artificial neural network, http://www.min.uc.edu/robotics/papers/paper2007/Final% 20ANNIE%2007%20Souma%20Alhaj%20Ali%206p.pdf. Hanke, J.E. and Wichern, D.W. (2005). Business Forecasting, 8th edn. Pearson/PrenticeHall, Upper Saddle River, NJ. Haugen, R.A. (1999). The Inefficient Stock Market What Pays and Why. Prentice-Hall, Upper Saddle River, NJ. Hendershott, T., Jones, C.M. and Menkveld, A.J. (2011). Does Algorithmic Trading Improve Liquidity? Journal of Finance 66, 1–33. Henry, T.R. and Koski, J.L. (2017). Ex-dividend Profitability and Institutional Trading Skills. Journal of Finance 72, 461–493. Hong, H. and Stein, J. (1999). A Unified Theory of Underreaction, Momentum Trading, and Overreaction in Asset Markets. Journal of Finance 54, 2143–2184. Hong, H. and Sraer, D. (2016). Speculative Betas. Journal of Finance 71, 2095–2144. Huang, H.C. and Chen, C.C. (2013). A Study on the Construction of the Prediction Model for Exchange Rate Fluctuations. International Journal of Intelligent Information Processing (IJIIP) 4(4), 63–74. Jegadeesh, N. and Titman (1993). Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency. Journal of Finance 48, 65–91. Jeong, H., Song, S., Shin, S. and Cho, B. (2008). Integrating Data Mining to a Process Design Using the Robust Bayesian Approach. International Journal of Reliability, Quality & Safety Engineering 15, 441–464. Kahneman, D. and Tversky, A. (1979). Prospect Theory: An Analysis of Decision Under Risk: Heuristic and Biases. Econometrica 47, 263–291. Keloharju, M., Linnaimaa, J.T. and Nyberg, P. (2016). Return Seasonalities. Journal of Finance 71, 1557–1589. Kimoto, T., Asakawa, K., Yoda, M. and Takeoka, M. (1990). Stock Market Prediction System with Modular Neural Networks. Proceedings of the IEEE International Conference on Neural Networks pp. 1–16. Kohzadi, N., Boyd, M., Kemlanshahi, B. and Kaastra, I. (1996). A Comparison of Artificial Neural Network and Time Series Models for Forecasting Commodity Prices. Neurocomputing 10, 169–181. Kwon, O., Wu, Z. and Zhang, L. (2016). Study of the Forecasting Performance of China Stock’s Prices Using Business Intelligence (BI): Comparison Between Normalized and Denormalized Data. Academy of Information and management Sciences Journal 20(1), 53–69. Kryzanowski, L., Galler, M. and Wright, D. (1993). Using Artificial Neural Networks to Pick Stocks. Financial Analysts Journal 49, 21–27. Leigh, W., Hightower, R. and Modani, N. (2005). Forecasting the New York Stock Exchange Composite Index with Past Price and Interest Rate on Condition of Volume Spike. Expert Systems with Applications 28, 1–8. Lo, A., Mamaysky, H. and Wang, J. (2000). Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation. Journal of Finance 55, 1705–1765. Loh, R.K. and Stulz, R. (2018). Is Sell-Side Research More Valuable in Bad Times? Journal of Finance 73, 959–1019. McGrath, C. (2002). Terminator Portfolio. Kiplinger’s Personal Finance 56, 56–57. McNelis, P. (1996). A Neural Network Analysis of Brazilian Stock Prices: Tequila Effects Vs. Pisco Sour Effects. Journal of Emerging Markets 1, 29–44.
page 3914
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3915
Meng, D. (2008). A Neural Network Model to Predict Initial Return of Chinese SMEs Stock Market Initial Public Offerings. 2008 IEEE International Conference on Networking, Sensing and Control pp. 394–398. Menzly, L. and Ozbas, O. (2010). Market Segmentation and Cross-Predictability of Returns. Journal of Finance 65, 1555–1580. Moshiri, S. and Cameron, N. (2000). Neural Network Versus Econometric Models in Forecasting Inflation. Journal of Forecasting 19, 201–217. Moskowitz, T. and Grinblatt, M. (1999). Do Industries Explain Momentum? Journal of Finance 54, 1249–1290. Mostafa, M. (2004). Forecasting the Suez Canal Traffic: A Neural Network Analysis. Maritime Policy and Management 31, 139–156. Mostafa, M. (2010). Forecasting Stock Exchange Movements Using Neural Networks: Empirical Evidence from Kuwait. Expert Systems with Application 37, 6302–6309. Nofsinger, J. and Sias, R. (1999). Herding and Feedback Trading by Institutional and Individual Investors. Journal of Finance 54, 2263–2295. Odean, T. (1998). Volume, Volatility, Price, and Profits When All Traders are Above Average. Journal of Finance 53, pp. 1887–1934. Odean, T. (1999). Do Investors Trade Too Much? American Economic Review 89, 1279– 1298. Palani, S., Liong, S. and Tkalich, P. (2008). An ANN Application for Water Quality Forecasting. Marine Pollution Bulletin 56(9), 1586–1597. Poh, H., Yao, J. and Jasic, T. (1998). Neural Networks for the Analysis and Forecasting of Advertising Impact. International Journal of Intelligent Systems in Accounting, Finance and management 7, 253–268. Pring, M. (1991). Technical Analysis Explained, 3rd edn. McGraw-Hill. Ruiz-Suarez, J., Mayora-Ibarra, O., Torres –Jimenez, J. and Ruiz-Suarez, L. (1995). ShortTerm Ozone Forecasting by Artificial Neural Network. Advances in Engineering Software 23, 143–149. Shefrin, H. (2000). Beyond Greed and Fear Understanding Behavioral Finance and the Psychology of Investing. Harvard Business School Press. Siegel, J.J. (2008). Stocks for the Long Run, 4th edn. McGraw-Hill. Sullivan, R., Timmermann, A. and White, H. (1999). Data-Snooping, Technical Trading Rule Performance, and the Bootstrap. Journal of Finance 54, 1647–1691. Tjung, L.C., Kwon, O. and Tseng, K.C. (2012). Comparison Study on Neural Network and Ordinary Least Squares Model to Stocks’ Prices Forecasting. Financial Management 9(1), 32–54. Tokic, D. (2005). Explaining US Stock Market Returns from 1980 to 2005. Journal of Asset Management 6, 418–432. Tsai, C.-F. and Wang, S.-P. (2009). Stock Price Forecasting by Hybrid Machine Learning Techniques. Proceedings of International Multi Conference of Engineers and Computer Scientists, p. 1. Turban, E. (1992). Expert Systems and Applied Artificial Intelligence. Macmillan Publishing Company, New York. Videnova, I., Nedialkova, D., Dimitrova, M. and Popova, S. (2006). Neural Networks for Air Pollution Forecasting. Applied Artificial Intelligence 20, 493–506. West, P., Brockett, P. and Golden, L. (1997). A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice. Marketing Science 16, 370–391. Wilson, J.W., Keating, B. and John Galt Solutions, Inc. (2009). Business Forecasting with ForecastXT M, 6th edn. McGraw-Hill Irwin, Boston.
page 3915
July 6, 2020
16:5
3916
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
Yu, L., Wang, S. and Lai, K. (2009). A Neural-Network-Based Nonlinear Metamodeling Approach to Financial Time Series Forecasting. Applied Soft Computing 9, 563–574. Yumlu, S., Gurgen, F. and Okay, N. (2005). A Comparison of Global, Recurrent and Smoothed-Piecewise Neural Models for Istanbul Stock Exchange (ISE) Prediction. Pattern Recognition Letters 26, 2093–2103. Zweig, M. (1990). Winning on Wall Street. Warner Books, New York.
Appendix 112A Neural Networks One of the most significant advantages of Neural Networks (NNs) lies in its ability to handle very large number of observations and variables. In this study we used eight major indicators: Aggregate indicators such as global market indices, individual competitors, political indicators such as presidential election date and party, US market indices, market sentiment indicators, institutional investors (Franklin Resources), and calendar anomalies. Data were collected from National Bureau of Economic Research, Yahoo Finance, the Federal Reserve Banks, Market Vane, NYSE, and FXStreet. Altogether there are 213 variables and the detail can be found from the Appendix 112B.
112A.1 Basic neural network model A basic Neural Network (NN) model’s framework is shown in Figure 112A.1. Input neurons (1 to n) are connected to an output neuron j and each connection has an assigned weight (wjo to wjn ). In this example, the output of j becomes 1 (activated) when the sum of the total stimulus (Sj ) becomes great than 0 (zero). The activation function in this example used a simple unit function (0 or 1), but other functions such as Gaussian, exponential, sigmoid, or hyperbolic functions can be used for complex networks.
Sj =
n i=0
ai wji
where: wji = the weight associated with the connection to processing unit j from processing unit i ai = the value output by input unit i. Output of j = Xj {0 if Sj ≤ 0 and 1 if Sj > 0 Figure 112A.1: Basic neural network model. Source: Dayhoff, J. (1990) Neural Network Architectures. Van Nostrand Reinhold, New York.
page 3916
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis
3917
Back-propagation is one of the most popular learning algorithms in NN and is derived to minimize the error using the following formula: 2 (tpk − Opk ) , (112A.1) E = 0.5 p
k
where p = the pattern i, k = the output unit, tpk = the target value of output unit k for patter p, Opk = the actual output value of output layer unit k for patter p. 112A.2 BrainMaker First, we used BrainMaker software to create an NN model for four companies (C, GS, JPM, and MS), but BrainMaker had a major limitation of 20 variables, and so it was not adequate for the number of variables in our model. We included independent variables from stepwise regression to BrainMaker due to the limit. However, BrainMaker burst out and unable to learn. BrainMaker failed to perform as shown in Figure 112A.2.
Figure 112A.2:
BrainMaker error distribution.
page 3917
July 6, 2020
16:5
3918
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
112A.3 Alyuda NeuroIntelligence We searched and found Alyuda NeuroIntelligence (ANI) that allowed us to handle 272 independent variables (Tables 112B.1–112B.7 in the Appendix 112B) and 50 dependent variables. ANI is used to create the second generation of NN models. Genetic algorithm (GA) has the capabilities in pattern recognition, categorization, and association and therefore it has been widely applied in NN. Turban (1992) has shown that a genetic algorithm enables NN to learn and adapt to changes through machine learning for automatically solving complex problems based on a set of repeated instructions. GA enables NN to produce improved solutions by selecting input variables with higher fitness ratings. Alyuda NeuroIntelligence is using GA and it enables us to retain the best network. We used both the non-normalized and normalized data. We followed the seven-step neural network design process to build up the network. ANI is used to perform data analysis, data preprocessing, network design, training, testing, and query. Logistic function is applied to design the network. The logistic function has a sigmoid curve of F (x) = 1/(1+e−x ) with output range of [−1, 0.1]. Batch back propagation model with stopping training condition of 501 iterations is used to find the best network during the network training. We used the same model architecture of 272-41-1 for all normalized data and 272-1-1 for all non-normalized data. The network architecture consisted of 272 input neurons, 41 neurons in the hidden layer, and 1 output neuron. The number of iterations is intended to escape from local minima and reach a global minimum to achieve the lowest possible errors to train the network. The setup screen of ANI is shown in Figure 112A.3. 112A.4 Data — training data set, validation data set, and out-of-sample testing data set There are three sets of data used in the neural network model such as training set, validation set, and testing set. The training set is used to train the neural network and adjust network weights. The validation set is used to tune network parameters other than weights, to calculate generalization loss, and retain the best network. The testing set is used to test how well the neural network performs on new data after the network is trained. We used training and validation data to train the network and come up with a model. Finally, we used out-of-sample testing data to test the forecasting errors between the actual and predicted values. That is, we have both training (80%) and
page 3918
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
Figure 112A.3:
b3568-v4-ch112
3919
Alyuda Neurointelligence setup screen.
validation (20%) data from September 1, 1998 to October 6, 2010 and testing data from October 7, 2010 to December 31, 2010. 112A.5 Non-normalized data For non-normalized data, we used an original data directly from the sources without any modifications. The same data are being used for running timeseries regressions. As the literature in NN suggested, using non-normalized data generated a bigger errors with high standard deviations as shown in the Table 112.5. Therefore, we normalized the data using various techniques and discussed in the next section. A sample NN run using Non-Normalized data is shown in the Figure 112A.4. 112A.6 Normalized data We tried various data normalization techniques and compared the performance of the networks. We found that using the stock price difference rather than actual daily stock price works much better. Then, we looked at the numbers in our data: both positive and negative numbers in daily stock price changes. We thought we would want to have all positive numbers to
page 3919
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3920
Figure 112A.4:
Neural network non-normalized network architecture.
see how the neural network learns. So, we wanted to shift up or normalize the data. First, we searched for the lowest negative numbers. We wanted to add the negative numbers to all numbers to make all numbers positive. Second, we took the absolute value of the lowest negative numbers. If it was not done so, we would have negative numbers, plus negative numbers result in bigger negative numbers. For example, −6 + (−6) = −12. Third, we wanted to take into account the rounding error by adding 0.1 to the absolute value of the lowest negative numbers. For example, to normalize the data of company A, we added the absolute value of lowest negative numbers of company A, that is, |−6.7| to 0.1. As a result, we have 6.8. Then we used 6.8 to add all number. Let us say we used the lowest numbers: 6.8 + (−6.7) = 0.1. To sum up, the formula we used to normalize the data = (|lowest negative number| + 0.1 + all number in our data set). After we normalized the data, we had both a lower mean and standard deviation for all NN models. According to Alyuda NeuroIntelligence manual (2010) “back-propagation algorithm is the most popular algorithm for training of multi-layer perceptrons and is often used by researchers and practitioners. The main drawbacks of back propagation are: slow convergence, need to tune up the learning rate
page 3920
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
Time Series and Neural Network Analysis
Figure 112A.5:
3921
Neural network normalized network architecture.
and momentum parameters, and high probability of getting caught in local minima.” Gaussian distribution of network inputs is used to retrain and restore the best network and randomize weights. By retraining and restoring the best network over-training such as memorizing data instead of generalizing and encoding data relationships can be prevented and thus reduce the network errors. A 10% jitter (random noise) was added to avoid over-training and local minima. Weights randomization can avoid sigmoid saturation that causes slow training. A sample NN run for normalized network is shown in Figure 112A.5. The results from normalized and non-normalized data can better compare with those from the three time-series models discussed previously. The findings from normalized data may shed light on the possible improvement from normalization since in NN normalizing data has become a common practice. 112A.7 Determination of the Numbers of Hidden Nodes (NH) for the Hidden Layer To determine the Numbers of Hidden Nodes (NH) for the Artificial Neural Network, we combined the former rules from previous research. Various previous research discuss the relationship among the Numbers of Hidden Nodes (NH), the Numbers of Inputs (I) for input layer and the Numbers of Outputs (O) for output layer. According to Fletcher and Goss (1993), the numbers
page 3921
July 6, 2020
16:5
3922
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
of Hidden Nodes (NH) should range from 2I1/2 + O to 2I + 1. Palani et al. (2008) suggest that the HN should range both from I to 2I + 1 and larger than I/3 and O. And Alyuda Research (2006) suggests that the HN should range from I/2 to 4I. Also, Gazzaz et al. (2012) have combined the former three rules for Neural Networks application with Alyuda NeuroIntelligence, which states that the NH should range between I/3 and 4I. As a result, for this study, the numbers of hidden nodes should lay down between 1/3 of inputs to four times of inputs and larger than numbers of outputs as well (1/3 I < NH < 4I ∩ NH > O). 112A.8 Criterion of architecture searching Based on the previous researches, different criteria can be applied to Alyuda NeuroIntelligence for the best neural networks architecture Searching. Gazzaz et al. (2012) have applied R-squared as model selection criteria in forecasting water quality index. Gaurang et al. (2010) have discussed and pointed out the significant efficiency of AIC in neural networks architecture searching. Huang and Chen (2013) have used minimum testing error as criteria to select neural networks architecture for Exchange Rate Prediction Model. Considering the huge data Inputs in this research, efficiency will be important for our modeling and further practical operation in future. Hence, AIC will be applied as the best neural networks architecture searching criteria. The architecture with the highest AIC will be selected for the network training. Each company should have its own architecture.
112A.9 Training stop criterion and network selection There are several training stop criteria from previous papers, on Alyuda NeuroIntelligence. Anwer and Watanabe (2010) have set the termination of training after 20,000 iterations or mean squared error (MSE) < 0.000001, and the learning and momentum rate at 0.1 for back-propagation. Gazzaz et al. (2012) have applied 0.000001 as the network MSE improvement, 0.01 of training set MSE and maximum for 10,000 iterations. Also, Gazzaz et al. (2012) retrain for 10 times according to Alyuda NeuroIntelligence manual. Meng (2008) has applied 50,000 iterations and network error (MSE) as 0.01 in predicting the return on IPO in China stock market. Since the uncertain of the training process, more training times for the Artificial Neural Network will have a better chance to achieve better results. For this research, the training was set to stop when 10,000 iterations are finished with 10 retain, or stop training at 0.000001MSE improvement or the achievement of 0.01
page 3922
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis
b3568-v4-ch112
3923
training error. This training process was conducted three times for each stock, and the network with lowest Relative Error was selected. Appendix 112B List of Variables Used Table 112B.1:
Macroeconomic indicators (World Indexes).
DJI IXIC FCHI AEX GDAXI N225 FTSE SSMI ATX BFX KFX HEX ATG XU100 AORD MERV BVSP MXX IGRA BSESN HIS KLSE STI TWII KSE PSI
Dow Jones Nasdaq Composite France Netherlands Germany Japan United Kingdom Switzerland Austria Belgium Denmark Finland Greece Turkey Australia Argentina Brazil Mexico Peru India Hong Kong Malaysia Singapore Taiwan Pakistan Philippines
Source: YahooFinance.
Table 112B.2: GC NG CL HG PA PL SI
Market indicators. Gold Natural Gas Light Crude Oil COPPER PALLADIUM PLATINUM SILVER (Continued)
page 3923
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3924
Table 112B.2: AD BR BP CD JY MP SF
(Continued) Australian Dollar Brazil Real British Pound Canadian Dollar Japanese Yen Mexican Pesso Swiss Franc
Source: Pifin. Table 112B.3: Basic Materials 1 2 3 4 5 6 7 8 9 10 11 12
Agricultural Chemicals Aluminum Chemicals — Major Diversified Copper Gold Independent Oil & Gas Industrial Metals & Minerals Major Integrated Oil & Gas Nonmetallic Mineral Mining Oil & Gas Drilling & Exploration Oil & Gas Equipment & Services Oil & Gas Pipelines
13 Oil & Gas Refining & Marketing 14 Silver 15 Specialty Chemicals 16 Steel & Iron 17 Synthetics CONGLOMERATES 18 Conglomerates CONSUMER GOODS 19 Appliances 20 Auto Manufacturers — Major 21 Auto Parts 22 Beverages — Brewers 23 24 25 26
Beverages — Soft Drinks Beverages — Wineries & Distillers Business Equipment Cigarettes
27 Cleaning Products
Microeconomic indicators. Companies POTASH CP SASKATCHEWAN [POT] ALCOA INC [AA] DOW CHEMICAL [DOW] FREEPORT MCMORAN [FCX] BARRICK GOLD [ABX] OCCIDENTAL PETROLEUM [OXY] BHP BILLITON [BHP] EXXON MOBIL [XOM] HARRY WINSTON DIAMOND [HWD] TRANSOCEAN [RIG] SCHLUMBERGER [SLB] KINDER MORGAN ENERGY PARTNERS [KMP] IMPERIAL OIL [IMO] COEUR D’ ALENE MINES COPR [CDE] LUBRIZOL CORP [LZ] RIO TINTO PLC [RTP] PRAXAIR INC [PX] GENERAL ELECTRIC [GE] WHIRLPOOL CORP [WHR] HONDA MOTOR CO. LTD [HMC] JOHNSON CONTROLS INC [JCI] FORMENTO ECONOMICO MEXICANO [FMX] THE COCA-COLA CO. [KO] DIAGEO PLC [DEO] XEROX CORP. [XRX] BRITISH AMERICAN TOBACCO PCL [BTI] ECOLAB INC [ECL] (Continued)
page 3924
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis Table 112B.3: Basic Materials 28 29 30 31
Confectioners Dairy Products Electronic Equipment Farm Products
32 Food — Major Diversified 33 Home Furnishings & Fixtures 34 Housewares & Accessories 35 36 37 38 39 40 41 42 43 44
Meat Products Office Supplies Packaging & Containers Paper & Paper Products Personal Products Photographic Equipment & Supplies Processed & Packaged Goods Recreational Goods, Other Recreational Vehicles Rubber & Plastics
45 46 47 48 49 50
Sporting Goods Textile — Apparel Clothing Textile — Apparel Footwear & Accessories Tobacco Products, Other Toys & Games Trucks & Other Vehicles
FINANCIAL 51 Accident & Health Insurance 52 Asset Management 53 Closed-End Fund — Debt 54 Closed-End Fund — Equity 55 Closed-End Fund — Foreign 56 57 58 59 60 61 62
Credit Services Diversified Investments Foreign Money Center Banks Foreign Regional Banks Insurance Brokers Investment Brokerage — National Investment Brokerage — Regional
b3568-v4-ch112
3925
(Continued) Companies CADBURY PLC [CBY] LIFEWAY FOODS INC [LWAY] SONY CORPORATION [SNE] ARCHER-DANIELS-MIDLAND [ADM] HJ HEINZ CO. [HNZ] FORTUNE BRANDS INC [FO] NEWELL RUBBERMAID INC [NWL] HORMEL FOODS CORP. [HRL] ENNIS INC. [EBF] OWENS-ILLINOIS [OI] INTERNATIONAL PAPER CO. [IP] PROCTER & GAMBLE CO. [PG] EASTMAN KODAK [EK] PEPSICO INC. [PEP] FOSSIL INC. [FOSL] HARLEY-DAVIDSON INC. [HOG] GOODYEAR TIRE & RUBBER CO. [GT] CALLAWAY GOLF CO. [ELY] VF CORP. [VFC] NIKE INC. [NKE] UNIVERSAL CORP. [UVV] MATTEL INC. [MAT] PACCAR INC. [PCAR] AFLAC INC. [AFL] T. ROWE PRICE GROUP INC. [TROW] ALLIANCE BERNSTEIN INCOME FUND INC. [ACG] DNP SELECT INCOME FUND INC. [DNP] ABERDEEN ASIA-PACIFIC INCOME FUND INC. [FAX] AMERICAN EXPRESS CO. [AXP] MORGAN STANLEY [MS] WESTPAC BANKING CORP [WBK] BANCOLOMBIA S.A. [CIB] MARSH & MCLENNAN [MMC] CHARLES SCHWAB CORP. [SCHW] JEFFERIES GROUP INC. [JEF] (Continued)
page 3925
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3926
Table 112B.3: Basic Materials 63 Life Insurance 64 Money Center Banks 65 Mortgage Investment 66 Property & Casualty Insurance 67 Property Management 68 REIT — Diversified 69 REIT — Healthcare Facilities 70 REIT — Hotel/Motel 71 72 73 74
REIT REIT REIT REIT
75 76 77 78 79 80 81
Real Estate Regional — Regional — Regional — Regional — Regional — Regional —
— — — —
Industrial Office Residential Retail Development Mid-Atlantic Banks Midwest Banks Northeast Banks Pacific Banks Southeast Banks Southwest Banks
82 Savings & Loans 83 Surety & Title Insurance HEALTHCARE 84 Biotechnology 85 Diagnostic Substances 86 Drug Delivery 87 Drug Manufacturers — Major 88 Drug Manufacturers — Other 89 90 91 92 93 94 95 96
9.61in x 6.69in
Drug Related Products Drugs — Generic Health Care Plans Home Health Care Hospitals Long-Term Care Facilities Medical Appliances & Equipment Medical Instruments & Supplies
(Continued) Companies
AXA [AXA] JPMORGAN CHASE & CO. [JPM] ANALLY CAPITAL MANAGEMENT [NLY] BERKSHIRE HATHAWAY [BRK-A] ICAHN ENTERPRISES, L.P. [IEP] PLUM CREEK TIMBER CO. INC. [PCL] HCP INC. [HCP] HOST HOTELS & RESORTS INC. [HST] PUBLIC STORAGE [PSA] BOSTON PROPERTIES INC. [BXP] EQUITY RESIDENTIAL [EQR] SIMON PROPERTY GROUP INC. [SPG] THE ST. JOE COMPANY [JOE] BB & T CORP. [BBT] US BANCORP [USB] STATE STREET CORP. [STT] BANK OF HAWAII CORP. [BOH] REGIONS FINANCIAL CORP. [RF] COMMERCE BANCSHARES INC. [CBSH] PEOPLE’S UNITED FINANCIAL INC. [PBCT] FIRST AMERICAN CORP. [FAF] AMGEN INC. [AMGN] IDEXX LABORATORIES INC. [IDXX] ELAN CORP. [ELN] JOHNSON & JOHNSON [JNJ] TEVA PHARMACEUTICAL INDUSTRIES LTD [TEVA] PERRIGO CO. [PRGO] MYLAN INC. [MYL] UNITEDHEALTH GROUP INC. [UNH] LINCARE HOLDINGS INC. [LNCR] TENET HEALTHCARE CORP. [THC] EMERITUS CORP. [ESC] MEDTRONIC INC. [MDT] BAXTER INTERNATIONAL INC. [BAX] (Continued)
page 3926
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis Table 112B.3: Basic Materials 97 Medical Laboratories & Research 98 Medical Practitioners 99 Specialized Health Services INDUSTRIAL GOODS 100 Aerospace/Defense — Major Diversified 101 Aerospace/Defense Products & Services 102 103 104 105 106 107
Cement Diversified Machinery Farm & Construction Machinery General Building Materials General Contractors Heavy Construction
108 109 110 111 112 113
Industrial Electrical Equipment Industrial Equipment & Components Lumber, Wood Production Machine Tools & Accessories Manufactured Housing Metal Fabrication
114 Pollution & Treatment Controls 115 Residential Construction 116 Small Tools & Accessories 117 Textile Industrial 118 Waste Management SERVICES 119 Advertising Agencies 120 Air Delivery & Freight Services 121 Air Services, Other 122 Apparel Stores 123 Auto Dealerships 124 Auto Parts Stores 125 Auto Parts Wholesale 126 Basic Materials Wholesale 127 Broadcasting — Radio 128 Broadcasting — TV 129 130 131 132 133
Business Services CATV Systems Catalog & Mail Order Houses Computers Wholesale Consumer Services
b3568-v4-ch112
3927
(Continued) Companies QUEST DIAGNOSTICS INC. [DGX] TRANSCEND SERVICES INC. [TRCR] DAVITA INC. [DVA] BOEING CO. [BA] HONEYWELL INTERNATIONAL INC. [HON] CRH PLC[CRH] ILLINOIS TOOL WORKS INC. [ITW] CATERPILLAR INC. [CT] VULCAN MATERIALS CO. [VMC] EMCOR GROUP INC. [EME] MCDERMOTT INTERNATIONAL INC. [MDR] EATON CORPORATION [ETN] EMERSON ELECTRIC CO. [EMR] WEYERHAEUSER CO. [WY] STANLEY WORKS [SWK] SKYLINE CORP [SKY] PRECISION CASTPARTS CORP. [PCP] DONALDSON COMPANY INC. [DCI] NVR INC. [NVR] THE BLACK & DECKER CORP. [BDK] MOHAWK INDUSTRIES INC. [MHK] WASTE MANAGEMENT INC. [WM] OMNICOM GROUP INC. [OMC] FEDEX CORP. [FDX] BRISTOW GROUP INC. [BRS] GAP INC. [GPS] CARMAX INC. [KMX] AUTOZONE INC. [AZO] GENUINE PARTS CO. [GPC] AM CASTLE & CO. [CAS] SIRIUS XM RADIO INC. [SIRI] ROGERS COMMUNICATIONS INC. [RCI] IRON MOUNTAIN INC. [IRM] COMCAST CORP. [CMCSA] AMAZON.COM INC. [AMZN] INGRAM MICRO INC. [IM] MONRO MUFFLER BRAKE INC. [MNRO] (Continued)
page 3927
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3928
Table 112B.3: Basic Materials 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
9.61in x 6.69in
Department Stores Discount, Variety Stores Drug Stores Drugs Wholesale Education & Training Services Electronics Stores Electronics Wholesale Entertainment — Diversified Food Wholesale Gaming Activities General Entertainment Grocery Stores Home Furnishing Stores Home Improvement Stores Industrial Equipment Wholesale Jewelry Stores Lodging
151 Major Airlines 152 Management Services 153 Marketing Services 154 155 156 157 158 159 160 161
Medical Equipment Wholesale Movie Production, Theaters Music & Video Stores Personal Services Publishing — Books Publishing — Newspapers Publishing — Periodicals Railroads
162 163 164 165 166 167 168 169 170 171 172 173 174
Regional Airlines Rental & Leasing Services Research Services Resorts & Casinos Restaurants Security & Protection Services Shipping Specialty Eateries Specialty Retail, Other Sporting Activities Sporting Goods Stores Staffing & Outsourcing Services Technical Services
(Continued) Companies
THE TJX COMPANIES INC. [TJX] WAL-MART STORES INC. [WMT] CVS CAREMARK CORP. [CVS] MCKESSON CORP. [MCK] DEVRY INC. [DV] BEST BUY CO. INC. [BBY] AVNET INC. [AVT] WALT DISNEY CO. [DIS] SYSCO CORP. [SYY] BALLY TECHNOLOGIES INC. [BYI] CARNIVAL CORP. [CCL] KROGER CO. [KR] WILLIAMS-SONOMA INC. [WSM] THE HOME DEPOT INC. [HD] W.W. GRAINGER INC. [GWW] TIFFANY & CO. [TIF] STARWOOD HOTELS & RESORTS WORLDWIDE INC. [HOT] AMR CORP. [AMR] EXPRESS SCRIPTS INC. [ESRX] VALASSIS COMMUNICATIONS INC. [VCI] HENRY SCHEIN INC. [HSIC] MARVEL ENTERTAINMENT INC. [MVL] BLOCKBUSTER INC. [BBI] H&R BLOCK INC. [HRB] THE MCGRAW-HILL CO. INC. [MHP] WASHINGOTN POST CO. [WPO] MEREDITH CORP. [MDP] BURLINGTON NORTHERN SANTA FE CORP. [BNI] SOUTHWEST AIRLINES CO. [LUV] RYDER SYSTEM INC. [R] PAREXEL INTL CORP. [PRXL] MGM MIRAGE [MGM] MCDONALD’S CORP. [MCD] GEO GROUP INC. [GEO] TIDEWATER INC. [TDW] STARBUCKS CORP. [SBUX] STAPLES INC. [SPLS] SPEEDWAY MOTORSPORTS INC. [TRK] HIBBETT SPORTS INC. [HIBB] PAYCHEX INC. [PAYX] JACOBS ENGINEERING GROUP INC. [JEC] (Continued)
page 3928
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis Table 112B.3: Basic Materials 175 Trucking 176 Wholesale, Other TECHNOLOGY 177 Application Software 178 Business Software & Services 179 Communication Equipment 180 Computer Based Systems 181 Computer Peripherals 182 Data Storage Devices 183 Diversified Communication Services 184 Diversified Computer Systems 185 Diversified Electronics 186 Healthcare Information Services 187 Information & Delivery Services 188 Information Technology Services 189 Internet Information Providers 190 Internet Service Providers 191 Internet Software & Services 192 Long Distance Carriers 193 Multimedia & Graphics Software 194 Networking & Communication 19 Devices 195 Personal Computers 196 Printed Circuit Boards 197 Processing Systems & Products 198 Scientific & Technical Instruments 199 200 201 202 203
Security Software & Services Semiconductor — Broad Line Semiconductor — Integrated Circuits Semiconductor — Specialized Semiconductor Equipment & Materials
b3568-v4-ch112
3929
(Continued) Companies JB HUNT TRANSPORT SERVICES INC. [JBHT] VINA CONCHA Y TORO S.A. [VCO] MICROSOFT CORP. [MSFT] AUTOMATIC DATA PROCESSING INC. [ADP] NOKIA CORP. [NOK] ADAPTEC INC. [ADPT] LEXMARK INTERNATIONAL INC. [LXK] EMC CORP. [EMC] TELECOM ARGENTINA S A [TEO] INTERNATIONAL BUSINESS MACHINES CORP. [IBM] KYOCERA CORP. [KYO] CERNER CORP. [CERN] DUN & BRADSTREET CORP. [DNB] COMPUTER SCIENCES CORPORATION [CSC] YAHOO! INC. [YHOO] EASYLINK SERVICES INTERNATIONAL CORP. [ESIC] CGI GROUP INC. [GIB] TELEFONOS DE MEXICO, S.A.B. DE C.V. [TMX] ACTIVISION BLIZZARD INC. [ATVI] CISCO SYSTEMS INC. [CSCO] APPLE INC. [AAPL] FLEXTRONICS INTERNATIONAL LTD. [FLEX] POLYCOM INC. [PLCM] THERMO FISHER SCIENTIFIC INC. [TMO] SYMANTEC CORP. [SYMC] INTEL CORP. [INTC] QUALCOMM INC. [QCOM] XILINX INC. [XLNX] APPLIED MATERIALS INC. [AMAT] (Continued)
page 3929
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch112
K. C. Tseng, O. Kwon & L. C. Tjung
3930
Table 112B.3:
(Continued)
Basic Materials 204 205 206 207
Companies
Semiconductor — Memory Chips Technical & System Software Telecom Services — Domestic Telecom Services — Foreign
208 Wireless Communications UTILITIES 209 Diversified Utilities 210 Electric Utilities 211 Foreign Utilities 212 Gas Utilities 213 Water Utilities
MICRON TECHNOLOGY INC. [MU] AUTODESK INC. [ADSK] AT&T INC. [T] NIPPON TELEGRAPH & TELEPHONE CORP. [NTT] CHINA MOBILE LIMITED [CHL] EXELON CORP. [EXC] SOUTHERN COMPANY [SO] ENERSIS S.A. [ENI] TRANSCANADA CORP. [TRP] AQUA AMERICA INC. [WTR]
Source: YahooFinance. Table 112B.4: GSPC DJI DJT DJU
Market indicators.
S&P 500’s price changes Dow Jones Industrial’s price changes Dow Jones Transportation’s price changes Dow Jones Utility’s price changes
Source: YahooFinance. Table 112B.5: VIX
Market sentiment indicators.
CBOE Volatility Index changes
Source: YahooFinance. Table 112B.6: BEN
Institutional investor.
FRANKLIN RESOURCES INC.
Table 112B.7: Mon Tue Wed Thurs Fri Jan
Calendar anomalies. Monday Tuesday Wednesday Thursday Friday January (Continued)
page 3930
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Time Series and Neural Network Analysis Table 112B.7: Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
(Continued) February March April May June July August September October November December
b3568-v4-ch112
3931
page 3931
This page intentionally left blank
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
Chapter 113
Covariance Regression Model for Non-Normal Data∗ Tao Zou, Ronghua Luo, Wei Lan and Chih-Ling Tsai Contents 113.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Covariance Regression Model . . . . . . . . . . . . . . 113.3 Estimation and Inference . . . . . . . . . . . . . . . . 113.3.1 Estimation . . . . . . . . . . . . . . . . . . . . 113.3.2 Inference for non-normal response . . . . . . . 113.4 Real Data Analysis . . . . . . . . . . . . . . . . . . . . 113.4.1 Example I: Stock return comovement . . . . . 113.4.2 Example II: Herding behavior of mutual funds 113.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
3934 3935 3936 3936 3938 3939 3939 3941 3943 3943
Tao Zou The Australian National University e-mail: [email protected] Ronghua Luo Southwestern University of Finance and Economics e-mail: [email protected] Wei Lan Southwestern University of Finance and Economics e-mail: [email protected] Chih-Ling Tsai University of California e-mail: [email protected] ∗ This chapter is an update and expansion of the paper “Covariance Regression Analysis,” which was published in Journal of the American Statistical Association, Vol. 112, pp. 266–281, 2017. 3933
page 3933
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3934
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
Abstract Recently, Zou et al. (2017) proposed a novel covariance regression model to study the relationship between the covariance matrix of responses and their associated similarity matrices induced by auxiliary information. To estimate the covariance regression model, they introduced five estimators: the maximum likelihood, ordinary least squares, constrained ordinary least squares, feasible generalized least squares and constrained feasible generalized least squares estimators. Among these five, they recommended the constrained feasible generalized least squares estimator due to its estimation efficiency and computational convenience. Under the normality assumption, they further demonstrated the theoretical properties of these estimators. However, the data in the area of finance and accounting may exhibit heavy tails. Hence, to broaden the usefulness of the covariance regression model, we relax the normality assumption and employ Lee’s (2004) approach to obtain inferences for covariance regression parameters based on the five estimators proposed by Zou et al. (2017). Two empirical examples are presented to illustrate the practical applications of the covariance regression model in analyzing stock return comovement and herding behavior of mutual funds. Keywords Covariance regression model • Herding behavior • Non-normal data • Stock return comovement.
113.1 Introduction Estimating the covariance matrix and its inverse is a fundamental and important task across many areas such as principal component analysis, Gaussian graphical modeling, discriminative analysis, portfolio management and machine learning. Due to the technological advancement and availability of high-dimensional data in fields such as business, climatology, engineering, medicine, neurology, and science, there has been a substantial amount of work recently being contributed to covariance matrix estimation and testing; see two seminal books, Pourahmadi (2013) and Yao et al. (2015). In high-dimensional data, i.e., when the number of dimensions p is comparable or greater than the sample size n, covariance matrix estimation is a challenging task. This is because the number of unknown parameters p(p − 1)/2 is too large, and the classical sample covariance matrix estimator is no longer applicable (see, e.g., Bai, 1999). To bring down the number of unknown parameters, some constrained methods such as sparsity, graphical, and factor models are considered (e.g., see Ledoit and Wolf, 2004; Bickel and Levina, 2008a,b; Cai and Liu, 2011; Fan et al., 2013; Pourahmadi, 2013; Yao et al. 2015). In addition, there are several structured covariance matrices being considered due to their specific data characteristics such as autoregressive, moving average, or compound symmetry. Since those matrices are functions of a small number of unknown parameters, they can be estimated effectively even with very limited sample size and large p.
page 3934
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Covariance Regression Model for Non-Normal Data
b3568-v4-ch113
3935
It is worth noting that none of the structural covariance matrices mentioned above is directly linked to auxiliary information, such as explanatory variables, spatial information, and social network. To this end, Zou et al. (2017) proposed a novel covariance regression model to study the relationship between the covariance matrix of responses and their associated similarity matrices induced by auxiliary information. Their method can not only explain the structure of the covariance matrix via the auxiliary information, but also can allow us to estimate the covariance matrix consistently. Accordingly, we employ the covariance regression model to investigate the stock return comovement and herding effect of mutual funds. Due to complications of theoretical derivations, Zou et al. (2017) imposed the normality assumption on the response variable for inference in the covariance regression model. To broaden the usefulness of the covariance regression model in finance and accounting, we relax the normality assumption. Then we apply Lee’s (2004) approach to obtain inferences for covariance regression parameters based on the five estimators proposed by Zou et al. (2017), which are the maximum likelihood, ordinary least squares, constrained ordinary least squares, feasible generalized least squares and constrained feasible generalized least squares estimators. The remainder of this chapter is organized as follows. Section 113.2 introduces covariance regression models. Section 113.3 proposes the estimation and inference procedure, while Section 113.4 presents two real examples to illustrate the usefulness of the covariance regression model in analyzing stock return comovement and herding behavior of mutual funds. Section 113.5 concludes the article with discussions.
113.2 Covariance Regression Model Let Y = (Y1 , . . . , Yp )T ∈ Rp be the response vector including p components. For each component j, let Xj = (Xj1 , . . . , Xjm )T ∈ Rm be the associated m-dimensional auxiliary information vector. In finance and accounting, Yj can be the jth firm’s stock return and Xj are its corresponding attributes, including the industry, location, market value, book-to-market ratio, cash flow, etc. In addition, we assume that Y follows a multivariate normal distribution with mean 0 and covariance matrix Σ = cov(Y ). Adopting Zou et al.’s (2017) approach, we consider the covariance regression model given below, Σ = Σ(β) = β0 Ip + β1 W1 + · · · + βm Wm ,
(113.1)
page 3935
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
3936
where Ip is the identity matrix of dimension p, β = (β0 , β1 , . . . , βm )T are unknown regression coefficients, and Wk s are similarity matrices for k = 1, . . . , m. Specifically, Wk = Wk (Xk ) = (w(Xj1 k , Xj2 k )) ∈ Rp×p is induced by auxiliary information Xk (k = 1, . . . , m), where Xk = (X1k , . . . , Xpk )T ∈ Rp and w(Xj1 k , Xj2 k ) can be a similarity measure for j1 = 1, . . . , p and j2 = 1, . . . , p. For continuous Xk , we follow Zou et al.’s (2017) suggestion and define w(Xj1 k , Xj2 k ) = exp{−(Xj1 k − Xj2 k )2 }. If Xk is discrete, we can follow Johnson and Wichern (1992) to define a similarity measure between discrete Xj1 k and Xj2 k . For example, if Xjk represents the industry of firm j, we define w(Xj1 k , Xj2 k ) = 1, if firms j1 and j2 belong to the same industry, and 0 otherwise. Based on similarity matrices defined above, the regression coefficient βk in model (113.1) can have practical interpretations. For instance, for any two units j1 and j2 , βk > 0 can lead to the conclusion that a smaller distance between Xj1 k and Xj2 k implies a larger covariance between Yj1 and Yj2 . Accordingly, βk measures the effect of covariate Xk on the covariance structure. 113.3 Estimation and Inference 113.3.1 Estimation To ensure the positive definiteness of the covariance matrices modeled in (113.1), Zou et al. (2017) considered the following parameter space: B := {β : Σ(β) > 0},
(113.2)
where G1 > G2 if the difference between any two generic matrices, G1 − G2 , is positive definite. They then recommended the constrained feasible generalized least squares (FGLS) estimator via the following four steps. Step I. Obtain the unconstrained ordinary least squares (OLS) estimator βˆp, OLS , by minimizing Dp (β) := Y Y T − Σ(β)2F,
(113.3)
where GF = {tr(GT G)}1/2 denotes the Frobenius norm for any generic matrix G. After solving ∂Dp (β)/∂β = 0, the resulting OLS estimator is −1 βˆp,OLS = (tr(Wk1 Wk2 ))(m+1)×(m+1) (Y T Wk Y )(m+1)×1 ,
(113.4)
where (gk1 k2 )m1 ×m2 denotes a m1 × m2 matrix, and its (k1 , k2 )th element is gk1 k2 .
page 3936
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
Covariance Regression Model for Non-Normal Data
3937
Step II. To ensure the positive definiteness of Σ(β), we employ Zou et al.’s (2017) Algorithm 1 to obtain the constrained OLS (OLS+) estimator βˆp,OLS+ = arg minβ∈B Dp (β). Note that one can use βˆp,OLS as the initial value in this iterative algorithm. Step III. Obtain the unconstrained FGLS estimator βˆp,FGLS by minimizing ˆ −1 ⊗ Σ ˆ −1 )vecT (Y Y T − Σ(β)), ˜ p (β) := vecT (Y Y T − Σ(β)) (Σ D
(113.5)
ˆ −1 = Σ−1 (βˆp,OLS+ ) is a consistent estimator of Σ−1 , ⊗ represents where Σ the Kronecker product of two matrices, and vec(G) denotes the vectorization for any generic matrix G. After simple algebraic simplification, (113.5) can be re-expressed as ˆ −1/2 Y Y T Σ ˆ −1/2 Σ(β)Σ ˆ −1/2 2 . ˆ −1/2 − Σ ˜ p (β) := Σ D F
(113.6)
Then consider the following transformations ˜k = Σ ˆ −1/2 Wk Σ ˜ ˆ −1/2 , and Σ(β) ˆ −1/2 Y, W = Y˜ = Σ
m
˜ k. βk W
k=0
It can be demonstrated that the unconstrained FGLS estimator βˆp,FGLS , obtained from the objective function (113.5) via {Y, Σ(β)} is the same as the unconstrained OLS estimator, βˆp,OLS , obtained from the objective function ˜ (113.3) via the transformed {Y˜ , Σ(β)}. Step IV. To ensure the positive definiteness, the resulting constrained FGLS ˜ p (β) obtained from the objec(FGLS+) estimator, βˆp,FGLS+ = arg minβ∈B D tive function (113.5) via {Y, Σ(β)} is the same as the constrained OLS estimator, βˆp,OLS+ , obtained from the objective function (113.3) via the trans˜ formed {Y˜ , Σ(β)}. Note that one can use βˆp,FGLS as the initial value in the iterative algorithm. Instead of the constrained FGLS estimator, Zou et al. (2017) proposed the maximum likelihood estimation (MLE) approach. Specifically, under the assumption that Y follows a multivariate normal distribution with mean 0 and covariance Σ(β) = cov(Y ), the maximum likelihood estimator βˆp,MLE can be obtained by maximizing the following log-likelihood function, p
1 p 1 log λj (Σ(β)) − Y T Σ−1 (β)Y, p (β) = − log (2π) − 2 2 2
(113.7)
j=1
where λj (Σ(β)) are the jth largest eigenvalues of Σ(β) for j = 1, . . . , p. Although the MLE is asymptotically efficient under the normality assumption, it becomes computationally infeasible when p gets large. This
page 3937
July 6, 2020
16:5
3938
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
is mainly due to computational burden of O(p4 ) pointed out by Zou et al. (2017). It is worth noting that the OLS method alleviates the computational complexity, but the estimators βˆp,OLS and βˆp,OLS+ are not efficient. Accordingly, Zou et al. (2017) proposed the constrained FGLS estimator mentioned above, which not only mitigates the computational complexity, but also achieves the same estimation efficiency as that of MLE. This is exactly the reason why they recommended the constrained FGLS estimator instead of MLE in real practice. 113.3.2 Inference for non-normal response One limitation of Zou et al. (2017) is that the inferences of the aforementioned five estimators are relying on the normality assumption of responses. However, in the areas of finance and accounting, one may have heavy-tailed data, which are not necessarily normally distributed. This motivates us to extend the application of the covariance regression model by relaxing this assumption. To this end, we employ Lee’s (2004) approach to obtain the inference for all five estimators when the response is not necessarily normally distributed. Note that the unconstrained/constrained OLS and FGLS estimators are still applicable since they do not rely on the normality assumption. Moreover, maximizing the log-likelihood function (113.7) evaluated under the normal assumption can still lead to a consistent estimator even if the responses are not truly normally distributed. Lee (2004) called this method quasi-maximum likelihood estimation (QMLE), and we rename the resulting estimator by βˆp,QMLE . Applying similar techniques to those used in Zou et al. (2017) and Lee (2004), we obtain the asymptotic distributions of the unconstrained/constrained OLS and FGLS estimators and QMLE, for the non-normal responses. Assume that, for d = −1, 0, 1, 1 (tr(Σd Wk1 Σd Wk2 ))(m+1)×(m+1) →Qd ∈ R(m+1)×(m+1) and (113.8) p 1 (tr{(Σd/2 Wk1 Σd/2 ) ◦ (Σd/2 Wk2 Σd/2 )})(m+1)×(m+1) →Pd ∈ R(m+1)×(m+1) , p (113.9) where ◦ represents the Hadamard product of two matrices, and Qd are (m + 1) × (m + 1) positive definite matrices for d = −1, 0, 1. In addition, let Z = Σ−1/2 Y = (Zj )p×1 ∈ Rp and define the fourth-order moment μ(4) = E(Zj4 ) for j = 1, . . . , p. Then, under some mild conditions, one can prove that the unconstrained/constrained
page 3938
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Covariance Regression Model for Non-Normal Data
b3568-v4-ch113
3939
OLS estimator is asymptotically normal with mean 0 and covariance −1 −1 −1 −1 (4) 2p−1 Q−1 0 Q1 Q0 +p (μ −3)Q0 P1 Q0 as p → ∞. In addition, the unconstrained/constrained FGLS estimator is asymptotically normal with mean 0 −1 −1 −1 (4) and covariance matrix 2p−1 Q−1 −1 +p (μ −3)Q−1 P−1 Q−1 as p → ∞. Moreover, the QMLE βˆp,QMLE is asymptotically normal with mean 0 and covari−1 −1 −1 (4) ance matrix 2p−1 Q−1 −1 +p (μ −3)Q−1 P−1 Q−1 as p → ∞. Consequently, the constrained FGLS estimator and the QMLE still have the same asymptotic efficiency for non-normal response. But it has been pointed out at the end of Section 113.3.1 that the constrained FGLS estimator is computationally more feasible. Hence, we recommend using the constrained FGLS estimator in real practice. To make inferences about the above five covariance regression parameter estimators, one needs to estimate their corresponding asymptotic covariance matrices. To this end, we propose to estimate them via (113.8) and (113.9) ˆ where βˆ can be one of the above five parameter ˆ = Σ(β), evaluated at Σ estimators. In addition, μ(4) can be estimated by its empirical fourth-order ˆ −1/2 Y . Accordingly, one can test the signifimoment of the vector Zˆ = Σ cance of regression parameters and assess the effect of the covariates on the covariance structure. 113.4 Real Data Analysis In this section, we present two empirical examples to illustrate the usefulness of the covariance regression model via the US and Chinese stock markets, respectively. Example I employs the covariance regression model to study stock return comovement in the US stock market, while Example II investigates the herding behavior of mutual funds in the Chinese stock market. In both studies, we apply the constrained FGLS approach, recommended by Zou et al. (2017), to estimate the covariance regression model and obtain their associated standard errors without assuming the normality of responses. 113.4.1 Example I: Stock return comovement Finding highly associated stocks whose returns move together is not only essential for studying asset pricing theory and asset allocation strategies (see, e.g., Green and Hwang, 2009), but also for implementing a pairs trading strategy (see, e.g., Gatev et al. 2006). “Moving together” refers to the similar response of these stock returns to market-wide information such as in the capital asset pricing model (CAPM) of Sharpe (1964) and Fama–French
page 3939
July 6, 2020
16:5
3940
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
three/five factors model of Fama and French (1993, 2015). Recently, a number of authors documented that stock return comovement may be influenced by their economic fundamentals (e.g., size and price; see Shiller, 1989 and Green and Hwang, 2009) or by specific sources unrelated to fundamentals (e.g., geographical distance; see Pirinsky and Wang, 2006). In this example, we employ the covariance regression model to find important factors that can affect stock return comovement in the US stock market. We collect the closing prices of the component stocks from the Standard and Poors (S&P) 500 index via http://quote.yahoo.com/ using the R package “tseries”. Specifically, the R-command “get.hist.quote” in the “tseries” package is used to acquire the data (e.g., see Chang et al., 2018). Let Zj,2015 and Zj,2016 be the jth stock prices at the end of 2015 and 2016, respectively. The response vector Y = (Yj )p×1 consists of the log returns of the p stocks, namely log(Zj,2016 ) − log(Zj,2015 ), and they are standardized by subtracting the sample mean. To study the comovement of stocks, we consider three similarity matrices induced by their corresponding covariates in the spirit of empirical findings from Shiller (1989). The three covariates are IND (for industry or sector classified by the Global Industry Classification Standard, GICS, listed in the S&P 500 component stocks), SIZE (measured by the logarithm of the market capitalization in 100 million at the end of 2015), and PRICE (the logarithm of the stock price at the end of 2015), and they are denoted by Xk = (X1k , . . . , Xpk )T for k = 1, 2, 3, respectively. We eliminate stocks with missing values in the response or one of the covariates, and there remain p = 415 securities in total. We next construct the similarly matrices from covariates Xk . For the categorical covariate IND, we set the off-diagonal elements of the similarity matrix W1 to be 1 if the two stocks belong to the same sector, and 0 otherwise. For the two continuous covariates, SIZE and PRICE, we standardize them to have zero mean and unit variance, define the off-diagonal elements of the similarity matrix Wk by exp{−(Xj1 k − Xj2 k )2 } for covariates k = 2 and 3 and stocks 1 ≤ j1 = j2 ≤ p, and set the diagonal elements to be zeros. Table 113.1 presents the parameter estimates, their associated standard errors, t-statistics and p-values for the covariance regression model Σ(β) = β0 Ip + 3k=1 βk Wk . It is worth noting that the p-values are computed via a one-sided test of the null hypothesis H0 : βk = 0 versus the alternative hypothesis H1 : βk > 0, for k = 1, 2, 3, respectively. Table 113.1 shows two interesting finds. First, IND and PRICE are positively and significantly related to the covariance between annual returns at the 10% significance level. Such empirical results are expected, and they are consistent
page 3940
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
Covariance Regression Model for Non-Normal Data Table 113.1:
Ip IND SIZE PRICE
3941
The covariance regression results for all three covariates.
Estimate (×10−2 )
Standard-Error (×10−2 )
t-Statistic
5.3212 1.4288 0.0592 1.3449
1.4745 0.7617 0.0844 0.9465
3.6087 1.8758 0.7015 1.4210
p-Value 0.0002 0.0303 0.2415 0.0777
with finance theory. If two stocks are located in the same industry, they are usually affected by some common polices and events, and hence behave similarly. Second, SIZE and PRICE have been proved to be the fundamental ingredients in recent asset pricing models (Fama and French, 2015; Hou et al., 2015), i.e., these two covariates represent the fundamentals of a stock. Naturally, stocks with similar SIZE and PRICE tend to comove. However, the coefficient of SIZE is positive but not significant. This indicates that stock returns with similar SIZEs do not comove significantly when the other variables are controlled in the model. This may be due to SIZE being measured as the logarithm of the market capitalization, where the market capitalization equals to the stock price multiplied by the number of outstanding shares. Thus, the similarities measured by SIZE can be partly explained by PRICE, which weakens the effect of SIZE on the covariance. 113.4.2 Example II: Herding behavior of mutual funds In this example, we employ the covariance regression model to explore the herding behavior of mutual funds. In a seminal paper, Scharfstein and Stein (1990) pointed out that fund managers tend to mimic the investment decision of other managers, which results in herding behavior in financial markets. Since then, many studies attempt to explain and clarify the mechanism of the herding behavior among mutual fund managers (see, e.g., Wermers, 1999; Nofsinger and Sias, 1999; Hong et al., 2005). These works showed that, due to the compensation mechanism, the relationship between cash-flow and performance, and the common impact from market or industry, fund managers employ similar investment strategies, especially for those funds with the same investment style. As expected, there does exist herding behavior among fund managers, and the performances of these herding funds are similar, i.e., their return series are likely to be correlated. To study the herding behavior in the Chinese stock market, we collect the annual returns of 491 actively managed open-ended mutual funds in
page 3941
July 6, 2020
16:5
3942
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
2016 from the WIND database (one of the most authoritative financial databases in China). The response vector Y consists of the corresponding annual returns of the 491 funds standardized by subtracting the sample mean. To verify the herding behavior, we consider the following k = 5 variables to represent the fundamentals of each fund: STY, the investment style category as classified according to the size and book-to-market ratio (BM) (see Fama and French, 1993, 2015), including high size-high BM, high size-middle BM, high size- low BM, middle size-high BM, middle size-middle BM, middle size-low BM, low size-high BM, low size-middle BM, and low size-low BM; TNA, the total net assets under management with logarithm transformation; CON, the concentration of the fund portfolio calculated by the ratio of the market value of the top five invested stocks to the total market value (see Goldman et al., 2016); DEV, the fund portfolio’s deviation from the market portfolio measured as the Euclidean distance of the fund portfolio and market portfolio (see Petajisto, 2013; Hunter et al., 2014); STD, the return volatility measured by the standard deviation of the daily return (see Brown et al., 1996). All five variables are measured from the data at the end of 2015. We denote the above covariates Xk = (X1k , . . . , Xpk )T ∈ Rp for k = 1, . . . , 5. We then construct the similarity matrices induced by the covariates Xk . For the discrete variable STY (k = 1), let the off-diagonal element of its associated similarity matrix be 1 if two funds have the same investment style, and 0 otherwise. For the other four continuous variables, we standardize them to have zero mean and unit variance. Finally, let the offdiagonal elements of the similarity matrices be exp{−(Xj1 k − Xj2 k )2 } for stocks j1 = j2 and k = 2, . . . , 5, and set the diagonal elements to be zeros. To study the effect of these five covariates on the covariance of annual returns of 491 mutual funds, we consider the covariance regression model, Σ(β) = β0 Ip + 5k=1 βk Wk . Table 113.2 reports its parameter estimates and their associated standard errors, t-statistics and p-values. Analogous to Example I, the p-values are computed based on the null hypothesis H0 : βk = 0 versus the alternative hypothesis H1 : βk > 0, (or βk < 0), for k = 0, 1, . . . , 5, respectively. From Table 113.2, we have three interesting findings. First, STY and STD are positively and significantly related to the covariance of annual returns of 491 mutual funds at the 10% significance level. Second, the two covariates TNA and CON are positively related to the covariance of annual returns, although they are not significant. These first two findings indicate that if the two funds have the same investment objective (i.e., the investment style STY) and similar basic characteristics,
page 3942
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
Covariance Regression Model for Non-Normal Data Table 113.2:
The covariance regression results for all five covariates.
Estimate (×10−2 ) Ip STY TNA CON DEV STD
1.2482 0.3112 0.0199 0.0024 −0.0069 0.1472
3943
Standard-Error (×10−2 ) 0.2333 0.2066 0.0244 0.0087 0.0069 0.1081
t-Statistic 5.3489 1.5062 0.8150 0.2751 −1.0118 1.3621
p-Value 0.0000 0.0660 0.2075 0.3916 0.1558 0.0866
including fund size (TNA), portfolio concentration (CON) and historical return volatility (STD), they tend to perform similarly and have a large correlation between return series. This is consistent with the empirical evidence in Grinblatt et al. (1995), Brown et al. (1996), Carhart (1997), Daniel et al. (1997), and Goldman et al. (2016). Third, the coefficient of DEV is negative, but not significant with p-value 0.16. This finding indicates that, after controlling the impact of the other four covariates, the activeness of a mutual fund does not significantly influence the covariance structure, which differs from the findings of Petajisto (2013) and Hunter et al. (2014) on the US financial market. A possible explanation is that the Chinese stock market operates differently from the US market, and this deserves further investigation. 113.5 Conclusions In this paper, we extend the inference of Zou et al.’s (2017) covariance regression model to non-normal data. Two empirical applications in the US and Chinese stock markets are presented to illustrate the usefulness of the extended model. Both examples show that the similarity matrices induced by some covariates are possibly explained by the other covariates, which can weaken the effects on the covariance. We consider this a multicollinearity effect of similarity matrices in the covariance regression model, which has not been well addressed in Zou et al. (2017). Hence, we believe finding a proper measure to assess the multicollinearity of similarity matrices is worthy of future research, and this effort should broaden the use of the covariance regression model. Bibliography Bai, Z. (1999). Methodologies in Spectral Analysis of Large-Dimensional Random Matrices: A Review. Statistica Sinica 9, 611–677.
page 3943
July 6, 2020
16:5
3944
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch113
T. Zou et al.
Bickel, P.J. and Levina, E. (2008a). Covariance Regularization by Thresholding. Annals of Statistics 36, 2577–2604. Bickel, P.J. and Levina, E. (2008b). Regularized Estimation of Large Covariance Matrices. The Annals of Statistics 36, 199–227. Brown, K., Harlow, W. and Starks, L. (1996). Of Tournaments and Temptations: An Analysis of Managerial Incentives in the Mutual Fund Industry. Journal of Finance 51, 85–110. Cai, T.T. and Liu, W. (2011). Adaptive Thresholding for Sparse Covariance Matrix Estimation. Journal of the American Statistical Association 106, 672–684. Carhart, M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance 52, 57–82. Chang, J., Qiu, Y., Yao, Q. and Zou, T. (2018). Confidence Regions for Entries of a Large Precision Matrix. Journal of Econometrics 206, 57–82. Daniel, K., Grinblatt, M., Titman, S. and Wermers, R. (1997). Measuring Mutual Fund Performance with Characteristic-Based Benchmarks. Journal of Finance 52, 1035–1058. Fama, E.F. and French, K.R. (1993). Common Risk Factors in the Return on Stocks and Bonds. Journal of Financial Economics 33, 3–56. Fama, E.F. and French, K.R. (2015). A Five-Factor Asset Pricing Model. Journal of Financial Economics 116, 1–22. Fan, J., Liao, Y. and Mincheva, M. (2013). Large Covariance Estimation by Thresholding Principal Orthogonal Complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, 603–680. Gatev, E., Goetzmann, W.N. and Rouwenhorst, K.G. (2006). Pairs Trading: Performance of a Relative-Value Arbitrage Rule. Review of Financial Studies 19, 797–827. Goldman, E., Sun, Z. and Zhou, X. (2016). The Effect of Management Design on the Portfolio Concentration and Performance of Mutual Funds. Financial Analysts Journal 72, 1–13. Green, T. and Hwang, B. (2009). Price-Based Return Comovement. Journal of Financial Economics 93, 37–50. Grinblatt, M., Titman, S. and Wermers, R. (1995). Momentum Investment Strategies, Portfolio Performance, and Herding: A Study of Mutual Fund Behavior. American Economic Review 85, 1088–1105. Hong, H., Kubik, J. and Stein, J. (2005). Thy Neighbor’s Portfolio: Word-of-Mouth Effects in the Holdings and Trades of Money Managers. Journal of Finance 60, 2801–2824. Hou, K., Xue, C. and Zhang, L. (2015). Digesting Anomalies: An Investment Approach. Review of Financial Studies 28, 650–705. Hunter, D., Kandel, E., Kandel, S. and Wermers, R. (2014). Mutual Fund Performance Evaluation with Active Peer Benchmarks. Journal of Financial Economics 112, 1–29. Johnson, R.A. and Wichern, D.W. (1992). Applied Multivariate Statistical Analysis. Prentice-Hall, Englewood Cliffs, NJ. Ledoit, O. and Wolf, M. (2004). A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices. Journal of Multivariate Analysis 88, 365–411. Lee, L.F. (2004). Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models. Econometrica 72, 1899–1925. Nofsinger, J. and Sias, R. (1999). Herding and Feedback Trading by Institutional and Individual Investors. Journal of Finance 54, 2263–2295. Petajisto, A. (2013). Active Share and Mutual Fund Performance. Financial Analysts Journal 69, 73–93.
page 3944
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Covariance Regression Model for Non-Normal Data
b3568-v4-ch113
3945
Pirinsky, A. and Wang, Q. (2006). Does Corporate Headquarters Location Matter for Stock Returns. Journal of Finance 61, 1991–2015. Pourahmadi, M. (2013). High-Dimensional Covariance Estimation. John Wiley & Sons, New York. Scharfstein, D. and Stein, J. (1990). Herd Behavior and Investment. American Economic Review 80, 465–479. Sharpe, W.F. (1964). Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk. Journal of Finance 19, 425–442. Shiller, R.J. (1989). Comovements in Stock Prices and Comovements in Dividends. Journal of Finance 44, 719–729. Wermers, R. (1999). Mutual Fund Herding and the Impact on Stock Prices. Journal of Finance 54, 581–622. Yao, J., Zheng, S. and Bai, Z. (2015). Large Sample Covariance Matrices and HighDimensional Data Analysis. Cambridge University Press, New York. Zou, T., Lan, W., Wang, H. and Tsai, C.L. (2017). Covariance Regression Analysis. Journal of the American Statistical Association 112, 266–281.
page 3945
This page intentionally left blank
July 17, 2020
14:53
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Chapter 114
Impacts of Time Aggregation on Beta Value and R2 Estimations Under Additive and Multiplicative Assumptions: Theoretical Results and Empirical Evidence Yuanyuan Xiao, Yushan Tang and Cheng Few Lee Contents 114.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2 Time Aggregation on Systematic Risk Coefficient . . . . . . 114.2.1 Impacts of time aggregation on beta under additive assumption . . . . . . . . . . . . . . . . . . . . . . . 114.2.2 Impacts of time aggregation on beta under multiplicative assumption . . . . . . . . . . . . . . . 114.2.3 Aggressive stock . . . . . . . . . . . . . . . . . . . . 114.3 Time Aggregation on Estimated R2 . . . . . . . . . . . . . . 114.4 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . .
Yuanyuan Xiao Rutgers University e-mail: [email protected] Yushan Tang Rutgers University e-mail: [email protected] Cheng Few Lee Rutgers University e-mail: cfl[email protected] 3947
. .
3948 3949
.
3949
. . . .
3952 3954 3955 3958
page 3947
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3948
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
114.4.1 Sample and data 114.4.2 Static model . . 114.4.3 Dynamic model 114.5 Summary . . . . . . . . Bibliography . . . . . . . . . . Appendix 114A . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
3958 3959 3962 3982 3982 3983
Abstract Data for big and small market-value firms are used to evaluate the effects of temporal aggregation on beta estimates, t-values, and R2 estimates. In addition to our analysis of the standard market model within addictive rates of return framework, the standard model under assumption of multiplicative rates of return is also discussed. Furthermore, dynamic is estimated in this study to evaluate differences in the short-term and long-term dynamic relationships between the market and each type of firm. It is found that temporal aggregation has important effects on both the specification of a market model and the stability of beta and R2 estimates. Keywords Temporal aggregation • Additive and multiplicative rates of return • Random coefficient model • Coefficient determination • Estimation stability.
114.1 Introduction The relationship between the investment horizon and systematic risk as well as the estimation of R2 has been investigated in great detail. Lee and Morimune (1978) used the time aggregation method of Zeller and Montimarquette (1971) to show that the estimation of systematic risk and the estimated coefficient of determination (R2 ) are generally not independent of the length of investment horizon. Some prior researches attributed the dependency of the estimation of beta coefficient and R2 to different reasons. Levhari and Levy (1977) showed that it is the multiplicative nature of simple rate of return that leads the estimation of systematic risk and R2 to vary with length of investment horizon. Furthermore, Chen (1980) analyzed the autocorrelation and variance in unaggregated market rates of return and found significant impacts on the magnitudes of the estimated aggregated systematic risk for neutral, aggressive and defensive securities, separately. Besides, Hawawini (1983) found out that the relationships between beta and return interval are different for firms with smaller and larger market value. Specifically, the betas of securities with a smaller market value will increase as the return interval is lengthened, whereas the betas of securities with a larger market value will decrease. Moreover, Gilbert found that “opaque” firms have high-frequency betas smaller than their low-frequency
page 3948
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations
3949
betas, while it is the opposite for “transparent” firms. In this paper, we apply the standard market model to analyze the relationship of unaggregated beta and aggregated beta within the framework of additive rates of return as well as multiplicative rates of return. And we find empirical evidence (using both static and dynamic model) supporting our belief that the estimation of systematic risk and R2 is frequency-dependent. In Section 114.2, we will derive the relationship between unaggregated beta and aggregated beta with additive rates of return and show the results for multiplicative rates of return by Chen (1980). In Section 114.3, we will show the derivation of the impact of time aggregation on estimated coefficient of determination, R2 . Empirical results are shown in Section 114.4. Finally, Section 114.5 concludes. 114.2 Time Aggregation on Systematic Risk Coefficient 114.2.1 Impacts of time aggregation on beta under additive assumption Following Schwartz and Whitcomb (1977a,b), the market model for any ith firm or portfolio for a T -year period is defined as RT ij = αT i + βT i RT M j + UT ij , i = 1, 2, . . . , I and j = 1, 2, . . . , J, (114.1) where RT M j = log(IT j /IT j−1 ) is the “market” continuously compounded rates of return per annum over the jth subperiod of length T and RT ij = loge (PT j + DT j /PT j−1 ), the continuously compounded rate of return per annum over the jth subperiod of length T for the ith firm or portfolio. Then for any tth short period of duration n years (n < T ), write the model (dropping the firm index i, and the observation index j for compactness) as follows: rt = αn + βn rM t + ut ,
t = 1, 2, . . . , N where N = T /n.
(114.2)
The relationship between RT and rt . RM T and rM T is defined as follows: RT =
T /n
rt ,
(114.3a)
rM t .
(114.3b)
t=1
RM T =
T /n t=1
page 3949
July 6, 2020
16:5
3950
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
Equations (114.3a) and (114.3b) show that the return of long period can be presented by the summation of rates of return of every individual subperiod which are disjoint. And this property is called additivity. For example, if rt and rM t represent monthly rates of returns and T /n = 3, then RT and RM T will represent the quarterly rates of return. To simplify the analysis, the market model deviation from the mean in terms of monthly rates of return is defined as follows1 : Yt = βXt + ut ,
t = 1, 2, . . . , N,
(114.4)
where Yt = rt − r¯t , Xt = rM t − r¯M t , β is scalar parameter, and Ut is a non-autocorrelated error term with E(ut ) = 0 and E(u2t ) = σ 2 for all t. Following Zeller and Montimarquette (1971) and the definitions defined in (114.3), the market model without intercept in terms of N -period rates of return is given as follows: AY = AXβ + AU , where Y = (Y1 , Y2 , . . . , Y(J×N ) ), X = (u1 , u2 , . . . , u(J×N ) ), and A is a J × (J × N ) ⎡ I o . . . ⎢ ⎢o I . . . A=⎢ ⎢. . . . . . . . . ⎣ o o . . .
(114.5a)
(X1 , X2 , . . . , X(J×N ) ), U = matrix of the form ⎤ o ⎥ o ⎥ ⎥, . . .⎥ ⎦ I
and I = (1, 1, . . . , 1) and o = (0, 0, . . . , 0) are row vectors of 1 × N . In addition, we specify Aj Y = βAj X + Aj U ,
(114.5b)
where Aj is the jth row of A. Given the assumptions made about the elements of U in connection with (114.4), we have E(AU ) = 0, and the J × J covariance matrix for AU in (4) is ⎡ ⎤ N ⎢ ⎥ N ⎥ 2⎢ (114.6) E(AU U A ) = σ ⎢ ⎥. .. ⎣ ⎦ . N
1
A model with autocorrelated residuals is developed in Appendix A.
page 3950
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Time Aggregation on Beta Value and R2 Estimations
The OLS estimate of β is defined as β˜ = (X A AX)−1 X A AY .
b3568-v4-ch114
3951
(114.7)
The minimum variance linear unbiased (MVLU) estimator for β is defined as β ∗ = [X A (AA )−1 AX]−1 X A (A A)−1 AY ˜ = [X A AX]−1 X A AY = β.
(114.8)
This result indicates that the ordinary least squares (OLS) estimator of systematic risk is equivalent to the generalized least squares (GLS) estimator. Hence the OLS estimator is an MVLU estimator. Equations (114.4) and (114.5) are simple regressions without intercepts. The slope associated with aggregated data can be defined as
2 Cov(Aj Y , Aj X) Var(Aj Y ) 2 2 , (114.9) = r βa = a Var(Aj X) Var(Aj X) where ra is the correlation coefficient between Aj Y and Aj X. Equation (114.9) can be used to show that the ex ante systematic risk associated with additive rates of return is not affected by time aggregation even if there exists autocorrelation in the market rates of return. Following equation (114.9), it can be shown that
Var(Aj X) = Aj Var(X)Aj = I Var(Xj )I
N −1 (N − s)ρsX , = Var(Xt ) N + 2
(114.10)
s=1
where Xj is column vector containing N market rates of return in aggregated period, and the above equations are derived by assuming that Var(Xt ) = Var(Xt−s ) for s = 1, 2, . . . , N − 1, the lag s autocovariances of Xt are all equal, which are Cov(Xt , Xt−s ) = Cov(Xt−k , Xt−k−s ) for all k and s. And ρsX denotes autocorrelation coefficients with order s for the explanatory variable, which is the excess expected return of market. For simplicity, we denote that −1 s PN = 2 N i=1 (N − i)ρX . Furthermore, as shown in appendix to this chapter, it can be obtained that Cov(Aj Y , Aj X) = Cov(Xt , Yt )[N + ΓN ],
(114.11)
where ΓN =
N −1 s=1
N −1 ρsXY + ρ−s s XY (N − s) = (N − s)qXY . ρXY s=1
(114.12)
page 3951
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
3952
Equation (114.11) is the Covariance–Time Function (C–T Function) as ρs
+ρ−s
denoted in Hawawini (1980). The ratio XYρXY XY is called the q-ratio of order s. It is the sum of lead (ρsXY ) and lag (ρ−s XY ) intertemporal crosscorrelation coefficients for order s in unit (unaggregated) returns divided by the contemporaneous cross-correlation coefficient (ρXY ). And we denote this s . ratio as qXY From equations (114.10), (114.11) and definitions of the coefficient of systematic risk, we obtain Cov(Xt , Yt )[N + ΓN ] Var(Xt )[N + PN ] −1 s N+ N 2 s=1 (N − s)qXY . = βu × −1 s N +2 N s=1 (N − s)ρX
βa2 =
Denote the time aggregation factor as Φ(N ) =
(114.13)
s N + N−1 s=1 (N −s)qXY s . N +2 N−1 (N −s)ρ s=1 X
There-
fore, the ex ante systematic risk associated with additive rates of return is frequency-dependent. The direction and the magnitude of the time aggregation effect on the estimation of systematic coefficient is then given by the sign and absolute value of the difference between Φ(N ) and 1. It is reasonable to imply that the time aggregation effect is positive (aggregated beta is larger than the unaggregated one) if Φ(N ) is larger than 1. While the effect is believed to be negative if Φ(N ) is smaller than 1. Furthermore, we can speculate that for the variation in time interval, larger difference between the Phi multiplier, Φ(N ), and 1 would indicate bigger gap between aggregated and unaggregated estimations of beta. In Section 114.3, we will test our result by calculating the Φ’s for each firms and different time frequency and compare it with the actual changing trend of beta.
114.2.2 Impacts of time aggregation on beta under multiplicative assumption Continuing our discussion under multiplicative rates of return assumption with Schwartz and Whitcomb’s (1977) T -year market model for any ith I −IT j−1 , the “marfirm or portfolio. In multiplicative framework, RT M j = T jIT j−1 ket” simple rates of return over the jth sub-period of length T . RT ij = PT j +DT j −PT j−1 , the simple rate of return over the jth subperiod of length T PT j−1
page 3952
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Time Aggregation on Beta Value and R2 Estimations
b3568-v4-ch114
3953
for the ith firm or portfolio. Sometimes (1 + R) is used to denote the gross rate of return. Then for any tth short period of duration n years (n < T ), the n-year model is the same model used in previous section. The relationships between RT and rt and RM T and rM T are, respectively, defined as follows: T /n
(1 + RT ) =
(1 + rt ),
(114.14a)
(1 + rM t ).
(114.14b)
t=1
and T /n
(1 + RM T ) =
t=1
Equations (114.14a) and (114.14b) show that the gross return of long period can be presented by the product of consecutive gross rates of return of every individual sub-period which are disjoint. And this property is called multiplicativity. For example, if rt and rM t represent monthly rates of returns and T /n = 3, then RT and RM T will represent the quarterly rates of return. Levhari and Levy (1977) and Chen (1980) has analyzed the effects of temporal aggregation on beta estimates under multiplicative assumption. It has been shown that if the market rates of return are assumed to be identically, but not independently distributed, the N -period systematic risk associated with a multiplicative model can generally be written as [Cov(rt , rM t ) + μi μM ]N − (μi μM )N + bN 2 + μ2 ) − μ2N + c (σM N M M N −1 at β N −1 (1 + α)t + bN = t=0 Nu−1 , t=0 at + cN
βN =
(114.15)
where rt = the security rate of return, rM t = the market rate of return,
N 2 N −t 2 t 2 = Var(r (μM ) μi = E(rt ), μM = E(rM t ), σM M t ); at = t (σM )
−r and α = (βu − 1) μM with r = riskless interest rate as defined μM by Levhari–Levy; bN and cN represent the effects of autocorrelation in the market rates of return on Cov(r1 , r2 , . . . , rN , rM 1 , rM 2 , . . . , rM N ) and Var(rM 1 , rM 2 , . . . , rM N ), respectively. Equation (114.15) reduces to Levhari– Levy equation if the market rates of return assumed to be independently distributed (bN = cN = 0). To investigate the impact of autocorrelation
page 3953
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3954
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
in one-period systematic risk associated with multiplicative rates of return, equation (114.15) is explored as given in Sections 114.2.2.1–114.2.2.3. 114.2.2.1 Neutral stock The N -period systematic risk for a neutral stock (βu = 1) can be shown to be N −1 at + bN ∗ . (114.16) βN = t=0 N −1 t=0 at + cN ∗ = 1 = β only if b = c = 0. In other words, β ∗ cannot be Thus, βN u N N N equal to one unless the terms, bN and cN , associated with autocorrelation in one-period market rates of return are zero, which requires the assumption ∗ of the independent distribution of the market rates of return. Therefore, βN is not equal to one for a neutral stock if there exists autocorrelation in the market rates of return.
114.2.2.2 Aggressive stock A stock which is more volatile than the market is called an aggressive stock; i.e., βu > 1. In this case, α > 0 and βuN −i ≥ βu , i = 0, 1, . . . , N − 1. Then, following equation (114.15), it is easy to show that
N −1 a + b /β t N u ∗ t=0 > βu . (114.17) βN N −1 t=0 at + cN ∗ > β if rates of return are assumed As indicated by Levhari and Levy, βN u to be identically and independently distributed. However, equation (114.17) ∗ would have a lower bound greater or less than β , dependimplies that βN u ∗ ing on the magnitude of bN and cN . If bN /β u is greater than cN , βN has a lower bound greater than βu . In this case, Levhari–Levy conclusion, ∗ > β , is still valid. However, if b /β is less than c , β ∗ has a lower βN u N N N u ∗ could be less than β for some bound less than βu . This means that βN u ∗ is not greater than β N . This finding explains why for some stocks βN u for some N in the Levhari–Levy empirical results. Thus, the existence of autocorrelation in the market rates of return can increase or decrease the magnitude of the N -period systematic risk associated with an aggressive stock.
page 3954
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Time Aggregation on Beta Value and R2 Estimations
b3568-v4-ch114
3955
114.2.2.3 Defensive stock In this case, βu < 1, α < 0, βuN −i ≤ βu , i = 0, 1, . . . , N − 1. Then, it can be shown that
N −1 a + b /β t N u ∗ t=0 ≤ βu . (114.18) βN N −1 t=0 at + cN ∗ < β , as found by Levhari If bN /β u is less than cN , the same result, βN u ∗ has an upper bound less than β . In addition, and Levy, is obtained since βN u ∗ will be less than −β if, for some N , b or c is negative such that either βN u N N the numerator or the denominator of equation (114.18) becomes negative. Specifically, the effect of autocorrelation in the market rates of return may ∗ associated with a defense stock. And, for decrease the magnitude of βN some N , the N -period systematic risk may become negative. This result helps explain why some estimated N -period systematic risks associated with defensive stocks are negative in Levhari–Levy empirical results. Therefore, the existence of autocorrelation in the market rates of return can be used to interpret the empirical results found by levhari and Levy.
114.3 Time Aggregation on Estimated R2 Jacob (1971) has found that the R2 estimated from monthly data is smaller than that from both quarterly and annual data, Altom, Jacquillat and Levasseur (1974) have found that the R2 estimated from quarterly data is smaller than that from both semiannual and annual data and McDonald (1974) has found that the R2 obtained from monthly mutual fund data is smaller than that from both quarterly and annual mutual fund data. Schwartz and Witcomb (1977 a,b) have tried to explain the above findings by the time-variance relationship. Now, a new approach is used to explain the impact of time aggregation on the estimated R2 . Given (114.5)–(114.8), to derive the relationship between R2 in terms of daily rates of return and R2 in terms of weekly rates of return, we first calculate the variance of the 2 I Var(X)I + I Iσ 2 , Var(AY ) = βdaily
where I = (1, 1, 1, 1, 1)
(114.19)
page 3955
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
3956
and ⎡
Var(Xt ) ⎢ ⎢Cov(Xt,t−1 ) ⎢ ⎢ Var(X) = ⎢Cov(Xt,t−2 ) ⎢ ⎢Cov(Xt,t−3 ) ⎣ Cov(Xt,t−4 )
Cov(Xt,t−1 )
Cov(Xt,t−2 )
Var(Xt−1 )
Cov(Xt−1,t−2 )
Cov(Xt−1,t−2 )
Var(Xt−2 )
Cov(Xt−1,t−3 )
Cov(Xt−2,t−3 )
Cov(Xt−1,t−4 )
Cov(Xt−2,t−4 )
Cov(Xt,t−3 )
Cov(Xt,t−4 )
Cov(Xt−1,t−4 )⎥ ⎥ ⎥ Cov(Xt−2,t−4 )⎥ ⎥ ⎥ Cov(Xt−3,t−4 )⎦
Cov(Xt−1,t−3 ) Cov(Xt−2,t−3 ) Var(Xt−3 ) ⎡
Cov(Xt−3,t−4 ) 1
⎢ ρ1 ⎢ x ⎢ 2 = Var(Xt ) ⎢ ⎢ρX ⎢ 3 ⎣ρX ρ4X
ρ1x 1 ρ1X ρ2X ρ3X
ρ2X
ρ3X
1
ρ1X
ρ1X
ρ1X ρ2X
ρ2X 1 ρ1X
ρ4X
⎤
⎤
ρ3X ⎥ ⎥ ⎥ 2 ρX ⎥ ⎥, ⎥ 1 ρX ⎦
Var(Xt−4 )
(114.20)
1
where ρiX ’s (i = 1, 2, 3, 4) are autocorrelation coefficients for the daily market index. In addition, according the equation (114.4), the variance of Yt is defined as follows: 2 Var(Xt ) + σ 2 . Var(Yt ) = βdaily
(114.21)
Equations (114.19) and (114.21) imply that there exists a relationship between the variance of daily data and the variance of weekly data. Based on the definition of R2 , the “daily” and “weekly” population good2 2 and Rweekly , can be defined as ness of fitness measure, Rdaily 2 = Rdaily
2 βdaily Var(Xt )
2 βdaily Var(Xt ) + σ 2
,
(114.22)
page 3956
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations
2 Rweekly
⎛ ⎞ 1 ⎜ ⎟ ⎜1 ⎟ ⎜ ⎟
⎜ ⎟ 2 βdaily 1 1 1 1 1 Var(X) ⎜1⎟ ⎜ ⎟ ⎜1 ⎟ ⎝ ⎠ 1 = . ⎛ ⎞ 1 ⎜1⎟ ⎜ ⎟ ⎜ ⎟ 2 2 ⎟ βdaily (1 1 1 1 1)Var(X) ⎜ ⎜1⎟ + 5σ ⎜ ⎟ ⎝1⎠
3957
(114.23)
1 Substituting (114.20) into (114.23) yields that 2 = Rweekly
1 1+
5σ2 2 Var(Xt )(5+8ρ1X + 6ρ2X +4ρ3X +2ρ4X )βdaily
.
(114.24)
And note that from equation (114.22), we have that 2 1 − Rdaily σ2 = . 2 2 βdaily Var(Xt ) Rdaily
(114.25)
Finally, we substitute (114.25) into (114.24) and obtain that 2 = Rweekly
2 Rdaily
2 2 Rdaily + kd−w (1 − Rdaily )
,
(114.26)
where kd−w is the daily-to-weekly multiplier for R2 ’s estimated by using daily data and weekly data: kd−w =
1 1+
8 1 5 ρX
+
6 2 5 ρX
+ 45 ρ3X + 25 ρ4X
.
(114.27)
2 2 > Rdaily , which implies that the estimated Thus, if kd−w < 1, then Rweekly goodness of fitness grows as the time interval increase. On the other hand, 2 2 < Rdaily . It means that the estimation of R2 if kd−w > 1, then Rweekly using weekly data would be smaller than that estimated by daily data. In Section 112.3, we will provide empirical results concerning the change in the estimations of R2 ’s when the investment horizon increase.
page 3957
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
3958
Following the derivation process above, similar result can be achieved for the relationship between the estimated R2 associated with weekly data and that associated with monthly data. To obtain it, notice that the transformation matrix now is ⎤ ⎡ I o . . . o ⎥ ⎢ ⎢o I . . . o ⎥ ⎥, ⎢ A=⎢ ⎥ . . . . . . . . . . . . ⎦ ⎣ o o . . . I where I = (1, 1, 1, 1). 2 ) Under this circumstance, the R2 associated with monthly data (Rmonthly 2 2 and the R associated with weekly data (Rweekly ) would be as follows: 2 Rmonthly
=
2 Rweekly
2 2 Rweekly + kd−w (1 − Rweekly )
,
(114.28)
where kw−m is the weekly-to-monthly multiplier for R2 ’s estimated by using weekly data and monthly data: kw−m =
1+
3 1 2ρ X
1 . + ρ 2X + 12 ρ 3X
(114.29)
And ρ iX ’s (i = 1, 2, 3) here represent the autocorrelation coefficients associated with weekly market index. 114.4 Empirical Evidence 114.4.1 Sample and data The daily, weekly, and monthly rates of return from July 1, 2008 to July 1, 2018 for a sample of Dow Jones 30 companies and 30 Small-Cap companies listed in NASDAQ and NYSE are used in this section to do some empirical analysis. Based on the findings of Hawawini (1983) that the impacts of investment horizon on beta estimation is different firms of different sizes, we decide to use two groups of samples, which are big and small companies in terms of their market capitalization. To simplify, we use Dow Jones 30 companies to represent “big” companies. Then we rank all the companies listed in NASDAQ and NYSE Small-Cap list in ascending and choose the top 30 companies which compose the sample for “small” companies.
page 3958
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Time Aggregation on Beta Value and R2 Estimations
b3568-v4-ch114
3959
114.4.2 Static model Daily, weekly, monthly data for “big” and “small” firms are used to estimate equation (114.1). First, we are going to test the Phi Multiplier, Φ(N ) under the circumstance that additive rates of return are applied. To do this, we calculate the Phi multiplier derived in Section 114.2 and compare it with the actual changing trend of beta. The comparison is shown in Table 114.1. It can be found in Table 114.1 that the Phi multipliers can well predict the changing trend of aggregated betas when betas are estimated using additive rates of return. And this phenomenon is shared by both samples of big and small firms. For example, note that Caterpillar Inc. has a Phi multiplier, Φ(5), connecting daily and weekly betas larger than 1, which is 1.15, should have a larger estimation of beta with weekly data than with daily data. And this can be observed Table 114.1. In addition, Transcontinental Realty Investors, Inc. has a Phi multiplier, Φ(4), connecting weekly and monthly betas to be 0.17, and is supposed to have a smaller estimation of beta when using monthly data than using weekly data. Also, supporting evidence can be found in Table 114.1. These provide solid evidence to support our derivation in Section 114.2, which asserts that after taking the effect of lead and lag cross-correlation into consideration, the Phi multipliers for unaggregated and aggregated beta can truly predict the changing pattern of betas. Moreover, there is sufficient evidence for the claim that the Phi multiplier could describe the magnitude of change for aggregated betas. To see this, take the Phi multiplier connecting weekly and monthly betas, Φ(4), as an example. Note that DowDuPont Inc. has a larger multiplier (1.19) than that of United Technologies Corporation (1.08). And it can be seen that the actual growth of estimates of beta for DowDuPont Inc. is 18.42%, whereas the growth rate for United Technologies Corporation is 8.33%. Thus, we believe the Phi multiplier could well depict the magnitude of changing in betas as the time interval increases. Hence, it is plausible to say that the Phi multiplier is reliable for predicting the changing direction and magnitude for the estimation of beta when the time interval increases. To further analyze the changing trend of betas and R2 ’s, we display the summary for market model estimations of betas and R2 ’s obtained by using additive and multiplicative rates of return in panel (I) and panel (II) of Table 114.2, respectively. It can be seen from both panels that for large firms (30 Dow Jones firms) the estimation of beta is declining as the investment horizon grows according
page 3959
July 6, 2020
Table 114.1:
Multipliers and actual changing trend of betas. Firms
βdaily
Φ(5)
βweekly
Φ(4)
AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG TRV UNH UTX V VZ WBA WMT XOM
0.942 1.426 1.031 1.244 1.044 1.065 1.066 1.334 1.389 0.930 0.764 1.012 0.582 1.590 0.566 0.572 0.865 0.785 0.967 0.899 0.768 0.574 1.017 1.001 0.955 0.969 0.682 0.767 0.506 0.938
1.084 1.000 1.126 1.146 1.029 0.964 1.042 1.094 1.010 1.096 1.080 1.009 0.965 0.931 0.996 0.877 1.000 1.009 0.948 0.975 0.913 0.822 0.789 1.051 1.046 0.936 0.909 1.024 0.888 0.839
1.034 1.420 1.111 1.392 1.103 1.017 1.143 1.518 1.376 0.984 0.817 0.993 0.526 1.611 0.508 0.526 0.774 0.795 0.910 1.003 0.674 0.480 0.753 0.973 0.961 0.903 0.570 0.840 0.468 0.790
1.028 1.065 1.031 1.162 0.958 0.865 1.009 1.186 0.975 1.001 0.918 0.975 1.066 0.807 0.920 0.791 1.113 0.872 0.944 0.815 0.957 0.976 0.923 0.960 1.082 0.955 0.899 1.150 0.839 0.783
0.861 1.503 1.236 1.739 1.129 0.859 1.252 1.795 1.424 0.880 0.683 1.045 0.641 1.447 0.532 0.474 0.848 0.785 0.968 0.847 0.875 0.473 0.803 1.049 1.044 0.721 0.534 0.893 0.410 0.663
MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB INBK PCQ PTH AVK BLE INSG RLH RFI MEIP FC
0.942 1.426 1.031 1.244 1.044 1.065 1.066 1.334 1.389 0.930 0.764 1.012 0.582 1.590 0.566 0.572 0.865 0.785 0.967 0.899 0.768 0.574 1.017 1.001 0.955 0.969 0.682 0.767 0.506 0.938
1.084 1.000 1.126 1.146 1.029 0.964 1.042 1.094 1.010 1.096 1.080 1.009 0.965 0.931 0.996 0.877 1.000 1.009 0.948 0.975 0.913 0.822 0.789 1.051 1.046 0.936 0.909 1.024 0.888 0.839
1.034 1.420 1.111 1.392 1.103 1.017 1.143 1.518 1.376 0.984 0.817 0.993 0.526 1.611 0.508 0.526 0.774 0.795 0.910 1.003 0.674 0.480 0.753 0.973 0.961 0.903 0.570 0.840 0.468 0.790
1.028 1.065 1.031 1.162 0.958 0.865 1.009 1.186 0.975 1.001 0.918 0.975 1.066 0.807 0.920 0.791 1.113 0.872 0.944 0.815 0.957 0.976 0.923 0.960 1.082 0.955 0.899 1.150 0.839 0.783
βmonthly 0.861 1.503 1.236 1.739 1.129 0.859 1.252 1.795 1.424 0.880 0.683 1.045 0.641 1.447 0.532 0.474 0.848 0.785 0.968 0.847 0.875 0.473 0.803 1.049 1.044 0.721 0.534 0.893 0.410 0.663
b3568-v4-ch114
βmonthly
9.61in x 6.69in
Φ(4)
Handbook of Financial Econometrics,. . . (Vol. 4)
βweekly
16:5
Φ(5)
Y. Xiao, Y. Tang & C. F. Lee
βdaily
3960
Firms
page 3960
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations Table 114.2:
3961
Summary for market model estimates of betas and R2 ’s. Arithmetic mean
Aggregation
β
R
2
Median β
R2
Panel (I) Additive Returns Dow Jones 30 companies Daily Weekly Monthly
0.942 0.932 0.947
Daily Weekly Monthly
0.661 0.772 0.878
0.495 0.47 0.476
0.949 0.936 0.868
0.514 0.478 0.507
30 Small-cap companies 0.152 0.172 0.225
0.639 0.783 0.957
0.085 0.113 0.156
Panel (II) Multiplicative Returns Dow Jones 30 companies Daily Weekly Monthly
0.945 0.936 0.954
0.494 0.47 0.469
0.952 0.931 0.883
0.513 0.48 0.499
30 Small-cap companies Daily Weekly Monthly
0.666 0.783 0.818
0.153 0.174 0.213
0.622 0.783 0.894
0.087 0.123 0.149
to the median value: (i) When additive returns are applied in estimation, it drops from 0.949 when using daily returns to 0.936 with weekly returns, and decreases further to 0.868 when estimated using monthly data; (ii) When applying multiplicative returns, it falls from 0.952 with daily data to 0.931 with weekly data, and then it drops further to 0.883 when estimated using monthly returns. On the other hand, for small firms (30 Small-Cap firms), the changing trend of estimated beta is exactly the opposite in terms of median values: (i) When estimated by additive returns, it increases form 0.639 with daily return to 0.783 with weekly returns, and even grows to 0.957 when estimated with monthly data; (ii) While estimated with multiplicative returns, the estimation rise from 0.622 with daily data to 0.783 with weekly data, and then increases to 0.894 when estimated with monthly returns. Note that while the decreasing trend for big firms and increasing trending for small firms are apparent when analyzing median values, arithmetic mean values do not provide much solid evidence. And we believe this might because the sample we have is not large enough and we should rely more on median values when dealing with samples as small as ones with 30 observations. When it comes to the variation of the estimated R2 ’s as the time interval increase, a general pattern can be concluded based on the median values.
page 3961
July 6, 2020
16:5
3962
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
With either additive or multiplicative returns, the estimated R2 of big firms tend to be declining as the time interval grows, whereas for small firms the estimation of R2 are increasing. Results with greater details are available in Tables 114.3–114.6. The results for big companies using additive and multiplicative rates of returns are shown in Tables 114.3 and 114.4, respectively. And Tables 114.5 and 114.6 show the results for small companies by using additive and multiplicative rates of returns. With the help of these tables, we can clearly observe how estimates of betas change when the time interval alters for aggressive and defensive companies. First, we analyze results by using additive rates of return (Tables 114.3 and 114.5), we can find that 11 out of 18 aggressive companies (with unit beta larger than 1) have increasing beta as the length of time interval becomes longer. And 7 of them have decreasing estimates of beta with increasing time interval. At the same time, 21 out of 42 defensive companies (with unit beta smaller then 1) have smaller and smaller estimates of beta when the investment horizon increase. Moreover, if can be noticed that among 24 defensive small companies, 18 of them have increasing betas with the time interval increasing. It implies that the frequency-dependency pattern of defensive companies is different for companies of different sizes. And this is an interesting and further finding based on what is predicted in Section 114.2.2.3. When it comes to results obtained by using multiplicative rates of return (shown in Tables 114.4 and 114.6), similar results can be obtained as well. Specifically, notice that 9 out of 18 aggressive companies have higher estimations of beta when the investment horizon is getting longer. On the other hand, 11 of them have smaller estimations. In the meanwhile, 19 out of 42 defensive companies have their estimation of betas declining with the time interval increases. Furthermore, we can also see the interesting phenomenon that the majority of defensive small companies have larger estimates of beta when the time interval increases. To be specific, there 18 of 24 defensive small companies with beta showing this tendency. 114.4.3 Dynamic model Besides the static market model, to measure the effect of a continuous oneunit change in Rm on the value of Ri , a dynamic model is estimated with daily, weekly, monthly and quarterly data. To compute the long run multiplier, the following dynamic model is estimated: (114.30) Ri,t = α + βi1 Ri,t−1 + βi2 Rm,t + εit .
page 3962
July 6, 2020 16:5
Table 114.3: Estimated β coefficient for the static model with additive rates of return: Dow Jones 30 companies (July 1, 2008–July 1, 2018).
DW
R2
0.942 1.426 1.031 1.244 1.044 1.065 1.066 1.334 1.389 0.930 0.764 1.012 0.582 1.590 0.566 0.572 0.865 0.785 0.967 0.899 0.768 0.574
41.206 59.044 52.837 58.930 52.612 65.504 65.173 52.323 54.360 54.144 49.223 52.346 49.462 58.633 40.094 40.540 64.844 41.642 50.853 42.568 48.456 43.986
1.935 2.064 1.986 1.934 2.011 1.999 2.058 2.032 2.044 2.047 1.919 2.098 2.061 2.036 2.131 2.018 2.005 1.954 2.026 2.032 1.929 1.938
0.403 0.581 0.526 0.580 0.524 0.630 0.628 0.521 0.540 0.538 0.491 0.521 0.493 0.578 0.390 0.395 0.626 0.408 0.507 0.419 0.483 0.435
β
t-value
DW
R2
1.034 1.420 1.111 1.392 1.103 1.017 1.143 1.518 1.376 0.984 0.817 0.993 0.526 1.611 0.508 0.526 0.774 0.795 0.910 1.003 0.674 0.480
18.361 25.529 24.887 27.481 25.988 27.941 31.861 26.657 23.281 25.375 22.251 23.465 20.054 25.619 17.518 16.851 23.237 18.397 21.406 21.367 17.606 15.662
2.117 2.109 2.116 2.075 2.040 2.068 2.284 1.851 1.892 2.115 2.183 1.927 2.021 2.306 1.960 2.203 2.271 2.078 2.320 2.216 2.252 2.066
0.393 0.556 0.544 0.592 0.565 0.600 0.661 0.577 0.510 0.553 0.488 0.514 0.436 0.558 0.371 0.353 0.509 0.394 0.468 0.468 0.373 0.321
β
t-value
DW
R2
0.861 1.503 1.236 1.739 1.129 0.859 1.252 1.795 1.424 0.880 0.683 1.045 0.641 1.447 0.532 0.474 0.848 0.785 0.968 0.847 0.875 0.473
6.995 13.006 13.295 14.904 11.459 10.766 18.528 13.198 12.705 11.532 9.128 10.895 11.679 12.657 7.748 7.167 12.315 8.963 11.124 8.922 12.414 7.440
2.038 1.980 1.940 2.172 2.211 1.912 2.060 2.005 1.556 1.933 1.942 1.961 1.897 2.045 2.585 1.898 1.976 2.305 2.203 2.093 1.983 1.756
0.293 0.589 0.600 0.653 0.527 0.496 0.744 0.596 0.578 0.530 0.414 0.501 0.536 0.576 0.337 0.303 0.562 0.405 0.512 0.403 0.566 0.319
(Continued ) 3963
b3568-v4-ch114
t-value
9.61in x 6.69in
β
Monthly3
Handbook of Financial Econometrics,. . . (Vol. 4)
AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG
Weekly2
Impacts of Time Aggregation on Beta Value and R2 Estimations
Daily1
page 3963
July 6, 2020
t-value
DW
R2
1.017 1.001 0.955 0.969 0.682 0.767 0.506 0.938
52.676 39.126 69.872 44.925 43.144 33.986 30.505 63.201
2.105 2.038 2.068 2.122 1.848 2.021 2.074 1.988
0.525 0.378 0.660 0.445 0.425 0.315 0.270 0.614
β
t-value
DW
R2
0.753 0.973 0.961 0.903 0.570 0.840 0.468 0.790
18.222 17.033 33.279 18.614 16.172 16.862 13.353 24.589
2.440 1.879 2.084 2.252 1.903 1.976 2.073 2.044
0.390 0.358 0.680 0.400 0.335 0.354 0.255 0.538
β
t-value
DW
R2
0.803 1.049 1.044 0.721 0.534 0.893 0.410 0.663
11.316 9.686 17.777 7.984 6.497 8.145 4.915 9.174
2.097 2.275 2.107 2.327 2.219 2.034 2.081 1.851
0.520 0.443 0.728 0.351 0.263 0.360 0.170 0.416
Notes: 1. Estimated with daily rates of return and there are 2517 observations; 2. Estimated with weekly rates of return and there are 522 observations; 3. Estimated with monthly rates of return and there are 120 observations.
9.61in x 6.69in
β
Monthly3 Y. Xiao, Y. Tang & C. F. Lee
TRV UNH UTX V VZ WBA WMT XOM
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
16:5
3964
Table 114.3:
b3568-v4-ch114 page 3964
July 6, 2020 16:5
Table 114.4: Estimated β coefficient for the static model with multiplicative rates of return: Dow Jones 30 Companies (July 1, 2008–July 1, 2018).
DW
R2
0.943 1.425 1.035 1.245 1.047 1.070 1.070 1.329 1.393 0.936 0.766 1.016 0.585 1.592 0.572 0.574 0.867 0.788 0.973 0.901 0.770 0.575
41.298 58.310 52.820 58.681 53.080 65.420 65.136 52.014 53.847 54.089 49.442 52.406 49.527 57.504 40.459 40.521 65.051 41.673 51.033 42.236 48.454 44.074
1.935 2.064 1.986 1.934 2.011 1.999 2.058 2.032 2.044 2.047 1.919 2.098 2.061 2.036 2.131 2.018 2.005 1.954 2.026 2.032 1.929 1.938
0.404 0.575 0.526 0.578 0.528 0.630 0.628 0.518 0.536 0.538 0.493 0.522 0.494 0.568 0.394 0.395 0.627 0.408 0.509 0.415 0.483 0.436
β
t-value
DW
R2
1.042 1.436 1.114 1.395 1.105 1.020 1.148 1.527 1.361 0.996 0.825 0.998 0.526 1.650 0.512 0.528 0.778 0.802 0.909 1.004 0.676 0.483
18.443 25.150 24.690 27.660 26.145 27.952 31.382 26.412 22.790 25.499 22.469 23.314 19.989 25.372 17.614 16.806 23.350 18.422 21.190 21.357 17.603 15.714
2.124 2.092 2.118 2.057 2.041 2.059 2.280 1.860 1.888 2.101 2.185 1.923 2.024 2.265 1.956 2.205 2.278 2.090 2.320 2.212 2.248 2.063
0.395 0.549 0.540 0.595 0.568 0.600 0.654 0.573 0.500 0.556 0.493 0.511 0.435 0.553 0.374 0.352 0.512 0.395 0.463 0.467 0.373 0.322
β
t-value
DW
R2
0.875 1.489 1.222 1.743 1.144 0.880 1.249 1.793 1.419 0.911 0.690 1.062 0.641 1.413 0.542 0.481 0.867 0.808 0.975 0.863 0.886 0.466
7.260 11.240 12.801 14.599 11.435 10.833 18.039 11.361 12.555 11.574 9.121 10.832 11.355 12.122 7.691 7.058 12.375 9.025 10.766 8.877 12.426 7.206
1.980 1.919 1.893 2.178 2.234 1.952 2.067 2.106 1.559 1.934 1.964 1.953 1.888 2.034 2.570 1.899 2.005 2.307 2.214 2.092 2.027 1.756
0.309 0.517 0.581 0.644 0.526 0.499 0.734 0.522 0.572 0.532 0.413 0.499 0.522 0.555 0.334 0.297 0.565 0.408 0.496 0.400 0.567 0.306
(Continued ) 3965
b3568-v4-ch114
t-value
9.61in x 6.69in
β
Monthly3
Handbook of Financial Econometrics,. . . (Vol. 4)
AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG
Weekly2
Impacts of Time Aggregation on Beta Value and R2 Estimations
Daily1
page 3965
July 6, 2020
t-value
DW
R2
1.019 1.014 0.959 0.970 0.686 0.772 0.507 0.941
51.933 39.008 70.096 44.768 43.165 34.217 30.573 63.207
2.105 2.038 2.068 2.122 1.848 2.021 2.074 1.988
0.517 0.377 0.661 0.443 0.426 0.318 0.271 0.614
β
t-value
DW
R2
0.759 0.950 0.961 0.912 0.579 0.842 0.469 0.786
18.453 16.801 32.934 18.792 16.285 16.829 13.397 24.382
2.435 1.883 2.089 2.238 1.902 1.973 2.072 2.046
0.396 0.352 0.676 0.404 0.338 0.353 0.257 0.533
β
t-value
DW
R2
0.844 1.042 1.056 0.720 0.556 0.902 0.412 0.672
11.465 9.917 17.689 7.787 6.639 7.932 4.902 9.256
2.093 2.182 2.119 2.374 2.219 2.054 2.053 1.862
0.527 0.455 0.726 0.339 0.272 0.348 0.169 0.421
Notes: 1. Estimated with daily rates of return and there are 2517 observations; 2. Estimated with weekly rates of return and there are 522 observations; 3. Estimated with monthly rates of return and there are 120 observations.
9.61in x 6.69in
β
Monthly3 Y. Xiao, Y. Tang & C. F. Lee
TRV UNH UTX V VZ WBA WMT XOM
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
16:5
3966
Table 114.4:
b3568-v4-ch114 page 3966
July 6, 2020 16:5
Table 114.5: Estimated β coefficient for the static model with additive rates of return: 30 Small-cap companies (July 1, 2008–July 1, 2018).
DW
R2
0.205 0.745 1.012 1.404 0.178 0.303 0.296 0.751 0.253 0.178 0.547 0.773 1.009 0.635 0.643 0.389 1.765 0.565 0.461 0.149 0.172
12.368 48.787 12.374 37.533 4.042 14.656 11.330 23.454 3.900 3.280 24.120 18.945 23.532 12.294 14.630 2.496 39.245 20.424 25.615 3.512 3.582
1.886 2.106 2.084 2.006 2.386 1.795 2.076 2.213 2.038 2.431 1.630 2.082 1.913 2.158 2.145 2.253 2.057 1.985 2.127 2.121 2.126
0.057 0.486 0.057 0.359 0.006 0.079 0.049 0.179 0.006 0.004 0.188 0.125 0.180 0.057 0.078 0.002 0.380 0.142 0.207 0.005 0.005
β
t-value
DW
R2
0.080 0.758 1.409 1.661 0.515 0.279 0.390 0.808 0.390 0.357 0.690 0.841 0.814 1.062 0.728 1.179 1.662 0.650 0.506 0.174 0.342
2.093 21.973 7.319 19.713 6.245 5.172 7.038 11.827 2.600 3.476 11.185 8.988 7.689 9.624 7.187 3.694 16.758 11.215 13.844 2.145 3.419
2.199 2.151 2.308 2.051 2.426 2.128 2.513 2.143 2.267 2.146 1.903 2.383 2.034 2.243 2.003 2.156 2.147 2.312 2.261 2.214 2.139
0.008 0.481 0.093 0.428 0.070 0.049 0.087 0.212 0.013 0.023 0.194 0.134 0.102 0.151 0.090 0.026 0.351 0.195 0.269 0.009 0.022
β
t-value
DW
R2
0.089 0.756 1.158 1.096 0.555 0.281 0.409 0.641 0.079 0.762 1.073 1.127 1.190 0.933 1.294 1.879 1.189 0.724 0.624 0.488 0.453
1.393 10.857 3.385 6.082 3.829 2.804 4.402 4.603 0.284 3.909 8.845 7.546 5.792 4.663 6.106 2.867 6.043 7.393 9.073 3.252 2.415
1.944 1.734 2.330 2.434 1.959 1.679 2.510 2.519 2.159 2.063 2.275 2.087 1.786 2.220 1.667 2.205 2.361 2.303 1.913 2.297 1.761
0.016 0.500 0.089 0.239 0.111 0.062 0.141 0.152 0.001 0.115 0.399 0.326 0.221 0.156 0.240 0.065 0.236 0.317 0.411 0.082 0.047
(Continued ) 3967
b3568-v4-ch114
t-value
9.61in x 6.69in
β
Monthly3
Handbook of Financial Econometrics,. . . (Vol. 4)
MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB INBK
Weekly2
Impacts of Time Aggregation on Beta Value and R2 Estimations
Daily1
page 3967
July 6, 2020
t-value
DW
R2
0.204 0.860 0.766 0.221 1.010 0.738 0.953 1.713 0.936
10.623 56.953 42.402 11.423 15.814 18.276 39.339 14.657 22.962
1.769 2.065 1.872 1.910 2.005 1.962 2.189 2.140 2.112
0.043 0.563 0.417 0.049 0.090 0.117 0.381 0.079 0.173
β
t-value
DW
R2
0.200 0.941 0.964 0.038 1.289 0.952 1.093 1.389 0.998
4.301 26.740 21.781 0.845 8.556 10.532 23.785 5.681 11.242
1.904 2.123 1.945 2.212 1.953 1.874 2.316 2.173 2.23
0.034 0.579 0.477 0.001 0.123 0.176 0.521 0.058 0.196
β
t-value
DW
R2
0.414 0.981 1.150 0.128 1.367 1.128 1.165 1.946 1.247
4.662 13.779 16.469 1.747 3.742 5.167 13.261 3.796 7.889
1.553 2.127 2.317 1.668 2.106 2.343 2.092 2.398 2.195
0.156 0.617 0.697 0.025 0.106 0.185 0.598 0.109 0.345
Notes: 1. Estimated with daily rates of return and there are 2517 observations; 2. Estimated with weekly rates of return and there are 522 observations; 3. Estimated with monthly rates of return and there are 120 observations.
9.61in x 6.69in
β
Monthly3 Y. Xiao, Y. Tang & C. F. Lee
PCQ PTH AVK BLE INSG RLH RFI MEIP FC
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
16:5
3968
Table 114.5:
b3568-v4-ch114 page 3968
July 6, 2020 16:5
Table 114.6: Estimated β coefficient for the static model with multiplicative rates of return: 30 Small-cap companies (July 1, 2008–July 1, 2018).
DW
R2
0.206 0.748 1.012 1.402 0.172 0.310 0.307 0.746 0.254 0.177 0.542 0.757 1.009 0.621 0.622 0.618 1.773 0.578 0.463 0.145 0.168
12.408 48.988 12.284 37.198 3.868 14.849 11.605 23.272 3.786 3.206 23.941 18.191 28.635 12.361 14.045 3.206 39.025 20.938 25.482 3.357 3.461
1.886 2.106 2.084 2.006 2.386 1.795 2.076 2.213 2.038 2.431 1.630 2.082 1.913 2.158 2.145 2.253 2.057 1.985 2.127 2.121 2.126
0.058 0.488 0.057 0.355 0.006 0.081 0.051 0.177 0.006 0.004 0.186 0.116 0.246 0.057 0.073 0.004 0.377 0.148 0.205 0.004 0.005
β
t-value
DW
R2
0.078 0.750 1.618 1.683 0.528 0.263 0.390 0.816 0.385 0.354 0.683 0.836 0.898 1.124 0.703 1.262 1.714 0.641 0.496 0.172 0.333
2.026 21.806 7.285 19.717 6.237 4.801 7.040 11.990 2.387 3.402 10.978 8.675 11.032 10.370 6.705 2.923 16.861 11.670 13.580 2.113 3.223
2.214 2.152 2.279 2.072 2.434 2.126 2.516 2.148 2.259 2.139 1.932 2.399 2.042 2.232 2.027 2.168 2.137 2.347 2.253 2.211 2.117
0.008 0.478 0.093 0.428 0.070 0.042 0.087 0.217 0.011 0.022 0.188 0.126 0.190 0.171 0.080 0.016 0.353 0.208 0.262 0.009 0.020
β
t-value
DW
R2
0.088 0.766 1.220 1.031 0.539 0.241 0.400 0.655 0.075 0.755 0.998 1.110 1.180 0.968 1.236 0.883 1.198 0.698 0.612 0.517 0.495
1.372 10.505 3.031 5.549 3.639 2.381 4.123 4.568 0.254 3.737 8.878 7.235 6.665 4.633 5.495 1.301 5.882 7.376 8.733 3.341 2.379
1.957 1.739 2.351 2.492 1.956 1.730 2.482 2.540 2.185 2.075 2.088 2.078 1.826 2.205 1.671 1.866 2.377 2.297 1.909 2.289 1.698
0.016 0.483 0.072 0.207 0.101 0.046 0.126 0.150 0.001 0.106 0.400 0.307 0.274 0.154 0.204 0.014 0.227 0.316 0.393 0.086 0.046
(Continued ) 3969
b3568-v4-ch114
t-value
9.61in x 6.69in
β
Monthly3
Handbook of Financial Econometrics,. . . (Vol. 4)
MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB INBK
Weekly2
Impacts of Time Aggregation on Beta Value and R2 Estimations
Daily1
page 3969
July 6, 2020
t-value
DW
R2
0.200 0.857 0.764 0.227 1.008 0.737 0.961 1.673 0.928
10.471 56.786 42.236 11.607 15.945 18.089 39.661 13.525 22.388
1.769 2.065 1.872 1.910 2.005 1.962 2.189 2.140 2.112
0.042 0.562 0.415 0.051 0.092 0.115 0.385 0.068 0.166
β
t-value
DW
R2
0.193 0.931 0.959 0.029 1.278 0.921 1.093 1.350 1.004
4.212 26.659 21.615 0.649 8.384 10.191 23.708 5.224 11.084
1.924 2.127 1.968 2.231 1.945 1.906 2.316 2.132 2.224
0.033 0.577 0.473 0.001 0.119 0.166 0.519 0.050 0.191
β
t-value
DW
R2
0.389 0.968 1.135 0.117 1.415 0.904 1.121 1.564 1.251
4.452 13.476 16.430 1.586 3.733 4.532 12.552 2.009 7.481
1.568 2.135 2.246 1.686 2.083 2.285 2.101 2.276 2.195
0.144 0.606 0.696 0.021 0.106 0.148 0.572 0.033 0.322
Notes: 1. Estimated with daily rates of return and there are 2517 observations; 2. Estimated with weekly rates of return and there are 522 observations; 3. Estimated with monthly rates of return and there are 120 observations.
9.61in x 6.69in
β
Monthly3 Y. Xiao, Y. Tang & C. F. Lee
PCQ PTH AVK BLE INSG RLH RFI MEIP FC
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
16:5
3970
Table 114.6:
b3568-v4-ch114 page 3970
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations
3971
Based on equation (114.25), we can infinitely substitute the lagged firm return, and find out that the coefficient for the market return can be 2 + β 3 + · · · ). And thus, provided that β expressed as βi2 (1 + βi1 + βi1 i1 i1 is significantly different from 0, we can define the long run multiplier as follows: LRM = βi2 /(1 − βi1 ).
(114.31)
If βi1 is not significantly different from 0 or violates the constraint by falling in the interval (−1,0], then LRM is βi2 . Summarized results for dynamic model are shown in Tables 114.7 and 114.8. Again, due the small size of our sample, we are going to discuss the tendency of betas based on the information of median values. Panel (I) and (II) of Table 114.7 imply that no matter using additive or multiplicative rates of return, estimate of beta is generally declining for big companies and increasing for small companies when the time interval increases. Specifically speaking, the median value for estimated beta of big companies drops from 0.925 with daily data to 0.864 with monthly data with additive rates of return and decreases from 0.927 with daily data to 0.878 with monthly returns when applying multiplicative returns to estimate. For small companies, estimate of beta rises from 0.653 with daily data to 0.970 with monthly data when Table 114.7:
Summary for long run multipliers.
Aggregation
Arithmetic mean
Median
Panel (I) Additive Returns Daily Weekly Monthly Daily Weekly Monthly
Dow Jones 30 companies 0.933 0.925 0.91 0.87 0.941 0.864 30 Small-cap companies 0.661 0.653 0.759 0.757 0.901 0.97
Panel (II) Multiplicative Returns Daily Weekly Monthly Daily Weekly Monthly
Dow Jones 30 companies 0.936 0.927 0.916 0.876 0.937 0.878 30 Small-cap companies 0.666 0.657 0.757 0.748 0.828 0.98
page 3971
July 17, 2020
14:53
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
3972 Table 114.8:
Summary for LRM’s for each companies.
Additive Daily Panel (I) AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG TRV UNH UTX V VZ WBA WMT XOM
Weekly
Monthly
Dow Jones 30 Companies 0.943 1.035 0.852 1.428 1.421 1.781 1.063 1.110 1.246 1.316 1.393 1.584 1.044 1.106 0.990 1.065 1.013 0.847 1.066 1.074 1.250 1.337 1.629 2.053 1.510 1.386 1.413 0.930 0.978 0.882 0.766 0.809 0.691 0.940 0.994 1.051 0.554 0.530 0.645 1.523 1.423 1.464 0.533 0.512 0.430 0.571 0.483 0.477 0.864 0.775 0.852 0.785 0.790 0.641 0.933 0.821 0.964 0.898 0.838 0.849 0.768 0.611 0.876 0.574 0.478 0.475 0.900 0.625 0.665 1.001 0.972 1.042 0.954 0.968 1.049 0.920 0.894 0.731 0.683 0.571 0.453 0.764 0.847 0.896 0.478 0.465 0.426 0.892 0.735 0.656
Panel (II) MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF
30 Small-Cap Companies 0.228 0.078 0.085 0.698 0.756 0.751 1.011 1.202 1.176 1.405 1.649 1.095 0.169 0.429 0.559 0.365 0.273 0.376 0.297 0.315 0.384 0.692 0.801 0.529 0.251 0.291 0.125 0.134 0.348 0.763
Multiplicative Daily
Weekly
Monthly
0.943 1.427 1.066 1.318 1.047 1.069 1.070 1.332 1.518 0.935 0.769 0.944 0.557 1.523 0.538 0.573 0.866 0.788 0.939 0.900 0.770 0.575 0.898 1.014 0.959 0.919 0.687 0.770 0.479 0.892
1.043 1.437 1.112 1.396 1.109 1.016 1.076 1.629 1.370 0.989 0.816 1.000 0.531 1.465 0.516 0.519 0.779 0.796 0.819 0.837 0.612 0.480 0.632 0.948 0.969 0.903 0.580 0.849 0.466 0.776
0.862 1.779 1.234 1.578 0.996 0.868 1.249 1.772 1.417 0.912 0.699 1.067 0.646 1.425 0.437 0.484 0.870 0.655 0.969 0.864 0.885 0.470 0.698 1.037 1.061 0.725 0.470 0.906 0.424 0.665
0.228 0.701 1.011 1.403 0.166 0.373 0.309 0.688 0.253 0.132
0.076 0.747 1.410 1.666 0.442 0.256 0.312 0.809 0.280 0.345
0.085 0.762 1.250 1.023 0.549 0.315 0.374 0.534 0.124 0.753 (Continued)
page 3972
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations Table 114.8:
(Continued)
Additive Daily Panel (II) PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB INBK PCQ PTH AVK BLE INSG RLH RFI MEIP FC
Weekly
3973
Multiplicative Monthly
30 Small-Cap Companies 0.720 0.745 1.091 0.742 0.759 1.113 1.012 0.820 1.260 0.597 1.049 0.947 0.610 0.739 1.456 0.376 1.214 1.923 1.671 1.644 1.178 0.615 0.647 0.724 0.461 0.500 0.726 0.133 0.159 0.498 0.161 0.340 0.478 0.228 0.202 0.539 0.895 0.940 0.994 0.962 1.069 1.158 0.245 0.035 0.117 1.008 1.293 1.358 0.736 1.089 1.260 0.952 1.086 1.164 1.580 1.322 1.936 0.878 0.979 1.272
Daily
Weekly
Monthly
0.713 0.723 1.008 0.579 0.593 0.581 1.676 0.626 0.463 0.130 0.158 0.224 0.892 0.953 0.251 1.006 0.736 0.961 1.572 0.874
0.679 0.749 0.905 1.117 0.711 1.267 1.575 0.581 0.490 0.158 0.331 0.196 0.929 1.043 0.027 1.283 1.032 1.022 1.290 0.984
1.025 1.096 1.219 0.979 1.172 1.019 1.007 0.699 0.720 0.527 0.535 0.502 0.980 1.144 0.108 1.406 0.998 1.122 1.555 1.266
estimated by additive rates of return. And it grows from 0.657 with daily data to 0.980 with monthly data when multiplicative rates of return are used. Summary for Estimated LRM for Dow Jones 30 Companies and 30 SmallCap companies are displayed in Panel (I) and Panel (II) of Table 114.8, respectively. It is shown that there are 16 aggressive companies and 44 defensive ones. Among these 16 aggressive companies, 8 of them have supports from both models with additive and multiplicative rates or return that they have increasing estimate of beta when the time interval increases. And 7 of them have sufficient evidence by estimation results with both additive and multiplicative rates of return that they have declining estimate of beta as investment horizon increases. What is interest is that there is one company, which is Capital Southwest Corporation (CSWC), having the tendency of beta different when rates of return are different. What’s more, we can find that 16 out of 44 defensive companies have decreasing estimate of beta when the time interval increase. And also, it can be found that the majority of
page 3973
July 6, 2020
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
0.064 −0.113 0.046 0.053 0.015 −0.052 0.007 −0.041 −0.144 0.063 0.055 −0.018 −0.002 −0.049 −0.018 −0.019 −0.013 0.027 0.000 0.007 0.008 −0.002
2.471 −3.673 1.582 1.719 0.510 −1.596 0.210 −1.435 −4.915 2.138 1.987 −0.624 −0.076 −1.604 −0.717 −0.748 −0.404 1.058 −0.003 0.254 0.294 −0.079
−0.140 0.069 −0.099 −0.105 −0.118 −0.056 −0.102 0.004 0.259 −0.108 −0.109 −0.173 −0.093 −0.165 −0.076 −0.078 −0.077 −0.123 −0.142 −0.110 −0.112 −0.100
−3.661 1.201 −2.409 −2.087 −2.845 −1.267 −2.326 0.069 4.693 −2.902 −3.574 −4.313 −4.010 −2.592 −3.282 −3.358 −2.156 −3.879 −3.702 −3.022 −3.663 −4.356
0.018 0.049 −0.009 0.139 −0.079 0.014 −0.137 0.174 −0.019 −0.042 −0.179 0.100 −0.007 −0.198 0.003 −0.162 −0.010 −0.061 −0.195 −0.194 −0.088 −0.036
0.312 0.742 −0.136 2.038 −1.195 −0.209 −1.833 2.595 −0.297 −0.651 −2.956 1.593 −0.113 −3.072 0.048 −3.003 −0.153 −1.084 −3.286 −3.318 −1.600 −0.677
−0.134 −0.250 −0.126 −0.255 −0.024 −0.160 −0.022 −0.401 −0.162 −0.130 0.013 −0.223 −0.012 −0.035 −0.061 0.004 −0.059 −0.063 0.038 −0.132 −0.066 −0.106
−1.442 −2.001 −1.295 −2.056 −0.245 −1.771 −0.207 −3.001 −1.346 −1.513 0.177 −2.575 −0.252 −0.248 −1.326 0.081 −0.872 −0.881 0.483 −1.547 −1.092 −2.371
0.050 −0.068 −0.184 0.216 −0.153 −0.022 −0.072 0.080 0.347 −0.012 −0.038 0.008 −0.077 −0.390 −0.296 0.003 0.103 −0.294 −0.153 −0.056 −0.132 0.129
0.461 −0.481 −1.267 1.376 −1.152 −0.174 −0.388 0.562 2.497 −0.086 −0.313 0.056 −0.571 −2.866 −2.684 0.031 0.736 −2.582 −1.164 −0.461 −0.939 1.145
0.049 0.518 0.415 −0.558 0.016 −0.071 0.199 0.306 −0.446 −0.087 0.098 −0.051 0.039 0.664 0.079 −0.086 −0.115 0.118 0.189 −0.007 0.049 −0.031
0.284 1.875 1.799 −1.659 0.075 −0.462 0.743 0.919 −1.711 −0.531 0.765 −0.262 0.331 2.562 0.787 −0.900 −0.726 0.843 1.062 −0.042 0.302 −0.330
(Continued )
b3568-v4-ch114
t-value
9.61in x 6.69in
β1
Handbook of Financial Econometrics,. . . (Vol. 4)
Monthly3
Y. Xiao, Y. Tang & C. F. Lee
AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG
Weekly2
16:5
Daily1
3974
Table 114.9: Estimated β coefficient for the dynamic model with additive rates of return: Dow Jones 30 companies (July 1, 2008–July 1, 2018).
page 3974
July 6, 2020 16:5
Monthly3
β1
t-value
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
−0.103 −0.026 −0.030 −0.067 0.111 −0.016 −0.041 −0.065
−3.626 −1.021 −0.884 −2.518 4.247 −0.664 −1.756 −2.053
−0.162 −0.039 −0.060 −0.058 −0.191 −0.100 −0.073 −0.102
−4.083 −0.940 −1.487 −1.501 −7.008 −3.036 −3.242 −2.671
−0.221 0.095 −0.029 −0.098 0.069 −0.017 −0.046 −0.103
−4.030 1.744 −0.369 −1.742 1.296 −0.302 −0.907 −1.611
−0.015 −0.194 −0.029 0.000 −0.169 0.002 −0.075 −0.104
−0.225 −2.199 −0.318 0.005 −3.226 0.027 −1.605 −1.521
−0.103 −0.361 −0.137 −0.091 −0.052 0.006 0.013 0.031
−0.787 −3.027 −0.774 −0.811 −0.495 0.050 0.133 0.258
−0.168 0.463 0.162 0.061 −0.238 0.134 −0.199 −0.088
−1.160 2.463 0.748 0.449 −2.189 0.778 −1.997 −0.720
Notes: 1. Estimated with daily rates of return and there are 2516 observations; 2. Estimated with weekly rates of return and there are 521 observations; 3. Estimated with monthly rates of return and there are 119 observations.
9.61in x 6.69in
TRV UNH UTX V VZ WBA WMT XOM
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
Impacts of Time Aggregation on Beta Value and R2 Estimations
Table 114.9:
b3568-v4-ch114
3975
page 3975
July 6, 2020
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
0.064 −0.102 0.043 0.051 0.014 −0.056 0.008 −0.039 −0.140 0.063 0.053 −0.018 −0.003 −0.046 −0.022 −0.020 −0.013 0.026 −0.002 0.006 0.009 −0.003
2.494 −3.341 1.489 1.669 0.486 −1.727 0.232 −1.347 −4.822 2.143 1.910 −0.619 −0.112 −1.524 −0.855 −0.795 −0.389 0.995 −0.056 0.214 0.333 −0.123
−0.140 0.052 −0.100 −0.104 −0.120 −0.056 −0.106 −0.003 0.261 −0.110 −0.108 −0.176 −0.094 −0.173 −0.076 −0.078 −0.079 −0.124 −0.143 −0.111 −0.114 −0.101
−3.653 0.907 −2.414 −2.069 −2.875 −1.284 −2.413 −0.061 4.701 −2.957 −3.543 −4.370 −4.038 −2.725 −3.267 −3.358 −2.229 −3.892 −3.703 −3.039 −3.736 −4.384
0.009 0.049 −0.016 0.144 −0.076 −0.013 −0.138 0.182 −0.009 −0.043 −0.180 0.099 −0.010 −0.161 0.005 −0.164 −0.010 −0.061 −0.196 −0.191 −0.084 −0.036
0.157 0.748 −0.255 2.093 −1.139 −0.195 −1.872 2.730 −0.153 −0.664 −2.964 1.592 −0.165 −2.509 0.089 −3.047 −0.167 −1.093 −3.319 −3.260 −1.526 −0.674
−0.135 −0.258 −0.135 −0.280 −0.034 −0.168 −0.029 −0.434 −0.190 −0.136 0.005 −0.227 −0.013 −0.122 −0.064 0.006 −0.062 −0.071 0.035 −0.147 −0.078 −0.109
−1.442 −2.042 −1.381 −2.254 −0.343 −1.851 −0.281 −3.233 −1.595 −1.563 0.073 −2.606 −0.274 −0.857 −1.371 0.116 −0.904 −0.992 0.445 −1.720 −1.287 −2.426
0.069 0.046 −0.114 0.165 −0.158 −0.018 −0.056 0.059 0.320 −0.013 −0.048 0.012 −0.065 −0.345 −0.291 0.001 0.100 −0.286 −0.141 −0.049 −0.149 0.120
0.626 0.354 −0.799 1.066 −1.192 −0.141 −0.309 0.450 2.304 −0.100 −0.396 0.089 −0.484 −2.585 −2.649 0.009 0.713 −2.501 −1.089 −0.408 −1.069 1.074
0.028 0.335 0.287 −0.527 −0.011 −0.099 0.143 0.327 −0.490 −0.113 0.090 −0.094 0.014 0.490 0.069 −0.092 −0.128 0.078 0.147 −0.045 0.036 −0.032
0.158 1.240 1.259 −1.572 −0.053 −0.621 0.542 1.000 −1.881 −0.669 0.692 −0.475 0.115 1.935 0.673 −0.947 −0.789 0.542 0.819 −0.274 0.217 −0.345
(Continued )
b3568-v4-ch114
t-value
9.61in x 6.69in
β1
Handbook of Financial Econometrics,. . . (Vol. 4)
Monthly3
Y. Xiao, Y. Tang & C. F. Lee
AAPL AXP BA CAT CSCO CVX DIS DWDP GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG
Weekly2
16:5
Daily1
3976
Table 114.10: Estimated β coefficient for the dynamic model with multiplicative rates of return: Dow Jones 30 companies (July 1, 2008–July 1, 2018).
page 3976
July 6, 2020 16:5
Monthly3
β1
t-value
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
−0.100 −0.025 −0.029 −0.070 0.117 −0.017 −0.042 −0.068
−3.559 −0.997 −0.847 −2.642 4.480 −0.707 −1.827 −2.139
−0.174 −0.046 −0.063 −0.057 −0.197 −0.103 −0.074 −0.105
−4.373 −1.093 −1.569 −1.484 −7.198 −3.127 −3.275 −2.743
−0.217 0.089 −0.035 −0.095 0.072 −0.014 −0.046 −0.106
−3.948 1.655 −0.460 −1.681 1.346 −0.249 −0.912 −1.670
−0.020 −0.200 −0.031 −0.009 −0.172 −0.005 −0.076 −0.104
−0.300 −2.316 −0.342 −0.106 −3.240 −0.064 −1.618 −1.522
−0.103 −0.311 −0.145 −0.124 −0.051 −0.007 0.024 0.038
−0.783 −2.552 −0.823 −1.117 −0.486 −0.057 0.239 0.316
−0.194 0.332 0.150 0.087 −0.251 0.131 −0.208 −0.100
−1.276 1.763 0.685 0.629 −2.244 0.748 −2.077 −0.801
Notes: 1. Estimated with daily rates of return and there are 2516 observations; 2. Estimated with weekly rates of return and there are 521 observations; 3. Estimated with monthly rates of return and there are 119 observations.
9.61in x 6.69in
TRV UNH UTX V VZ WBA WMT XOM
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
Impacts of Time Aggregation on Beta Value and R2 Estimations
Table 114.10:
b3568-v4-ch114
3977
page 3977
July 6, 2020
β1
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
0.065 −0.084 0.145 −0.048 0.086 0.081 0.174 0.078 −0.063 0.107 0.048 −0.040 −0.305 0.167 0.208 −0.001 −0.275 0.207 0.043 −0.008
3.735 −2.848 1.672 −0.826 1.984 3.631 6.379 2.000 −0.965 2.023 1.771 −0.860 −5.872 3.065 4.396 −0.009 −3.810 6.498 1.902 −0.197
−0.094 −0.142 −0.163 −0.092 −0.204 −0.086 −0.262 −0.084 −0.148 −0.073 0.099 −0.149 −0.036 −0.151 −0.012 −0.067 −0.075 −0.137 −0.158 −0.105
−2.142 −2.353 −3.570 −1.598 −4.562 −1.907 −5.915 −1.698 −3.372 −1.655 2.026 −3.200 −0.787 −3.204 −0.268 −1.514 −1.381 −2.819 −3.100 −2.404
0.014 0.038 0.122 −0.075 0.076 0.052 0.222 −0.052 −0.049 0.170 −0.019 0.309 0.139 0.320 0.177 0.352 −0.207 0.165 0.114 0.103
0.373 0.570 0.581 −0.512 0.878 0.918 3.791 −0.605 −0.323 1.633 −0.251 2.899 1.176 2.490 1.596 1.077 −1.369 2.311 2.300 1.267
0.036 0.173 −0.164 −0.266 0.020 0.149 −0.331 −0.258 −0.075 −0.053 −0.289 0.107 −0.035 −0.099 0.249 −0.109 −0.225 −0.235 0.057 −0.147
0.387 1.330 −1.752 −2.695 0.202 1.647 −3.837 −2.666 −0.817 −0.542 −2.571 0.955 −0.340 −0.985 2.374 −1.155 −2.162 −2.161 0.485 −1.530
0.002 −0.149 1.009 0.942 0.263 0.294 0.501 0.227 −0.273 0.160 0.760 0.152 0.258 0.190 −0.049 1.172 0.229 0.391 0.165 0.066
0.026 −1.075 2.766 4.251 1.627 2.889 5.331 1.429 −0.972 0.723 3.984 0.690 0.986 0.799 −0.178 1.688 0.899 2.800 1.435 0.408
(Continued )
b3568-v4-ch114
β2
9.61in x 6.69in
0.048 2.358 −0.051 −1.833 −0.038 −1.868 0.013 0.508 −0.190 −9.695 0.081 3.952 −0.036 −1.769 −0.121 −5.537 −0.020 −1.024 −0.219 −11.264 0.176 8.100 −0.017 −0.796 0.037 1.679 −0.078 −3.822 −0.096 −4.645 −0.125 −6.314 −0.029 −1.167 0.002 0.096 −0.042 −1.877 −0.064 −3.213
Monthly3
Y. Xiao, Y. Tang & C. F. Lee
MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB
t-value
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
16:5
3978
Table 114.11: Estimated β coefficient for the dynamic model with additive rates of return: 30 Small-cap companies (July 1, 2008–July 1, 2018).
page 3978
July 6, 2020 16:5
Monthly3
β1
t-value
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
−0.063 0.128 −0.028 0.038 0.031 0.023 0.039 −0.017 −0.072 −0.048
−3.169 6.347 −0.916 1.486 1.546 1.089 1.843 −0.671 −3.461 −2.211
0.087 0.013 0.016 0.164 0.105 0.007 0.000 0.104 −0.210 −0.122
1.801 0.676 0.474 5.362 5.190 0.100 −0.004 2.652 −1.662 −2.493
−0.069 0.051 −0.029 0.054 −0.106 0.031 0.056 −0.080 −0.115 −0.142
−1.553 1.151 −0.437 0.898 −2.426 0.652 1.155 −1.267 −2.567 −2.900
0.044 −0.028 −0.090 −0.041 0.072 −0.105 0.109 −0.026 0.498 0.094
0.426 −0.580 −1.079 −0.481 1.607 −0.609 0.997 −0.273 1.938 0.859
0.083 0.219 −0.315 −0.478 0.198 −0.060 −0.006 0.049 −0.207 −0.043
0.883 2.228 −2.161 −3.019 2.182 −0.614 −0.058 0.343 −2.150 −0.373
0.039 0.040 0.460 0.810 0.108 0.057 0.126 0.204 0.832 −0.043
0.196 0.384 2.530 3.712 1.470 0.139 0.468 0.946 1.466 −0.175
Notes: 1. Estimated with daily rates of return and there are 2516 observations; 2. Estimated with weekly rates of return and there are 521 observations; 3. Estimated with monthly rates of return and there are 119 observations.
9.61in x 6.69in
INBK PCQ PTH AVK BLE INSG RLH RFI MEIP FC
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
Impacts of Time Aggregation on Beta Value and R2 Estimations
Table 114.11:
b3568-v4-ch114
3979
page 3979
July 6, 2020
β1
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
0.063 −0.085 0.128 −0.053 0.083 0.078 0.169 0.074 −0.074 0.103 0.050 −0.054 −0.264 0.192 0.207 0.099 −0.278 0.200 0.045 −0.010
3.596 −2.868 1.472 −0.913 1.898 3.446 6.081 1.909 −1.102 1.916 1.818 −1.147 −5.690 3.617 4.369 0.513 −3.836 6.229 1.979 −0.231
−0.102 −0.144 −0.143 −0.096 −0.204 −0.085 −0.265 −0.086 −0.143 −0.070 0.084 −0.156 −0.049 −0.131 −0.019 −0.083 −0.081 −0.159 −0.153 −0.104
−2.316 −2.387 −3.138 −1.667 −4.576 −1.900 −5.969 −1.734 −3.276 −1.584 1.723 −3.359 −0.998 −2.740 −0.414 −1.875 −1.496 −3.247 −3.031 −2.373
0.011 0.036 0.112 −0.113 0.060 0.049 0.208 −0.055 −0.087 0.156 −0.042 0.290 0.071 0.253 0.170 0.191 −0.260 0.139 0.101 0.095
0.297 0.549 0.459 −0.762 0.670 0.857 3.549 −0.640 −0.541 1.480 −0.545 2.664 0.710 1.952 1.492 0.435 −1.671 2.024 2.064 1.157
0.028 0.145 −0.166 −0.278 0.008 0.123 −0.311 −0.271 −0.089 −0.060 −0.178 0.098 −0.094 −0.094 0.228 0.042 −0.225 −0.220 0.053 −0.140
0.296 1.130 −1.787 −2.849 0.078 1.366 −3.606 −2.815 −0.969 −0.613 −1.536 0.886 −0.878 −0.933 2.220 0.462 −2.172 −2.021 0.454 −1.455
0.009 −0.146 1.208 0.868 0.265 0.305 0.510 0.248 −0.247 0.112 0.540 0.147 0.059 0.179 −0.104 1.453 0.128 0.360 0.158 0.071
0.130 −1.032 2.873 3.933 1.623 3.016 5.245 1.529 −0.835 0.492 2.962 0.666 0.244 0.722 −0.376 2.149 0.492 2.660 1.387 0.416
(Continued )
b3568-v4-ch114
β2
9.61in x 6.69in
0.043 2.112 −0.051 −1.855 −0.024 −1.180 0.013 0.509 −0.188 −9.615 0.081 3.941 −0.035 −1.734 −0.120 −5.470 −0.013 −0.650 −0.218 −11.231 0.171 7.871 −0.021 −0.998 0.008 0.361 −0.089 −4.367 −0.092 −4.454 −0.085 −4.279 −0.031 −1.241 −0.003 −0.143 −0.052 −2.311 −0.064 −3.184
Monthly3
Y. Xiao, Y. Tang & C. F. Lee
MEN JOF NYNY HURC SFST PMF MCI MLR TCI SMMF PFL GPX CSWC FNHC CULP ACER PHX GUT BGT SENEB
t-value
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
16:5
3980
Table 114.12: Estimated β coefficient for the dynamic model with multiplicative rates of return: 30 Small-cap companies (July 1, 2008–July 1, 2018).
page 3980
July 6, 2020 16:5
Monthly3
β1
t-value
β2
t-value
β1
t-value
β2
t-value
β1
t-value
β2
t-value
−0.059 0.126 −0.030 0.031 0.026 0.027 0.035 −0.035 −0.053 −0.044
−2.969 6.260 −1.010 1.220 1.281 1.301 1.652 −1.379 −2.551 −2.019
0.087 0.015 0.017 0.164 0.103 −0.006 0.005 0.116 −0.264 −0.127
1.787 0.748 0.497 5.377 5.039 −0.089 0.108 2.943 −1.991 −2.570
−0.058 0.041 −0.033 0.038 −0.116 0.033 0.034 −0.086 −0.095 −0.137
−1.301 0.913 −0.495 0.626 −2.651 0.714 0.719 −1.370 −2.114 −2.812
0.048 −0.035 −0.098 −0.052 0.065 −0.158 0.099 −0.039 0.393 0.078
0.451 −0.733 −1.195 −0.617 1.434 −0.915 0.917 −0.413 1.453 0.701
0.117 0.214 −0.309 −0.414 0.183 −0.045 −0.013 0.074 −0.138 −0.072
1.243 2.181 −2.141 −2.573 2.025 −0.455 −0.132 0.531 −1.481 −0.641
0.041 0.021 0.406 0.672 0.126 0.003 0.049 0.121 0.620 −0.044
0.191 0.206 2.265 3.068 1.729 0.006 0.209 0.584 0.772 −0.179
Notes: 1. Estimated with daily rates of return and there are 2516 observations; 2. Estimated with weekly rates of return and there are 521 observations; 3. Estimated with monthly rates of return and there are 119 observations.
9.61in x 6.69in
INBK PCQ PTH AVK BLE INSG RLH RFI MEIP FC
Weekly2
Handbook of Financial Econometrics,. . . (Vol. 4)
Daily1
(Continued )
Impacts of Time Aggregation on Beta Value and R2 Estimations
Table 114.12:
b3568-v4-ch114
3981
page 3981
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3982
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
small defensive companies have increase estimate of beta, which is opposite to what is predicted before. However, we do find one company, BlackRock Municipal Income Trust II (BLE) have a negative long run multiplier (−0.013) when the aggregate time interval is “quarterly”. And this agrees to what we expected in previous text. For more detailed information on dynamic model, see Tables 114.9–114.12. 114.5 Summary In this paper, we have derived that in theory the systematic risk associated with additive rates of return should be frequency-dependent under time aggregation. And the changing trend can be well captured by the Phi multiplier as derived in Section 114.2. On the other hand, due to the nature of multiplicative rates of return, beta estimated by using multiplicative rates of return is also frequency-dependent under time aggregation. Later on, we conduct empirical analysis and find out that with both additive and multiplicative rates of return the systematic risk (β) are frequency-dependent in real world. Furthermore, we observed that the Phi multipliers can well describe the changing direction and magnitude of betas as the time interval grows. In addition, we analyzed beta’s property of frequency-dependency for different types of companies: aggressive and defensive. The results agree with our expectation obtain based on work of Chen (1980). Moreover, we find that for small defensive companies, it is more likely that there estimate of beta would increase as the time interval increases. Bibliography Chen, S.-N. (1980). Time Aggregation, Autocorrelation, and Systematic Risk Estimates – Additive Versus Multiplicative Assumptions. Journal of Financial and Quantitative Analysis 15(1). Gilbert, T., Hrdlicka, C., Kalodimos, J. and Siegel, S. (2014). Daily Data is Bad for Beta: Opacity and Frequency-Dependent Betas, Review of Asset Pricing 4(1). Hawawini, G.A. (1980). Intertemporal Cross-Dependence in Securities Daily Returns and the Short-Run Intervaling Effect on Systematic Risk. Journal of Financial and Quantitative Analysis 15(1). Hawawini, G.A. (1993). Why Beta Shifts as the Return Interval Changes. Financial Analysts Journal 39(3), 73–77. Lee, C.F. and Morimune, K. (1978). Time Aggregation, Coefficient of Determination and Systematic Risk of The Market Model. The Financial Review pp. 131–143. Lee, C.F. and Cartwright, P.A. (1987). Time Aggregation and the Estimation of the Market Model: Empirical Evidence. Journal of Business and Economic Statistics 5(1), 131–143.
page 3982
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Impacts of Time Aggregation on Beta Value and R2 Estimations
3983
Levhari, D. and Levy, H. (1977). The Capital Asset Pricing Model and the Investment Horizon. The Review of Economics and Statistics 59, 92–104. Schwartz, R. A. and Whitcomb, D. K. (1977a). The Time–Variance Relationship: Evidence on Autocorrelation in Common Stock Returns. Journal of Finance 32 (1), 41–55. Schwartz, R. A. and Whitcomb, D. K. (1977b). Evidence on the Presence and Causes of Serial Correlation in Market Model Residuals. Journal of Financial and Quantitative Analysis 12(2), 291–313. Zellner, A. and Montimarquette, C. (1971). A Study of Some Aspects of the Temporal Aggregation Problem in Economic Analysis. Review of Economics and Statistics 335–342; 356–363; (1976), 803–815.
Appendix 114A Note that Cov(Aj Y, Aj X) = Aj Cov(X, Y )Aj = I Cov(Yj ,Xj )I.
(114A.1)
Note that Cov(Yj , Xj ) in equation (114A.1) is an N × N matrix. Due to the stationarity, its N diagonal elements are all equal to the contemporaneous covariance for unaggregated returns represented by Cov(Xt , Yt ). The off-diagonal elements are intertemporal cross-covariances. Let s be a positive integer indicating the order of the intertemporal cross-covariance. Under the assumption of stationarity, for all intertemporal cross-covariances having the same difference, s, they will be the same and there are N − s of these. On the other hand, for all intertemporal cross-covariances having the same difference, −i, they will be the same and there are N − s of them. Denote the lead and lag intertemporal cross-correlation coefficients in unaggregated returns as ρsXY and ρ−s XY . And notice that based on the sign of γi , we can write the following: Cov(Xt , Yt+i ) = Cov(Xt , Yt ),
when i = 0,
(114A.2)
Cov(Xt , Yt+i ) = ρsXY σX σY ,
when i > 0,
(114A.3)
Cov(Xt , Yt+i ) = ρ−s XY σX σY ,
when i < 0.
(114A.4)
where σX and σY are standard deviation of Xt and Yt , respectively. And it follows that Cov(Aj Y, Aj X) = N Cov(Xt , Yt ) +
+
N −1 i=1
N −1 i=1
(N − i)ρsXY σX σY .
(N − i)ρ−s XY σX σY .
(114A.5)
page 3983
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch114
Y. Xiao, Y. Tang & C. F. Lee
3984
Since Cov(Xt , Yt ) = σX σY . γ
(114A.6)
Then, Cov(Aj Y, Aj X) = Cov(Xt , Yt )[N + ΓN ], where ΓN =
N −1 i=1
(N − i)
ρsXY +ρ−s XY ρXY
=
N −1 i=1
(N − i)qi .
(114A.7)
page 3984
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
Chapter 115
Large-Sample Theory Sunil Poshakwale and Anandadeep Mandal Contents 115.1 Introduction . . . . . . . . . . . . . . . . 115.2 Properties of Ordinary Least Squares . . 115.2.1 The Gauss–Markov conditions . 115.2.2 Condition for error and regressor 115.2.3 The Gauss–Markov theorem . . 115.3 Probability Limits (Plims) . . . . . . . . 115.3.1 Limiting distributions . . . . . . 115.3.2 Asymptotic distributions . . . . 115.3.3 Cramer’s theorem . . . . . . . . 115.3.4 The Mann–Wald theorem . . . . 115.4 The Distribution of the OLS Estimator 115.5 Maximum Likelihood Estimation . . . . 115.5.1 Importance of MLE . . . . . . . 115.5.2 MLE in outline . . . . . . . . . 115.5.3 MLE and OLS . . . . . . . . . . 115.5.4 A warning about sample sizes . 115.6 The MLE in More Detail . . . . . . . . 115.6.1 The linear model as an example 115.6.2 MLE in the general case . . . . Sunil Poshakwale Cranfield School of Management, Cranfield University e-mail: sunil.poshakwale@cranfield.ac.uk Anandadeep Mandal University of Birmingham e-mail: [email protected] 3985
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
3986 3987 3987 3988 3988 3989 3989 3990 3991 3991 3992 3993 3993 3993 3994 3994 3995 3995 3995
page 3985
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3986
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
115.6.3 Statistical properties of the MLE . . . . . . . . . . . 115.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3997 3998 3998
Abstract In this chapter, we discuss large sample theory that can be applied under conditions that are quite likely to be met in large samples even when the Gauss–Markov conditions are broken. There are two reasons for using large sample theory. First, there may be some problems that corrupt our estimators in small samples but tends to diminish down as the sample gets bigger. Thus, if we cannot get a perfect small sample estimator, we will usually want to choose the one that will be best in large samples. Second, in some circumstances, the theory used to derive the properties of estimators in small samples just does not work, and working out the properties of the estimators can be impossible. This makes it very hard to choose between alternative estimators. In these circumstances we judge different estimators on their “large sample properties” because their “small (or finite) sample properties” are unknown. Keywords Large-sample theory • Gauss–Markov conditions • Sample properties • Sample estimators.
115.1 Introduction The large-sample theory provides a framework for assessing properties of estimators and statistical tests. The properties of estimators and tests are evaluated under the assumption that the sample size “n” grows indefinitely. In practice, a limit evaluation, n → ∞, is treated as being approximately valid for large finite sample sizes. There are two reasons for using large-sample theory. First, there may be some problems that corrupt our estimators in small samples but these tend to diminish down as the sample gets bigger. Thus, if we cannot get a perfect small-sample estimator, we usually want to choose the one that will be best in large samples. Second, in some circumstances, the theory used to derive the properties of estimators in small samples just does not work and working out the properties of the estimators can be impossible. This makes it very hard to choose between alternative estimators. Fortunately, we have a theory that can be applied under conditions that are quite likely to be met in large samples even when the Gauss–Markov (GM) conditions are broken. In these circumstances, we judge different estimators on their “large sample properties” because their “small (or finite) sample properties” are unknown. The least squares estimator can always be found of course but if the regressors are not independent of the errors at all leads and lags, we will not usually be able to work out statistical properties for small samples. In
page 3986
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Large-Sample Theory
b3568-v4-ch115
3987
this case, the large-sample theory can get us the large-sample properties of the least squares estimator too. However, before we discuss the techniques used in large sample theory, we first explain the properties of ordinary least squares and the GM theorem. 115.2 Properties of Ordinary Least Squares Ordinary least squares (OLS) are not the only way that we could estimate the model. So we will want to know the conditions under which it is the best of the available estimator; this will lead us to consider the Gauss–Markov conditions. Another issue we will be interested in is the relationship between the estimated coefficients and the true coefficients. This will allow us to test a range of ideas that we may have about the true coefficients by looking at their estimates; this leads us to look for the statistical properties of the estimators. 115.2.1 The Gauss–Markov conditions OLS is quite easy to compute, so we have a natural incentive to use if it is likely to get us reliable estimates of the true model. However, we need to know the conditions under which it will perform well. These conditions are called the Gauss–Markov conditions. There is no presumption at this stage that these conditions actually hold. All we do here is look at the conditions under which OLS is well behaved, because if we can argue later that these conditions hold in our sample of data it will save us a lot of work that would otherwise be required to find another method of estimation. Later on we will look at ways to test whether or not the conditions are actually met. The five Gauss–Markov (GM) conditions are: 1. Functional form: The true population relationship is linear. 2. The error term I: The distribution of the population error has a zero mean which states the expected value of the errors is zero. 3. The error term II: The distribution of the population error has a variance that does not change over time. 4. The error term III: The error in time t is uncorrelated with the error in any other period. 5. The error and the regressor ‘x ’: This condition comes in two different forms: (a) the regressors are non-stochastic and
page 3987
July 6, 2020
16:5
3988
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
(b) the regressors and the errors are uncorrelated i. when they occur in the same period or different periods, ii. when they occur only in the same period. It is also convenient to assume that the population errors are normally distributed even though it is not stated as a GM condition. 115.2.2 Condition for error and regressor Note that if the above mentioned condition “5(a)” holds then so does “5(b)-i”. Similarly if “5(b)-i” holds then so does “5(b)-ii”. But it is possible for only “5(b)-ii” to be satisfied. When we say that the GM conditions are satisfied, we usually mean that the variables are non-stochastic, i.e., GM condition “5(a)” holds. We also describe a model for which the GM conditions hold as a “classical model”. However, just about everything that is true about the usefulness of least squares when “5(a)” holds is also true when it does not, provided that “5(b)-i” does — it’s just harder to prove mathematically that least squares is still useful. So you should not be too concerned about the difference between “5(a)” and “5(b)-i”. It is when neither of them holds that we run into some problems — we discuss this later. Standard notation: The complete, maintained hypothesis is often written as yt = α + βxt + εt , where ε ∼ i.i.d.(0, σ), the term “i.i.d” means that the population errors are assumed to be independently and identically distributed. With the extra assumption of normality it would be written as ε ∼ n.i.d. We will use the following notation for the population moments of the error: the mean will be μ = 0, while the variance will be “σ 2 ”, i.e., for the error we will not use a subscript. 115.2.3 The Gauss–Markov theorem If the GM conditions hold, then (1) the least squares estimator of “β” is unbiased; (2) the least squares estimator of “β” has a smaller variance than any other linear estimator. As discussed earlier, when we say the GM conditions “hold ”, we usually mean conditions “1–4”, plus 5(a). In fact, GM “5(a)” is not necessary for the GM theorem to hold — it just makes the proof easier. As long as we
page 3988
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
Large-Sample Theory
9.61in x 6.69in
b3568-v4-ch115
3989
have GM “5(b)-i”, the regressors can be stochastic and the theorem will still apply. 115.3 Probability Limits (Plims) The least squares estimators are unbiased when the GM conditions hold. The probability limit is a similar concept but it applies only to estimators based on large samples. An estimator is “consistent” if its probability limit is the true coefficient. This means that, as the sample size increases, the probability that the estimator is not equal to the true value goes to zero. We write this as p lim(b) = β.
(115.1)
An estimator can be biased (i.e., its expected value might not equal to the true value when it is calculated using a small sample), but can be consistent (if the bias disappears as the sample size increases). A rather trivial example of this is the following, where we assume the GM conditions to hold: 1 (115.2) β˜ = b + , n ˜ is our new estimator, and “b” is the ordinary least squares estimawhere “β” tor. In this case the expected value of our new estimator is biased. However, the bias goes to zero as “n” tends to infinity, so the estimator is consistent. We do not, of course, put trivial biases into our estimators but under some conditions they do exist even though we cannot measure them. In these circumstances, we will want to know whether or not biases disappear as the sample size increases. Properties of Plims: One of the problems about random variables is that 1 1 = . (115.3) E x E(x) This is why we cannot easily take expectations of the OLS estimator when “X” is stochastic. However, plims showcase a very helpful property as given below along with all the other properties that expected values have: 1 1 = . (115.4) p lim x p lim(x) 115.3.1 Limiting distributions Suppose you were asked to choose between two consistent estimators in circumstances where the small-sample properties were unknown. How would
page 3989
July 6, 2020
16:5
3990
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
you do it? When the GM conditions were met we were able to establish that the OLS estimator had the smallest variance of any unbiased estimator, regardless of sample size. Now, by relying on consistency rather than expectation, we are doing something different — we are letting the sample size tend to infinity for both estimators and in both cases, they are exactly equal to the true value, i.e., they both have a variance of zero at the limit. Thus we cannot use variance as a criterion this time. The reason is that while there is an element of variability in the estimator, it is being mishandled by being divided by the sample size. However, if do want to have a measure of variability, we can instead multiply the esti√ mator’s deviation by the true value by “ n”. We can then find a “limiting distribution”. Limiting distributions for sample means: The clearest example of this is provided by the Central Limit Theorem: If {x1 , . . . , xn } are random samples from any distribution with a finite ¯n = ( n1 ) ni=1 xi we have mean “μ” and finite variance “σ 2 ”, then for x E(¯ xn ) = μ, var(¯ xn ) =
σ2 . n
(115.5) (115.6)
Hence the variance of the sample estimate of the mean will go to zero as the sample gets larger. However, if we multiply every one of our “x” observations √ √ by “ n” the variance of the mean of the sample of [x n] would be “σ 2 ”. In addition to which, as “n” became larger the distribution would come to resemble a normal form: √
d
n(¯ xn − μ) → N [0, σ 2 ].
(115.7)
Consider taking a lot of samples of size “n”. If n = 1 the means of samples will be distributed in the same way as “x”. Whatever the initial distribution of “x” be, as “n” gets bigger the means will start to look as if they were drawn from a normal distribution. Hence the appearance of the “N ” in the last equation, i.e., the limiting distribution is the normal, is N (0, σ 2 ). 115.3.2 Asymptotic distributions We are interested in the distribution of our estimator “¯ xn ”, where “n” is finite. The problem is that we cannot find this exactly so we approximate it with something called the “asymptotic distribution”, i.e., the approximate
page 3990
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Large-Sample Theory
b3568-v4-ch115
3991
distribution that our estimator would have, if we had a large (but finite) sample. In nearly all cases in econometrics, we base the asymptotic distribution on the limiting distribution. Asymptotic Distributions for Sample Means: Given the limiting distribution of “¯ xn ” as √ d n(¯ xn − μ) → N [0, σ 2 ]. (115.8) √ We divide by “ n” and add “μ” and write the asymptotic distribution as a
x ¯n → N [μ, σ 2 /n] or
x ¯n ∼ AN [μ, σ 2 /n].
(115.9)
xn ) = σ 2 /n. This allows us to state that: p lim(¯ xn ) = μ and a var(¯ Thus even though the variance of “¯ xn ” goes to zero we have its approximate variance when “n” is large but finite. We describe “¯ xn ” as being “asymptotically normally distributed” with “asymptotic variance” σ 2 /n. 115.3.3 Cramer’s theorem Cramer’s theorem will be very useful for later discussion. Suppose that ηn = Hn ψn where “ψn ” is a n × 1 vector of random variables with a limiting (multivariate) normal distribution with mean “μ” and variance “Ω”, and “Hn ” is an m × n matrix with the property: p lim Hn = H, d
Then ηn → N (Hµ , HΩH ).
(115.10) (115.11)
Thus we can find the limiting distribution of the product of these variables without knowing the distribution of “Hn ”, all we need is its plim. 115.3.4 The Mann–Wald theorem This is the last bit of large-sample theory we will need. Let “Xn ” be the matrix of data on the regressors with sample size “n”, and assume that Xn Xn = Q. (115.12) p lim n Note that (Xn Xn ) grows without limit, which is why we divide by “n”. Even so, if “Xn ” were to grow fast enough, the limit might not converge — so this is an assumption and not a condition that is always true. What this means is that the variance–covariance matrix of the regressors is finite.
page 3991
July 6, 2020
16:5
3992
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
Now let “ε” be a set of error terms that are i.i.d. (but not necessarily normal) with zero mean and finite variance. Finally, assume that E(xt εt ) = 0 for all periods, so all we assume now is GM “5(b)-ii”; so for the first time we are thinking about dropping GM “5(b)-i”. The Mann–Wald theorem states that under these conditions: Xn ε = 0, (115.13) p lim n Xn εn d √ → N [0, σ 2 Q]. n
(115.14)
115.4 The Distribution of the OLS Estimator We are now in a position to work out the properties of the OLS estimator when “Xn ” is stochastic and related to the errors with some lags or leads, but not contemporaneously. This section states the distribution of the OLS estimators when GM conditions “5(a)” and “5(b)-i” do not hold but “5(b)-ii” does. We start by revisiting the deviation of the estimator from the true value, for a sample of size “n”: (βˆn − β) = (Xn Xn )−1 Xn εn .
(115.15)
Now, by the Mann–Wald theorem, the last product on the RHS has the limiting distribution: Xn εn d √ → N (0, σ 2 Q). n
(115.16)
The LHS here is now going to play the role of “ψn ” in Cramers theorem. Note that the Mann–Wald theorem requires Xn Xn −1 = Q−1 . (115.17) p lim n Applying Cramer’s theorem we get √ (Xn Xn ) −1 Xn εn ˆ √ , n(β − β) = n n d n(βˆ − β) → N (0, Q−1 (σ 2 Q)Q−1 ).
(115.18) (115.19)
Therefore, the asymptotic distribution is βˆ − β ∼ AN [0, σ 2 Q−1 ].
(115.20)
page 3992
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Large-Sample Theory
b3568-v4-ch115
3993
Finally, we have to use the best estimate for Q, i.e., (Xn Xn )−1 . A Welcome Result: So the asymptotic distribution has the same formula as the exact distribution when the GM conditions are satisfied. When we have GM “5(b)-i” or better we have βˆ − β ∼ N (0, σ 2 (X X)−1 ),
(115.21)
and when GM “5(b)-i” breaks down, and when we still have GM “5(b)-ii”, βˆ − β ∼ AN (0, σ 2 (X X)−1 ).
(115.22)
Gauss–Markov Theorem: We have found the distribution of the OLS estimator in large samples under much less strict conditions than the GMs. However, we have not established any sort of GM theorem for it, i.e., we do not know whether or not it is the best available distribution. To do this we need to find the Maximum Likelihood Estimator (MLE). The MLE has the general property of being the best estimator and we can derive a lot of different estimators for special conditions from it. One of these is the OLS estimator, which will show that OLS is the best even when we relax the GM5 conditions from “5(a)” to “5(b)-ii”. 115.5 Maximum Likelihood Estimation 115.5.1 Importance of MLE OLS performs well in specific circumstances. MLE, on the other hand, performs well in a much wider range of circumstances even when we have limited correlation between the regressors and the error. From the OLS perspective, the best thing about MLE is that although it has many different forms, depending upon the sort of model to be estimated, when the other GM conditions (i.e., 1, 2, 3, and 4) hold, it actually turns out to have the same form as the OLS. So we can use the properties of MLE to give us the properties of OLS when we have regressor-error problem. MLE also provides us with three test procedures, which will prove to be very useful later. There is a limitation however: One of the things MLE cannot cope with is a failure of the GM condition “5(b)-ii”. 115.5.2 MLE in outline The essential idea of MLE is that the data we observe are generated by some model, and we know that the model has some stochastic elements in it, i.e., to
page 3993
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3994
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
some extent the observations arise by chance. Provided that we are willing to decide on a functional form (GM (1) condition), MLE will set about finding the parameters of that form which make the data we actually observe. This appears to make sense but there is a problem here. For example, suppose that you have a dice but you do not know what numbers are on it, and you are not allowed to see it. You have an assistant who rolls the dice for you and tells you which number comes up. This is the only information you have, and you have to work out which are the other numbers on the dice. If, after several thousand throws, only the number 1 has come up, what set of numbers would make that outcome more likely than any other? Suppose that you only had one throw, and get number 1, what set would make that most likely? The answer is the same in both cases, but you would be much happier about the proposition that the dice had only ones in the case where you had thousand observations. One thing about MLE is that, unless the GM conditions hold, we need quite a lot of observations for it to work. This is where large sample theory comes in. 115.5.3 MLE and OLS MLE is the same as OLS when the GM conditions, including 5(a), hold. It is also the same when GM condition 5(a) is broken but 5(b)-i holds. It is still the same when 5(b)-i is broken but 5(b)-ii holds. So the properties of the MLE estimator can be used to find the properties of the OLS estimator in each of these cases. In particular, recall that OLS is Best Linear Unbiased Estimator (BLUE) only when the GM conditions hold up to 5(b)-i. If these conditions do not hold, we would not know the small-sample properties of OLS. Even though we may know its limiting distribution, we would not know how OLS compares with other potential estimators. 115.5.4 A warning about sample sizes Until now, we have worked out the properties of the least squares estimator without talking much about the size of the sample. For example, E(bOLS ) = β is true whether the sample has five observations or five thousand. When we lose GM 5(b)-i however, we do not know what the properties of OLS are when the sample is “small”. Also we did not discuss what they would be if the sample were large. We discuss both below. When the condition GM (5) breaks down it is still possible to find the properties of OLS for large samples, but not for small ones; for this we use “large sample” or “asymptotic” theory. The important point is that “large
page 3994
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Large-Sample Theory
b3568-v4-ch115
3995
sample” theory applies only to large samples, and, when the GM conditions are not fully satisfied, large-sample theory is the best that we have. If the conditions are satisfied, however, the theory is applicable to all samples. 115.6 The MLE in More Detail MLE can be performed for almost any error distribution, which is part of its attraction. The MLE is general enough for us to relax GM (1), GM (2), GM (3), GM (4), GM 5(a), GM 5(b)-i. We cannot ignore these conditions, but we can replace each one of these by something less strict. For the time being though, we keep GM (1–4) because it makes the algebra easier. 115.6.1 The linear model as an example We have assumed all along that the errors are normally distributed. Thus the joint density function is given by
1 pdf(ε) = √ e σ 2π However, from GM (1) condition we have
−ε ε 2 2σ
.
(115.23)
y = Xβ + ε.
(115.24)
Here we can substitute for “ε”. If we now take the set of observations as given we can find the values of “β” and “σ” that would have made these observations more likely than any others. So we started with the probability distribution function (pdf) giving us probabilities for “y” as functions of “β” and “σ” and we now think of it as a “likelihood function” giving us the parameters as a function of the observations, i.e.,
−(y−Xβ) (y−Xβ) 1 2σ 2 . (115.25) L(β, σ|y) = √ e σ 2π In practice we work with the “log” of the likelihood function because it is easier to differentiate.
115.6.2 MLE in the general case An advantage that MLE has over OLS is that we do not have to assume that the model is linear. The other advantage is that it is more efficient than OLS in some circumstances. All we need is the equation for the ε s so that we can construct the likelihood function. We then differentiate the log of the likelihood function with respect to the parameters (γ = (β, σ)) and set the first derivatives to zero. We can then solve for the MLE estimators of each
page 3995
July 6, 2020
16:5
3996
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
parameter. Furthermore, since we are trying to maximize the log-likelihood we want the second derivative to be negative. MLE: The Theory: Whatever the distribution of the random components, we can write the log likelihood function. We then find its log and differentiate it, and solve it to find the estimator “g”: D log L(gMLE ) =
∂D log L(gMLE ) = 0 and ∂gMLE
D 2 log L(gMLE ) =
∂D 2 log L(gMLE ) < 0. ∂gMLE ∂gMLE
MLE for the Linear Model: We now find the values of the parameters that maximize the likelihood function given “y”. n n 1 2 2 log 2π − log σ − (y − Xβ) (y − Xβ). log L(β, σ ) = − 2 2 2σ 2 (115.26) Differentiating with respect to “β” gives 1 [−2X (y − Xβ)] = 0. − 2σ 2 Similarly differentiating equation (115.26) and with respect to “σ 2 ” we get n 1 + (y − Xβ) (y − Xβ). − 2σ 2 2σ 4 Solving these and evaluating for the MLE estimators gives bMLE = (X X)−1 X Y
and s2MLE =
e e . n
The MLE estimator of “β” is just the OLS estimator. However, the MLE of “σ 2 ” has “n” as the denominator rather than “n − k”. In large samples “n” and “n − k” are going to be almost exactly the same, and since MLE is a large sample estimator only, it cannot distinguish between them: both the OLS and MLE estimators are consistent. The second derivative is derived as −2 σ (X X) 0 2 . D log L(gMLE ) = − 0 T /2σ 4
page 3996
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch115
Large-Sample Theory
3997
We can also calculate the maximized value of the log likelihood as max(log L) =
−n [1 + log(2πs2MLE )]. 2
115.6.3 Statistical properties of the MLE The expected value of the second derivative of the log-likelihood is called the information matrix. The elements of the information matrix grow in magnitude as “n” gets larger so we scale the matrix by dividing by “n”. It can then be shown that the limit of the scaled information matrix is the variance of the limiting distribution of the parameter estimates as “n” tends to infinity, i.e., limiting distribution is normal. It can also be shown that the lowest possible variance that a consistent estimator can achieve is the inverse of the information matrix. This limit is called the “Cramer– Rao lower bound”. Because the MLE is asymptotically normally distributed, and since its limiting variance is the inverse of the information matrix, it is “asymptotically efficient”. We have, by definition E[D 2 log L(gMLE )] = I(gMLE ), and it can be shown that, subject to E(X ε) = 0, gMLE ∼ AN (γ, I(gMLE )−1 ),
(115.27)
where I(bMLE )−1 is the information matrix, evaluated using the estimated coefficients. Properties of MLE in the Linear Case: The two primary properties of MLE are p lim(gMLE ) = γ, a var(gMLE ) = I −1 . Since we can partition the “γ” vector and the information matrix, we get a var(bMLE ) = s2MLE (X X)−1 . The results are just as for OLS. If the GM failure is only in 5(a) and 5(b)-i conditions, MLE turns out to be OLS, which means that OLS has all the properties of MLE given above (but note that they apply to large samples only).
page 3997
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
3998
9.61in x 6.69in
b3568-v4-ch115
S. Poshakwale & A. Mandal
For models with regressors that are contemporaneously uncorrelated with the errors, the MLE is • consistent; • asymptotically normally distributed; • asymptotically efficient. In other words, no other consistent and asymptotically normal estimator has a smaller asymptotic covariance matrix. Under GM conditions (1, 2, 3, 4 and up to 5(b)-ii), MLE and OLS are identical, so OLS has the above properties. However, just because MLE is consistent does not mean that it is unbiased, i.e., in small samples it can produce biased results. Also, if the GM conditions do hold, OLS is BLUE, and so is MLE. 115.7 Summary To summarize, the assumptions underlying the finite-sample properties of the OLS estimator and its associated test statistics are violated very often. The finite-sample theory heavily relies on the exogeneity of regressors, the normality of the error term, and the linearity of the regression equation. The large-sample theory offers an alternative approach when these assumptions are violated. The theory derives an approximation to the distribution of the estimator and its associated statistics assuming that the sample size is sufficiently large. Instead of making assumptions with regard to the sample size, the large sample theory makes assumptions on the stochastic process that generates the sample. Bibliography Bollerslev, T. and Wooldridge, J.M. (1992). Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time-Varying Covariances. Econometric Reviews 11(2), 143–172. Boswijk, H.P. and Paruolo, P. (2017). Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I (2) VAR Systems. Econometrics 5(3), 28. Carter, R. (2018). Principles of Econometrics. Wiley. Chiappori, P.A. and Salani´e, B. (2016). The Econometrics of Matching Models. Journal of Economic Literature 54(3), 832–861. Ferguson, T.S. (2017). A Course in Large Sample Theory. Routledge. Fern´ andez-Val, I. and Weidner, M. (2016). Individual and Time Effects in Nonlinear Panel Models with Large N, T. Journal of Econometrics 192(1), 291–312. Kiviet, J., Pleus, M. and Poldermans, R. (2017). Accuracy and Efficiency of Various GMM Inference Techniques in Dynamic Micro Panel Data Models. Econometrics 5(1), 14. Lehmann, E.L. and Romano, J.P. (2005). Basic Large Sample Theory. Testing Statistical Hypotheses. Springer, 1419–481.
page 3998
July 6, 2020
16:5
Handbook of Financial Econometrics,. . . (Vol. 4)
Large-Sample Theory
9.61in x 6.69in
b3568-v4-ch115
3999
Pan, J.X. and Fang, K.T. (2002). Maximum Likelihood Estimation. In Growth Curve Models and Statistical Diagnostics. Springer, New York, pp. 77–158. Royall, R. (2017). Statistical Evidence: A Likelihood Paradigm. Routledge. Scholz, F.W. (2014). Maximum likelihood estimation. Wiley StatsRef, Statistics Reference Online. Sen, P.K. and Singer, J.M. (2017). Large Sample Methods in Statistics (1994): An Introduction with Applications. CRC Press. Stewart, J. (2018). Understanding Econometrics. Routledge. White, H. (1982). Maximum Likelihood Estimation of Misspecified Models. Econometrica: Journal of the Econometric Society 1, 1–25.
page 3999
This page intentionally left blank
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Chapter 116
Impacts of Measurement Errors on Simultaneous Equation Estimation of Dividend and Investment Decisions Cheng Few Lee and Fu-Lai Lin Contents 116.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 116.2 Measurement Error and Simultaneous Equation Estimation . . . . . . . . . . . . . . . . . . . . . . . 116.2.1 Just-identified equation case . . . . . . . . . 116.2.2 Over-identified equation case . . . . . . . . . 116.3 Measurement Error Problem in the Dividend and Investment Decisions . . . . . . . . . . . . . . . 116.3.1 Asymptotic bias problem . . . . . . . . . . . 116.3.2 Bias correction in the simultaneous equations model . . . . . . . . . . . . . . . . . . . . . . 116.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
4002
. . . . . . . . . . . . . . .
4003 4003 4007
. . . . . . . . . .
4010 4010
. . . . . . . . . . . . . . .
4017 4022 4023
Abstract This chapter analyzes the errors-in-variables problems in a simultaneous equation estimation in dividend and investment decisions. We first investigate the effects of measurement errors in exogenous variables on the estimation of a just-identified or an over-identified simultaneous equations system. The impacts of measurement errors on the estimation of structural parameters are discussed. Moreover, we use a simultaneous system in terms of dividend and investment policies to illustrate how theoretically the unknown variance Cheng Few Lee Rutgers University e-mail: cfl[email protected] Fu-Lai Lin Da-Yeh University e-mail: fl[email protected] 4001
page 4001
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
4002
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
of measurement errors can be identified by the over-identified information. Finally, we summarize the findings. Keywords Errors-in-variables • Simultaneous equations system • Estimation • Identification problem • Investment decision • Dividend policy • Two stage least square method.
116.1 Introduction The errors-in-variables problems in finance research arise from using incorrectly measured variables or proxy variables in regression models. The presence of measurement errors causes biased and inconsistent parameter estimates and leads to erroneous conclusions to various degrees in finance analysis. For example, it is well known that the measurement of accounting earnings is almost invariably subject to error. In regression analysis, when a variable subject to measurement error, its coefficient is biased downward. However, the measurement error also imparts a downward bias to the coefficient estimate of a variable measured without error. As such, the problem of measurement errors is one of the most fundamental issues in applying the econometric model to finance research. Either in a single equation or a simultaneous equation system, the errorsin-variables problem can be regard as an identification problem. In single equation case, the parameters of interest cannot be identified unless some additional information on the variances of the measurement error is available. Traditionally, the additional information could be available as the measurement error variances associated with explanatory variable or the ratio of measurement error variances is known. Similarly, for a simultaneous equation system, a just-identified system will be become under-identified when some of exogenous variables are subject to error. Then, the rank-order condition of a simultaneous equation system cannot be employed to test the identifiability unless the variances of the measurement error are known. However, in an over-identified system, Goldberger (1972) demonstrated that the errors of measurement need not destroy identifiability. Moreover, Goldberger (1972), Lee (1973), and Hsiao (1976) have shown that the unknown measurement error variance can be estimated by using the over-identifying restrictions on the reduced-form coefficients. Therefore, it raises our interests to explore how combining prior restrictions required to identify the structural parameters and the prior information required to identify the measurement error variance to deal with problems associated with measurement error.
page 4002
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4003
To sum up, this chapter aims to study the impact of measurement error on the parameter estimates of variables measured with error. The bias problems caused by the exploratory variables measured with error are clearly clarified. Here, a single equation as well as a simultaneous equation system are investigated. Moreover, we demonstrated how the unknown measurement error variance can be identified by using the relationship between the structural form coefficient and reduced form coefficient. In addition, we also discuss the effects of measurement errors on the estimation of simultaneous equation system for the investment and dividend decisions. 116.2 Measurement Error and Simultaneous Equation Estimation Here, we will investigate the effects of measurement errors in exogenous variable on the estimation of structural parameters in a simultaneous equations system. The effects of measurement errors on the estimation of a justidentified or an over-identified simultaneous equations system are both investigated in this section. 116.2.1 Just-identified equation case First, we consider the following just-identified simultaneous equations system: y1 = α1 y2 + β1 ξ1 + v 1 ,
(116.1a)
y1 = α2 y2 + β2 ξ2 + v2 .
(116.1b)
The structural from model (116.1a)–(116.1b) may be written in the reducedform model y1 = π11 ξ1 + π12 ξ2 + u1 ,
(116.2a)
y2 = π21 ξ1 + π22 ξ2 + u2 .
(116.2b)
Then the corresponding reduced-form coefficients are as follows: π11 =
−β 1 α2 , α1 − α2
π12 =
β2 α1 , α1 − α2
u1 =
α1 ν2 − α2 ν1 α1 − α2
π21 =
−β1 , α1 − α2
π22 =
β2 , α1 − α2
u2 =
ν2 − ν1 . α1 − α2
(116.3)
Here, we will investigate the effects of measurement errors in an exogenous variable on the structural parameters of a just-identified equation
page 4003
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4004
system. Suppose that the covariance matrix of true exogenous variables is as follows: σ12 σ12 . Cov(ξ1 , ξ2 ) = σ12 σ22 Assume that there is a random error of measurement in ξ1 , x1 = ξ1 + η1 , but not in ξ2 . Here, η1 is a random variable has zero mean and constant variance, ση2 . For simplicity, we assume that measurement errors are uncorrelated with the true values. In this section, we will investigate how the estimation of structural parameters affects by the error of measurement in ξ1 . There exist many methods that can be used to estimate a just-identified simultaneous equations system. Here, we restrict our focus to only indirect least square and two-stage least square methods. 116.2.1.1 Indirect least square method First, the effects of error of measurement on the estimation of indirect least square method is shown in the following. From Cochran (1968) and Lee (1973), application of erroneous data in estimating (116.2a) and (116.2b) leads to the following estimation of regression parameters: plim π ˆ11 =
π11 κ(1 − ρ) , 1 − κρ2
plim π ˆ12 = π12 + plim π ˆ21 =
π1 π1·2 (1 − κ) , 1 − κρ2
π21 κ(1 − ρ) , 1 − κρ2
plim π ˆ22 = π22 +
π2 π1·2 (1 − κ) . 1 − κρ2
(116.4) (116.5) (116.6) (116.7)
Here, κ = σ12 /(ση2 + σ12 ) is the ratio called the reliability of x1 · ρ is the correlation coefficient between ξ1 and ξ2 · π1·2 = ρσ1 /σ2 . From (116.4)–(116.7), we can find that all the estimators of reduced form parameters are affected by the measurement error of ξ1 . From (116.3), we can identify the structural parameters as follows: α1 =
π12 , π22
α2 =
π11 , π21
β1 = −π21 (α1 − α2 ),
(116.8) β2 = π22 (α1 − α2 ).
page 4004
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4005
From (116.4)–(116.7), we can obtain the estimators of structural parameters, α1 , α2 , β1 , and β2 as follows: π12 + π ˆ12 = plim α ˆ1 = π ˆ22 π22 + plim α ˆ2 =
π11 π1·2 (1−κ) 1−κρ2 π21 π1·2 (1−κ) 1−κρ2
π ˆ11 π11 = , π ˆ21 π21
,
(116.9)
(116.10)
π21 κ(1 − ρ) π21 (ˆ α1 − α ˆ2 ) = (ˆ α1 − α ˆ 2 ), (116.11) plim βˆ1 = −ˆ 1 − κρ2 π21 π1·2 (1 − κ) ˆ ˆ22 (ˆ α1 − α ˆ2 ) = π22 + ˆ 2 ). (ˆ α1 − α plim β2 = π 1 − κρ2 (116.12) From (116.9), (116.11) and (116.12), we can find that the estimator of α1 , β1 , and β2 are affected by the measurement errors of ξ1 . In particular, only the estimate of β1 is free from measurement error of ξ1 . Moreover, from (116.12), we can find that the estimate of β2 is affected by the measurement errors of ξ1 even if ξ2 is not measured with error. We can conclude that, in simultaneous equations system, an equation of system cannot be estimated consistently even if all variables in that equation are measured correctly. 116.2.1.2 Two-stage least square method The two-stage least square estimation uses all the exogenous variables in this system as instruments to obtain the predictions of endogenous variables. In the first stage, we regress endogenous variable on all exogenous variables in the system to receive the predictions of the endogenous variable. In the second stage, we regress dependent variable of each equation on instruments of other endogenous variables and explanatory variables. From equations (116.2b), (116.6), and (116.7), y2 in the first stage can be estimated by π21 κ(1 − ρ) π21 π1·2 (1 − κ) (ξ1 + η1 ) + π22 + ξ2 yˆ2 = 1 − κρ2 1 − κρ2 π21 π1 · 2 (1 − κ) 1−κ = π21 ξ1 + π22 ξ2 + −π21 (ξ1 + η1 ) + π21 + ξ2 1 − κρ2 1 − κρ2 1−κ (π1·2 ξ2 − ξ1 − η1 ) = y2 + θ. = y2 + π21 η1 + 1 − κρ2
page 4005
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4006
It is obvious that yˆ2 contains a measurement error component. The second stage requiring that we regress y1 on yˆ2 , x1 and ξ2 as follows: y1 = α1 yˆ2 + β1 x1 + ν1 ,
(116.13a)
y1 = α2 yˆ2 + β2 ξ2 + ν2 .
(116.13b)
Here yˆ2 = y2 + θ, 2 Var(θ) = π21
Cov(y2 , θ) = π21
κ − κρ2 1 − κρ2
2
1−κ ση2 + 1 − κρ2
2
2 (σ12 + π1·2 σ22 ) ,
1−κ π1·2 π22 σ22 − π21 σ12 + (π1·2 π21 − π22 )Cov(ξ1 , ξ2 ) . 2 1 − κρ
Following (116.4) and (116.5), the estimation of structural parameters are as follows: plim α ˆ 1 =
α1 (κ4 − ρ κ3 κ) + β1 b2 ·1 κ3 (1 − κ) , 1 − ρ 2 κκ3
(116.14)
α b κ(1 − κ3 ) + β1 b1·2 κ3 (1 − ρ 2 κ4 ) , plim βˆ1 = 1 1·2 1 − ρ 2 κκ3
(116.15)
where ρ is the correlation coefficient between ξ1 and yˆ2 , b1·2 = ρ σσy1 , σy22
σy22 is the variance of y2 . κ3 = σ2 +Var(θ)+2Cov(y , 2 ,θ) y2 σy2 ˆ 2 and b2 ·1 = ρ σ1 . Similarly, the estimation of α
plim
α ˆ 2
=
α2
2
κ4 =
σy22 +Cov(y2 ,θ) , 2 σy2 +Var(θ)+2Cov(y2 ,θ)
βˆ2 can be written as
κ4 − ρ2 κ3 , 1 − ρ2 κ3
(1 − κ3 ) , plim βˆ2 = β2 + α2 b2·2 1 − ρ2 κ3
(116.16) (116.17)
where ρ is the correlation coefficient between ξ2 and yˆ2 . Also, b2·2 = ρ σ2 /σy2 . From (116.14)–(116.17), we can find that the estimates of secondstage structural parameters, α1 , β1 , α2 , and β2 are affected by the measurement error of ξ1 . Therefore, we can conclude that the biases in first stage carry over to second stage and that the estimators of second-stage parameters are affected by the measurement error of ξ1 .
page 4006
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Impacts of Measurement Errors
4007
116.2.2 Over-identified equation case Consider the following over-identified simultaneous equations system: y1 = α1 y2 + β1 ξ1 + ν1 ,
(116.18a)
y1 = α2 y2 + β2 ξ2 + β3 ξ3 + ν2 .
(116.18b)
Suppose that the covariance matrix of true exogenous variables is as follows: ⎡ σ11 ⎢ Cov(ξi , ξj ) = ⎣σ21
σ12 σ22
⎥ σ23 ⎦.
σ31
σ32
σ33
σ13
⎤
The reduced forms of model (116.18a) and (116.18b) are y1 = π11 ξ1 + π21 ξ2 + π32 ξ3 + u1 ,
(116.19a)
y2 = π12 ξ1 + π22 ξ2 + π32 ξ3 + u2 .
(116.19b)
Then the corresponding reduced form coefficients are as follows: π11 =
−β 1 α2 β2 α1 β3 α1 α1 ν2 − α2 ν1 , π21 = , π31 = , u1 = , α1 − α2 α1 − α2 α1 − α2 α1 − α2
π12 =
−β 1 β2 β2 ν2 − ν1 , π22 = , π32 = , u2 = . α1 − α2 α1 − α2 α1 − α2 α1 − α2 (116.20)
From the preceding reduced form coefficients, we obtain α1 =
π21 π31 = . π22 π32
(116.21)
From this, we know that α1 is over-identified. By employing the relationship between the structural parameters and the reduced form parameters, we have π21 π22 (116.22) = 0. π31 π32 Assume that there is a random error of measurement in ξ1 , x1 = ξ1 + η1 , but not in ξ2 and ξ3 . Here, η1 is a random variable has zero mean and constant variance, ση2 .
page 4007
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4008
116.2.2.1 Indirect least square method Goldberger (1972) uses only the standard over-identified information to identify the variance of measurement error. Suppose that the erroneous data is employed, the reduced-form model will be estimated by x1 + π21 ξ2 + π31 ξ3 + u1 , y1 = π11
(116.23a)
x1 + π22 ξ2 + π32 ξ3 + u2 . y2 = π12
(116.23b)
From Goldberger (1973), we know that the estimator of parameters in models (116.19a), (116.19b) and models (116.23a), (116.23b) have the corresponding condition as follows: ⎞ ⎛ ⎞⎛ ⎞ ⎛ 1 0 0 π11 π12 π11 π12 1 ⎟ ⎜ ⎟⎜ ⎜ ⎟, 0 ⎝η1 φ21 1 − η1 φ11 ⎠ ⎝π21 π22 ⎠ ⎝π21 π22 ⎠ = 1 − η1 φ11 π31 π32 0 1 − ηφ11 π31 π32 η1 φ31 (116.24) where the φij is the ijth element of inverse matrix of Φ, Φ = E(X X) is the covariance matrix of observed regressors X = [ x1 ξ2 ξ3 ]. Recall that we have the restrictions on reduced-form coefficients in the over-identified system as follows: π π 1 22 21 (116.25) = π31 π32 1 − η1 φ11 η1 φ21 π + (1 − η1 φ11 )π 11 21 η1 φ21 π12 + (1 − η1 φ11 )π22 × = 0. + (1 − η φ )π η1 φ31 π11 1 11 31 η1 φ31 π12 + (1 − η1 φ11 )π32 Then we can have π31 − π21 π32 )/[φ11 (π22 π31 − π21 π32 ) + φ21 (π11 π32 − π31 π12 ) η1 = (π22 π12 − π11 π22 ). + φ31 (π21
(116.26)
It is obvious that the condition (116.19) suffices to identify η1 because of the parameters, φ’s and π’s are estimable from observed data y1 , y2 , x1 , ξ2 and ξ3 . That is, we can use the priori restrictions on the reduced-form coefficients to identify the variance of measurement errors. Then, we can apply the classical method to identify both the reduced form and structural parameters. The reduced-form parameters can be identified by the condition (116.24), whence the condition (116.20) is used to identify the structural parameters.
page 4008
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Impacts of Measurement Errors
4009
116.2.2.2 Two-stage least square method The two-stage least square estimation uses all the exogenous variables in this system as instruments to obtain the predictions of endogenous variables. For example, y2 in the first stage can be estimated by y2 = α12 ξ1 + α22 ξ2 + α32 ξ3 + v 2 .
(116.27a)
Since ξ1 is unobserved, we should replace it by x1 to estimate α12 , α22 , and α32 . The observed relationship corresponding to (116.27a) can be written as y2 = α12 x1 + α22 ξ2 + α32 ξ3 + v2 ,
(116.27b)
where x1 = ξ1 + η1 . η1 has zero mean and constant variance, ση2 . From Eqs. (116.27a) and (116.27b), we can obtain two equivalent expressions for Cov(y2 x1 ). From (116.27a), we have αi2 σi1 . Cov(y2 x1 ) = Cov[(α12 ξ1 + α22 ξ2 + α32 ξ3 + v 2 )(ξ1 + η1 )] = i
(116.28a) From (116.27b), we have Cov(y2 x1 ) = Cov(α12 (ξ1 + η1 ) + α22 ξ2 + α32 ξ3 + v2 )(ξ1 + η1 )] αi2 σi1 + ση2 . (116.28b) = i
Similarly, based on two equivalent expressions for Cov(y2 ξj ), j = 2, 3, we can also obtain the following condition: αi2 σij = αi2 σij (i = 1, 2, 3; j = 2, 3). (116.29) i
i
From (116.28a), (116.28b) and (116.29), we can solve α12 , α22 , and α32 in terms of α12 , α22 , and α32 as follows: α12 κ(1 + c1·2 c1·3 c2·3 + c1·3 c3·2 c2·1 − c2·3 c3·2 − c1·2 c2·1 − c3·1 c1·3 ) , 1 + κ(c1·2 c1·3 c2·3 + c1·3 c3·2 c2·1 − c1·2 c2·1 − c3·1 c1·3 ) − c2·3 c3·2 (116.30a) α12 (1 − κ)(c1·2 − c1·3 c3·2 ) = α22 + , 1 + κ(c1·2 c1·3 c2·3 + c1·3 c3·2 c2·1 − c1·2 c2·1 − c3·1 c1·3 ) − c2·3 c3·2 (116.30b) α12 (1 − κ)(c1·3 − c1·2 c2·3 ) = α32 + . 1 + κ(c1·2 c1·3 c2·3 + c1·3 c3·2 c2·1 − c1·2 c2·1 − c3·1 c1·3 ) − c2·3 c3·2 (116.30c)
α12 = α22 α32
page 4009
July 6, 2020
16:6
4010
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
Here, κ = σ12 /(ση2 + σ12 ), ci·j = σij /σj2 . From (116.30a)–(116.30c), we can conclude that the biases in first stage carry over to second stage and the estimates of structural parameters are affected by the measurement error in ξ1 . 116.3 Measurement Error Problem in the Dividend and Investment Decisions This section aims to study the impact of measurement error on the parameter estimates of variables measured without error. Here, single equation as well as simultaneous equations model are investigated. In this section, the direction of the bias due to measurement error is clearly investigated. 116.3.1 Asymptotic bias problem 116.3.1.1 Asymptotic bias of OLS estimator in the single equation model Suppose that dividends are the result of a partial adjustment process towards a target ratio. The target pay-out level is assumed to be a fixed proportion of expected earnings as follows: DIV∗it = γPit∗ ,
(116.31)
where DIV∗it and Pit∗ are the target dividend and expected earnings of firm i at year t. Here, we assume that changes in dividends are determined by the difference between last year’s dividends and this year’s target dividend as ΔDIVit = α0 + α(DIV∗it − DIVit−1 ) + εit = α0 + α(γPit∗ − DIVit−1 ) + εit , (116.32) where ΔDIVit is the changes in dividends of firm i at year t. DIVit−1 is the dividend of firm i at year t − 1. We can rewrite equation (116.32) as follows: ΔDIVit = β0 + β1 Pit∗ + β2 DIVit−1 + εit .
(116.33)
Here, α = −β2 denotes the speed of adjustment of divided decision. Since the expected earnings is unobservable, using reported accounting earnings as a proxy may result measurement error problem (Hsu, Wang and Wu, 1998). We suppose that there is an error of measurement in expected earnings as follows: Pit = Pit∗ + νit .
(116.34)
page 4010
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4011
Assume that the measurement error νit is independent of Pit∗ , DIVit−1 , and εit . Carroll et al. (2006), and Chen, Lee and Lee (2015) show the asymptotic biases of traditional OLS estimated coefficients are as follows: 2 −σ ν (116.35) β1 , plim{βˆ1 − β1 } = σP2 ∗ |DIVt−1 + σν2 t bPt∗ |DIVt−1 σν2 (116.36) β1 , plim{βˆ2 − β2 } = σP2 ∗ |DIVt−1 + σν2 t
where σν2 is the variance of measurement error, σP2 ∗ |DIVt−1 is the variance t of Pt∗ conditional on DIVt−1 , and bPt∗ |DIVt−1 is the coefficient of DIVt−1 in the auxiliary regression of Pt∗ on DIVt−1 . Equation (116.36) shows that the OLS estimate of βˆ1 is downward biased. In addition, equation (116.36) shows that the estimated coefficient of DIVt−1 measured without error is also biased, the direction of bias of estimated coefficient of DIVt−1 will rely on the dependence of Pt∗ and DIVt−1 . We can conclude that the estimated coefficient of variable measured without error is also biased in general, unless the variable is independent of the variable measured with error. Following McCallum (1972) and Gleser, Carroll and Gallo (1987), the results in eqs. (116.35) and (116.36) can be generalized to the case by matrix form as follows: ΔDIV = 1β 0 + P∗ β1 + Zβ2 + ε,
(116.37)
where ΔDIV denotes T × 1 vector of observations for changes in dividends. 1 is a T ×1 unity column vector. P∗ is a T ×1 vector for unobserved expected earnings. Z is a T × k matrix of observations for additional explanatory variables measured without error, k > 1. Suppose that accounting earning P is unbiased for expected earnings P∗ , and its measurement error ν is independent of P∗ , Z and ε. McCallum (1972) and Gleser, Carroll and Gallo (1987) show that the asymptotic biases of traditional OLS estimator are as follows: 2 −σ ν (116.38) β1 , plim{βˆ1 − β1 } = σP2 ∗ |Z + σν2 2 σ ν (116.39) β1 , plim{βˆ2 − β2 } = bP ∗ |Z σP2 ∗ |Z + σν2 where σP2 ∗ |Z is the variance of P∗ conditional on DIVt−1 , and bP ∗ |Z t is a k × 1 vector of the coefficient on Z in the auxiliary regression of
page 4011
July 6, 2020
16:6
4012
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
P∗ on Z. Comparing equation (116.35) and equation (116.38) shows that σP2 ∗ |DIVt−1 is equal to σP2 ∗ |Z only when P∗ and Z are uncorrelated. Othert wise, σP2 ∗ |DIVt−1 < σP2 ∗ |Z , implying that the collinearity increases the bias in t the OLS estimate of earnings. Moreover, as shown in equation (116.39), the parameter estimate of Z measured without error is biased in general unless Z is independent of P∗ . 116.3.1.2 Asymptotic bias of 2SLS estimator in the simultaneous equations model Dhrymes and Kurz (1967), Fama (1974), Smirlock and Marshall (1983), and Lee et al. (2016) have studied empirical relationships between investment and dividend decisions of firms. In this study, a simultaneous equations system is constructed as follows: ΔDIVt = γ1 ΔKt + α0 + α1 DIVt−1 + α2 Pt∗ + ε1t , ΔKt = γ2 ΔDIVt + β0 + β1 Kt−1 + β2 Q∗t + ε2t ,
(116.40)
where ΔDIVt and ΔKt are the change in the dividend and capital stock of firm i from year t − 1 to year t, respectively. DIVt−1 and Kt−1 are the dividend and capital stock of firm i at year t. Pt∗ is the expected earnings of firm i at year t. Q∗t is the marginal q of firm i at year t. This system (116.40) may be rewritten in the matrix form as follows: YΓ = X∗ β + ε,
(116.41)
where Y = (ΔDIV, ΔK) is a T × 2 matrix for two endogenous variables, X∗ = (1 DIV−1 K−1 P∗ Q∗ ) is a T × 5 matrix for all exogenous variables in the system includes a constant term in the first vector of X∗ . Here, ΔDIV− and K−1 are one-period-lagged data matrix of ΔDIV and K−1 , respectively. ε = (ε1 , ε2 ) is a T × 2 matrix for two disturbances with mean vector zeros and covariance matrix Σ. The parameters matrices Γ and β are defined as α0 α1 0 α2 0 1 −γ1 and β = . (116.42) Γ= −γ2 1 β0 0 β1 0 β2 Post-multiplying the structural form model (116.32) by Γ−1 , we have the following reduced-form model: Y = X∗ βΓ−1 + εΓ−1 = X∗ Π∗ + e,
(116.43)
where Π∗ = βΓ−1 is the parameters matrix of reduced form model. Disturbances vector e has mean vector zeros and covariance matrix Ω = Γ−1 ΣΓ−1 .
page 4012
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4013
Here, we rewrite the jth equation in our simultaneous equation model (116.41) in terms of the full set of T observations: yj = Yj γj + X∗j βj + ε∗j = Z∗j δj + εj ,
j = 1, 2,
(116.44)
where yj denotes the T × 1 vector of observations for the endogenous variables on left-hand side of jth equation. Yj denotes the T × 1 data matrix for the endogenous variables on right-hand side of this equation. X∗j is a data matrix for all exogenous variables in this equation. For example, consider the dividend equation in our simultaneous equation model, yj and Yj are data matrices defined as ΔDIV and ΔK, respectively. X∗j = [1 ΔDIV− P∗ ] is a T × 3 data matrix for the exploratory variables, where ΔDIV− denotes one-period-lagged data matrix of ΔDIV. Z∗j = [ΔK 1 ΔDIV− P∗ ] is a T × 4 data matrix for all variables on right-hand side of this equation. Since the jointly determined variables yj and Yj (ΔDIV and ΔK) are determined within the system, they are correlated with the disturbance terms. This correlation usually creates estimation difficulties because the ordinary least squares estimator would be biased and inconsistent (e.g., Johnston and DiNardo, 1997, Greene, 2018). The two-stage least squares (2SLS) approach is the most common method used to deal with this endogeneity problem resulting from the correlation of Z∗j and εj . However, in reality, the expected earnings (P∗ ) and marginal q (Q∗ ) are not observable. Now we assume that there is an error of measurement in P∗ and Q∗ as follows: P = P∗ + vp
and
Q = Q∗ + vq ,
(116.45)
where P and Q denote observed data matrix of accounting earnings and Tobin’s q, respectively. Assume that the measurement errors vp and vq are independent of all exploratory variables and random disturbances ε: yj = Yj γj + Xj βj + νj = Zj δj + εj ,
j = 1, 2,
(116.46)
where Zj = [Yj Xj ]. The 2SLS estimation uses all the exogenous variables in this system as instruments to obtain the predictions of Yj . In the first stage, we regress Yj on all exogenous variables X = [1 ΔDIV− K−1 P Q] in the system to receive the predictions of the endogenous variables on rightˆ j = X(X X)−1 X Yj . In the second hand side of this equation (116.46), Y ˆ j and Xj to obtain the estimator of δj in equastage, we regress yj on Y ˆ j Xj ]. Note that ˆ tion (116.46). Using Yj in place of Yj in Zj yields Wj = [Y
page 4013
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4014
Wj Wj = Wj Zj , the 2SLS estimator for δj in equation (116.46) is then 2SLS = (Wj Wj )−1 Wj yj δˆj −1 ˆY ˆ ˆ ˆ yj ˆ Yj Y Y Y j j Yj Xj j j = = ˆ j X Xj X yj X Yj X Y j
j
j
j
Yj Xj
−1
Xj Xj
ˆ yj Y j
Xj yj
.
(116.47) ˆ j by X(X X)−1 X Yj in (116.46), this turns out to be equivalent Replace Y to 2SLS = (Zj X(X X)−1 X Zj )−1 Zj X(X X)−1 X yj , δˆj
(116.48)
where yj = ΔDIV, Zj = [ΔK 1 ΔDIV− P], and X = [1 ΔDIV− K−1 P Q] are observed data matrices. Here, the effect of measurement error on the 2SLS estimates of the structural parameters is investigated. To investigate the asymptotic bias of 2SLS estimator, we rewrite 2SLS estimator in equation (116.48) as follows: 2SLS = (Zj X(X X)−1 X Zj )−1 Zj X(X X)−1 X (Z∗j δj + εj ) δˆj
= δj + (Zj X(X X)−1 X Zj )−1 Zj X(X X)−1 X ([0 νj ]δj + εj ). (116.49) Here, we define that Zj = [Yj Xj ] = [Yj X∗j + νj ] = Z∗j + [0 νj ], and νj is the submatrix of measurement error corresponding to included exogenous variables on the right-hand side of the jth equation. Thus, we can find that the sampling error of 2SLS estimator depends on three parts: T −1 (X X), T −1 (X Zj ), and T −1 (X ([0 νj ]δj + εj )). We start with the first term T −1 (X X) (i) T −1 (X X): plimT −1 (X X) = plimT −1 ((X∗ + ν) (X∗ + ν))
= plimT −1 (X∗ X∗ ) + θ ≡ Σ + θ,
(116.50)
where θ is the covariance matrix of measurement error v. For simplify, we assume that the measurement errors are mutually independent, i.e., matrix ˆj = θ is diagonal. By equation (116.50), we predict that the biases of Y −1 X(X X) X Yj in the first stage will carry over to second stage and hence the estimates of structural parameters well be affected by the measurement error.
page 4014
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4015
(ii) T −1 (X Zj ): To solve the variable of interest, we divide the Y matrix in two parts corresponding the one-dependent endogenous variable yj , and the explanatory endogenous variables Yj in equation (116.46). The X∗ matrix can be similarly partitioned in two parts corresponding the included exogenous variables X∗j and the excluded exogenous variables X∗−j . The jth equation of reduced form model (116.43) can be rewritten as (yj |Yj ) =
[X∗j
X∗−j ]
Π∗1
Π∗j
Π∗2
Π∗−j
+ (ej |e−j ).
(116.51)
Then, we rewrite the reduced-form model (116.12) corresponding to explanatory endogenous variables Yj in the jth equation as follows: Yj = X∗j Π∗j + X∗−j Π∗−j + e−j ,
(116.52)
where X∗j is explanatory exogenous variables appeared in equation (116.44). Π∗j denotes the submatrix of Π∗ corresponding to X∗j appeared in equation (116.44). X∗−j and Π∗−j are the submatrices of X∗ and Π∗ , respectively, corresponding the explanatory variables that do not appear in the jth equation. We rearrange the columns of X such that the first part is Xj , that is X = Xj X−j ,
(116.53)
where X−j is the observed exogenous variables that do not appear in the jth equation. From eqs. (116.45), (116.52), and (116.53) we have Yj = Xj Π∗j + X−j Π∗−j + e−j − νj Π∗j − ν−j Π∗−j ,
(116.54)
where νj and ν−j are measurement errors corresponding to X∗j and X∗−j , respectively.
X Zj = Xj
X−j
Yj
Xj =
Xj Yj
Xj Xj
X−j Yj X−j Xj Xj X−j Π∗j I Xj Xj Xj ν j Π∗j + Xj ν −j Π∗−j 0 − = X−j νj Π∗j + X−j ν−j Π∗−j 0 X−j Xj X−j X−j Π∗−j 0 Xj e−j 0 . (116.55) + X−j e−j 0
page 4015
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4016
Then, plim{T −1 (X Zj )} = plim{T −1 (X X)} −
∗
Π∗j
I
Π∗−j
0
plim{T −1 (Xj + νj )νj }Π∗j
0
plim{T −1 (X∗−j + ν−j )ν−j }Π∗−j 0 θj Π∗j 0 Π∗j I − = (Σ + θ) Π∗−j 0 θ−j Π∗−j 0 ≡ H,
+0
(116.56)
where θj is the submatrix of θ, corresponding to erroneously measured exogenous variables included in the jth equation. θ−j is the submatrix of θ, corresponding to erroneously measured exogenous variables that do not appear in the jth equation. As shown in equation (116.56), the structural coefficients of explanatory endogenous variables, Π∗j and Π∗−j , play a role in the asymptotic bias of 2SLS estimator. (iii) T −1 (X ([0 νj ]δj + εj )): plim{T −1 (X ([0
νj ]δj + εj ))} = plim{T −1 (X [0 νj ]δj + X εj )} ν 0 X j j δj = plim{T −1 X 0 νj δj } = plim T −1 0 X−j νj ∗ + ν )ν 0 (X 0 θ j j j j δj . (116.57) δj = = plim T −1 0 0 0 (X∗−j + ν−j )νj
As shown in equation (116.57), the asymptotic bias of 2SLS estimator depends on the structural coefficients (βj ) and its covariance matrix of measurement error (θj ), corresponding to erroneously measured exogenous variables included in the jth equation. Combining (116.50), (116.56), and (116.57), the asymptotic bias of 2SLS estimator is given by 2SLS 0 θ j δj . − δj = −(H (Σ + θ)−1 H)−1 H (Σ + θ)−1 × plim δˆj 0 0 (116.58)
page 4016
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Impacts of Measurement Errors
4017
The asymptotic bias of 2SLS estimated coefficients can be analyzed as follows: (i) The bias term depends on the structural coefficients corresponding to exogenous variables (βj ) and reduced form coefficients of explanatory endogenous variables (Yj ), Π∗j and Π∗−j . (ii) If an equation without exogenous variables (βj = 0), the structural parameters are always estimated consistently. 116.3.2 Bias correction in the simultaneous equations model The structural form model (116.40) can be written in the following reduced form: ⎡ ⎤ 1 ⎥ ⎢ ⎢DIVt−1 ⎥ ⎥ α1 γ1 β1 α2 γ1 β2 ⎢ α0 + γ1 β0 ΔDIVt 1 ⎢ Kt−1 ⎥ = ⎢ ⎥ 1 − γ1 γ2 β0 + α0 γ2 α1 γ2 ΔKt β1 γ2 α2 β2 ⎢ ⎥ ⎣ Pt∗ ⎦ ε1,t + γ1 ε2,t 1 + 1 − γ1 γ2 ε2,t + γ2 ε1,t ≡
∗ π11
∗ π21
∗ π31
∗ π41
∗ π12
∗ π22
∗ π32
∗ π42
Q∗t
⎡
1
⎤
⎥ ⎢ ⎢DIVt−1 ⎥ ∗ ⎢ ⎥ π51 e ⎢ Kt−1 ⎥ + 1,t . ⎢ ⎥ ∗ e2,t π52 ⎢ ⎥ ⎣ Pt∗ ⎦
(116.59)
Q∗t
∗ /π ∗ and π ∗ /π ∗ are both equal From equation (116.59), we can find that π31 32 51 52 ∗ /π ∗ and π ∗ /π ∗ are both equal to γ . This concludes to γ1 as well as π22 2 21 42 41 that the coefficients of this reduced form model are constrained by π ∗ π ∗ π ∗ π ∗ 31 21 41 51 = 0 and = 0. (116.60) ∗ ∗ ∗ ∗ π32 π52 π22 π42
From (116.59), we know that γ1 and γ2 are over-identified. Next, we will identify the unknown variances of measurement error by using the overidentified restrictions (116.60) on the reduced-form coefficients.
page 4017
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4018
Here, we assume that there is an error of measurement in expected earnings (P∗ ) and marginal q (Q∗ ) as follows: X = X∗ + ν,
(116.61)
where X = 1 DIVt−1 Kt−1 P Q and X∗ = (1 DIV−1 K−1 P∗ Q∗ ). Assume that the measurement error disturbances ε in equation (116.41). surement error ν is given by ⎡ 0 0 ⎢0 0 ⎢ ⎢ θ=⎢ ⎢0 0 ⎢ ⎣0 0 0
ν is independent of X∗ and random Here, the covariance matrix of mea0
0
0
0
0
0
0 θp
0 0
0
0
⎤
0⎥ ⎥ ⎥ 0⎥ ⎥. ⎥ 0⎦
(116.62)
θq
Here, we assume that θ is not only diagonal, but also has some zeros on the diagonal which reflects some exogenous variables measured without error. This reflects that the measurement errors between variables are uncorrelated and some corresponding exogenous variables are measured without error. Then the covariance matrix of all observed exogenous variables, Φ is
Φ ≡ E(X X) = E{(X∗ + ν) (X∗ + ν)} = E(X∗ X∗ ) + θ
(116.63)
and assume that Π is the coefficient matrix in the linear regression of Y on the observed exogenous variables X. Then ΦΠ ≡ E(X Y) = E{(X∗ + ν) (X∗ Π∗ + ε)} = E(X X)Π∗ = (Φ − θ)Π∗ . (116.64) Thus we have ⇒ Π = (I5 − θΦ−1 )Π∗
(116.65)
⇒ Π∗ = (I5 − θΦ−1 )−1 Π, where
⎡
∗ π11
⎢ ∗ ⎢π21 ⎢ ⎢ ∗ Π∗ = ⎢π31 ⎢ ⎢π ∗ ⎣ 41 ∗ π51
∗ π12
⎡
⎤
∗ ⎥ ⎥ π22 ⎥ ∗ ⎥ π32 ⎥ ⎥ ∗ π42 ⎥ ⎦ ∗ π52
and
Φ−1
φ11
⎢ ⎢φ21 ⎢ ⎢ = ⎢φ31 ⎢ ⎢φ ⎣ 41 φ51
φ12
φ13
φ14
φ22
φ23
φ24
φ32
φ33
φ34
φ42
φ43
φ44
φ52
φ53
φ54
φ15
⎤
⎥ φ25 ⎥ ⎥ ⎥ φ35 ⎥, ⎥ φ45 ⎥ ⎦ φ55
page 4018
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Impacts of Measurement Errors
4019
where the φij denote the i–j element of Φ−1 , and Π∗ = (I5 − θΦ−1 )−1 Π =
[Aij ]T adj(I5 − θΦ−1 ) Π = Π, det(I5 − θΦ−1 ) det(I5 − θΦ−1 ) (116.66)
where ⎡
θΦ−1
0
⎢ ⎢ 0 ⎢ ⎢ =⎢ 0 ⎢ ⎢θ φ ⎣ p 41 θq φ51
0
0
0
0
0
0
0
0
0
0
θp φ42
θp φ43
θp φ44
θq φ52
θq φ53
θq φ54
⎤
⎥ 0 ⎥ ⎥ ⎥ 0 ⎥ ⎥ θp φ45 ⎥ ⎦ θq φ55
and ⎡
1
⎢ ⎢0 ⎢ ⎢ (I5 − θΦ−1 ) = ⎢0 ⎢ ⎢θ φ ⎣ p 41 θq φ51
⎤
0
0
0
0
1
0
0
0
0
1
0
0
θp φ42
θp φ43
1 − θp φ44
θp φ45
θq φ52
θq φ53
θq φ54
1 − θq φ55
⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦
Then, we have ⎡
∗ π11
⎢ ∗ ⎢π21 ⎢ ⎢ ∗ ⎢π31 ⎢ ⎢π ∗ ⎣ 41 ∗ π51
⎡ A11 ⎢ ⎥ ∗ ⎥ ⎢A12 π22 ⎢ ⎥ 1 ⎢ ⎥ ∗ π32 ⎥ = ⎢A ⎥ det(I5 − θΦ−1 ) ⎢ 13 ⎢A ∗ ⎥ π42 ⎣ 14 ⎦ ∗ π52 A15 ⎡ A11 ⎢ ⎢0 ⎢ 1 ⎢ = ⎢0 −1 det(I5 − θΦ ) ⎢ ⎢A ⎣ 14 A15 ∗ π12
⎤
A21
A31
A41
A22
A32
A42
A23
A33
A43
A24
A34
A44
A25
A35
A45
0
0
0
A22
0
0
0
A33
0
A24
A34
A44
A25
A35
A45
A51
⎤⎡
π11
⎥⎢ ⎢ A52 ⎥ ⎥ ⎢π21 ⎥⎢ A53 ⎥ ⎢π31 ⎥⎢ ⎢ A54 ⎥ ⎦ ⎣π41 A55 π51 ⎤⎡ 0 π11 ⎥⎢ ⎢ 0 ⎥ ⎥ ⎢π21 ⎥⎢ 0 ⎥ ⎢π31 ⎥⎢ ⎢ A54 ⎥ ⎦ ⎣π41 A55 π51
π12
⎤
⎥ π22 ⎥ ⎥ ⎥ π32 ⎥ ⎥ π42 ⎥ ⎦ π52 ⎤ π12 ⎥ π22 ⎥ ⎥ ⎥ π32 ⎥, ⎥ π42 ⎥ ⎦ π52
page 4019
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4020
where 1 0 det(I5 − θΦ−1 ) = 0 −θ φ p 41 −θq φ 51
0
0
0
1
0
0
0
1
0
−θp φ42
−θp φ43
1 − θp φ44
−θq φ52
−θq φ53
−θq φ54
0 0 −θp φ45 1 − θq φ55
= (1 − θp φ44 )(1 − θq φ55 ) − θp φ45 θq φ54 , ⎡ ⎤ A11 A12 A13 A14 A15 ⎢ ⎥ ⎢A21 A22 A23 A24 A25 ⎥ ⎢ ⎥ ⎢ ⎥ Aij = ⎢A31 A32 A33 A34 A35 ⎥, ⎢ ⎥ ⎢A ⎥ ⎣ 41 A42 A43 A44 A45 ⎦ A51 A52 A53 A54 A55 1 − θp φ −θ φ p 44 45 A11 = A22 = A33 = + −θq φ 1 − θq φ55 54
A14
A15
A24
A25
= (1 − θp φ44 )(1 − θq φ55 ) − θp φ45 θq φ54 , 0 1 0 0 0 0 1 0 = − −θp φ45 −θp φ41 −θp φ42 −θp φ43 −θq φ 51 −θ q φ52 −θq φ53 1 − θq φ55
= 1 − θq φ55 θp φ41 + θp φ45 θq φ51 , −θp φ 41 1 − θp φ44 1+2 1+2 = +(−1) (−1) −θq φ −θq φ54 51 = (1 − θp φ44 )θq φ51 + θp φ41 θq φ54 , −θp φ −θp φ45 42 1+2 = +(−1) −θq φ 52 1 − θq φ55 = (1 − θq φ55 )θp φ42 + θp φ45 θq φ52 , −θp φ 1 − θ φ p 42 44 = −(−1)1+2 −θq φ −θq φ 52
54
= (1 − θp φ44 )θq φ52 + θp φ42 θq φ54 ,
0
page 4020
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
Impacts of Measurement Errors
A34
A35
4021
−θp φ −θp φ45 43 = − = (1 − θq φ55 )θp φ43 + θp φ45 θq φ53 , −θq φ 1 − θ φ q 53 55 −θp φ 43 1 − θp φ44 = = (1 − θp φ44 )θq φ53 + θp φ43 θq φ45 , −θq φ −θq φ45 53
A44 = 1 − θq φ55 , A45 = θq φ54 , A54 = θp φ45 , A55 = 1 − θp φ44 , A12 = A13 = A21 = A23 = A31 = A32 = A41 = A42 = A43 = A51 = A52 = A53 = 0, ⎡ A11 A21 A31 ⎢ ⎢A12 A22 A32 ⎢ ⎢ adj(I5 − θΦ−1 ) = [Aij ]T = ⎢A13 A23 A33 ⎢ ⎢A A A ⎣ 14 24 34 A15 A25 A35 Π∗ = (I5 − θΦ−1 )−1 Π =
A41 A51
⎤
⎥ A42 A52 ⎥ ⎥ ⎥ A43 A53 ⎥ ⎥ A44 A54 ⎥ ⎦ A45 A55
⎡ π11 ⎢ ⎢π21 ⎢ ⎢ Π = ⎢π31 ⎢ ⎢π ⎣ 41 π51
π12
⎤
⎥ π22 ⎥ ⎥ ⎥ π32 ⎥, ⎥ π42 ⎥ ⎦ π52
adj(I5 − θΦ−1 ) det(I5 − θΦ−1 )
[Aij ]T Π. det(I5 − θΦ−1 ) Combining the priori restrictions on the reduced-form coefficients in equation (116.60) and the regression relationship in equation (116.66), we have π ∗ π ∗ 1 41 21 =0= ∗ ∗ π22 π42 det(I5 − θΦ−1 ) A12 π11 + A22 π21 + A32 π31 + A42 π41 + A52 π51 A12 π12 + A22 π22 + A32 π32 + A42 π42 + A52 π52 A14 π11 + A24 π21 + A34 π31 + A44 π41 + A54 π51 A14 π12 + A24 π22 + A34 π32 + A44 π42 + A54 π52 A22 π21 A14 π11 + A24 π21 + A34 π31 + A44 π41 + A54 π51 =⇒ = 0. A22 π22 A14 π12 + A24 π22 + A34 π32 + A44 π42 + A54 π52 Π=
That is, {(1 − θq φ55 )θp φ41 + θp φ45 θq φ51 }(π11 π22 − π12 π21 ) + {(1 − θq φ55 )θp φ43 + θp φ45 θq φ53 }(π31 π22 − π32 π21 )
(116.67)
+ (1 − θq φ55 )(π41 π22 − π42 π21 ) + θp φ45 (π51 π22 − π52 π21 ) = 0,
page 4021
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch116
C. F. Lee & F.-L. Lin
4022
and π ∗ π ∗ 1 51 31 =0= ∗ ∗ π32 π52 det(I5 − θΦ−1 ) A13 π13 + A23 π21 + A33 π31 + A43 π41 + A53 π51 A13 π12 + A23 π22 + A33 π32 + A43 π42 + A53 π52
A15 π12 + A25 π22 + A35 π32 + A45 π42 + A55 π52 A15 π11 + A25 π21 + A35 π31 + A45 π41 + A55 π51 A33 π31 A15 π11 + A25 π21 + A35 π31 + A45 π41 + A55 π51 =⇒ = 0. A33 π32 A15 π12 + A25 π22 + A35 π32 + A45 π42 + A55 π52
That is, {(1 − θp φ44 )θq φ51 + θp φ41 θq φ54 }(π11 π32 − π12 π31 ) + {(1 − θp φ44 )θq φ52 + θp φ42 θq φ54 }(π21 π32 − π22 π31 ) + θq φ54 (π41 π32 − π42 π31 ) + {1 − θp φ44 (π51 π32 − π52 π31 ) = 0.
(116.68)
Equations (116.67) and (116.68) state that the two unknown measurement error variances, θp and θq , can be solved by the parameters φij and πij which can be obtained from observations by regressing Y on X. Obviously, we can solve the two equations in two unknowns for θp and θq , given φij and πij are estimable. As a consequence, we can conclude that in a simultaneous equations model, the measurement error variance need not destroy the identifiability provided that the model is otherwise over-identified. In fact, the over-identifying restrictions have been traded off against the underidentifiability introduced by measurement error (Hsiao, 1976). 116.4 Conclusions In this chapter, we analyze the errors-in-variables problems in a simultaneous equation system. We first investigate the effects of measurement errors in exogenous variable on the estimation of structural parameters in a simultaneous equations system. We show that the estimated coefficient of variable measured without error is also biased in general, unless the variable is independent of the variable measured with error. In an error-in-variable simultaneous equation system, we also show that the over-identification can compensate for under-identification due to the errors of measurement. We further discuss the effects of measurement errors on the estimation of simultaneous equation system for investment policy and dividend policy. Here,
page 4022
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Impacts of Measurement Errors
b3568-v4-ch116
4023
the direction of the bias due to measurement error is clearly investigated. We also illustrate how the unknown variance–covariance matrix of measurement error can be identified by the restrictions on the coefficients of reduced form model. Therefore, in a simultaneous equation system, the measurement error variance need not destroy the identifiability provided that the model is over-identified. If the variance of measurement error can be consistently estimated by Goldberger’s approach, then a two stage least squares method instead of an indirect least squares method can be applied to obtain the consistent estimators of structural parameters in the future research. Bibliography Carroll, R.J., Ruppert, D., Crainiceanu, C.M. and Stefanski, L.A. (2006). Measurement Error in Nonlinear Models: A Modern Perspective, 2nd edn. Chapman & Hall/CRC. Chen, H.Y., Lee, A.C. and Lee, C.F. (2015). Alternative Errors-in-Variables Models and Their Applications in Finance Research. Quarterly Review of Economics and Finance 58, 213–227. Cochran, W.G. (1970). Some Effects of Errors of Measurement on Multiple Correlation. Journal of the American Statistical Association 65, 22–34. Dhrymes, P. J. and Kurz, M. (1967). Investment, Dividends and External Finance Behavior of Firms. In Ferber, R. (ed.), Determinants of Investment Behavior, New York. Dinardo, J., Johnston, J. and Johnston, J. (1997). Econometric Methods (Fourth edition). McGraw-Hill Companies, Inc, New York, NY. Fama, E.F. (1974). The Empirical Relationships Between the Dividend and Investment Decisions of Firms. The American Economic Review 64(3), 304–318. Gleser, L.J., Carroll, R.J. and Gallo, P.P. (1987). The Limiting Distribution of Least Squares in an Errors-In-Variables Regression Model. Annals of Statistics 15(1), 220–233. Goldberger, A.S. (1972). Structural Equation Methods in the Social Sciences. Econometrica 40, 979–1001. Goldberger, A.S. (1973). Efficient Estimation in Overidentified Models: An Interpretive Analysis. In A.S. Goldberger and O.D. Duncan (eds.). Structural Equation Modeling in the Social Sciences, Seminar Press, New York. Greene, W.H. (2018). Econometric Analysis, 8th edn. Prentice-Hall. Hsiao, C. (1976). Identification and Estimation of Simultaneous Equation Models with Measurement Error. International Economic Review 17(2), 319–339. Hsu, J., Wang, X.M. and Wu, C. (1998). The Role of Earnings Information in Corporate Dividend Decisions. Management Science 44(12-part-2), 173–191. Lee, C.F. (1973). Errors-in-Variables Estimation Procedures with Applications to a Capital Asset Pricing Model, Ph.D. Dissertation, State University of New York at Buffalo. Lee, C.F., Liang, W.L., Lin, F.L. and Yang, Y. (2016). Applications of Simultaneous Equations in Finance Research: Methods and Empirical Results. Review of Quantitative Finance and Accounting 47(4), 943–971. McCallum, B.T. (1972). Relative Asymptotic Bias from Errors of Omission and Measurement. Econometrica 40(4), 757–758. Smirlock, M. and Marshall, W. (1983). An Examination of the Empirical Relationship between the Dividend and Investment Decisions: A Note. Journal of Finance 38(5), 1659–1667.
page 4023
This page intentionally left blank
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
Chapter 117
Big Data and Artificial Intelligence in the Banking Industry T. Robert Yu and Xuehu Song Contents 117.1 Introduction . . . . . . . . . . . . . . . . . . 117.2 Definition and Background . . . . . . . . . 117.2.1 Big data . . . . . . . . . . . . . . . 117.2.2 Artificial intelligence . . . . . . . . 117.3 Implications of Big Data and AI for Banks . 117.4 Implications for Regulatory Compliance and Supervision . . . . . . . . . . . . . . . . 117.5 Challenges and Limitations . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
4026 4028 4028 4029 4030
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4034 4036 4039
Abstract Big data and artificial intelligence (AI) assist businesses with decision-making. They help companies create new products and processes or improve existing ones. As the amount of data grows exponentially and data storage and computing power costs drop, AI is predicted to have great potentials for banks. This chapter discusses the implications of big data and AI for the banking industry. First, we provide background on big data and AI. Second, we identify areas in which banks can benefit from big data and AI, and evaluate their T. Robert Yu University of Wisconsin — Whitewater e-mail: [email protected] Xuehu Song California State University e-mail: [email protected]
4025
page 4025
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
4026
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
applications for the banking industry. Third, we discuss the implications of big data and AI for regulatory compliance and supervision. Last, we conclude with the limitations and challenges facing the use of big-data based AI. Keywords Big data • Artificial intelligence • Machine learning • Bank • Robo-advisor • Bank regulatory compliance • Algorithmic bias.
117.1 Introduction The banking business is essentially data business. In their role as financial intermediaries to resolve incentive problems between lenders and borrowers, banks collect private information and minimize the cost of monitoring (Diamond, 1984). Their value chains are supported by data collected from borrowers and customer transactions. Insights gained from the data provide banks a competitive information advantage. With fast growth in the number of Web users, prevalence of the Internet of Things (IoTs), and popularity of mobile devices, large volumes of data are generated at a record rate (“big data”). At the same time, the wide use of mobile devices and changing preferences of customers suggest that customers and businesses expect real-time and cross-channel experiences more than ever before (Miklos et al., 2016). To maintain competitive advantage, banks are racing to catch the bandwagon of the big data trend. It is estimated that investment in big data and analytics will grow from $130.1 billion in 2016 to over $203 billion in 2020, with banking being the industry with the largest investment in 2016 and the fastest spending growth.1 The exponential growth of data has also facilitated the reemergence of artificial intelligence. Large volumes of data enable companies to develop successful algorithms which extract insights from the data. With the advance in technologies, banks are competing not only with other banks but also with nonfinancial institutions in customer acquisition and servicing, consumer finance, payment service, and wealth management. The disruption brought by new entrants poses challenges to the banking industry. In addition to technology giants such as Google and Apple, disruption also comes from many FinTech startups. It is estimated that new technology startups
1
International Data Corporation. 2017. https://www.idc.com/getdoc.jsp?containerId=pr US41826116.
page 4026
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4027
could capture $4.7 trillion in annual revenue.2 Competition with the new entrants will become highly data and technology driven. Advanced analytics and machine learning offered by FinTech companies account for nearly half of new technology solutions (WFE and Mckinsey, 2018). A recent survey of 3000 executives across industries shows that 65 percent of executives in the finance and banking industry expect artificial intelligence to play an important role in the next five years (Ransbotham et al., 2017). While half of the executives regard it as an opportunity, only about one in five companies has incorporated artificial intelligence into its products, services or processes. New competition has also brought opportunities for banks to innovate and grow. Some large banks have already been using big data technologies and machine learning techniques to gain competitive advantage through personalized sales and advertising, improved customer service, higher operational efficiency, reduced risk, and better regulatory compliance.3 Thus, it is imperative for small-to medium-sized banks to equip themselves with technologies and skills to analyze and discover the potential of big data. These banks already have valuable data on customer transaction history and can capitalize on such data by combining them with external data. Machine-learning processes can then produce meaningful insights into their customers, processes, products, and employees and improve process efficiency, reduce operating cost, customize advertising, and create value. In fact, for both investment and retail banks of different sizes, big data and artificial intelligence provide opportunities to broaden revenue streams, create value for customers, improve productivity and efficiency, and better manage regulatory compliance. The chapter is divided into five sections. In Section 117.2, we offer definition and describe the background of big data and artificial intelligence. Section 117.3 analyzes applications of big data and artificial intelligence for the banking industry and highlights important practical implications. Section 117.4 discusses regulatory compliance implications, and Section 117.5 concludes with challenges and limitations banks face related to big data and artificial intelligence.
2
Deloitte 2018. Fintech trends: Five insights for now and the future, https://www2.deloit te.com/us/en/pages/risk/articles/fintech-trends-insights.html#. 3 Based on a survey by the Boston Consulting Group, 80 percent of the efforts to commercialize existing data came from financial and telecommunications industries (Platt et al., 2014).
page 4027
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
4028
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
117.2 Definition and Background 117.2.1 Big data The term “big data” generally refers to large-volume, unstructured data sets that are difficult to process using traditional technology. Although its importance has been widely accepted by academics, government, and business organizations, there is no consensus on the definition of big data. Nevertheless, the unique properties of big data are widely recognized. These properties are often referred to as the four Vs: volume, variety, velocity, and value. Volume, as the most common trait, refers to the database size. While traditional data analytics typically uses data comprising no more than tens of terabytes, data sets in big data analytics often exceed 100 terabytes. The digital universe doubles its size every 12 months (EMC, 2011) and is expected to reach 44 zettabytes in 2020 and 180 zettabytes in 2025 (International Data Corporation, 2014).4 It is estimated that the amount of data that Walmart has collected from customer transactions is over 2.5 petabytes so far. Variety refers to the formats of data. Traditional data analytics typically analyze structured data using relational databases. In contrast, a large portion of big data is semi-structured or unstructured. The format may include text message, blog, image, video, audio, sensor reading, smartphone signal, and others. In data generated by Web users and IoTs, the amount of semi-structured and unstructured data is much larger than structured data. Velocity means the data and its tools must have fast generation, transfer, and processing speeds to create a business advantage. Big data is constantly updated and gaining a competitive edge from this real-time information requires a company to be agile. The variety in types and constant flow of big data is very difficult for traditional databases to handle but can be managed using big data technologies such as Hadoop/MapReduce. Value is the process of discovering new insights about the customer, process, product and service by leveraging the multi-format, rapidly generated information in big data (Chen et al., 2014). Among all the properties, value is considered “the most important aspect of big data” (Hashem et al., 2015). Based on these four characteristics of big data, big data technologies can be defined as “a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, 4
One zettabyte equals to one million petabytes or one billion terabytes. One terabyte equals one thousand gigabytes or one million megabytes.
page 4028
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4029
by enabling high velocity capture, discovery, and/or analysis” (Gantz and Reinsel, 2011). These characteristics of big data make them unique from traditional structured data. Because of its large size, various types, and dynamic nature, big data is often analyzed using AI engines such as machine learning. Analysis based on big data is often referred to as big data analytics (Ward and Barker, 2013). Big data analytics involves multiple data types, employs technologies and AI algorithms for fast processing, and operates in hybrid technology environments such as relational databases, Hadoop, and cloud-based computing. 117.2.2 Artificial intelligence Artificial intelligence (AI) is defined by some scholars as “(t)he branch of computer science that is concerned with the automation of intelligent behavior” (Luger and Stubblefield, 1993). It can also be viewed as “(t)he art of creating machines that perform functions that require intelligence when performed by people” (Kurzweil, 1990). The concept of AI was first developed in the 1950s, and its goal during that time was “using computers to stimulate human intelligence”. (Crevier, 1993). After experiencing several setbacks, it was not until after the 2000s that development in AI began to accelerate. In recent years, cognitive computing capabilities have moved toward achieving the goal of simulating human thought processes. This vigorous resurgence of AI relies on exponential data growth, lower computing cost and fast processor speed. For example, AlphaGo developed by Google’s DeepMind defeated top-ranked Go game player Lee Sedol in 2016. DeepMind used neural networks in designing AlphaGo, meaning AlphaGo learned the game largely through tens of millions of past Go matches and matches with itself, rather than only receiving a set of defined rules. The same AI engine used by DeepMind was applied to manage power efficiency in data centers and improved power efficiency by 15% (Mearian, 2016). AI can be classified into three broad categories: artificial narrow intelligence (ANI), artificial general intelligence (AGI), and artificial super intelligence (ASI). ANI refers to algorithms that are as competent as or better than human beings at just a specific task. AGI is a program that is as intelligent as a human in every capacity, while the algorithm for ASI is smarter than human beings. The range of technologies and applications of AI covers robotics, machine learning, autonomous vehicles, virtual agents, deep learning, and more. Artificial intelligence is referred to here as machine learning. There are several types of machine learning, distinguished by the degree of human involvement required in labeling the data: supervised learning,
page 4029
July 6, 2020
16:6
4030
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
semi-supervised learning, unsupervised learning, and deep learning. In deep learning, a machine uses artificial neural networks (ANN), which are algorithms inspired by the human central nervous system that can actively learn about the environment to solve complex problems. The algorithms used in deep learning can be applied to supervised, semi-supervised or unsupervised learning. Before the resurgence of AI, big data and AI had no direct relation. With the exponential growth of data from the Internet and mobile devices and recent technological developments, these two areas have converged to create new opportunities for companies to improve efficiency and boost revenue. On one hand, many types of large unstructured data such as video, audio and images cannot be analyzed by traditional techniques and require AI. On the other hand, machine learning requires a large volume of data to develop and refine its algorithms. The more data on which a machine is trained the “smarter” it becomes at discovering relationships and patterns. As Google’s Chief Scientist Peter Norvig put it, Google’s success is attributable to more data instead of better algorithms (Cleland, 2011).
117.3 Implications of Big Data and AI for Banks Big data and AI have several important implications for the banking industry. First, big data and AI can help banks make better trading and investment decisions. They reduce the cost associated with information collection and processing. Artificial intelligence algorithms can gather a vast amount of public data (e.g., corporate filings and disclosures, news, tweets) and process it using text mining, sentimental analysis, or social network analysis. These analyses generate informed trades based on machine learning techniques including probabilistic logic, Bayesian networks, and deep learning. This increases productivity by improving efficiency and reducing cost relative to human or traditional computer processing. Furthermore, big-data-based machine learning can also make trading and investment decisions without human intervention based on investors’ investment horizons. Whereas the algorithms underlying high frequency trading rely on speed to trade, big-data and AI-based trading strategies search for the best trades in the longer term, which may be days, weeks, or months into the future (Metz, 2016). A recent study on the US stock market examines the usefulness of an AI-based trading strategy and reports positive findings on profitable investment strategies (Krauss et al., 2017). Hedge funds and banks
page 4030
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4031
are developing or have developed similar AI-based strategies.5 The value of assets under management using AI and machine learning is estimated over $10 billion in 2017 and is growing steadily (Financial Stability Board, 2017). Second, when equipped with refined and customized information generated by big data and AI, banks can learn how to attract new customers and gain a higher margin from existing customers. They may offer customers better products and services and customize offerings using crossselling, upselling, and real-time personalized advertising. Banks already have rich data on customer transactions and credit worthiness. Big data technologies provide the opportunity for banks to link internal customer transaction data to external data and create contextualized and personalized experiences. External consumer data may come from retailers, telecommunications companies, and other data providers such as Nielsen and Acxiom. By combining internal and external data, a bank can gain a more nuanced understanding of their customers’ spending habits and preferences. The bank can then use AI to build predictive models and design personalized advertising, marketing and services to suit customer interests and improve customer retention and loyalty. Some large banks have already become early adopters of big data technologies. For example, a large global bank applied predictive models to savings-related product offerings and employed cross-selling, and the outcome was that they boosted branch sales tenfold and achieved a 200% increase in conversion rates within only two months (Chintamaneni, 2016). Bank of America uses big data technologies to better understand customers across various interactions and to personalize offers to well-defined customer segments (Davenport, 2016). Further, AI-based Robo-advisors and Chatbots can help banks enhance efficiency, strengthen customer relationships, and improve customer satisfaction.6 It is estimated that the assets managed by Robo-advisors will reach between $2.2 trillion and $3.7 trillion by 2020 and $16 trillion by 2025.7 The 5
For example, Aidyia, a hedge fund founded by AI expert Benjamin Goertzel, uses multiple AI techniques to analyze big data from various sources and generate and choose the best trading decisions. JP Morgan Chase has been working with Sentient Technologies to develop an AI-based trading algorithm (Metz, 2016). 6 A Robo-advisor is a digital platform that offers automated financial planning services based on algorithms with little to no human supervision. A Robo-advisor collects information about the client and uses the data to offer investment advice or automatically invests on behalf of the client. 7 https://money.usnews.com/investing/investing-101/articles/2018-06-27/6-of-the-newes t-trends-in-robo-advisors.
page 4031
July 6, 2020
16:6
4032
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
current generation of Robo-advisors can identify and understand complex customer needs, and customize advice and implement solutions according to client preferences. In day-to-day customer interactions, one application of AI is that a bank can use voice or text activated search technology with an application programming interface (API) that presents the user with an organized, detailed response to questions based on natural language processing (NPL). For example, Erica, a chatbot used by Bank of America, can perform easyto-use transaction-search functions and give financial advice to users on their mobile devices. Erica is expected to perform a variety of functions such as protecting consumer privacy, and the tool will improve its ability to understand speech over time. This AI-based virtual assistant has already attracted over 1 million users in its first three months since March 2018.8 Third, big data technologies can increase internal efficiency and effectiveness of processes such as loan reviews, fraud detection, and regulation compliance. During the credit application process, a bank can rely on AI techniques, such as advanced pattern recognition and audio and video analytics, to analyze potential borrowers’ social media data and video images of their posture, facial expressions, and gestures to gauge the applicant’s truthfulness. Insights gained from such analyses are then evaluated with the credit rating of the borrower to determine the level of risk. Big data and AI can also help banks assess and monitor the risks in the entire process of personal and commercial lending. Khandani et al. (2010) applies machine-learning algorithms to consumer credit risk modeling using customer transactions and credit bureau data for a sample of a major commercial bank’ customers. The study finds that the AI-based risk models can save the bank 6% to 25% of its total loan losses. AI can enhance a bank’s internal control and efficiently prevent and detect fraud. Big data analytics can build predictive models to categorize and score the risk of transactions and customers. For example, a bank can use optical character recognition (OCR) and AI technology to verify signatures on checks with those on previously scanned checks stored in the database.
8
Crosman, P. 2018. Mad about erica: Why a million people use Bank of America’s chatbot, American Banker, June 13, 2018, https://www.americanbanker.com/news/mad-about-er ica-why-a-million-people-use-bank-of-americas-chatbot.
page 4032
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4033
As a result, the bank can identify fraudulent checks in real time.9 It can also reduce operating costs by cutting the number of labor hours required for manual validation and achieve a reduction in fraudulent transactions. HSBC uses machine learning to monitor millions of card transactions in the United States, which has effectively enhanced fraud detection and prevention by reducing false-positive rates and improving fraud case handling.10 Big-data-based AI can also be applied to managing regulatory compliance such as customer identification programs or “Know Your Customers” (KYC) and anti-money laundering combating the financing of terrorism (AML-CFT) tasks.11 The KYC process is often costly, extensively manual, and highly duplicative. Big-data-based machine learning is increasingly applied to KYC tasks to check customer identity and background (Financial Stability Board, 2017). This AI-based approach can improve cost efficiency, enhance customer service, and increase standardization of KYC quality and compliance. In the anti-money laundering function, AI technologies enable banks to automate the financial crimes risk management process including collection of third-party data, client risk rating, behavioral monitoring, segmentation, scoring, and identification. For example, big-data-enabled AI can identify non-linear relationships among different client attributes and rate the likelihood of suspicious transactions. Further, it can detect complicated behavior patterns that are not easily observable through analyzing suspicious transactions alone. Fourth, banks can more proactively manage risk portfolios based on big data analytics. For example, machine learning can enhance risk modeling to provide a more accurate basis on which to generate warning signals for intervention when the positions of a trading account exceed the optimal risk threshold.
9
Using the machine learning approach, a global bank uses scan and validate up to 1200 checks per second and thus substantially reduces the processing time compared to manual processing. Its predictive model is forecasted to save $20 million from prevention of fraudulent transactions (Cognizant, 2018). 10 Clark, L. 2018. https://www.computerweekly.com/feature/How-banks-are-detectingcredit-fraud. 11 Banks use a central repository (KYC utility) to warehouse customer data and documents required to satisfy regulatory requirements and to support a financial institution’s KYC procedures. Once a customer’s data has been stored in the repository, member financial institutions can use the information for their KYC needs.
page 4033
July 6, 2020
16:6
4034
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
Fifth, banks can also use big data and AI algorithms to increase productivity and enhance internal controls through human resources (HR) management. HR managers can use big data and AI to analyze how hiring criteria relate to employee performance. Additionally, HR can analyze social media and use AI techniques such as sentimental analysis and facial recognition to monitor the mood of employees and analyze informal networks among coworkers. 117.4 Implications for Regulatory Compliance and Supervision After the financial crisis, authorities gradually tightened their supervision of financial institutions, and the cost of complying with regulatory decrees increased. In order to meet regulatory requirements, financial institutions invested more workforce and resources into compliance-related jobs, including legal support, regulatory reporting, business impact analysis, process improvement and implementation, and others. Globally, banking industry’s cost to meet compliance requirements reached $100 billion (Zabelina et al., 2018). Some large international banks, which must meet regulatory requirements for multinational operations, invest more than $1 billion annually for compliance and internal control.12 At the same time, compliance regulation continues to change. The text of the annually updated terms related to financial compliance and supervision totals 300 million pages, illustrating that financial institutions face enormous regulatory compliance pressures.13 Furthermore, in recent years, governments have imposed heavier fines on banks for not meeting regulatory requirements. Consequently, banks incurred significant operating expenses due to failure to comply with laws and regulations. For example, Deutsche Bank paid $7.2 billion to settle a lawsuit in 2016 related to mortgage-backed securities.14 It was also fined $787 million 12
For example, JP Morgan Chase pointed out that from 2012 to 2014, to respond to government regulations, 16,000 employees were added, which amounted to 6% of the total workforce; the annual cost expenditure increased by $2 billion, which was about 10% of annual operating profit. See Douglas J. Elliott, “$2 Billion Later: Policy Implications of JP Morgan Chase’s Trading Loss,” and Brookings, May 14, 2012, https://www.brookings.ed u/research/2-billion-later-policy-implications-of-jp-morgan-chases-trading-loss/. 13 Elena Mesropyan, “Application of AI in RegTech,” Medic, November 11, 2016, https:// gomedici.com/application-of-ai-in-regtech/. 14 Karen Freifeld, Arno Schuetze, and Kathrin Jones, “Deutsche Bank agrees to $7.2 billion mortgage settlement with US,” Reuters, December 22, 2016, https://www.reuters.com/a rticle/us-deutsche-bank-mortgages-settlement-idUSKBN14C041.
page 4034
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4035
by regulators for ineffective foreign exchange transactions supervision and ineffective AML measures.15 In response to increasing regulatory compliance challenges, most financial institutions are increasing their human and financial compliance resources, which in turn leads to a heavy operating burden that will eventually become unsustainable as regulatory complexity increases more. Current application areas for regulatory compliance include regulatory compliance management, customer identification and insight, AML, and conduct surveillance (Chadha and Kaur, 2018). It is therefore imperative for banks to use technologies such as big data analytics to solve complex, time-consuming, and costly regulatory compliance challenges. Some innovative financial institutions partnered with technology companies are beginning to use these new technologies to solve compliance issues and have achieved initial results, providing us with new ideas and methods. Regulatory supervisors will find several advantages of using AI. First, it helps solve the agency problem of regulators, in that the regulators are not well motivated to supervise. This regulatory incentive issue can be mitigated when regulators use AI to identify abnormal activities. At the same time companies are more likely to comply with regulation when they are aware of the employment of AI by regulators. The second is that AI can enable the supervisor to move toward global optimization. The recently concluded game between AlphaGo and Ke Jie shows that AI learns autonomously quickly, and it has compelling computational and global optimization capabilities (Byford, 2017). Humans are challenged to integrate probability calculations across multiple courses of action and over extended time horizons, which is the essence of calculating financial and other risk in a complex system. This shortcoming can now be largely overcome by AI. Third, AI has unique advantages in dealing with systemic financial risk, which is when individual financial risks cause significant damage to the real economy and the operation of the financial system. We do not yet have a good understanding of the relation between financial market volatility and systemic financial risk. For example, before the 2007 global financial crisis, few people predicted that the US subprime mortgage crisis would lead to a global financial crisis. Even after the US subprime mortgage crisis, many
15
Karen Freifeld and Arno Schuetze, “Deutsche fined $630 million for failures over Russian money-laundering,” Reuters, January 30, 2017, https://www.reuters.com/article/us-deut sche-mirrortrade-probe/deutsche-fined-630-million-for-failures-over-russian-money-laun dering-idUSKBN15E2SP.
page 4035
July 6, 2020
16:6
4036
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
experts across financial fields failed to foresee that this would escalate to a global crisis. After the crisis, international researchers have sought to develop early-warning mechanisms, such as the Credit-to-GDP gaps indicator of the Bank for International Settlements. However, several studies have identified different shortcomings of this indicator (e.g., see Giese et al., 2014, Drehmann et al., 2014). Thus, it may be difficult to identify systematic financial risk through simple indicators. The features of AI make it well-positioned to identify systematic financial risk in the complexity of the global economy. Two types of reasoning comprise AI. The first type is rule reasoning, in which the AI stores rules and applies them to new situations. In the current application, the shortcoming of reasoning by rules is that regulatory rules may be frequently adjusted. Therefore, rule-based reasoning is not enough, and the second type of reasoning, case-based reasoning, is also needed. An AI can quickly access and analyze many cases. Presently, case-based reasoning in AI is very useful in the medical field (Hernandez et al., 2017). Heart disease diagnosis performed by AI is more accurate than that of doctors. Similarly, regarding regulatory supervision, AI can accumulate all historical cases and suggest correct decisions, at least as an extra tool for the regulator. In this sense, AI may replace manual supervision in the future. Furthermore, regulators need to know these technologies because it is relevant to their function. Because financial institutions use these tools and can leverage big data, if regulators do not own this technology, information asymmetry will arise. Moreover, financial institutions may use advanced technology to arbitrage (Jakˇsiˇc and Marinˇc, 2018). In order to avoid regulatory arbitrage, regulators also need to know big data and AI technology. When regulatory agencies first started using big data technologies, the first problem they addressed was that of regulators’ incentive and restraint mechanisms. The US financial regulatory authorities are using AI to match the modes of financial supervision to the types of investment. The US Securities and Exchange Commission is starting to use machine learning to predict future investor behavior.
117.5 Challenges and Limitations First, the emergence of big data and AI makes financial risks more sensitive and contagious (Helbing, 2013). In the absence of financial technology, financial risk may be contained. With the development of financial technology using big data and AI, through the Internet’s ability to spread information widely and quickly, risks that were formerly isolated may become
page 4036
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4037
systemic. Financial technology itself helps solve the problem of information asymmetry in some individual markets, but from a global perspective, it may cause systemic information asymmetry. As information becomes more complex, regulatory agencies’ understanding may decrease, which would increase the hidden nature of financial risks. As more banks and other institutions adopt the same technology, financial contagion and fragility increase among financial institutions and markets. The second challenge relates to privacy concerns (Eastin et al., 2016). In the process of value development of data resources, privacy issues may become the most prominent ethical challenge. In the information age, websites, apps, appliances, and digital home assistants record users’ every move. Data mining technology finds valuable patterns in a mass of seemingly unrelated data. This data-rich existence is a double-edged sword. Users expect their smart devices to use their personal information to provide “personalized customization” services; however, a breach in the security of such a comprehensive database could be devastating to personal security, financial and otherwise. The widespread use of cloud computing and data sharing technologies exacerbates the risk of this privacy breach. If hacked, a leak in the customer’s personal privacy information and large-scale retail banking business information will inevitably lead to a series of security problems extending beyond the individual, causing unpredictable losses to banks and customers.16 AI also poses a security risk (Scherer, 2016). The decision-making process of AI is like a black box. As a neural network acquires information, it does not create a record of the structure of that information or the basis on which it makes decisions. The iterative nature of AI information acquisition makes it impossible for users to provide, post hoc, a justification for AI output. This lack of transparency makes it difficult for regulators and market investors to correct potential problems in the decision-making process. If AI causes financial loss, it may not be possible to tell how the algorithm caused that outcome; it may also be difficult to decide who is responsible. The overall risk of using AI may be especially underestimated when there is uncertainty in the governance structure of AI applications within financial
16
For example, Equifax reported a massive leak in their database in 2017, potentially losing 148 million people’s personal information. See Brian Fung, “Equifax’s massive 2017 data breach keeps getting worse,” The Washington Post, March 1, 2018, https://www.washingtonpost.com/news/the-switch/wp/2018/03/01/equifax-keeps -finding-millions-more-people-who-were-affected-by-its-massive-data-breach/?noredirec t=on&utm term=.602b0e198ce5.
page 4037
July 6, 2020
16:6
4038
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
institutions. In addition, if AI is highly dependent on a small number of thirdparty technology providers, such dependencies may pose risks to financial institutions. Further, facing opportunities from big data and AI, banks will have to compete for talent with the critical skills in big data analytics. In the near future, a certain level of proficiency with big data analytics will become the baseline for competition. Banks will have to invest early and act more proactively to maintain their advantage through talent acquisition. Finally, financial market stability risks may arise when many financial market participants simultaneously apply AI technology. For example, if machine learning-based traders outperform other traders, this may lead to more traders adopting similar machine learning strategies, which could amplify financial volatility. In addition, predictable patterns in machine learning trading strategies may present an opportunity for criminals to manipulate market prices.17 Artificial intelligence also has several practical limitations (Mittelstadt et al., 2016). First, unlike human intelligence, all the classification, recognition and prediction functions of AI are algorithm-driven statistical data analysis systems. Artificial intelligence finds relationships among various attributes, but that does not establish causation. Algorithms that describe correlation but cannot determine causation present an incomplete picture of reality. As a result, decisions based on the algorithms’ findings may be suboptimal. Another limitation relates to data quality (Hazen et al., 2014). The success of AI algorithms largely depends on the quality of the data used to create the algorithm. Because data are updated constantly, and current conditions are always changing, historical data may not speak well for future events. Thus, there is still a large gap between simulating reality by mapping data onto the abstract structure of the real world and the real world itself. Mathematical tools based on symbols and logic can only simulate the “quantity feature sampling” of the real world and cannot achieve continuous “complete refactoring.” The real world is a continuous, interdependent universe of things, and data are only a partial, abstract expression of the whole. Third, AI decision making may be associated with algorithmic discrimination (Hacker, 2018). On the surface, the algorithm, as a mathematical 17
On May 6, 2010, a hedge fund trader submitted a $4.1 billion stock index futures sell order to hedge stock risk. It triggered a chain of programmatic selling and resulted in the Dow Jones index falling about 9%, and trillions of dollars of wealth disappeared instantly. See CNBC, “The lasting impact of the 2010 flash crash,” CNBC, May 6, 2014, https:// www.cnbc.com/2014/05/06/the-lasting-impact-of-the-2010-flash-crash.html.
page 4038
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4039
structure, has objective and deterministic features. Therefore, algorithmic decisions should be less likely to be influenced by people’s emotions and values. However, the actual situation is exactly the opposite. Artificial intelligence systems’ training data may be disturbed or contaminated in subtle ways that are very difficult to detect. Humans are notably poor at detecting their own biases; therefore, intentional, explicit efforts are required to avoid bias. This is especially important because of the veneer of objectivity of a mathematical system, which could cause complacency by those creating the system. For example, instances of racial or gender discrimination in a specific group may be reflected in the machine-learning process and become part of the algorithm’s output. For example, if a region has a large number of low-income groups or ethnic minorities, all personal loan applicants from that region could be rejected by the algorithm automatically, resulting in exacerbated geographic discrimination. In this way, the algorithm magnifies relatively smaller-scale discrimination hidden in the original data into larger social justice problems. It is technically difficult to retroactively identify the parts of the original data set that caused this outcome; instead, engineers may create a new algorithm with different data, hoping for a more favorable outcome.
Bibliography Baker, R.S. and Inventado, P.S. (2014). Educational Data Mining and Learning Analytics, in J.A. Larusson and B. White (eds.), Learning Analytics: From Research to Practice. Springer, Berlin. Byford, S. (2017). AlphaGo Retires from Competitive Go After Defeating World Number One 3–0. The Verge 27. Chadha, A. and Kaur, P. (2018). Handling Smurfing Through Big Data, in V.B. Aggarwal, V. Bhatnagar and D.K. Mishra (eds.), Big Data Analytics. Springer, Singapore, pp. 459–470. Chen, M., Mao, S. and Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications 19(2), 171–208. Chen, C.P. and Zhang, C.Y. (2014). Data-Intensive Applications, Challenges, Techniques and Technologies: A Survey on Big Data. Information Sciences 275, 314–347. Chintamaneni, P. (2016). How Banks are Capitalizing on a New Wave of Big Data and Analytics. Harvard Business Review, Retrieved from https://hbr.org/sponsored/2016 /11/how-banks-are-capitalizing-on-a-new-wave-of-big-data-and-analytics. Cleland, S. (2011). Google’s “Infringenovation” Secrets. Forbes, Retrieved from https:/ /www.forbes.com/sites/scottcleland/2011/10/03/googles-infringenovation-secrets/ #1e9c536730a6. Cognizant, (2018). Advanced AI Machine Learning Solution Detects Check Fraud for a Large Global Bank. Retrieved from https://www.cognizant.com/case-studies/ai-mac hine-learning-fraud-detection. Crevier, D. (1993). AI: The Tumultuous History of the Search for Artificial Intelligence. Basic Books, New York, NY.
page 4039
July 6, 2020
16:6
4040
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch117
T. R. Yu & X. Song
Davenport, T. (2014). Big Data at Work: Dispelling the Myths, Uncovering the Opportunities. Harvard Business Review Press, Cambridge, MA. Diamond, D.W. (1984). Financial Intermediation and Delegated Monitoring. Review of Economic Studies 51(3), 393–414, Drehmann, M. and Juselius, M. (2014). Evaluating Early Warning Indicators of Banking Crises: Satisfying Policy Requirements. International Journal of Forecasting 30(3), 759–780. Eastin, M.S., Brinson, N.H., Doorey, A. and Wilcox, G. (2016). Living in a Big Data World: Predicting Mobile Commerce Activity Through Privacy Concerns. Computers in Human Behavior 58, 214–220. EMC Corporation (2011). World’s Data More Than Doubling Every Two Years — Driving Big Data Opportunity, New IT Roles. Retrieved from https://www.emc.com/about/ news/press/2011/20110628-01.htm. Financial Stability Board (2017). Artificial Intelligence and Machine Learning in Financial Services pp. 1–41. Financial Stability Board (2017). Financial Stability Implications from FinTech: Supervisory and Regulatory Issues that Merit Authorities’ Attention, pp. 1–65. Gantz, J. and Reinsel, D. (2011). Extracting Value from Chaos. IDC iView 1–12. Gershman, S.J., Horvitz, E.J. and Tenenbaum, J.B. (2015). Computational Rationality: A Converging Paradigm for Intelligence in Brains, Minds, and Machines. Science 349(6245), 273–278. Giese, J., Andersen, H., Bush, O., Castro, C., Farag, M. and Kapadia, S. (2014). The Credit-to-GDP Gap and Complementary Indicators for Macroprudential Policy: Evidence from the UK. International Journal of Finance & Economics 19(1), 25–47. Gurkaynak, G., Yilmaz, I. and Haksever, G. (2016). Stifling Artificial Intelligence: Human Perils. Computer Law & Security Review 32(5), 749–758. Hacker, P. (2018). Teaching Fairness to Artificial Intelligence: Existing and Novel Strategies Against Algorithmic Discrimination Under EU Law. Common Market Law Review 55(4), 1143–1185. Hashem, I., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A. and Khan. S. (2015). The Rise of “Big Data” on Cloud Computing: Review and Open Research Issues. Information System 47, 98–115. Hazen, B.T., Boone, C.A., Ezell, J.D. and Jones-Farmer, L.A. (2014). Data Quality for Data Science, Predictive Analytics, and Big Data in Supply Chain Management: An Introduction to the Problem and Suggestions for Research and Applications. International Journal of Production Economics 154, 72–80. Helbing, D. (2013). Globally Networked Risks and How to Respond. Nature 497, 51–59. Hernandez, B., Herrero, P., Rawson, T.M., Moore, L.S., Charani, E., Holmes, A.H. and Georgiou, P. (2017). Data-Driven Web-Based Intelligent Decision Support System for Infection Management at Point-of-Care: Case-Based Reasoning Benefits and Limitations. HEALTHINF 119–127. International Data Corporation (2014). The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, Retrieved from https://www.emc .com/leadership/digital-universe/2014iview/index.htm. Jakˇsiˇc, M. and Marinˇc, M. (2018). Relationship Banking and Information Technology: The Role of Artificial Intelligence and FinTech. Risk Management 1–18. Khandani, A.E., Kim, A.J. and Lo, A.W. (2010). Consumer Credit Risk Models Via Machine-Learning Algorithms. Journal of Banking and Finance 34(11), 2767–2787. Krauss, C., Do, X.A. and Huck, N. (2017). Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500. European Journal of Operational Research 259(2), 689–702.
page 4040
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Big Data and Artificial Intelligence in the Banking Industry
b3568-v4-ch117
4041
Kurzweil, J. (1990). The Age of Intelligent Machines. The MIT Press, Boston, MA. Luger, G.F. and Stubblefield, W.A. (1993). Artificial Intelligence: Structures and Strategies for Complex Problem. Benjamin-Cummings Publishing, Redwood City, CA. Mearian L. (2016). Google’s DeepMind A.I. Can Slash Data Center Power Use 40%, Computer World, 2016 Jul 20, Retrieved from https://www.computerworld.com/article/3 098325/data-center/googles-deepmind-ai-can-slash-data-center-power-use-40.html. Metz, C. (2016). The Rise of the Artificially Intelligent Hedge Fund. The Wired, Retrieved from https://www.wired.com/2016/01/the-rise-of-the-artificially-intelligent-hedge-f und/. Miklo, D., Khanna, S., Olanrewaju, T. and Rajgopal, K. (2016). Cutting Through the Noise Around Financial Technology. McKinsey & Company, February 2016. Retrieved from https://www.mckinsey.com/industries/financial-services/our-insights/cuttingthrough-the-noise-around-financial-technology. Mittelstadt, B.D., Allo, P., Taddeo, M., Wachter, S. and Floridi, L. (2016). The Ethics of Algorithms: Mapping the Debate. Big Data & Society 3(2), 1–21. Philippon, T. (2016). The Fintech Opportunity. Working Paper, National Bureau of Economic Research. Platt, J., Souza, R., Checa, E. and Chabaldash, R. (2014). Elated Seven Ways to Profit from Big Data as a Business, Boston Consulting Group, https://www.bcg.com/publi cations/2014/technology-digital-seven-ways-profit-big-data-business.aspx. Ransbotham, S., Kiron, D., Gerbert, P. and Reeve, M. (2017). Reshaping Business with Artificial Intelligence. MIT Sloan Management Review Research Report 1–24. Scherer, M.U. (2015). Regulating Artificial Intelligence Systems: Risks, Challenges, Competencies, and Strategies. Harvard Journal of Law & Technology 29, 353. Wall, L.D. (2018). Some Financial Regulatory Implications of Artificial Intelligence. Journal of Economics and Business, Forthcoming. Ward, J.S. and Barker, A. (2013). Undefined by Data: A Survey of Big Data Definitions. Cornell University, arXiv1309.5821. World Federation of Exchanges and McKinsey & Company (2018). Fintech Decoded: Capturing the Opportunity in Capital Markets Infrastructure. Joint report, February 2018, Retrieved from https://www.mckinsey.com/industries/financial-services/our-i nsights/fintech-decoded-the-capital-markets-infrastructure-opportunity. Zabelina, O.A., Vasiliev, A.A. and Galushkin, S.V. (2018). Regulatory Technologies in the AML/CFT. KnE Social Sciences 3(2), 394–401. Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L. and Farhadi, A. (2017). Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning. In 2017 IEEE International Conference on Robotics and Automation, pp. 3357–3364.
page 4041
This page intentionally left blank
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
Chapter 118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration Ke Yang, Susan Wahab, Bharat Kolluri and Mahmoud Wahab Contents 118.1 118.2 118.3 118.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118.4.1 Parametric principal components model . . . . . . 118.4.2 Non-parametric principal components model . . . 118.4.3 Local polynomial regression . . . . . . . . . . . . 118.4.4 Kernel regression . . . . . . . . . . . . . . . . . . 118.4.5 Improving efficiency by using panel data structure
Ke Yang University of Hartford e-mail: [email protected] Susan Wahab University of Hartford e-mail: [email protected] Bharat Kolluri University of Hartford e-mail: [email protected] Mahmoud Wahab University of Hartford e-mail: [email protected]
4043
. . . . . . . . .
. . . . . . . . .
4045 4049 4051 4052 4052 4052 4054 4056 4057
page 4043
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
4044
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
118.4.6 Non-parametric additive model . . . . . . . . . . . 118.4.7 Test of differences of adjusted R2 . . . . . . . . . 118.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118.6 Summary and Conclusions . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 118A Tables and Graphs . . . . . . . . . . . . . . . . 118A.1 Bivariate simple correlation (SC) computed with respect to US equity market . . . . . . . . 118A.2 Bivariate simple correlations (SC) computed with respect to Japan’s equity market 118A.3 Plots of average simple correlations (SC) of alternative emerging equity markets groups versus US and Japan, along with Averages of parametric and non-parametric adjusted R2 measures . . . . . . . . . . . . . . . . . . . . 118A.4 Vuong’s (1989) test results comparing parametric and non-parametric adjusted R2 measures and cross-validation bandwidths results . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
4057 4058 4059 4062 4064 4066
. .
4066
. .
4069
. .
4071
. .
4073
Abstract Prior studies on financial markets integration use parametric estimators whose underlying assumptions of linearity and normality are, at best, questionable, particularly when using high frequency data. We re-examine the evidence regarding financial integration trends using data for 14 emerging equity markets from Southeast Asia, Latin America, and the Middle East, along with US and Japan. We employ non-parametric estimators of Pukthuanthong and Roll’s (2009) adjusted R2 measure of financial integration. Results from non-parametric estimators are contrasted with results from parametric estimators of adjusted R2 financial integration measure using bi-daily returns for contiguous yearly sub-periods from 1993 to 2016. We find two key results. First, we confirm prior evidence in Pukthuanthong and Roll (2009) that simple correlation (SC) understates financial integration trends compared to parametric adjusted R2 . Second, parametric adjusted R2 understates financial integration trends relative to non-parametric adjusted R2 . Hence, emerging equity markets may be more financially integrated, and offer fewer diversification benefits to global investors than previously thought. The results underscore the need to exercise caution when drawing inferences regarding financial markets integration using parametric estimators. Keywords Financial integration • Non-parametric regression • Locally-weighted regression • Principal components regression • Simple correlation.
page 4044
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4045
118.1 Introduction An extensive literature documents evidence of global diversification benefits from investing in emerging equity markets (e.g., Harvey, 1995; DeSantis and Gerard, 1997; DeRoon et al., 2001; Li et al., 2002, Chiou, 2007). However, recent economic and financial markets reforms in emerging economies have resulted in these markets becoming increasingly financially integrated with global capital markets, reducing their potential diversification benefits to foreign investors (e.g., Longin and Solnik, 1995; Errunza et al., 1999; Goetzmann et al., 2002; Dumas et al., 2003; Carrieri et al., 2007, Christofferson et al., 2012, Billio et al., 2017). Financial integration is a complex and dynamic process, and measuring it can be a challenging task. With several measures available, it is difficult to state whether one measure is universally better than another because the true pattern of how global risk factors permeate a country’s economy and capital markets changes over time, and is never known with certainty (Billio et al., 2017). As financial integration has direct consequences for expected diversification benefits, and the extent to which they are insulated from global financial and economic shocks, it is of interest to academics, practitioners, and policy makers to continue examination of alternative measures of financial integration and their robustness when estimated using a variety of statistical estimators of financial integration. The extant literature on financial integration uses a variety of parametric estimators of financial integration trends, and reports varying results. The disparity of findings can be attributed in part to the questionable validity of assumptions underlying parametric estimators, particularly when using high frequency data. Specifically, the assumptions of linearity and normality of returns may be untenable when using high frequency daily, bi-daily, or even weekly returns data, resulting in biased estimates and faulty inferences. In this study, we re-examine the evidence regarding financial integration of 14 emerging equity markets from Southeast Asia, Latin America, and the Middle East, along with US and Japan, which are taken as the primary global and regional equity markets for these emerging markets, using data from 1993 to 2016. We employ non-parametric estimators, which do not require any restrictive assumptions about the functional form of the model, or normality of the distribution of stock returns, and contrast these results with those obtained from using standard parametric estimators. We include the US in our sample of markets because it is the most prominent equity
page 4045
July 6, 2020
16:6
4046
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
market in the world, with developments therein impacting all equity markets around the globe. We include Japan in our sample of markets because of its prominent status as the largest Asian equity market for the group of Southeast Asian equity markets studied. Recent empirical studies suggest emerging equity markets exhibit increasing financial integration trend during periods of turmoil, which are periods when global diversification benefits from investing in these markets are most urgently needed. When uncertainty subsides, however, emerging markets financial integration levels revert back to their normal levels (e.g., Bekaert et al., 2005, 2009, Pukthuanthong and Roll, 2009). Alternatively, stated, during periods of crises, all financial markets exhibit a tendency to move down together, resulting in much fewer diversification benefits being available to global investors. While financial integration levels of developed equity markets have peaked and stabilized at high levels over the last two decades, reflecting their maturity, depth, and breadth, emerging equity markets financial integration levels exhibit substantial variability over time, suggesting international diversification benefits from combining emerging and developed markets equities remain, but will be time-varying, prompting a need for understanding the drivers behind this variability in financial integration trends. Traditionally, simple correlation has been used to gauge international diversification benefits, and it has been shown in the extant literature on international diversification that correlations of dissimilarly-financially integrated markets (i.e., developed and emerging equity markets) are lower than correlations of similarly-financially integrated markets (i.e., developed or emerging equity markets). It has also been shown in a number of studies that high levels of financial integration do not always correspond with high cross-market correlations, and that the simple correlation (SC) is an inadequate measure of financial integration (Pukthuanthong and Roll, 2009). Therefore, a variety of financial integration measures have been introduced and examined in prior studies. An understanding of financial integration is important because financial integration trends have direct consequences for global diversification benefits expected from combining global equities in one portfolio. One measure which has received attention in recent studies (e.g., Pukthuanthong and Roll, 2009) is adjusted R2 obtained from regressing equity returns on a variety of principal components risk factors, and it has also been shown in recent studies that simple correlation (SC) will understate financial integration compared to adjusted R2 . As is well known, adjusted R2 is determined by three components: exposure coefficients (factor loadings) with respect to systematic global risk factors, factor
page 4046
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4047
volatilities, and idiosyncratic volatilities. Even if exposure coefficients, or factor volatilities, are rising (suggesting rising financial integration), risk reduction benefits from global diversification with emerging markets equities may still remain if idiosyncratic volatilities are also rising, possibly dwarfing rising factor volatilities, and/or rising exposures to systematic risk factors. In sum, international diversification benefits depend on more than simple correlations (SC). Global equities diversification benefits can be quantified by either: (1) marginal risk-reduction benefits from combining domestic and foreign equities, or (2) enhancement of risk-adjusted returns when investing globally rather than only domestically. Rising financial integration has been shown to reduce (but not totally eliminate) global diversification benefits (e.g., Li et al., 2002). Further, financial integration of capital markets affects their risk levels, raising or reducing the cost of capital for sovereign and private sector entities in these markets seeking to raise capital globally. On one hand, rising financial integration can increase emerging capital markets systematic risks through their increased exposures to global financial and economic events, increasing their required returns, and raises their cost of capital (DeJong and DeRoon, 2005). However, adverse cost of capital effects resulting from rising financial integration may be partly offset by increased global risk-sharing opportunities as a result of increasingly attracting foreign capital flows into emerging capital markets (e.g., Bekaert et al., 2005). The net effects of rising financial integration levels on required returns, risk, and cost of capital for emerging markets are not easily determined before-hand. Therefore, it is not surprising that interest in developing alternative robust measures of financial markets integration continues among academics, investors, and policy makers due to its wide-ranging implications for benefits expected from diversifying globally, along with an understanding of the extent to which financial markets and economies are exposed or insulated from global financial and economic shocks. In this study, we extend the literature on financial integration by employing non-parametric estimation methods of financial integration of a sample of emerging equity markets from Southeast Asia, Latin American, and the Middle East. We demonstrate how non-parametric estimators yield different financial integration estimates than parametric estimators when measuring financial integration using Pukthuanthong and Roll (2009) adjusted R2 measure, then we contrast both sets of adjusted R2 estimates with the traditional simple correlation (SC) measure of financial integration. Prior studies used a variety of alternative parametric estimators of financial integration, including: simple correlation (SC), volatility-adjusted correlation (e.g., Forbes and Rigobon, 2002), constant conditional correlation
page 4047
July 6, 2020
16:6
4048
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
(e.g., Baillie and Bollerslev, 1987; Bollerslev, 1990; Karolyi, 1995), dynamic conditional correlation (Engle, 2002), and single and multi-factor model adjusted R2 measures (e.g., Schotman and Zalewska, 2006, Bekaert et al., 2009, Eiling and Gerard, 2007, Pukthuanthong and Roll, 2009 and Volosovych, 2011). In a recent study, Billio et al. (2017) compare several of these alternative parametric measures of financial integration (and their resulting international diversification benefits estimates), and, surprisingly, find simple correlation (SC) does just as well as more sophisticated measures of financial integration. Pukthuanthong and Roll (2009) call into question validity of simple correlation (SC) as a measure of financial integration, and demonstrate that SC will, in most cases, understate financial integration, thereby overstating international diversification benefits. This arises when multiple common risk factors affect securities returns, and factor exposures are not exactly proportional across the multiple risk factors. They propose, instead, an alternative (more robust) measure of financial integration: adjusted R2 estimated by regressing equity markets returns on a set of orthogonal common global risk factors derived from principal components analysis of returns of 21 most developed equity markets in the world. They demonstrate adjusted R2 and SC will perfectly correspond in only two polar cases: (1) when residuals variances (and co-variances) are zero, in which case SC and adjusted R2 converge to +1, and (2) when factor loading (or factor volatilities) are zero, in which case SC and adjusted R2 converge to zero. In between these two extremes, SC will not correspond to adjusted R2 , resulting in these two measures yielding different conclusions for financial integration. They illustrate that differences in results from these two alternative financial integration metrics (adjusted R2 and SC) can be attributed to: (i) variability of factor loadings (exposure coefficients), (ii) variability of factor volatilities, and (iii) idiosyncratic volatility. In sum, SC estimated from unconditional returns or constant (or dynamic) conditional correlations (estimated from conditional residual returns) are generally flawed measures of financial integration as neither can reveal the full extent of financial integration (and, hence, international diversification benefits). Parametric financial integration estimators assume stock returns adhere to the assumptions of linearity of functional form, normality, and independent and identical distributions (i.i.d.). These assumptions are, however, rarely met, particularly in high frequency returns (e.g., daily, bi-daily, or even weekly), resulting in biased and potentially misleading inferences for financial integration trends. Further, the estimated models’ coefficients and their standard errors will be biased and inefficient. While violations of
page 4048
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4049
normality may not be critical for drawing valid inferences (because of central limit theorem), violations of linearity and i.i.d. assumptions can result in statistically significant differences in financial integration levels estimated parametrically. We contribute to the debate regarding financial integration by introducing a class of non-parametric estimators of financial integration, and contrast these results with those from parametric estimators. 118.2 Motivation Measuring financially integration has been of interest to researchers over the last two decades. A variety of parametric approaches have been used yielding conflicting results with no consensus as to whether one measure is better than another. A possible contributing factor to this lack of consensus for a universally acceptable financial integration measure is the questionable validity of the assumptions underlying parametric estimators. Therefore, it is of interest to continue the search for robust measures of financial integration, which do not require restrictive assumptions regarding the underlying data-generating process. We re-examine a new class of estimators of financial integration using non-parametric, rather than parametric estimators, particularly since the former are known to be robust to violations of the assumptions of linearity of functional form, and normality of the underlying returns distributions. In this study, we re-estimate Pukthuanthong and Roll’s (2009) adjusted R2 measure of financial integration non-parametrically, and contrast these results with parametric estimators. We use adjusted R2 as a measure of financial integration as it does not require a specification of any particular asset pricing theory, and, more importantly, it does not require any restrictive assumptions of functional form or the data-generating process. We compare parametric and non-parametric estimates of adjusted R2 with simple correlation SC as a competing measure of financial integration because a recent study (Billio et al., 2017) finds SC to be as good a measure of financial integration as parametric adjusted R2 . Since non-parametric measures of financial integration have not been used in the existing financial integration literature, our study adds to this literature by demonstrating an alternative methodological approach to the study of financial integration. In principle, non-parametric estimators may yield different and more robust measures of financial integration than parametric estimators for several reasons. First, functional form-flexibility is a common feature of all non-parametric estimators. Second, non-parametric estimators are readily adaptable to special requirements of panel data sets, allowing the marginal
page 4049
July 6, 2020
16:6
4050
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
effects of global factors on individual markets returns to vary across observations. Lastly, functional form flexibility helps reduce spatial auto-correlation without imposing arbitrary contiguity matrices or distributional assumptions on the data. Surprisingly, despite these advantages, non-parametric estimators have been of limited use in analyzing financial data sets (with the exception of a few studies; e.g., Fiori and Simonetta, 2007; Min and Lee, 2008; Bartoloni and Maurizio, 2014). Several factors may account for this apparent unpopularity. First, there may be a perception that non-parametric estimation is difficult to implement, its results are difficult to interpret, and it wastes degrees of freedom. Second, non-parametric estimation may be more appropriate for prediction than for hypothesis testing. For example, Fiori and Simonetta (2007) examine interest rate risk exposure of the 18 largest banks in Italy using both parametric and nonparametric methods because the distribution of interest rate changes, and underlying risk factors, are skewed and non-normal. Using daily data, they estimate principal components model parametrically and non-parametrically (the latter is based on kernel densities of principal components distributions), and find parametric estimators are better at capturing interest rate volatility, particularly when interest rates are declining, but non-parametric estimators perform better when interest rates are rising. Min and Lee (2008) use non-parametric data-envelopment analysis (DEA) to predict credit scores utilizing a large sample of 1061 manufacturing firms in South Korea. The authors indicate traditional parametric multiple discriminant analysis, logistic regression, and neural networks are restrictive in the context of predicting bankruptcy, and credit ratings changes, as they require additional prior information, whereas non-parametric DEA requires only ex post data. They find non-parametric DEA is an equal (if not a superior) estimator when predicting credit scores compared to traditional parametric measures such as regression and discriminant analysis (used by credit rating agencies and financial institutions). Bartoloni and Maurizio (2014) use non-parametric DEA to examine credit risk and financial performance of a sample of surviving and bankrupt Italian manufacturing firms for the period between 2003 and 2009, and compare its results to those of parametric logistic discriminant analysis. Since the ratio of bankrupt to non-bankrupt firms is always less than 1, estimated default probabilities tend to be negatively-skewed (underestimated), violating the normality assumption of parametric estimators. They argue parametric discriminant analysis has often been criticized on issues of endogeneity and sample selection bias (as a result of including a disproportionate number
page 4050
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4051
of surviving rather than failing firms). Accordingly, use of non-parametric DEA may be desirable as it does not require ex ante information on bankrupt firms, and, also, has the ability to rank firms according to their efficiency, and financial optimization procedures using a number of mathematical programming models. They find non-parametric DEA performs better than its parametric equivalent. In conclusion, non-parametric methods have been used to a limited extent in financial research, predominantly the area of financial services, and have been shown to be better estimators than parametric methods. Because we use high frequency bi-daily stock returns data, likely to be non-normal (and highly non-linear over time), non-parametric estimators may provide better fit to the data, yielding more robust results than those obtained from traditional parametric estimators.
118.3 Data We use bi-daily US dollar returns of national stock indices of 10 emerging Southeast Asian equity markets, 3 emerging Latin equity markets, and 1 emerging Middle East equity market. We also use equity indices of the US and Japan as the former represents the most prominent global equity market whose developments impact all equity markets around the world, while the latter is the largest regional equity market in Asia whose developments are expected to impact Southeast Asian equity markets. All stock indices data are obtained from global financial data (globalfinalcialdata.com). The Asian equity indices studied are: Singapore’s Straits Times Index, Taiwan’s Stock Exchange Index, Thailand’s Bangkok Stock Exchange Index, Hong Kong’s Hang Seng Index, India’s Bombay Stock Exchange Sensitive Index, Indonesia’s Jakarta Composite stock Index, South Korea’s Stock Price Index, Malaysia’s Kuala Lumpur’s Stock Exchange Index, China’s Shanghai Stock Exchange Index, and Manilla Stock Exchange Index (Philippines). The three Latin American equity indices are those of Brazil, Argentina, and Mexico, and the Middle Eastern equity index used is that of Turkey. The US and Japan are represented by the S&P500 index and Tokyo Price Index (TOPIX), respectively. Principal Components estimation is employed to obtain estimates of global risk factors using returns on the 21most developed equity markets of the world: Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Japan, Luxembourg, Netherlands, New Zealand, Norway, Spain, Sweden, Switzerland, United Kingdom, and United States. Our sample period extends from
page 4051
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
4052
January 5, 1993 to February 22, 2016, yielding a total of 2,833 bi-daily returns (126 bi-daily returns observations per year). 118.4 Models 118.4.1 Parametric principal components model We begin our discussion with the traditional parametric approach to estimating adjusted R2 following Pukthuanthong and Roll’s (2009) multi-factor principal components model. We construct principal components (representing global risk factors) by decomposing the variance–covariance matrix of stock returns of the 21 most developed equity markets. We select only the first three principal components as proxies for the common global risk factors as they explain over 75% of the variability in the variance–covariance matrix of returns. The parametric Principal Components model is specified as: = 1d βik P C ki,t + ei,t ∀i = {1, . . . , 14} and ∀k = {1, . . . , d}. Ri,t = αi + k
(118.1) To estimate principal components risk factors, we take products of oneperiod lagged eigenvectors and concurrent equity returns of the 21 most developed equity markets after sorting the eigenvectors according to their eigenvalues (from highest to lowest eigenvalues). We regress returns on each of the 14 emerging equity markets onto the first three principal components with the highest eigenvalues. In this regression, Ri is return on each of the 14 emerging Asian equity markets, βik are factor loadings (exposure coefficients) of each market i with respect to the kth principal components, P C kt , (k = 1, . . . , d for the first d principal components; d = 3). In total, we run 16 regressions corresponding to the 14 emerging equity markets, along with the US and Japan. If US (or Japan) equity returns are dependent variable, they are excluded from the variance–covariance matrix of returns of the 21 developed equity markets used for estimating principal components (resulting in use of returns on only 20 developed equity markets at a time, following Pukthuanthong and Roll, 2009). This avoids use of US (or Japan’s) equity returns as both dependent and independent variables. Parametric adjusted R2 from equation (118.1) serve as our parametric financial integration measures. 118.4.2 Non-parametric principal components model Classical statistical assumptions of a known fully-parametric model specification are always violated in practice, so that factor models are likely
page 4052
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4053
to be nonlinear in functional form. Likewise, there is no reason to expect returns to be linear in continuous measures of the global risk factors (e.g., Pukthanthong and Roll, 2009 suggested salt and water as two factors in their 2-factor model). Further, there may be interactions between variables in a multi-factor model specification, suggesting a standard parametric model is just a convenient and simplified form of a more general model such as the one shown below. For convenience, we start with the following notation. Let Yit = Rit ; and Xit = {P Cit1 , . . . , P C dit }, so that: Yit = αi + f (Xit ) + it
f or
i = 1, . . . , n; t = 1, . . . , J,
(118.2)
where αi and ij are unobservable random variables with mean zero and variance σα2 and σ2 , respectively, and f (·) is an unknown smooth function which dictates how global factors contribute to returns.1 α Previously, polynomial terms, or spline functions, are commonly used to approximate the unknown function within a parametric equation. Recently, non-parametric procedures have been gaining popularity in estimating regressions. Some commonly employed non-parametric procedures are: (1) local polynomial regression, (2) local linear regression, (3) kernel regression, (4) non-parametric additive regression. Each of these procedures fits individual regressions targeted to specific points, with more weight placed on observations that are closer to the target. “Closer” can be defined narrowly in terms of geographic distance, or in terms of more general measures of distance among the full set of explanatory variables. In the case of a multi-factor model, “close” may be thought of more constructively in terms of similarity — returns observed in the same time period, countries with similar economic systems, or countries with similar economic resources, geographic location, culture, or languages. It is within these subgroups that markets behave similarly, resulting in returns possessing common attributes. Locally weighted regression is the most general of the three procedures, with other procedures being special cases. In the following subsections, we survey some commonly used estimation algorithms for model (118.2), including bandwidth selection, as well as a statistical test of differences between parametric and non-parametric estimates of adjusted R2 . 1
Model (118.2) is also known as the random-effects model and the individual-specific effects, αi , are assumed to be randomly drawn from an underlying population and independent from the regressors, whereas in a fixed-effects model one makes inference conditional on the individual units.
page 4053
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
4054
118.4.3 Local polynomial regression Define Yi = {Yi1 , Yi2 , . . . , YiJ } and Y = {Y1 , . . . , Yn } , and, analogously, define Xi , X and ui , u. In addition, define fi = {f (X11 ), . . . , f (XnJ )} . In this section, we focus on a univariate case in order keep notations concise, but estimators discussed here can be easily extended to higher dimensional models. The covariance structure of the observations on individual i is var(Yi |Xi ) = Σ = {σjk }jk=1 where σjj = σ2 + σα2 and σjk = σα2 for j = k. The standard local polynomial regression aims at the pth polynomial expansion of the regression function at a local point, f (x) = Gp (X − x) β, where Gp (υ) = {1, υ, . . . , υ p } and β = {β0 , . . . , βp } is a vector of Taylor expansion parameters. Let h denote the bandwidth and K(·) a symmetric kernel density function. Define Kh (υ) = h−1 K(υ/h) and Gip (x) = {Gp (Xi1 − x), Gp (Xi2 −x), . . . , Gp (XiJ −x)}. Martins-Filho and Yao (2009) suggest, in local polynomial regression, one can ignore the covariance structure without losing efficiency, i.e., m(x) can be estimated by β˜0 where β˜ = {β˜0 , β˜1 , . . . , β˜p } minimizes the weighted sum of squared residuals, minβ˜
n
˜ Kih (x)(Yi − [Gip (x)] β), ˜ (Yi − [Gip (x)] β)
(118.3)
i=1
where Kih (x) = diag{Kh (xij − x)}Jj=1 is known as the kernel function. The kernel function K(·) determines the weight that observation i receives in estimating the value of y at target point x. There are many choices for the kernel function, including a Gaussian density function, and a tri-cube kernel function. These functions generally share the common feature of declining weights as distance increases. In general, the choice of kernel weight function has little effect on the results. By limiting the Taylor expansion to degree of one, the local polynomial regression estimator as described in equation (118.3) can be simplified to local linear regression, also known as standard weighted least squares estimation with one regression to each target point. Denote Gi1 = (1, Xi ) , then one can define weighted least squares estimator as n −1 n ˆ Gi1Kih[Gi1] Gi1KihYi , (118.4) β(x) =e 1
i=1
i=1
which can be calculated as the vector of coefficients from a regression of wit Yit on wit and wit Xit , where wit = [K((Xit −x)/h)]1/2 . The predicted value of Y at the target point is simply βˆ Gi1 (x), i.e., the standard prediction evaluated ˆ at the target point, x. The coefficients on the explanatory variables, β(x),
page 4054
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4055
represent the estimated marginal effects at the target point. Standard errors ˆ are also easy to evaluate for β(x) (see Pagan and Ullah, 1999 for details). There are some general assumptions required for the local polynomial estimator to be well behaved. We discuss them below, focusing on their implications in empirical applications. First, the target function, f (·), needs to be a continuous function to some degree depending on the order of the Taylor expansion used to approximate the target function. For example, to use a local linear estimator, it is generally assumed that the second derivative of the target function, f (·), exists. The kernel function, K(·), is a symmetric and bounded kernel density function defined on a compact support, and has unit variance. Some of the commonly used kernel functions include Gaussian density function and tri-cube kernel function. 118.4.3.1 Bandwidth selection An important consideration in local polynomial regression estimation is the choice of bandwidth, h. The bandwidth determines how many observations receive positive weight when constructing the estimate, and how rapidly the weights decline with distance. Larger bandwidths, by placing more weight on more distant observations, produce smoother curves than smaller bandwidths. The bandwidth may be a fixed value for all data points, in which case the number of observations receiving some weight in estimation varies depending on how many observations are in the vicinity of the target point. Alternatively, the bandwidth may vary by target point such that a fixed number of observations receives positive weight for each target. The second approach is known as the “nearest neighbor” estimation in data mining literature and the varying bandwidth is generally referred to as the “window size” because it determines the size of the opening for observations to be included in estimation. In the case that observations are distributed uniformly over space, there is little difference between a fixed bandwidth and a fixed window size. However, most panel data sets combine countries with many close-by observations, while observations around other countries may be sparse. A fixed bandwidth leads to over-smoothing in areas where observations are densely populated near a target point, and it leads to undersmoothing (spiky) results in areas with sparse data. Thus, the “nearest neighbor” approach of a common window size for all target points is generally preferable to a fixed bandwidth for analyzing panel data. There is a large and growing literature on bandwidth selection. A common method for choosing the bandwidth is cross-validation, whereby each observation is used as the target point, and the estimated value of y for observation it is
page 4055
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
4056
constructed after omitting the itth observation from the model: CV =
J n 1 2 (Yit − Yˆit ) , nJ
(118.5)
i=1 t=1
where Yˆ it is the leave-one-out predicted value of Yit . The cross-validation bandwidth provides a useful guide for choosing window size when the goal is to accurately predict the dependent variable. The optimal window size is much larger — perhaps double or triple — when the goal is to estimate the marginal effect of the explanatory variables (Pagan and Ullah, 1999). The cross-validation bandwidth for each country based on a subsample of the data is presented in Table 1 in Appendix 118A. However, for convenience of comparison, a set of “Rule-of-thumb” bandwidths — roughly 1.5 standard deviation of the respective independent variable (principal components) — is used in calculating the non-parametric regression function and corresponding adjusted R2 . Non-parametric adjusted R2 results are presented in plots in Appendixes (118A.1)–(118A.3). 118.4.4 Kernel regression An even simpler version of the local polynomial regression is when the local function approximation is reduced to a degree of zero (also known as Nadaraya and Watson stimator). In this case the objective function is J n
K((Xit − x)/h) · (Yit − α)2
i=1 t=1
and the estimated value of Y at the target point x0 is n J i=1 t=1 K((Xit − x)/h)Yit . Yˆ (x) = n J i=1 t=1 K((Xit − x)/h) The marginal effects are estimated by taking the derivative of equation (118.2) with respect to X. Equation (118.2) can be constructed by regressing wit yit on wit xit , where wit = K((Xit − x)/h)1/2 . Thus, kernel regression is identical to local polynomial estimator, but with only a (weighted) constant term included in the weighted regression. The advantage of local polynomial regression over kernel regression is that a higher order Taylor expansion can accommodate more curvature in the estimated function, leading to more accurate estimates in regions with sparse data. LP estimator are typically less variable than kernel regression estimates, allowing larger bandwidths to be used in estimation.
page 4056
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4057
118.4.5 Improving efficiency by using panel data structure A panel data structure provides a unique opportunity to improve efficiency of the local polynomial estimator (see Lin and Carroll, 2000; Martins-Filho and Yao, 2009; Yang, 2016 for some recently proposed methods). Yang (2016) proposes a computationally simple and intuitive method which can improve estimation accuracy of the LP estimator by including the covariance matrix, Σ, in panel data in the weighing scheme (118.3). More specifically, this method estimates f (x) by fˆ(x) = βˆ0 , where βˆ = {βˆ0 , βˆ1 , . . . , βˆp }T by solving the following minimization problem: n ˜ 1ih(x)Σ−1/2 Kih(x) (Y i − [Gip(x)] β) minβ i=1
˜ Σ−1/2 1ih(x)(Y i − [Gip(x)] β), where the 1ih (x) = diag{1((xij − x)/h)}Tj=1 , and 1(z) = 1(|z| < c) is an indicator function defined on (−c, c), the same support as the kernel function K(·). In the following, we skip the fixed-point index x in the definitions of Gip (x), Kih (x) and 1ih (x) and write the solution to the above problem as n −1 −1/2 −1/2 Gip 1ih Σ Kih Σ 1ih [Gip ] m(x) ˆ = e1 i=1
n
Gip 1ih Σ−1/2 Kih Σ−1/2 1ih Yi ,
(118.6)
i=1
where e1 = {1, 0, . . . , 0} , a unitary vector of size (p + 1) × 1. 118.4.6 Non-parametric additive model The model in equation (118.2) is general enough to accommodate any smooth function but its performance with empirical data can be significantly affected by the “curse of dimensionality”, i.e., the accuracy of the estimator on f (X) depends inversely on dimension of the X. Stone (1980) shows the best rate obtainable in the estimation of f (x) is ns/(2s+d) where s is the degree of smoothness of the function f (·). The solution to the “curse of dimensionality” is to impose some structure on the regression function. One method which has recently been well studied is the additive model, as shown by Stone (1985) if m(x) has an additive structure: Yit = αi +
d δ=1
fδ (Xitδ ) + uit ,
(118.7)
page 4057
July 6, 2020
16:6
4058
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
with E(fδ (·)) = 0, each of the component functions fδ (·) can be estimated at an optimal rate ns/(2s+1) which does not depend on d. This circumvention of the curse of dimensionality, as well as the ease of interpreting the impacts of different regressors on the regressand have led to popularity of additive nonparametric regression models in both the theoretical and applied literature.2 This model can be estimated using an algorithm by “smooth back-fitting” which is computationally efficient among available algorithms, but achieves the same optimal asymptotic properties. 118.4.7 Test of differences of adjusted R2 To test for statistically significant differences between parametric and nonparametric adjusted R2 coefficient estimates from principal components regressions corresponding to the 16 different equity markets used, we conduct Vuong’s (1989) test of model selection, predicated on a likelihood ratio test. In Vuong’s (1989) test, the null hypothesis is one of parametric and nonparametric adjusted R2 coefficients being equal, while the alternate hypothesis is that these two measures are not equal. Vuong (1989) derives a likelihood ratio test-statistic of the following form: L(RN P ) [lnL(RN P ) − lnL(RP )] LR = log L(RP ) n 2P 1 μ2N P n 2 2 − 2 . = (ln(σN P ) − ln(σP )) + 2 2 2 t=1 σN σP P The variance of the likelihood ratio is given as n μ2N P 1 2 2 2 (1/2)∗/ln(σP ) − (1/2)∗ln(σN P ) + (1/2)∗ 2 ϑ = n σN P t=1 2 − 1/2∗ P2 − (LR/n)2 . σP The test statistic is asymptotically normal, and is given as: Z = √1n LR ϑ . If the test-statistic’s sign is positive and statistically significant, it indicates non-parametric adjusted R2 estimator is the correct choice, while a negative and statistically significant test-statistic indicates parametric adjusted R2 estimator is the correct financial integration measure which best describes the data. Since (unconditional) simple correlation, SC, is a popular indicator of financial integration (and also diversification benefits), we contrast 2
See, inter alia, Hastie and Tibshirani (1990) and Pagan and Ullah (1999).
page 4058
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4059
parametric and non-parametric adjusted R2 with SC. Together, these three competing measures allow us to shed light on sensitivity of results to violations of the assumptions underlying traditional parametric estimators. 118.5 Results As stated earlier, we estimate Pukthaunthong and Roll’s (2009) adjusted R2 measure of financial integration non-parametrically (for each of the 14 emerging equity markets), and compare these results to both parametric adjusted R2 , and bivariate (unconditional) simple correlations (SC) of each of the 14 emerging equity markets with respect to each of the US and Japan’s equity returns. To examine differences between these three financial integration measures, we present graphs of each emerging market’s SC with respect to the US and Japan, along with each market’s parametric adjusted R2 , and non-parametric adjusted R2 for the study period between 1993 to 2016.3 Appendix (118A.1) presents time series plots of each country’s simple correlation, SC estimated with respect to US equity returns, along with its corresponding parametric and non-parametric adjusted R2 . As seen, in general, for all 14 emerging equity markets, SC falls below parametric adjusted R2 . Consistent with results reported in Pukthaunthong and Roll (2009), this suggests SC understates the degree of financial integration relative to parametric adjusted R2 . More importantly, non-parametric adjusted R2 coefficients are consistently above parametric adjusted R2 across all 16 countries, which is new evidence of potential differences between parametric and non-parametric adjusted R2 measures of financial integration. These results conflict with results reported in Billio et al. (2017) where SC is found to do just as well as other financial integration measures (including parametric adjusted R2 ). In sum, our first set of results indicate emerging equity markets may be more financially integrated than previously thought, thereby offering potentially fewer diversification benefits to global investors. 3
The non-parametric R2 results presented in the plots in Appendix 118A.1–118A.3 are calculated based on a set of “Rule-of-thumb” bandwidths — roughly 1.5 standard deviation of the respective independent variable (principal components). We also calculated an alternative set of bandwidths using the cross-validation method as defined in (118.5) (See Table 118A.1 in Appendix 118A.4). Note that the CV bandwidth are smaller than ROT bandwidth for each country. We expect non-parametric R2 using CV bandwidth would be even larger than those with ROT bandwidth, but they follow the same pattern. Also, in model (118.1) and (118.2), we chose d = 10, i.e., in both parametric and non-parametric regression, the first ten principle components are used to calculate the respective adjusted R2 .
page 4059
July 6, 2020
16:6
4060
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
Appendix (118A.2) presents similar graphs of financial integration trends using Pukthaunthong and Roll’s (2009) adjusted R2 measure estimated both parametrically and non-parametrically for each of the 14 emerging equity markets, along with their respective bivariate SC estimated with respect to Japan’s equity returns. Note parametric and non-parametric adjusted R2 presented in Appendix (118A.2) are identical to those presented in Appendix (118A.1), but are presented again in an overlay plot along with SC estimated with respect to Japan’s stock returns for ease of contrast to plots presented in Appendix (118A.1) with SC estimated with respect to US equity returns. Once more, SC estimates with respect to Japan’s stock of each of the 14 emerging markets returns fall below both their respective parametric and non-parametric adjusted R2 when Japan is used as the reference market instead of the US. This evidence supports Pukthanunthong and Roll’s (2009) general findings that SC understates financial integration relative to parametric adjusted R2 . As in Appendix (118A.1), non-parametric adjusted R2 coefficients are consistently above parametric adjusted R2 for all 16 markets, suggesting notable differences between parametric and non-parametric adjusted R2 measures of financial integration. The last set of graphs shown in Appendix (118A.3) contain four timeseries plots, allowing us to contrast the three alternative measures of financial integration (parametric adjusted R2 , non-parametric adjusted R2 , and SC) across two sub-groups of emerging equity markets: the first sub-group is that of Southeast Asian equity markets, while the second sub-group includes all four non-Asian equity markets (Brazil, Argentina, Mexico, and Turkey). This distinction between Asian and non-Asian equity markets is particularly meaningful when calculating: cross-sectional averages of parametric adjusted R2 , non-parametric adjusted R2 , and correlations of each sub-group with respect to the US and Japan’s equity markets. Since one expects Southeast Asian equity markets to behave more homogenously, as a sub-group, with respect to either the US or Japan, segregating the 14 emerging equity markets accordingly when examining financial integration trends using the three financial integration measures seems intuitively appealing. The first time-series plot in Appendix (118A.3) presents cross-sectional averages of all 10 emerging Southeast Asian equity returns correlations estimated with respect to US equity returns, pitted against their respective cross-sectional average parametric adjusted R2 , and nonparametric adjusted R2 . The second time-series plot in Appendix (118A.3) presents cross-sectional averages of the 10 emerging Southeast Asian equity returns SC estimated with respect to Japan’s equity returns, pitted against their respective cross-sectional average parametric adjusted R2 , and
page 4060
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4061
non-parametric adjusted R2 . The third and fourth time-series plots present, respectively, cross-sectional averages of SC of the four non-Asian equity markets calculated with respect to US equity returns, and Japan’s equity returns, along with cross-sectional averages of their respective parametric adjusted R2 , and non-parametric adjusted R2 . As seen, regardless of whether the US or Japan is used as the reference global equity market used in calculating bivariate correlations, results using cross-sectional averages for each of the two sub-groups (Asian and nonAsian emerging equity markets) financial integration measures are consistent with results shown in Appendices (118A.1) and (118A.2). Specifically, crosssectional average SC are uniformly below cross-sectional average parametric and non-parametric adjusted R2 coefficients for both sub-groups of emerging equity markets. Further, financial integration measured by parametric adjusted R2 is slightly higher than when measured by SC. More importantly, financial integration measured by non-parametric adjusted R2 appears much higher than when measured by parametric adjusted R2 . These results indicate, once more, non-parametric adjusted R2 are much higher than parametric adjusted R2 , and both adjusted R2 measures are higher than SC measured with respect to the US or Japan. Therefore, caution should be used when drawing inferences regarding degree and trend of financial integration using parametric estimation, as it will be understated compared to estimates based on non-parametric adjusted R2 . The consistency of these results over time and across markets, individually, and in sub-groups is highly reassuring, and provides strong support to robustness of our results, particularly in regards to important differences between adjusted R2 measures of financial integration estimated parametrically versus non-parametrically. In summary, non-parametric estimation indicates higher levels of financial integration than parametric estimation, so that emerging equity markets may be more financially integrated than previously thought based on parametric estimation methods. Furthermore, for the sub-group of 10 emerging Southeast Asian equity markets, non-parametric adjusted R2 appear less variable over time relative to parametric adjusted R2 , but this observation is less stark for the sub-group of 4 emerging non-Asian equity markets. In general, a lesser variability of cross-sectional averages of the three different financial integration metrics for the sub-group of 10 emerging Southeast Asian equity markets than cross-sectional averages of the three different financial integration metrics for the sub-group of 4 non-Asian equity markets is not surprising since cross-sectional averaging for the second sub-group is calculated with respect to only four equity markets, but is estimated with respect to ten equity
page 4061
July 6, 2020
16:6
4062
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
markets for the first sub-group (averaging over a larger number of markets destroys more of the variability than averaging over a smaller number of markets). Because a visual observation of plots of parametric adjusted R2 and nonparametric adjusted R2 coefficients is not a definitive test of whether these two alternative financial integration measures exhibit statistically significant differences, we conduct a statistical test of differences between parametric and non-parametric adjusted R2 estimates using Vuong’s (1989) likelihood ratio test statistic, which discriminates between two models/estimators without presuming either model is true under the null. Vuong’s test statistic is normally distributed, and, its sign provides a directional indication of which of the two competing (parametric versusnon-parametric) estimators is better at explaining the data. Appendix (118A.4) reports the Vuong’s test results. As seen, all Vuong’s Z-statistics are positive and statistically significant at very stringent levels (p < 0.0001) for all 16 (emerging and developed) equity markets, indicating non-parametric adjusted R2 are uniformly statistically greater than their parametric equivalents. Based on these results, we conclude financial integration will be understated if measured by parametric adjusted R2 rather than non-parametric adjusted R2 for our sample of markets and our sample study period. Given these results suggesting differences between parametric and non-parametric adjusted R2 as competing measures of financial integration, future research examining international diversification benefits from investing globally (rather than only domestically) should consider non-parametric estimators of financial integration, and their associated international diversification benefits. Equally important, future research in finance using parametric estimators may need to consider non-parametric estimators in addition to (if not as a substitute for) parametric examination methods. 118.6 Summary and Conclusions Financial integration is an important concept in international finance and investments theory and practice. Measures of financial integration have wideranging implications for the functioning of financial markets, for expected diversification benefits from investing globally, and for the degree to which global markets are open to (or insulated from) global financial/economic shocks. Therefore, measuring financial integration has economic/financial policy implications aiming to ensure some minimum level of domestic financial/economic insularity to foreign shocks, and to encourage foreign capital flows that supplement domestic savings for funding growth in domestic
page 4062
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4063
investment spending, and economic growth. Temporal variations in financial integration complicate the task of global optimal portfolio selection and management, and render domestic economic and financial policies vulnerable to factors beyond the control of domestic policy makers. Contrarily, predictable financial integration trends facilitate global investment decision making, and domestic and global financial/economic policies. Yet, there is no consensus as to the best model or the best measure for quantifying financial integration. Developing alternative robust measures of financial integration should be a topic of prime importance to not only academics, but also practitioners and policy-makers. In this study, we introduce a non-parametric statistical approach to measuring financial integration trends. We apply this approach using a recently introduced financial integration measure: the adjusted R2 coefficient of Pukthuanthong and Roll (2009). We study a group of emerging equity markets ranging from Southeast Asia, to Latin America, and the Middle East. We contrast results from non-parametric adjusted R2 to parametric adjusted R2 , and contrast both to simple (unconditional) correlation (SC). We find several key results. First, we confirm prior evidence in the financial integration literature that SC understates financial integration relative to parametric adjusted R2 (see for example Pukthuanthong and Roll, 2009). This result is robust across individual markets, as well as two sub-groups of emerging equity markets: Southeast Asian equities, and non-Asian equities. More importantly, we find parametric adjusted R2 uniformly understates financial integration relative to non-parametric adjusted R2 , similarly on an individual equity market basis, as well as for the two sub-groups of Asian and non-Asian emerging equity markets. We conduct Vuong’s (1989) statistical test of significance of differences between parametric and non-parametric adjusted R2 for all 16 equity markets (14 emerging markets, along with the US and Japan), and find such differences are statistically significant at very stringent significance levels, underscoring the possibility that violations of assumptions underlying parametric estimators may be responsible for differences between parametric and non-parametric adjusted R2 , and the degree to which parametric adjusted R2 understates financial integration trends. Future research may, at a minimum, wish to employ non-parametric (along with parametric) approaches to examine sensitivity of statistical inferences and empirical results to possible violations of the classical statistical assumptions underlying parametric estimators. In addition, a formulation of a structural model of financial integration may also help in, not only explaining differences in financial integration levels across markets (and over time), but also in explaining why financial integration levels are time-varying.
page 4063
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
4064
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
Bibliography Baillie, R. and Bollerslev, T. (1987). A Multivariate Generalized ARCH Approach to Modeling Risk Premiums in Forward Foreign Exchange Rate Markets. Journal of International Money and Finance 9, 309–324. Bartoloni, E. and Maurizio, T. (2014). Financial Performance in Manufacturing Firms: A Comparison between Parametric and Non-Parametric Approaches. Business Economics 49, 32–45. Bekaert, G., Harvey, C. and Ng, A. (2005). Market Integration and Contagion. Journal of Business 78, 39–69. Bekaert, G., R. Hodrick. and Zhang, X. (2009). International Stock Return Co-movements. Journal of Finance 64, 2591–2626. Billio, M., Donadelli, M., Paradiso, A. and Reidel, M. (2017). Which Market Integration Measure? Journal of Banking and Finance 76, 150–174. Bollerslev, T. (1990). Modeling the Coherence in Short-run Nominal Exchange Rates: A Mul tivariate Generalized Approach. Review of Economics and Statistics 72, 498–505. Carrieri, F., Errunza, V. and Hogan, K. (2007). Characterizing World Market Integration through Time. Journal of Financial and Quantitative Analysis 42, 915–940. Christofferson, P., Errunza, V., Kris, J. and Langlois, H. (2012). Is the Potential for International Diversification Disappearing? A Dynamic Copula Approach. Review of Financial Studies 25, 3711–3751. Chiou, W. (2007). Who Benefits from International Diversification? Journal of International Financial Markets, Institutions, and Money 18, 466–482. DeJong, F. and DeRoon, F. (2005). Time-Varying Market Integration and Expected Returns in Emerging Markets. Journal of Financial Economics 78, 583–613. De Roon, F., Nijman, T. and Werker, J. (2001). Testing for Mean–Variance Spanning With Short Sales Constraints and Transaction Costs: The Case of Emerging Markets. Journal of Finance 56, 721–742. De Santis, G. and Gerard, B. (1997). International Asset Pricing and Portfolio Diversification with Time-Varying Risk. Journal of Finance 52, 1881–1912. Dumas, B., Harvey, C. and Ruiz, P. (2003). Are Correlations of Stock Returns Justified by Subsequent Changes in National Output? Journal of International Money and Finance 22, 777–811. Eiling, E. and Gerard, B. (2007). Dispersion, Equity Returns Correlations, and Market Integration. Unpublished Working Paper, University of Toronto. Engle, R. (2002). Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized Autoregressive Conditional Heteroskedasticity Models. Journal of Business and Economic Statistics 20, 339–350. Errunza, V., Hogan, K. and Hung, M. (1999). Can the Gains from International Diversification be Achieved without Trading Abroad? Journal of Finance 54, 2075–2107. Fiori, R. and Simonetta, I. (2007). Scenario-based Principal Component Value-at-Risk when the Underlying Risk Factors are Skewed and Heavy-Tailed: An Application to Italian Banks’ Interest Rate Risk Exposure. Journal of Risk 9, 63–99. Forbes, K. and Rigobon, R. (2002). No Contagion, Only Interdependence: Measuring Stock Market Comovements. Journal of Finance 57(5), 2223–2261. Goetzmann, W., Li, L. and Rouwenhorst, K. (2002). Long-Term Global Market Correlations. Working Paper, National Bureau of Economic Research. Harvey, C. (1995). Predictable Risk and Returns in Emerging Markets. Review of Financial Studies 8(3), 773–816.
page 4064
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4065 Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman & Hall, New York. Karolyi, A. (1995). A Multivariate GARCH Model of International Transmission of Stock Returns and Volatility: The Case of the United States and Canada. Journal of Business and Economic Statistics 13, 11–25. Li, K., Sarkar, A. and Wang, Z. (2002). Diversification Benefits of Emerging Markets Subject to Portfolio Constraints. Journal of Empirical Finance 10, 57–80. Lin, X. and Carroll, R. (2000). Non-Parametric Function Estimation for Clustered Data when the Predictor is Measured Without/With Error. Journal of the American Statistical Association 95, 520–534. Longin, F. and Solnik, B. (1995). Is the Correlation in International Equity Returns Constant? Journal of International Money and Finance 14, 3–26. Martins-Filho, C. and Yao, F. (2009). Non-Parametric Regression Estimation with General Parametric Error Covariance. Journal of Multivariate Analysis 100(3), 309–333. McMillen, D. and Redfearn, C. (2010). Estimation and Hypothesis Testing for Nonparametric Hedonic House Price Functions. Journal of Regional Science 50(3), 712–733. Min, J. and Lee, Y. (2008). A Practical Approach to Credit Scoring. Expert Systems with Applications: An International Journal 35, 1762–1770. Pagan, A. and Ullah, A. (1999). Non-parametric Econometrics. Cambridge University Press. Pukthuanthong, K. and Roll, R. (2009). Global Market Integration: An Alternative Measure and Its Application. Journal of Financial Economics 94, 214–232. Schotman, P. and Zalewska, A. (2006). Non-synchronous Trading and Testing for Market Integration in Central European Emerging Markets. Journal of Empirical Finance 13, 452–494. Stone, C. (1985). Additive Regression and Other Non-Parametric Models. Annals of Statistics 6, 689–705. Volosovych, V. (2011). Measuring Financial Market Integration over the Long Run: Is there a U-Shape? Journal of International Money and Finance 30, 1535–1561. Yang, K. (2016). More Efficient Local Polynomial Regression with Random-Effects Panel Data Models. Econometric Review 37(7), 760–776.
page 4065
July 6, 2020
16:6
4066
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
Appendix 118A Tables and Graphs 118A.1 Bivariate simple correlation (SC) computed with respect to US equity market4
4 In all plots that follow, the curves are simple correlations of each country with US (except in US plot, where curve with diamonds is for simple correlations of US and Japan), curve with squares are for adjusted R2 from parametric model, and curve with triangles are for adjusted R2 from non-parametric model.
page 4066
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4067
page 4067
July 6, 2020
16:6
4068
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
K. Yang et al.
b3568-v4-ch118
page 4068
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4069
118A.2 Bivariate simple correlations (SC) computed with respect to Japan’s equity market5
5 In all plots that follow, the curve with diamonds are simple correlations of each emerging market with Japan’s equity market (except in US plot, where curve with diamonds is for simple correlations of US and Japan), curve with squares are for adjusted R2 from parametric model, and curve with triangles are for adjusted R2 from non-parametric model.
page 4069
July 6, 2020
16:6
4070
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
K. Yang et al.
b3568-v4-ch118
page 4070
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4071
118A.3 Plots of average simple correlations (SC) of alternative emerging equity markets groups versus US and Japan, along with Averages of parametric and non-parametric adjusted R2 measures
page 4071
July 6, 2020
16:6
4072
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
K. Yang et al.
b3568-v4-ch118
page 4072
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
A Non-Parametric Examination of Emerging Equity Markets Financial Integration 4073
118A.4 Vuong’s (1989) test results comparing parametric and non-parametric adjusted R2 measures and cross-validation bandwidths results Table 118A.1:
Vuong’s (1989) likelihood ratio test results for model selection.
Country Hong Kong Malaysia Singapore Thailand Philippines India Indonesia S. Korea Taiwan China Argentina Brazil Mexico Turkey USA Japan
Vuong’s (1989) z -test statistic
p-values
64.514 64.465 66.901 61.534 58.078 63.033 64.370 67.698 62.826 58.705 62.816 63.651 63.192 61.895 71.204 63.209
p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001
Statistically, significant and positive Z-statistic indicate parametric R2 measure is rejected in favor of non-parametric R2 measure.
Table 118A.2: Cross-validation bandwidths for non-parametric multi-factor model for each emerging market. Country Hong Kong Malaysia Singapore Thailand Philippines India Indonesia S. Korea Taiwan China Argentina
Cross-validation bandwidths 0.5 0.4 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.3 0.4 (Continued)
page 4073
July 6, 2020
16:6
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch118
K. Yang et al.
4074
Table 118A.2: Country Brazil Mexico Turkey USA Japan
(Continued)
Cross-validation bandwidths 0.5 0.5 0.2 0.4 0.4
Notes: • Bandwidths are expressed as multiples of corresponding sample standard deviations. • Bandwidths based on cross-validation algorithm are smaller than those used in producing adjusted R2 in Figures 118A.1– 118A.3. • For ease of comparison, we use a larger bandwidth to obtain more conservative results. If cross-validation (CV) bandwidths were used, we expect the difference between non-parametric and parametric adjusted R2 measures will be even larger. Results with CV bandwidths are available upon request.
page 4074
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
Chapter 119
Algorithmic Analyst (ALAN) — An Application for Artificial Intelligence Content as a Service Ted Hong, Daniel Lee and Wenching Wang Contents 119.1 Introduction . . . . . . . . . . . . . . . 119.2 Analysis Gap and Language Barrier . 119.3 Robo-Adviser and Robo-Journalism . 119.4 ALAN: Algorithmic Analyst . . . . . . 119.5 Demystifying ALAN . . . . . . . . . . 119.6 Multi-Factor Risk Model . . . . . . . . 119.7 Hybrid Natural Language Generation 119.8 Empirical Study . . . . . . . . . . . . 119.9 Conclusions . . . . . . . . . . . . . . . 119.10 Next Steps . . . . . . . . . . . . . . .
Ted Hong Beyondbond, Inc. e-mail: [email protected] Daniel Lee Beyondbond, Inc. e-mail: [email protected] Wenching Wang Beyondbond, Inc. e-mail: [email protected]
4075
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
4076 4077 4078 4078 4078 4079 4082 4082 4083 4084
page 4075
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4076
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 119A ALAN Mobile App Sample Screenshots . . . . . . Appendix 119B ALAN Confidence Score Sample Website Screenshot . . . . . . . . . . . . . . . . . . . . . .
4085 4086 4086
Abstract This chapter presents Algorithmic Analyst (ALAN), an application that implements statistics and artificial intelligence methods with natural language generation to publish multimedia financial reports in Chinese and English. ALAN is a portion of a long-term project to develop an Artificial Intelligence Content as a Service (AICaaS) platform. ALAN gathers global capital market data, performs big data analysis driven by algorithms, and makes market forecasts. ALAN uses a multi-factor risk model to identify equity risk factors and ranks stocks based on a set of over 150 financial market variables. For each instrument analyzed, ALAN computes and produces narrative metadata to describe its historical trends, forecast results, and any causal relationship with global macroeconomic variables. ALAN generates English and Chinese text commentaries in html and pdf formats, audio in mp3 format, and video in mp4 format for the US and Taiwanese equity markets on a daily basis. Keywords Natural language generation (NLG) • Multi-factor risk (MFR) model • AI content as a service (AICaaS) • ARIMA • Generalized autoregressive conditional heteroscedasticity (GARCH) • Machine learning • Deep learning • RNN • LSTM • Factor attributes • Scoring system • Global investing.
119.1 Introduction This chapter describes Algorithmic Analyst (ALAN), an automated technology behind a commercial mobile product with which a universe of global investment instruments is analyzed using dynamically changing mathematical models and the analysis results and forecasts are communicated to the recipients via machine generated narratives in their native languages. ALAN is a component of an artificial intelligence content as a service (AICaaS) platform where artificial intelligence (AI) algorithms create and deliver a customized content to a targeted audience. ALAN is a financial application. ALAN gathers global macroeconomic and microeconomic data, performs data analysis, produces narrative in multiple languages using natural language generation, and distributes them digitally. ALAN combines machine learning with a multi-factor risk (MFR) model to identify a number of risk factors and their associated subfactor attributes from a rich set of financial market variables. Four major factors are employed in the model: valuation, quality, technical, and macro factors. The factors are
page 4076
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
ALAN — An Application for Artificial Intelligence Content as a Service
4077
used to forecast and to rank stocks. Self-adjusting to the most recent data, ALAN computes near-term price movement forecasts and generates narrative texts along with associated audios and videos to describe the past trends and explain the forecast results. ALAN generated multimedia outputs are delivered in the audience’s native language. The entire process is automated.
119.2 Analysis Gap and Language Barrier Recent years have seen two major trends in the investment world: greater participation of do-it-yourself (DIY) investors and global investing. While plenty of investors continue to rely on and delegate investment management to professional financial institutions, an increasing number of investors are choosing to manage their funds themselves and making their own investment decisions. As the pace of globalization accelerates, more investors are venturing outside of their own country to search for investment opportunities around the world. These two trends, in turn, create two corresponding obstacles: analysis gap and language barrier. In the Information Age, there is an overabundance of news and financial information supplied by numerous exchanges and financial portal sites such as IEX, Yahoo, and Google. With the greater availability of data, it is becoming increasingly difficult for DIY investors to extract knowledge from data and to distinguish useful information from noise. Too much information unaccompanied by proper analysis can be an obstacle to making sound investment decisions. Because knowledge and speed contribute to a large portion of trading profits, financial institutions devote resources to develop and profit from all forms of time-sensitive financial analysis. Consequently, DIY investors without the resources needed to conduct such analysis are at a trading disadvantage. Furthermore, as investors expand their horizon and search for international investment opportunities, they notice frequently that the corresponding financial data and analysis for such opportunities are available only in the language of the issuer’s country. Even when English is used, as is the case for many large issues, it creates problems for non-English speaking investors. To overcome this language barrier, the first remedial action is usually to employ a machine translation tool such as Google Translate. Unfortunately, the current capability of such translation technology is still lacking, and it is not workable for complex topics including finance and investment analysis. Bad, incomplete, or incorrect translation will contribute to errors in decision making and potentially severe negative financial consequences.
page 4077
July 6, 2020
16:7
4078
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
119.3 Robo-Adviser and Robo-Journalism For the benefit of DIY investors, robo-advisers such as Wealthfront and Betterment provide professional financial advice online with moderate to minimal human interaction and intervention. These platforms have been gaining the popularity among small retail investors with perceived ease of use and generally lower fees. Robo-journalism attempts to overcome the language barrier by producing narratives directly in the reader’s native language without the need for translation. Robo-journalism is an example of an application from the AI subfield of natural language generation (NLG) which investigates how to create narratives using algorithms. Real world examples of NLG include baseball game coverage articles from the Associated Press and quarterly corporate earnings reports from Automated Insights. 119.4 ALAN: Algorithmic Analyst ALAN contributes to the DIY investment community by providing professional research and analysis in order to narrow the analysis gap and to the global investor community by communicating directly to international investors in their native languages in order to overcome the language barrier. ALAN expands the NLG field by using a novel hybrid approach to generate narratives in multiple languages simultaneously. 119.5 Demystifying ALAN ALAN is an application that integrates and automates the processes of collecting global financial information, performing statistical analysis, executing machine-learning algorithms, and producing reports in text, audio and video formats in multiple languages on demand. ALAN incorporates techniques from multiple disciplines including finance, statistics, econometrics, machine learning, and natural language generation. The three main components of ALAN are the Historical Data Collector (HDC), the Algorithmic Robo-Advisor (ARA), and the Robo Market Commentator (RMC) engines. The Historical Data Collector (HDC) engine continuously collects and updates a wide range of raw financial time series data. It includes global macroeconomic variables, market indices, and individual company level information such as stock quotes, trading volumes, periodic financial reports, and news, along with corresponding industry sector indices data. The engine
page 4078
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
ALAN — An Application for Artificial Intelligence Content as a Service
4079
Apply Algorithms
Save to Database
HDC Historical Data Collector
ARA Algorithmic Robo-Advisor
Raw Data Analysis Results
RMC Robo Market Commentator
NLG Results
Video Audio Text
Database
Figure 119.1:
Distribute
ALAN architecture workflow.
scrubs the collected raw data for completeness, a step that is crucial and frequently challenging. The Algorithmic Robo-Advisor (ARA) engine processes the collected data and forecasts equity prices, financial ratios and global macroeconomic variables to be used to rank stocks. The ARA engine uses an extension of a multi-factor risk model (MFR) as the foundation for its stock ranking algorithm. It combines traditional econometrical techniques and time-series analysis with AI machine learning methods to calculate four risk factors associated with stock performance and to generate relative rankings for stock trading and portfolio optimization. The Robo Market Commentator (RMC) engine uses a scripted templatebased hybrid NLG system to generate narrative text reports in html and pdf formats first and then produces corresponding multimedia files in mp3 and mp4 formats subsequently. Figure 119.1 divides ALAN’s architecture into three analytical engines: Historical Data Collector (HDC), Algorithmic Robo-Advisor (ARA), and Robo Market Commentator (RMC). 119.6 Multi-Factor Risk Model Multi-factor risk models have been used in finance since the 70s as a variation of the Arbitrage Pricing Theory (APT) model to attempt to explain the historical returns on stocks and bonds. APT seeks to measure various market and macroeconomic factors without identifying them specifically. In 1993, Fama and French published a model containing three factors: an overall market factor, a factor related to firm size, and a factor related to book-tomarket equity. Then in 1997, Carhart added an additional momentum factor
page 4079
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4080
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
to make the four-factor model a starting point for many modern models. Quality and low volatility are two other factors frequently added to expand or enhance the models. Multi-factor risk models are now commonly used in the industry. Mutual fund institutions such as Fidelity and Vanguard use factor-based models to market their ETF products. Asset management firms use factor-based risk models to allocate portfolio holdings and perform return attributions. ALAN uses a multi-factor risk model as the analytical foundation to measure and rank stock performances. ALAN identifies fundamental quality, fundamental valuation, technical momentum, and global-macro as the four major risk factors. A company’s fundamental information derived from financial statements, earnings reports, and corporate news is used to gauge its long-term performance which ALAN defines as the fundamental valuation risk factor and its sustainability which ALAN defines as the fundamental quality risk factor. Historical price performance and trading liquidity data are used to measure the short-term market momentum and sentiment which ALAN defines as the technical momentum risk factor. The global-macro risk factor is derived from such market data as interest rates, currencies, commodities and other macroeconomic variables that affect financial markets. For each of the four risk factors in ALAN’s MFR model, lower level variables are identified as its subfactor attributes. Table 119.1 lists some of the subfactor attributes for each of the four factors. Many subfactor attributes are directly observed while others must be calculated from the collected and scrubbed financial data. They are forecast using a variety of algorithms from econometrics, time-series analysis, machine learning methods, and other AI techniques. For technical sentiment analysis, ALAN applies text mining algorithms, for example. The resulting subfactor attribute forecasts are used by the MFR. ALAN’s flexibility allows different sets of subfactor attributes to determine the risk factors for different stocks. Table 119.1: Factor Fundamental Valuation Fundamental Quality Technical Momentum Global-Macro
Factors and their subfactor attributes. Subfactor attributes
Earnings per share (EPS) Return on equity (ROE), debt equity ratio (D/E) Moving average convergence divergence (MACD), relative strength index (RSI) Yield curve, inflation outlook, volatility
page 4080
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
ALAN — An Application for Artificial Intelligence Content as a Service
4081
On top of this multi-factor risk model, ALAN adds a layer of machine learning methods. The combination is intended to better capture market trends and patterns than simply using traditional technical indicators and ARIMA-GARCH type time-series analysis. Self-adjusting to the most recently collected data, an automated process runs periodically or on demand to identify the most predictive set of subfactor attributes and factors for each stock. ALAN then makes stock price forecasts and ranks the stocks based on their corresponding confidence scores. The ranking and the individual stock risk model calculation results are used by the RMC engine subsequently to compose the narratives. ALAN returns some of the analysis results back into the model to create an improvement feedback loop. Additionally, several market timing trading strategies are incorporated to generate back test reports and to enhance the risk model. While this entire process is performed on a daily basis currently, ALAN is capable of running it more frequently as long as an appropriate and timely set of data is provided. Figure 119.2 displays ALAN’s four risk factors: valuation, quality, technical, and macro. For each stock, the four risk factors are used to measure
Stock picking Loop
Ticker Name
Fundamentals
Fundamentals
Momentum
Global Macro
Analysis, Score & Forecast
Save to Database
July 6, 2020
Report & Commentary Generation in English & Chinese (ѝ ѝ ᮷) Market Sentiment Data & Reports Global Macro Research Ranking & Recommendation
Figure 119.2:
Industry Analysis
ALAN stock ranking analytical workflow.
page 4081
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4082
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
and forecast its profitability, long-term sustainability, price momentum, correlation with its related industry sector, and causal relationship with global macroeconomic variables. 119.7 Hybrid Natural Language Generation Natural language generation (NLG) is an AI subfield that studies how to use algorithms to generate sentences and paragraphs. Generally, NLG follows either a knowledge-based approach which incorporates grammatical and linguistic information embedded within to compose text or a templatebased approach which utilizes predefined output sentences and paragraph structures to compose text. A high-quality knowledge-based NLG approach requires a theoretically driven linguistic model for each language and is difficult to develop and expensive to implement. Given the current state of technology, a knowledgebased NLG approach is not able to create consistently human-readable reports on a complex topic such as investment analysis. A template-based NLG approach lacks a theoretical linguistic model and uses a predefined template engine to incorporate grammar rules. Generally, it is easier and faster to develop a usable template-based NLG system because it has a lower initial infrastructure requirement. ALAN utilizes an enhanced template-based NLG approach. The templates in ALAN are loosely structured and are programmable and modifiable on the fly. Consequently, the expressive narrative range is wider. ALAN can generate multi-lingual outputs simultaneously using multiple templates. Unlike most other template-based systems, the access to ALAN’s template engine is based on an open standard so that a new user may modify or enhance the templates to fit his/her needs more easily. With this flexibility and lower prior NLG knowledge requirement, ALAN makes it userfriendly to experiment with a wider range of narrative variations for multiple languages. 119.8 Empirical Study Currently, ALAN generates English and Chinese narrative text in html and pdf formats, audio in mp3 format, and video in mp4 format for US and Taiwanese equities on a daily basis. The number of companies covered in the US exceeds 7000, and each company record contains the historical price performance, financial statements,
page 4082
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
ALAN — An Application for Artificial Intelligence Content as a Service
Date 03/31/17 04/30/17 05/31/17 06/30/17 07/31/17 08/31/17 09/30/17 10/31/17 11/30/17 12/31/17 01/31/18 Figure 119.3:
4083
Win/Loss Ra o 53% 51% 66% 71% 61% 50% 49% 55% 59% 60% 65%
Back-test performance report for 2017 stock picking.
and relevant industry sector and global macroeconomic information for at least 10 years. For Taiwan, the number of stocks covered is almost 1600. ALAN uses a deep learning technique, recurrent neural networks (RNN) with long-short term memory (LSTM), for technical analysis and compares the forecast results with those obtained from ARIMA-GARCH forecasting models. The RNN+LSTM results show that a better performance is obtained from a dataset with higher frequency or greater liquidity. While ALAN’s deep learning algorithm has been able to produce better results than ARIMAGARCH models at times, those results are not always consistent. Therefore, ALAN presently uses a hybrid forecasting model that combines both statistical and deep learning algorithms. In order to test the effectiveness of ALAN’s stock ranking algorithm, a backtest was performed for 2017. On a daily basis, ALAN analyzed and ranked the S&P 500 stocks based on their corresponding factor confidence scores. A long portfolio was composed of those stocks whose factors obtained 4 or 5 stars out of 5, with 5 being the best, and its performance was tracked. Figure 119.3 summarizes the back test results. The Win/Loss ratio is 65% over the test period which is better than the 50% expected from a random walk. 119.9 Conclusions ALAN is an automated financial application designed to assist global retail investors. ALAN begins with a Historical Data Collector engine that gathers and scrubs global financial data including equity prices, financial statements,
page 4083
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4084
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
currencies, interest rates, commodities, market indices, and macroeconomic indicators from a variety of data sources and stores them into a master database. ALAN then runs its Algorithmic Robo-Advisor engine which digests collected time series data to forecast equity prices, financial ratios and select global macroeconomic variables and subsequently utilizes MFR and AI to calculate four risk factors associated with stock performance and generate their relative rankings for stock trading and portfolio optimization. Finally, ALAN runs the Robo Market Commentator engine to create individual stock reports in multiple languages describing the past historical performance, associated correlations/causal relationships with global macroeconomic variables, and future forecasts from the ALAN ranking model. ALAN then creates additional multimedia outputs and distributes them online. ALAN contributes to the DIY investment and global investments trends and expands the NLG field by generating narratives in multiple languages with a simplified hybrid approach. 119.10 Next Steps A future paper will explore various trading strategies based on the ALAN stock ranking algorithm and include a more comprehensive description of backtest results with detailed profit and loss calculations. As Figure 119.4 shows, ALAN will expand its product range to include futures, ETFs, and fixed income securities and broaden its geographical coverage to include China and Japan.
Japan HK
GEOGRAPHY
July 6, 2020
China Taiwan* US*
Equity*
ETF
Bond
Futures
PRODUCT RANGE * Completed
Figure 119.4:
Coverage expansion.
page 4084
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
ALAN — An Application for Artificial Intelligence Content as a Service
4085
Bibliography BARRA (1998). Multiple-Factor Modeling, Barra US Equity Model, pp. 41–53. Bildirici, M. and Ersin, O.O. (2012). Modeling Markov Switching ARMA-GARCH Neural Networks Models and an Application to Forecasting Stock Returns, https://ssrn.com/ abstract=2125855 or http://dx.doi.org/10.2139/ssrn.2125855. Chena, A.-S., Leungb, M.T. and Daoukc, H. (2003). Application of Neural Networks to an Emerging Financial Market: Forecasting and Trading the Taiwan Stock Index. Computers & Operations Research 30(6), 901–923. Dew-Becker, I., Giglio, S. and Kelly, B. (2017). How Do Investors Perceive the Risks from Macroeconomic and Financial Uncertainty? Evidence from 19 Option Markets. Edet, S. (2017). Recurrent Neural Networks in Forecasting S&P 500 index, Edet, Samuel, Recurrent Neural Networks in Forecasting S&P 500 Index, https://ssrn.com/ abstract=3001046 or http://dx.doi.org/10.2139/ssrn.3001046. Germ´ an, G. (2012). Model Calibration and Automated Trading Agent for Euro Futures. Quantitative Finance 12(4), 531–545. JPMorgan US Factor Reference Book. Investment Strategies Series 2011, pp. 25–49. J¨ urgen Schmidhuber, J. (2014). Deep Learning in Neural Networks: An Overview. Technical Report IDSIA-03-14, arXiv:1404.7828. Khandelwal, I., Adhikari, R. and Verma, G. (2015). Time Series Forecasting using Hybrid ARIMA and ANN Models based on DWT Decomposition. Procedia Computer Science 48, 173–179. Kittredge, R., Polgu`ere, A. and Goldberg, E. (1986). Synthesizing Weather Forecasts from Formated Data. In Proceedings of the 11th Coference on Computational Linguistics, COLING ’86, Stroudsburg, PA, USA. Association for Computational Linguistics, pp. 563–565. Navon, A. and Kellery, Y. (2017). Financial Time Series Prediction using Deep Learning. Portet, F., Reiter, E., Gatt, A., Hunter, J., Sripada, S., Freer, Y. and Sykes, C. (2009). Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173(7), 789–816. Yu, J., Reiter, E., Hunter, J. and Mellish, C. (2007). Choosing the Content of Textual Summaries of Large Time-Series Data Sets. Natural Language Engineering 13, 25–49. Sezer, O.B., Ozbayoglu, M. and Dogdu, E. (2017). An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework. In ACM Southeast Conference, ACMSE 2017, Kennesaw State University, GA, USA, 13–15 April. Sezer, O.B and Ozbayoglua, A.M. (2018). Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach, https: //www.researchgate.net/journal/1568-4946 Applied Soft Computing.
page 4085
July 6, 2020
16:7
4086
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch119
T. Hong, D. Lee & W. Wang
Appendix 119A ALAN Mobile App Sample Screenshots
Appendix 119B ALAN Confidence Score Sample Website Screenshot
page 4086
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Chapter 120
Survival Analysis: Theory and Application in Finance Feng Gao and Xiaomin He Contents 120.1 Introduction . . . . . . . . . . . . . . . . . 120.2 Basic Survival Functions . . . . . . . . . . 120.3 Non-Parametric Methods . . . . . . . . . 120.3.1 Non-parametric estimators . . . . 120.3.2 Non-parametric testing . . . . . . 120.4 Parametric Methods . . . . . . . . . . . . 120.4.1 Parametric regression models . . . 120.4.2 Maximum likelihood estimation . 120.4.3 Parameter estimation and testing 120.5 Semi-Parametric Methods . . . . . . . . . 120.5.1 Cox proportional hazards model . 120.5.2 Partial likelihood estimation . . . 120.5.3 Tied data . . . . . . . . . . . . . . 120.5.4 Time-dependent covariates . . . . 120.6 Other Applications . . . . . . . . . . . . . 120.6.1 Discrete time data . . . . . . . . .
Feng Gao Rutgers University email: [email protected] Xiaomin He Taiho Oncology, Inc. email: [email protected] 4087
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
4088 4090 4092 4092 4098 4103 4103 4106 4107 4109 4109 4110 4111 4112 4112 4112
page 4087
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4088
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
120.6.2 Competing risks . . . . . . . . . . . . . . . . . . . . . 120.6.3 Bayesian estimation and testing . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4114 4117 4117
Abstract This chapter outlines some commonly used statistical methods for studying the occurrence and timing of events, i.e., survival analysis. It is also called duration analysis or transition analysis in econometrics. Statistical methods for survival data usually include non-parametric method, parametric method and semiparametric method. While some nonparametric estimators (e.g., the Kaplan–Meier estimator and life-table estimator) estimate survivor functions, others (e.g., the Nelson–Aalen estimator) estimate the cumulative hazard function. The commonly used non-parametric test for comparing survivor functions is the log-rank test. Parametric models such as the exponential model, Weibull model, and the generalized Gamma model, etc., are based on different assumptions of survival time. Semiparametric regression models are also called the Cox proportional hazards (PH) model, which is estimated by the method of partial likelihood and do not require the assumption of survival time. Other applications of discrete time data and the competing risks model are also introduced. Keywords Survival analysis • Non-parametric methods • Parametric methods • Semi-parametric methods • Discrete time data • Competing risks.
120.1 Introduction Survival analysis is a class of statistical methods for studying the occurrence and timing of events. The terminology was originated from medical research, in which an event often refers to death. But the event can often be other endpoints in clinical trials, such as heart failure, progression-free survival, etc. The methods of survival analysis have been applied to other research areas, usually with adapted names. For example, it is also called duration analysis or transition analysis in finance. Survival data are longitudinal in nature because the variable of interest is the length of time until the occurrence of an event. The occurrence of events is typically described with a binary variable. These can be either one-time event such as the death of a patient, firm’s bankruptcy. The events can also happen multiple times such as the relapse of diseases (i.e., leukemia or heart attack), repeated arrests, or multiple times of unemployment. A significant change in the quantitative value can also be defined as an event, such as a dramatic drop in stock price. A distinct characteristic of survival data is the existence of censoring. For example, due to ethical concerns and/or budget considerations, a clinical trial will be conducted over a finite period of time. Individuals enter
page 4088
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance
4089
the clinical trial voluntarily at different times, and the length of their stay varies. An adverse event for an individual, such as death, may or may not be observed during the trial. In addition, some patients might fail to followup before the trial ends, and thus it is impossible to ascertain their time to event. When subjects drop out of the sample before an event happens, such cases are called right censoring. Such censoring can also be present in data such as employment history. Take a study using employment registration data to determine the length of unemployment for an example. If no more records can be found for an individual, it is hard to tell whether the person is “alive” (still unemployed), or just decides to stop updating his/her employment status. Assume that there are n independent individuals. For each individual i, the data consist of two parts: the observed time of a particular subject i, denoted by Yi = min(Ti , Ci ), and an indicator variable for the occurrence of an event, denoted by 1 if Ti ≤ Ci , δi = 0 otherwise. Thus, the observed time might be either the survival time Ti or the censored time Ci . The censored time Ci can be either non-random or random, but is independent of Ti . For example, when all the Ci ’s are equal to a constant, the data has Type I censoring. Type II censoring occurs when all remaining subjects after a predetermined number of events are censored. In both of these cases, Ci is non-random. However, random censoring might occur when observations are terminated for reasons that are not under the control of the investigator. In addition to right censoring of the data, left censoring occurs if the upper bound of the survival time of the censored observation is known, and interval censoring occurs if both the upper and lower bounds of survival time of the censored observation are known. Kalbleisch and Prentice (1980) and Allison (2010) have more details on left censoring and interval censoring, and our discussion in this chapter focuses on right censoring only. Another important characteristic of survival data is that the time of events cannot be negative. For the simplicity of our discussion, we limit survival time to be positive only, which makes it possible to take the log transformation of time, and all of individuals (100%) are at risk to an event at the beginning.
page 4089
July 6, 2020
16:7
4090
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
Applications of survival analysis in finance are mostly focused on predicting the risk of firm failure. For example, Shumway (2001) uses a discrete-time hazard model to forecast bankruptcy. Lee and Kim (2014) use a mixed Cox proportional hazards model to examine the dynamic changes in the probability of survival for hedge funds. Iwasaki (2014) also uses the Cox proportional hazards model to examine factors affecting the survival of Russian industrial firms around the global financial crisis. Bhattacharjee, Higson, Holly and Kattuman (2009) use both the Cox proportional hazards model and the logit model to evaluate the impact of macroeconomic instability on two forms of exits for UK firms, i.e., bankruptcies or acquisitions. Agarwal et al. (2005) use the Cox proportional hazards model to investigate the impact of bankruptcy exemption levels on the decision of small business owners to file for bankruptcy. Chen, Guo and Lin (2010) use Bayesian inference procedure to analyze the motivation behind a firm’s decision to cancel its initial public offering (IPO). Researchers have also used survival analysis in predicting negative events associated with bonds and personal debt. Moeller and Molina (2003) use the Cox proportional hazards model to study the changes in the probability of default of high-yield bonds. Banasik, Crook and Thomas (1999) and Stepanova and Thomas (2002) both use the Cox proportional hazards model and apply it to personal loan data for credit-scoring models. Bansik et al. (1999) also apply the competing risks approach to explain two potential outcomes for personal loans, i.e., early payoff or default. Belloti and Crook (2008) use the Cox proportional hazards model to model default on credit card accounts. In addition to survival models, logistic models have also been used to predict negative events such as bank failure (e.g., Audrino, Kostrov, and Ortega 2019). 120.2 Basic Survival Functions The standard approaches to survival analysis are probabilistic or stochastic. Denote T as the survival time, and assume that it is a continuous random variable (the case when survival time is treated as discrete will be discussed later). The cumulative distribution function of T is t f (u)du. F (t) = Pr (T ≤ t) = 0
Denote the survivor function by S(t), i.e., S(t) = Pr (T > t) = 1 − F (t).
page 4090
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance
4091
By definition, S(0) = 1, S(∞) = 0, and S does not increase in t (it usually decreases in t). The survival function measures the probability of surviving at time t if a subject is alive at the origin of time (t = 0), and the probability is decreasing in time. The most commonly used methods to estimate the survivor function are Kaplan–Meier estimator and life-table estimator, which we discuss later in the chapter. The probability density function (pdf) of T is defined as dS(t) dF (t) =− . dt dt Accordingly, the hazard function is defined as f (t) =
Pr(t ≤ T < t + Δt|T ≥ t) . Δt Note that the hazard function is not a probability; instead, it is an instantaneous risk that an event will happen at time t. The hazard function is often interpreted as the expected number of events in a one-unit interval of time, so it is also called the hazard rate. The reciprocal of the hazard rate is sometimes called the Mills ratio (Mills 1926), which gives the expected length of time until an event happens. The relationship between the survivor function, the pdf and the hazard function is expressed as λ(t) = lim
Δt→0
λ(t) =
f (t) . S(t)
The formula can also be expressed as λ(t) = − or
d log S(t), dt
t λ(u)du . S(t) = exp − 0
Let the cumulative hazard function be denoted by Λ(t) = survivor function and the pdf can be rewritten as
t 0
λ(u)du, the
S(t) = exp[−Λ(t)], and f (t) = λ(t) exp [−Λ(t)]. The most commonly used estimation of cumulative hazard function is Nelson–Aalen estimator, which is an alternative to the Kaplan–Meier estimator.
page 4091
July 6, 2020
16:7
4092
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
120.3 Non-Parametric Methods 120.3.1 Non-parametric estimators Some non-parametric estimators (i.e., Kaplan–Meier estimator and life-table estimator) estimate the survivor function, while others (i.e., Nelson–Aalen estimator) estimate the cumulative hazard function. 120.3.1.1 Kaplan–Meier estimator The Kaplan–Meier estimator, also known as the product-limit estimator, was independently developed and jointly published by Edward L. Kaplan and Paul Meier (Kaplan and Meier 1958). For a sample of n individuals, k events (e.g., death) are observed (k ≤ n). Assume these k events occur at distinct times, 0 < t1 < t2 < · · · < tk . Let di be the number of events and ni the number of subjects at risk (i.e., have not yet had an event or have not been censored) at time ti (1 ≤ i ≤ k). The Kaplan–Meier estimator for the survivor function is defined as ni − di ˆ . S(t) = ni i:ti ≤t
ˆ ˆ is defined as 1. For t > tk , S(t) = 0 if there are no cenFor t < t1 , S(t) ˆ will be undefined if the censored times sored times greater than tk , and S(t) are greater than tk . The Kaplan–Meier estimator changes value only at the observed survival time. The standard error of the Kaplan–Meier estimator is obtained by Greenwood’s formula.
di ˆ ˆ 2 . var[S(t)] = [S(t)] ni (ni − di ) i:ti ≤t
ˆ can be constructed as S(t) ˆ ± zα /2 · The (1 − α)100% confidence limits of S(t) ˆ var[S(t)] accordingly. Note that this confidence interval is only a pointwise limit at the specified survival time instead of over the entire lifetime process. To illustrate the application of the Kaplan–Meier estimator, we use the data tracking the employment outcomes of Chief Executive Officers (CEOs) based on that used in Dai, Gao, Lisic and Zhang (2018), who examine the future employment outcomes of CEOs after leaving their previous employer. We focus on factors affecting the career outcomes of CEOs within 5 years of leaving the previous employer. The data is thus right censored (Type I
page 4092
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance
4093
censoring) and the maximum of newyear is 2016, which is when the data collection stopped. The variables are defined as follows: turnyear newyear
findjob5 gap5
the year a CEO departs from the previous employer the year a CEO finds an executive position in a new employer within five years of leaving the previous employer, or the censored value that equals turnyear + 5 or 2016, whichever comes earlier 1 if a CEO finds a job within 5 years, and 0 otherwise newyear-turnyear + 0.5.
The following Table 120.1 shows the first 10 observations in the example. Note that a constant, 0.5, is added to obtain the variable gap5, which deals with the problem of the data being available on an annual basis only. For executives who find a new job in the same year as the year of departure from previous employer (i.e., newyear = turnyear), taking the difference of newyear and turnyear without making adjustments would result in a value of zero in survival time. This is usually not a problem when the exact dates of employment are available, as an executive does not usually hold two jobs on the same date. We effectively solve the issue of zero survival time by defining the gap between employment as gap5 = newyear−turnyear + 0.5. Thus, the variable gap5 takes the value 0.5 for a CEO finding a job in the same year as the year of departing from the previous employer, and the censored value of 5.5 if a CEO does not find an executive job within 5 years (i.e., findajob5 = 0). We use the following SAS codes to get the Kaplan–Meier estimator and plot the product-limit survival estimates of the career path of departing Table 120.1: First 10 observations in the CEO career path sample. Obs. 1 2 3 4 5 6 7 8 9 10
turnyear
newyear
findjob5
gap5
1996 2005 2010 2002 1998 2005 1999 2003 2010 2007
1996 2009 2015 2007 1999 2006 2001 2004 2015 2012
1 1 0 0 1 1 1 1 0 1
0.5 4.5 5.5 5.5 1.5 1.5 2.5 1.5 5.5 5.5
page 4093
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4094
Figure 120.1:
Kaplan–Meier estimate of unemployment duration.
CEOs in Figure 120.1. The plot of the survival distribution function estimates shows a step function because the employment outcomes are measured in annual frequencies. ODS GRAPHICS ON; PROC LIFETEST DATA=career PLOTS=S PLOTS=S(CL); TIME gap5*findjob5(0); RUN; ODS GRAPHICS OFF; We also report an excerpt of the SAS output in Table 120.2. The survival rate when gap5 = 0.5 is 95%, which means that 5% of CEOs found a new executive position in the same year that they leave their previous employer. By the end of the first year after their departure, 68% survived, which means another 26% of executives found a new position in the first year after leaving their previous employer. There is no censoring observed in the career data until by the end of the fifth year, when about 39% of the sample is censored. The censored data includes CEOs who do not find a job by the end of the fifth year or by 2016, whichever comes sooner.
page 4094
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance Table 120.2: Duration of unemployment (gap5) 0 0.5 1.5 2.5 3.5 4.5 5.5
Failed 0 97 507 305 149 64 40
4095
SAS output excerpt of the Kaplan–Meier estimator. Survival distribution SDF Number function standard left estimate error 1, 915 1, 818 1, 311 1, 006 857 793 753
1 0.9493 0.6846 0.5253 0.4475 0.4141 0.3932
0 0.0050 0.0106 0.0114 0.0114 0.0113 0.0112
SDF 95% lower confidence limit
SDF 95% upper confidence limit
1 0.9385 0.6633 0.5027 0.4251 0.3920 0.3713
1 0.9583 0.7049 0.5474 0.4696 0.4361 0.4150
120.3.1.2 Life-table estimator The life-table estimator, also known as the actuarial-table estimator, could handle a large sample of survival data in which survival times are grouped into intervals. The life-table method counts the actual number of events from the longitudinal data, so its algorithm is similar to the Kaplan–Meier estimator. Assume the survival times are grouped into k + 1 intervals (0, t1 ), (t1 , t2 ), . . . , (tk−1 , tk ), (tk , ∞). Let di be the number of events, ci be the number of censored cases, and ni be the number of subjects at risk at the beginning of the time interval (ti , ti+1 ) (0 ≤ i ≤ k). Here, t0 = 0 and tk+1 = ∞. For any j = 1, . . . , k, the life-table estimator for the survivor function is defined as j ni − c2i − di ˆ . S(tj ) = ni − c2i i=0
The estimation of the variance of life-table estimator is also similar to the application of Greenwood’s formula for Kaplan–Meier estimator, ˆ j )]2 ˆ j )] = [S(t var[S(t
j
i=0
ni −
ci 2
di
− di ni −
ci 2
.
Although there is some loss of information by grouping time intervals, the life-table estimator provides intuitive interpretation of the survival data when the sample size is large. We apply this method to the same data on the career path of CEOs used in the previous section. The life-table method is potentially a better approach for estimating this data because the occurrence of events is already grouped into intervals measured by year. We use the following SAS code for this test and report the SAS output excerpt in Table 120.3 and plot the life-table survival curve below in Figure 120.2.
page 4095
July 6, 2020
1 2 3 4 5 6
0.5 1.5 2.5 3.5 4.5 5.5
97 507 305 149 64 40 0
0 0 0 0 0 753 0
1915 1818 1311 1006 857 416.5 0
1 0.9493 0.6846 0.5253 0.4475 0.4141 0.3743
SDF 95% lower confidence limit
0 0.0050 0.0106 0.0114 0.0114 0.0113 0.0118
0.9385 0.6633 0.5027 0.4251 0.3920 0.3512
SDF 95% upper confidence limit
0.9583 0.7049 0.5474 0.4696 0.4361 0.3974
9.61in x 6.69in
Midpoint of interval
SDF standard error
Handbook of Financial Econometrics,. . . (Vol. 4)
0 1 2 3 4 5 6
Upper time of interval
Survival distribution function Failed Censored Effect size estimate
F. Gao & X. He
Lower time of interval
SAS output excerpt for the life-table estimator.
16:7
4096
Table 120.3:
b3568-v4-ch120 page 4096
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Survival Analysis: Theory and Application in Finance
Figure 120.2:
b3568-v4-ch120
4097
Life-table survival curve.
ODS GRAPHICS ON; PROC LIFETEST DATA=career METHOD=LIFEPLOTS=S PLOTS=S(CL); TIME gap5*findjob5(0); RUN; ODS GRAPHICS OFF;
Similar to the results from Table 120.2 using the Kaplan–Meier estimator, the number of failed observations is the highest in the second row. This suggests that it is more likely for CEOs to find another executive job in the year after leaving the previous employer than any other year. Although the survival distribution function estimates are similar between the two estimators, the curve in Figure 120.2 appears smoother other than the step function in Figure 120.1. 120.3.1.3 Nelson–Aalen estimator The Nelson–Aalen estimator estimates the cumulative hazard function in the same fashion as the Kaplan–Meier estimator. It was initially proposed by Nelson (1972) and later mathematically formalized and justified by Aalen (1978).
page 4097
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4098
Since S(t) = exp[−Λ(t)], the estimator for the cumulative hazard function is
ni − di n − d i i ˆ ˆ = − log log =− . Λ(t) = − log S(t) ni ni i:ti ≤t
i:ti ≤t
Because log (1 + x) = x for small x, so above equation can be expressed as ˆ Λ(t) =
di . ni
i:ti ≤t
The variance of Nelson–Aalen estimator is estimated by
di ˆ . var Λ(t) = n2 i:t ≤t i i
The plot of the Nelson–Aalen estimator illustrates the hazard rate. A comparison between multiple Nelson–Aalen estimators provides a visual examination on the difference of different hazard types. We will discuss it in a later section. 120.3.2 Non-parametric testing 120.3.2.1 Logrank test The logrank test, also called the Mantel–Cox test, is the best known and most widely used test for differences in the survivor functions. Assume there are two groups, i.e., the treatment group versus the placebo group, with different risk exposure. The question of interest is whether one group has a better survival rate than the other. The underlying null hypothesis would be written as H0 : S1 (t) = S2 (t) for all t. We first plot the Kaplan–Meier estimators of survivor function for each group. If the survivor function for one group is always higher than the survivor function for another group, then we can infer intuitively that the first group might have a higher probability of survival than the second group. Assume there are k events observed in both groups. These events occur at either distinct times or the same time, t1 ≤ t2 ≤ · · · ≤ tk . Let n1i and n2i be the number of subjects at risk at time ti (1 ≤ i ≤ k) in each group, and
page 4098
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Survival Analysis: Theory and Application in Finance
b3568-v4-ch120
4099
ni = n1i + n2i . Assume k ≤ n, and let d1i and d2i be the number of observed events at time ti (1 ≤ i ≤ k) in each group, where di = d1i + d2i . Under the null hypothesis, for given parameter ni , n1i and di , the number of observed events in the first group, D1i , is a random variable following a hypergeometric distribution. It can be written as di ni − di d1i n1i − d1i . Pr(D1i = d1i |ni , n1i , di ) = ni n1i Denote the expected value by e1i = E(D1i = d1i |ni , n1i , di ) and variance by v1i = var(D1i = d1i |ni , n1i , di ), we get e1i =
di n1i ni
and v1i =
di (ni − di )n1i (ni − n1i ) . n2i (ni − 1)
For group 1, Mantel and Haenszel (1959) proposed to sum up the differences between d1i and e1i for all observed survival times, k
(d1i − e1i ). i=1
Its standardized form is the following k i=1 (d1i − e1i ) . k i=1 v1i The logrank statistic developed by Peto and Peto (1972) is the following: { ki=1 (d1i − dinni1i )}2 2 χlogrank = k d (n −d )n (n −n ) ∼ χ21 , i i i 1i i 1i i=1
n2i (ni −1)
which follows the standard normal distribution asymptotically. To illustrate the application of the logrank test, we use a sample of 232 Nasdaq firms whose stock price fell below $1 for 30 consecutive trading days in 2010. We track these firms to June 29, 2018 to check how many of them are delisted or merged by the end of sample period.
page 4099
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4100
The variables are defined as follows: start end
dur status young
the last date of the 30 consecutive days when a Nasdaq firm’s stock prices remain below $1 (t = 0) the last date of observation, either the date of delisting, merger, or June 29, 2018 for firms still trading on Nasdaq end – start +1. 1 if the firm is delisted from Nasdaq or merged, and 0 otherwise 1 if a firm’s age when its stock price drops below $1 for 30 consecutive days (t = 0) is less than 10 years, and 0 otherwise.
Similar to the career data in the previous section, a constant was added in the calculated date difference between start and end is to avoid zero being the time to event. We use the number 1 here, but any reasonable positive number works. It is widely believed that younger firms are more fragile than established firms (Nicol`o, 2015). Many do not last due to the lack of history to establish their ability to meet the expectation of stakeholders, such as debtholders, customers, and investors. We thus use the categorical variable young to partition the sample and examine whether there is any difference in the survival functions between young and established firms. Because the variable young is derived at t = 0, it is a time-invariant variable. Table 120.4 tabulates 10 observations of the sample. Out of the 232 at risk firms, 54% are delisted, 14% are merged, and the rest of 32% remain listed on Nasdaq as of the end of the sample period. We use the following SAS code to conduct the logrank test and compare the survivor functions for young versus established firms in Nasdaq. The Table 120.4: Obs 1 2 3 4 5 6 7 8 9 10
First 10 observations in the firm data.
start
end
dur
status
young
25-Feb-10 15-Sep-10 26-Feb-10 16-Feb-10 16-Feb-10 10-Mar-10 16-Feb-10 25-Jun-10 20-Aug-10 11-Mar-10
31-Dec-10 31-Jan-14 29-Jun-18 29-Jun-18 30-Nov-10 30-Jul-10 29-Jun-18 30-Dec-11 29-Jun-18 30-Apr-10
309 1234 3045 3055 287 142 3055 553 2870 50
0 0 0 0 0 1 0 1 0 1
1 0 0 0 0 1 0 1 0 1
page 4100
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance
Figure 120.3:
4101
Product-limit survival estimates for Nasdaq firms in 2010.
product-limit survival function estimates, along with the confidence intervals, are plotted in Figure 120.3. ODS GRAPHICS ON; PROC LIFETEST DATA=firms PLOTS=S(CL); TIME dur*status(0); STRATA young; RUN; ODS GRAPHICS OFF; The variable young is used to partition the data so the survival estimates for young versus established firms are plotted separately. The survival estimates for the younger firms is significantly lower than that for the older firms. The logrank test shows that the chi-square statistics is 81.1762, with a p-value < 0.0001. Since all sample firms have experienced persistent low stock price at t = 0 (i.e., dropping below $1 for 30 consecutive days), this finding confirms the conclusion of prior literature about the higher probability for established firms to survive through difficult times.
page 4101
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4102
120.3.2.2 Weighted logrank test The weighted logrank test is a more general formula for the logrank test. The test statistic is 2 k di n1i w (d − ) i=1 i 1i ni 2 χ2weighted = 2 d (n −d )n (n −n ) ∼ χ1 . w k i i 1i i 1i i i i=1
n2i (ni −1)
There are multiple choices of weights (wi ) for the specific test. Below is a summary of common weights and the names of the corresponding tests. For example, when wi =1, the weighted logrank test is the same as the logrank test.
Test Logrank Wilcoxon Peto-Peto Fleming-Harrington Tarone-Ware
Weight wi wi = 1 wi = n i ˆ i )) wi = f (S(t p ˆ i ) [1 − S(t ˆ i )]q where p, q ≥ 0 wi = S(t √ wi = ni
Which test should be used for an analysis? All the tests have the correct Type I error for testing the null hypothesis of equal survival, so the choice depends on the alternative hypothesis. Specifically, the weight you chose determines the power of the test. • The logrank test is powerful for detecting the difference in the survivor functions if they fit in the proportional hazards (PH) model, S1 (t) = [S2 (t)]γ (γ > 0). • The Wilcoxon test gives more weight to earlier times than to later times, so it is more sensitive to detect early differences between survivals. For example, the Wilcoxon test is suitable for the accelerated failure time (AFT) model when S1 (t) = S2 (ψt). Further, the Wilcoxon test has higher power when the survival time follows the Log-normal distribution. • A special case of Peto–Peto test or Fleming–Harrington test is where ˆ i ) It is most powerful under the alternative hypothesis of wi = S(t log-logistic model.
page 4102
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Survival Analysis: Theory and Application in Finance
b3568-v4-ch120
4103
120.3.2.3 Likelihood-ratio test The likelihood-ratio test is less popular than the (weighted) logrank test, because it requires the unnecessary assumption that the hazard function is constant in each group. But the likelihood-ratio test is powerful if the survival time follows the exponential distribution. All of non-parameter estimators and non-parametric testing could be conduct by SAS via the PROC LIFETEST procedure.
120.4 Parametric Methods Researchers are often interested in how the survival time depends on explanatory variables. Assume a vector x i = (xi1 , xi2 , . . . , xip )T for the explanatory variables of an individual i, which can include both continuous variables and indicator variables for qualitative classifications. The vector β = (β1 , β2 , . . . , βp )T represents the coefficients to be estimated. We assume that a linear predictor β T xi = pj=1 βj xij affects the survival rate through some function ψ(β T x). The most common models are the accelerated failure time (AFT) model and Cox proportional hazards (PH) model. Other models include the proportional odds model, additive hazards model, etc. All parametric regression models discussed here are AFT models. That is, it assumes Si (t) = S0 (ψt) and λi (t) = ψλ0 (ψt) for all t, where S0 and λ0 denote the baseline survival function and the baseline hazard function. The survival time is defined as follows: log Ti =
p
βj xij + σ i = β T xi + σ i ,
j=1
where β represents unknown regression parameters and σ is an unknown scale parameters. The baseline distribution of the error term i can be specified as one of several possible distributions, including extreme value, loggamma etc. However, AFT models are named after the distribution of survival time T rather than for the distribution of or log T . 120.4.1 Parametric regression models In this section, we introduce five most commonly used parameter regression models, all of which can be estimated by SAS via the PROC LIFEREG procedure.
page 4103
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4104
120.4.1.1 Exponential model If T is assumed to follow an exponential distribution, then three basic survival functions are defined as follows: λ(t) = λ > 0, f (t) = λ exp(−λt), S(t) = exp(−λt), where λ = exp(−β T x). The exponential model has the simplest form in parametric regression models, with an assumption of a constant hazard function. Besides, the exponential model assumes the scale parameter σ = 1. Under the exponential model, for an individual i, the hazard function λi (t) = exp(−β T xi ) is independent of survival time t, and the survivor function Si (t) = exp[−t exp(−β T xi )]. In most cases, these assumptions are not realistic, so the exponential model is often not the most suitable choice, but it serves as a good starting point for model selection. 120.4.1.2 Weibull model If T follows a Weibull distribution, then three basic survival functions are as follows: λ(t) = λγtγ−1 , f (t) = λγt
γ−1
λ > 0, γ > 0,
exp[−λtγ ],
S(t) = exp[−λtγ ], where γ = 1/σ and λ = exp(−β T x/σ). σ is called the Weibull shape parameter. The Weibull model is an extension of the exponential model when σ = 1. In other words, the exponential model can be treated as a special case of Weibull model with σ = 1. In this case, the exponential model is considered to be nested within the Weibull model. Under the Weibull model, the hazard T γ−1 rate and survivor function Tfor anγ individual i are λi (t) = exp(−β xi )γt and Si (t) = exp −exp(−β xi )t . One advantage of the Weibull model is that it belongs to both the AFT model and the Cox PH model, but the Cox PH model is often more powerful when the survival time follows the Weibull distribution.
page 4104
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Survival Analysis: Theory and Application in Finance
b3568-v4-ch120
4105
120.4.1.3 Generalized gamma model If T follows a three-parameter generalized gamma distribution, then its pdf is f (t) =
γτ θ θ−1 t exp[−(τ t)γ ], Γ(θ/γ)
θ > 0, τ > 0, γ > 0,
where Γ is a complete gamma function. Both the survivor and hazard functions are omitted here due to their complexity. When γ = 1, the generalized gamma distribution is the standard two-parameter gamma distribution. Also, when θ = γ and λ = τ γ , the generalized gamma distribution is also the Weibull distribution. Note that the exponential model is nested within the Weibull model, and both the Weibull model and log-normal model (which will be discussed later) are nested within the generalized gamma model. Because the generalized gamma model is a more generalized model than others, and it is often utilized for the goodness of fit test in survival analysis. Interested readers could refer to more detailed discussion by Lawless (2002) and Klein and Moeschberger (1997). 120.4.1.4 Log-normal model If log T follows a normal distribution, then T follows a log-normal distribution in which the pdf and the survivor function are 1 log t − μ 2 1 exp − , μ > 0, σ > 0, f (t) = √ 2 σ 2πσt log t − μ , S(t) = 1 − Φ σ where μ = β T x and Φ is the cumulative distribution function for the normal distribution. The hazard function can be expression as the ratio of the pdf and the survivor function. The log-normal hazard function is bell-shaped but generally asymmetric, where λ(0) = 0. Often times, the log-normal model fits a common situation that the risk facing a subject increases with time; but the hazard gradually declines as time goes by. 120.4.1.5 Log-logistic model Similar to the definition of log-normal model, if log T follows a logistic distribution, then T follows a log-logistic distribution in which three basic survival
page 4105
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4106
functions are λ(t) =
λγtγ−1 , 1 + λtγ
f (t) =
λγtγ−1 , (1 + λtγ )2
S(t) =
1 , 1 + λtγ
λ > 0, γ > 0,
where γ = 1/σ and λ = exp(−β T x/σ). The advantage of the log-logistic model is that it can be easily transformed into a traditional logistic regression model. Unlike the other four models have been introduced before, the log-logistic model has no nesting relationship with other models. 120.4.2 Maximum likelihood estimation In this section, we extend the assumption of n independent individuals in Section 120.1. For each individual i, the data consist of three parts: the observed time Yi , the event indicator δi and the explanatory variables vector xi . The likelihood function can be written as L(β) =
n
[fi (ti )]δi [Si (ti )]1−δi .
i=1
Taking the logarithm of the likelihood, we get the log-likelihood function as l(β) =
n
[δi log fi (ti ) + (1 − δi ) log Si (ti )]
i=1
=
n
[δi log λi (ti ) + log Si (ti )].
i=1
Using the exponential model as an example, we have n
[δi β T xi + ti exp(−β T xi )]. l(β) = − i=1
The score function of the first derivative of the log-likelihood and the Fisher information matrix function of the negative second derivative of the loglikelihood are n
∂l(β) =− [δi xi − ti xi exp(−β T xi )], U (β) = ∂β i=1
page 4106
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
Survival Analysis: Theory and Application in Finance
4107
n
I(β) = −
∂ 2 l(β) = ti xi xTi exp(−β T xi ). T ∂β∂β i=1
The maximum likelihood estimate (MLE) β is solved from the score function U (β) = 0. The solution is usually obtained through the Newton–Raphson algorithm.
120.4.3 Parameter estimation and testing We use the same data used for the logrank test to illustrate the parameter estimation and hypothesis tests. There are a couple of changes. First, instead of measuring duration in days (dur), we now measure the duration of a firm in years (dury). This is not necessary for illustration purpose, but annual frequencies of observations are the most common in economics and finance. Second, we use the continuous variable age rather than the categorical variable young for the model. Third, we are able to add covariates in the regression model, especially continuous covariates. We add three variables, advertising expense (ad), R&D expense (rd), and leverage ratio (debt). Note that all of variables in the model are measured at t = 0, thus time-invariant. The variables are defined as following: start end
dury status age ad rd debt
the last date of the 30 consecutive trading days when a Nasdaq firm’s stock prices fell below $1 (t = 0). the last date of observation, either the date of delisting, merger, or June 29 2018 for firms still trading on Nasdaq. ceiling integer value of (end – start + 1)/365. 1 if a firm is delisted or merged, and 0 otherwise firm age in years at t = 0. advertising expense per year in millions of dollars at t = 0. research and development expense per year in millions of dollars at t = 0. total debt deflated by total assets at t = 0.
The PROC LIFEREG procedure is used below to estimate the exponential model. By replacing the distribution D=exponential with weibull, gamma, lnormal and llogistic, we can get results for other models.
page 4107
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch120
F. Gao & X. He
4108
proc lifereg data=firms; model dury*status(0)=age ad rd debt / D=exponential; run;
The parameter estimations for four of the five models are displayed below. The output by the generalized Gamma model was excluded because SAS gave a warning that its convergence and model fit are questionable.
Exponential model Parameter Estimate
P -value
age ad rd debt
0, β1 ≥ 0 if εt−1 < 0,
(123.9)
and ht = ω + α11 ε2t−1 + β1 ht−1 , ω > 0, α11 ≥ 0, β1 ≥ 0 if εt−1 ≥ 0.
(123.10)
Hence, the ht in equation (123.9) is greater than that in equation (123.10); this property is referred to as the leverage effect. Suppose we have a symmetric distribution of zt . Then, the process of {ht } is stationary if all solutions
page 4215
July 6, 2020
16:7
4216
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
of the characteristic equation q p 1 βj λj = 0 αi1 + αi2 λi − 1− 2 i=1
(123.11)
j=1
are greater than unity in absolute value. 123.2.2.2 Exponential GARCH (EGARCH) (p, q) model Nelson (1991) proposed another model to catch the leverage effects log(ht ) = ω +
q
αi g(zt−i ) +
i=1
p
βj log(ht−j ),
(123.12)
j=1
where g(zt−i ) = κzt−i + ς(|zt−i | − E(|zt−i |)), and ω, αi , βj , κ and ς are real numbers. This specification allows asymmetry in the volatility, and also avoids positivity restrictions on the parameters of ω, αi , and βj by way of specifying the log-volatility. The process of {ht } is stationary if all solutions of the characteristic equation 1−
p
βj λj = 0
(123.13)
j=1
are greater than unity in absolute value. 123.2.3 Estimation of the GARCH(p, q) model The parameter vector in the GARCH(p, q) model for equations (123.1) and (123.4) is θ = (θ1 , . . . , θk+p+q+2 ) = (μ, γ1 , . . . , γk ; ω, α1 , . . . , αq ; β1 , · · · , βp ) . The true value of parameters is unknown and denoted as θ0 = (θ01 , . . . , θ0k+p+q+2 ) = (μ0 , γ01 , . . . , γ0k ; ω0 , α01 , . . . , α0q ; β01 , . . . , β0p ) . The orders of the GARCH(p, q) model (namely k, p, and q) are assumed known to estimate the parameters. Though we do not assume any specific distribution of the standardized random variable zt , we work with the normal distribution. Then the quasi-likelihood is
ε˜2t exp − , LT (θ) = LT (θ; y1 , . . . , yT ) = ˜t ˜t 2h 2π h t=1 T
1
(123.14)
page 4216
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4217
where ε˜t = ε˜t (θ) = yt − μ −
k
γi yt−i , and
i=1
˜ t (θ) = ω + ˜t = h h
q i=1
αi ε˜2t−i +
p
˜ t−j . βj h
(123.15)
j=1
If the initial values are given as y0 , . . . , y1−k−p , ε˜0 , . . . , ε˜−q , h0 , . . . , h1−p , then, ˜ t and ε˜t for t = 1, . . . , T , can be computed from for any θ, the values of h equation (123.15). The quasi-maximum likelihood estimator (QMLE) is defined by a solution θˆT of θˆT = arg maxθ LT (θ). Taking the logarithm of LT (θ), the solution is equivalent to ˆ 1 , . . . , yT ) = arg min lT (θ), θˆT = θ(y θ
(123.16)
where lT (θ) = l(θ; y1 , . . . , yT ) =
T 1 lt , T t=1
and
2 ˜ t ) + ε˜t . lt = log(h ˜t h
(123.17)
This method of estimation is applicable to other models of GJR and EGARCH, although we do not explicitly discuss these in this paper. 123.2.4 Asymptotic properties of the QMLE for the GARCH(p, q) Under suitable regularity conditions, the QMLE is strongly consistent: θˆT → θ0 (almost surely), as T → ∞ and asymptotically normally distributed5 : √ T (θˆT − θ0 ) → N (0, Σ), where Σ = J −1 IJ −1 , ∂lt (θ0 ) ∂lt (θ0 ) , I = Eθ0 ∂θ ∂θ
5
as
T → ∞,
and
J = Eθ0
∂ 2 lt (θ0 ) . ∂θ∂θ
(123.18)
(123.19)
(123.20)
See Theorems 7.4 and 7.5 in Chapter 7 in Francq and Zakoian (2010) for rigorous regularity conditions and proofs. Proofs of consistency and asymptotic normality of the QMLE are mathematically much involved.
page 4217
July 6, 2020
16:7
4218
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
Although we do not explicitly state the regularity conditions, the basic requirements are the stationarities of the process of both {yt } and {εt } as stated in equations (123.2) and (123.4). The consistency and asymptotic normality of the estimator are very important from the statistical viewpoint. Consistency implies that the estimates approach the true parameter values when the sample size increases to infinity. Hence, the estimates are reliable when the sample is large. Asymptotic normality guarantees a standard process of testing the significance of parameter estimates when the sample size is large.
123.3 Multivariate GARCH Models There is no unanimously agreed direct extension of the univariate GARCH model to a multivariate framework, unlike the progression from the univariate ARMA model to the multivariate (vector) ARMA model. Two main difficulties arise in a multivariate extension from the statistical viewpoint: (i) the conditional variance–covariance matrix (say Ht : n×n) entail positive definite condition. This condition imposes rather complicated restrictions on the parameter space in general; (ii) the specification of Ht generally requires quite a large number of parameters if the dimension in the model is large, since the matrix Ht has n × (n + 1)/2 different elements. In practical application of the multivariate GARCH model, we seek a model that simultaneously solves two contrary requirements. The model naturally must be sufficiently general to fit the actual behaviors of the multivariate variables. However, the model must be parsimonious to be statistically tractable in practice. A wide variety of multivariate extensions of the GARCH model have been duly proposed. The multivariate GARCH (MGARCH) models so far proposed in the literature may be classified into two types of conditional covariance models and conditional correlation models. The most general model of the first type is the VECH model proposed by Bollerslev, Engle, and Wooldridge (1988). This model is quite flexible, allowing all volatilities and conditional covariances to be interrelated. However, the empirical implementation of VECH models is limited due to their extremely large numbers of parameters, even in systems of moderate dimensions (see Table 123.1). Another difficulty of VECH models is that of determining the conditions required to guarantee positive definite conditional covariance matrices and stationarity. These problems are overcomed by imposing some restrictions on the MGARCH models. Some popular restricted VECH models are the
page 4218
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model Table 123.1:
4219
Number of parameters in the conditional covariance (Ht ).
n VEC-GARCH BEKK-GARCH(K = 1) BEKK-GARCH(K = 2) CCC-GARCH CCC-GARCH(diagonal) DCC-GARCH DCC-GARCH(diagonal)
2
4
10
21 11 19 11 7 13 9
210 42 74 42 18 44 20
6105 255 455 255 75 257 77
Note: The order of GARCH is p = q = 1 and n is number of dimensions of the variables. The number of parameters of VAR(k) are common for all models, and excluded from the table. The DCC-GARCH will be explained in Section 123.4.
diagonal VECH (DVECH), suggested by Bollerslev et al. (1988), and the BEKK (Baba–Engle–Kraft–Kroner) model of Engle and Kroner (1995). The variations of the BEKK models include diagonal BEKK model by Ding and Engle (2001), rotated BEKK model by Noureldin et al. (2014), diagonal rotated BEKK by Noureldin et al. (2014), scalar BEKK by Ding and Engle (2001), and scalar rotated BEKK by Noureldin et al. (2014). Applications of DVECH models are found in Bauwens et al. (2007) and Ledoit et al. (2003), and applications of the BEKK models include Beirne et al. (2013) and Hecq et al. (2016). The second type of the MGARCH models is based on the decomposition of the covariance matrix into the product of conditional variances and correlations. The Constant Conditional Correlation (CCC) model of Bollerslev (1990) and the Dynamic Conditional Correlation (DCC) model of Engle (2002) belong to the second type. In particular, the DCC model is modified in various ways such as Corrected DCC(cDCC) model of Aielli (2013), Almon-shuffle DCC model of Bauwens et al. (2016), Generalized DCC model of Hafner and Franses (2009), and Rotated DCC model of Noureldin et al. (2014). The CCC and DCC models have been most extensively used in the empirical studies. See Amado and Terasvirta (2014) and Laurent et al. (2013) for some recent empirical applications of the CCC model, and Tsukuda et al. (2017), Aielli and Caporin (2014), and Audrino (2014) for the DCC model among others. Bauwens et al. (2006), Silvennoinen and Terrasvirta (2009), and Franq and Zakoian (2010, Chapter 11) provide recent useful reviews of multivariate extensions of the GARCH models. We do not review the development of all the multivariate GARCH models in detail. Instead, we restrict our attention
page 4219
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4220
to some of the most important models from the application viewpoints, and consider how these models solve the above contradictory requirements. 123.3.1 Representative multivariate GARCH models We commence with a multivariate (vector) autoregressive of order k or VAR(k) model Yt = μ +
k
Γi Yt−i + εt ,
t = 1, . . . , T,
(123.21)
i=1
where Yt = (Y1t , . . . , Ynt ) is an n-dimensional vector and μ, Γi , εt are conformably defined. We assume that all solutions of the characteristic equation k Γi λi = 0 (123.22) I − i=1
are greater than unity in absolute value. Under this condition, the process {Yt } is stationary if {εt } is stationary. A process {εt } in equation (123.21) is a multivariate GARCH model if its first two conditional moments exist and satisfy: (i) E(εt |It−1 ) = 0,
(ii) Ht = Var(εt |It−1 ),
(123.23)
1/2 Ht zt ,
Ht > 0 (positive definite), zt ∼ i.i.d., E(zt ) = 0, where εt = Var(zt ) = I, and It−1 in (123.23) denotes the information available up to time t − 1. The conditional expectation of Yt is E(Yt |It−1 ) = μ +
k
Γi Yt−i .
(123.24)
i=1
The n × n conditional variance–covariance matrix Ht is positive definite and has n × (n + 1)/2 different elements. Many different ways of specifications concerning the process of Ht are proposed in the literature. 123.3.1.1 VEC-GARCH(p, q) model Denote by vech(·) an operator that stacks the lower triangular part of its argument square matrix (if A = (aij ), then vech(A) = (a11 , . . . , an1 , a22 , . . . , an2 , . . . , ann ) ). The VEC-GARCH(p, q) model proposed by Bollerslev, Engle and Wooldridge (1988) is defined by q p (i) A vech(εt−i εt−i ) + B (j) vech(Ht−j ). (123.25) vech(Ht ) = ω + i=1
j=1
where ω is a vector of size {n(n + 1)/2}×1, and A(i) and B (j) are matrices of {n(n + 1)/2} × {n(n + 1)/2}. The parameters of the model in (123.21) and
page 4220
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4221
(123.25) consist of θ = (μ, Γ1 , . . . , Γk ; ω, A(1) , . . . , A(q) ; B (1) , . . . , B (p) ) and + { n(n+1) }2 the number of parameters in equation (123.25) is s = n(n+1) 2 2 (p + q). The positive definiteness condition of the matrix Ht is satisfied if A(i) and B (j) and Ω are positive definite, where an n × n matrix Ω is defined such as vech(Ω) = ω. The VEC-GARCH(p, q) model is a natural and general extension of the GARCH(p, q). The coefficient parameter of A(i) and B (j) is clearly interpretable. However, this model is not much useful in practice for the purpose of dealing with large-dimensional variables because the numbers of parameter increase rapidly in proportion to n4 . 123.3.1.2 BEKK-GARCH(p, q) model The BEKK-GARCH(p, q) model proposed by Engle and Kroner (1995) is defined by Ht = Ω +
q K i=1 k=1
Aik εt−i εt−i Aik
+
p K j=1 k−1
Bjk Ht−j Bjk ,
(123.26)
where an n × n matrix Ω is assumed to be positive definite. The matrix Ht in equation (123.26) is positive definite. This model is theoretically appealing, but again requires too many parameters for modeling large-dimensional variables. The parameters Aik , Bjk in the model have no direct interpretation and it is difficult to interpret parameters. The parameters of the model of (123.21) and (123.26) are θ = (μ, Γ1 , . . . , Γk ; Ω, A11 , . . . , AqK ; B11 , . . . , BpK ), and the number of parameters in equation (123.26) is s = 12 n(n + 1) + n2 (p + q)K. To illustrate the relationship between the BEKK-GARCH and VECGARCH models, consider a simple case of p = q = K = 1 and n = 2. The BEKK-GARCH(1, 1) is written as , Ht = Ω + A11 εt−1 εt−1 A11 + B11 Ht−1 B11
(123.27)
where Ω = (ωij ), A11 = (aij ), B11 = (bij ) (i, j = 1, 2). After some calculations, the elementwise expression of equation (123.27) is ⎞ ⎛ ⎞ ⎛ 2 ⎞⎛ ⎞ ⎛ ε21,t−1 2a11 a12 a212 ω11 a11 h11,t ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎝h21,t ⎠ = ⎝ω21 ⎠ + ⎝a11 a21 a11 a22 + a12 a21 a12 a22 ⎠ ⎝ε1,t−1 ε2,t−1 ⎠ h22,t
ω22 ⎛
a221 b211
⎜ + ⎝b11 b21 b221
2a21 a22
2b11 b12 b11 b22 + b12 b21 2b21 b22
b212
⎞⎛
a222 h11,t−1
⎞
⎟⎜ ⎟ b12 b22 ⎠ ⎝h21,t−1 ⎠. b222
h22,t−1
ε22,t−1 (123.28)
page 4221
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4222
By contrast, the VEC-GARCH(1,1) model is written as vech(Ht ) = ω + A(1) vech(εt−i εt−i ) + B (1) vech(Ht−1 ),
(123.29)
where vech(Ht ) = (h11,t , h21,t , h22,t ) , vech(εt−1 εt−1 ) = (ε21,t−1 , ε1,t−1 ε2,t−1 , ε22,t−1 ) , ω = (ωj ), A(1) = (αij ) and B (1) = (βij ) (i, j = 1, 2, 3). Comparing equation (123.28) with equation (123.29), we obtain ω = vech(Ω), ⎛ ⎞ ⎛ 2 ⎞ α11 α12 α13 2a11 a12 a212 a11 ⎜ ⎟ ⎜ ⎟ A(1) = ⎝α21 α22 α23 ⎠ = ⎝a11 a21 a11 a22 + a12 a21 a12 a22 ⎠, α31 α32 α33 a221 2a21 a22 a222 (123.30) ⎛ ⎞ ⎛ 2 ⎞ β11 β12 β13 2b11 b12 b212 b11 ⎜ ⎟ ⎜ ⎟ B (1) = ⎝β21 β22 β23 ⎠ = ⎝b11 b21 b11 b22 + b12 b21 b12 b22 ⎠. β31
β32
b221
β33
b222
2b21 b22
(123.31) We can see from (123.30) and (123.31) that the VEC-GARCH(1, 1) model include the BEKK-GARCH(1, 1) as a special case, and the parameters of the BEKK-GARCH(1, 1) implicitly impose very special restrictions on those of the VEC-GARCH(1, 1). 123.3.1.3 CCC-GARCH(p, q) model Introduced by Bollerslev (1990) and extended by Jeantheau (1998), in the CCC-GARCH(p, q) model, the conditional variance-covariance matrix is decomposed as Ht = Dt RDt ,
1/2
1/2
Dt = diag(h11,t , . . . , hnn,t ),
(123.32)
where Dt is the diagonal matrix of the square roots of the conditional variances, and correlation matrix (R = (ρkl )) is assumed to be constant over time, i.e., ρkl,t =
hkl,t 1/2
1/2
hkk,thll,t
= ρkl .
(123.33)
The diagonal elements of Ht are formulated as ¯ ht = ω +
q i=1
Ai ε¯t−i +
p j=1
Bj ¯ht−j ,
(123.34)
page 4222
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4223
¯ t = (h11,t , . . . , hnn,t ) , and ε¯t = (ε2 , . . . , ε2 ) are n × 1 vecwhere h n,t 1,t tors. If all elements of parameters ω, Ai , Bj are positive, and a correlation matrix R is chosen positive definite, then the conditional covariance matrix is guaranteed positive definiteness. The parameters in this model are θ = (μ, Γ1 , . . . , Γk ; ω, A1 , . . . , Aq ; B1 , . . . , Bp ; R). The number of parameters in equation (123.34) are s = n + n2 (p + q) + 12 n(n − 1). As a special case, if all Ai , Bj are diagonal matrices, equation (123.34) reduces to hii,t = ωi +
q
αij ε2i,t−j
+
j=1
p
βij hii,t−j ;
i = 1, . . . , n.
(123.35)
j=1
In equation (123.35), hii,t depends on only the past values of itself and those of ε2i,t , but not the past values of the other jth variables (j = i). The CCCGARCH(p, q) model is simple and easy to handle. However, it is limited in its practical application since the assumption of constant correlation is unrealistic for most purposes. Table 123.1 illustrates the number of parameters required for formulating the covariance (Ht ) in the various models. As shown, the VEC-GARCH model quickly increases as n becomes large, while the diagonal CCC-GARCH remains relatively small even if the size of model (n) is large. 123.3.2 Estimation of the CCC-GARCH(p, q) model We can obtain the QMLE of the CCC-GARCH(p, q) model in an analogous method to that in Section 123.2.3. The (negative) quasi log-likelihood function is apart from constant terms T
1 lt , lT (θ) = l(θ; Y1 , . . . , YT ) = 2
(123.36)
t=1
˜ t ) + ε˜ H −1 ε˜t , where lt = log det(H t t ε˜t = ε˜t (θ) = Yt − μ −
k
Γi Yt−i ,
(123.37)
i=1
˜ 1/2 , . . . , h ˜ 1/2 ), (123.38) ˜ t (θ) = D ˜ t RD ˜ t where D ˜ t = diag(h ˜t = H H nn,t 11,t ˜ ˜ ¯ t (θ) = ω + ¯t = h h
q i=1
Ai ε˜¯t−i +
p j=1
˜¯ , Bj h t−j
(123.39)
page 4223
July 6, 2020
16:7
4224
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
and the parameter vector is θ = (μ, Γ1 , . . . , Γk ; ω, A1 , . . . , Aq ; B1 , . . . , Bp ; R). Minimizing the (negative) quasi-log-likelihood with respect to θ, we obtain the QMLE: ˆ 1 , . . . , YT ) = arg min lT (θ). θˆT = θ(Y θ
(123.40)
This method of estimation is theoretically applicable to the earlier VECGARCH and BEKK-GARCH models. However, the numerical calculations for the quasi-log-likelihood function of those models are often not stable in practice when the dimension of the variables (n) is large. 123.3.3 Asymptotic properties of the QMLE for the CCC-GARCH We can derive the asymptotic properties of the QMLE of the CCC-GARCH in a similar manner to those of the univariate GARCH model as stated in Section 123.2.4. Under certain suitable regularity conditions, the QMLE is strongly consistent6 : θˆT → θ0 (almost surely), as T → ∞ and asymptotically normally distributed √ T (θˆT − θ0 ) → N (0, Σ) where Σ = J −1 IJ −1 , ∂lt (θ0 ) ∂lt (θ0 ) , I = Eθ0 ∂θ ∂θ
as
T → ∞,
and J = Eθ0
∂ 2 lt (θ0 ) . ∂θ∂θ
(123.41)
(123.42)
(123.43)
While the mathematical formula of (123.41)–(123.43) is identical to (123.18)– (123.20), the number of elements in θˆT is much larger in the former than in the latter. 123.4 DCC-GARCH Model 123.4.1 Statistical properties of the DCC-GARCH model The DCC-GARCH model proposed by Engle (2002) is an extension of the CCC-GARCH. The conditional variance–covariance matrix (Ht ) is factorized 6
See Theorems 11.7 and 11.8 in Chapter 11 in Francq and Zakoian (2010) for rigorous regularity conditions and proofs. Proofs of consistency and asymptotic normality of the QMLE are mathematically even much involved.
page 4224
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4225
into the product of the conditional variance and correlation matrices Ht = Dt Rt Dt ,
(123.44)
where the conditional variance Dt is defined in equation (123.32). The n × n conditional correlation matrix Rt , is not constant unlike that of the CCCGARCH. The diagonal elements of Ht are formulated as ¯ ht = ω +
q
Ai ε¯t−i +
i=1
p
Bj ¯ht−j ,
(123.45)
j=1
¯ t = (h11,t , . . . , hnn,t ) , and ε¯t = (ε2 , . . . , ε2 ) are n × 1 vectors. This where h n,t 1,t formulation is identical to the CCC-GARCH. Defining the normalized error term vector as −1/2
ut = Dt
εt ,
(123.46)
the conditional correlation matrix of εt is given by Rt = E(ut ut |It−1 ). We specify an innovation of the conditional correlation matrix as ¯ + aut−1 ut−1 + bQt−1 , Qt = (1 − a − b)Q
(123.47)
¯ is a matrix of location parameters. If a ≥ 0, b ≥ 0, a + b < 1 and Q ¯ where Q is positive definite, then Rt is also positive definite. Hence, Rt can be defined in terms of Qt as Rt = diag(q11,t , . . . , qnn,t)−1/2 Qt diag(q11,t , . . . , qnn,t)−1/2 .
(123.48)
The ith and jth elements of Rt can be written as ρij,t = qij,t {qii,t qjj,t}−1/2 = {(1 − a − b)¯ qij + b qij,t−1 + aui,t−1 uj,t−1 } qjj × {((1 − a − b)¯ qii + bqii,t−1 + au2i,t−1 )((1 − a − b)¯ + bqjj,t−1 + au2j,t−1 )}−1/2 ,
(123.49)
which implies that the conditional correlations are dynamically driven by the process of Q1 . The conditional covariances are obtained accordingly as hij,t = ρij,t{hii,t hjj,t }1/2 . We note that the correlation coefficients are nonlinear functions of two unknown parameters a and b. The full set of parameters for the DCC-GARCH(p, q) model consists of θ = ¯ (μ, Γ1 , . . . , Γk ; ω, A1 , . . . , Aq ; B1 , . . . , Bp ; a, b, Q). Engle (2002) specifies the process of the conditional correlation matrix (Ri ) using equations (123.47) and (123.48), while most of multivariate GARCH models before Engle (2002) directly specify the process of n(n + 1)/2 different elements of Ht . The construction of Rt using (123.48)
page 4225
July 6, 2020
16:7
4226
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
automatically satisfies the positive definiteness condition as long as Qt is positive definite. The number of parameter required for specifying Ht in this + 2, and exceeds that of the CCCmodel is s = n + n2 (p + q) + n(n−1) 2 GARCH by only two. This model is also flexible for allowing the conditional correlations to change over time. The DCC-GARCH model ingeniously compromises two contrary requirements for constructing a model: sufficiently flexible to catch behaviors of actually observed data process, and sufficiently parsimonious for statistical analysis in practice. 123.4.2 Estimation of the DCC-GARCH(p, q) model We can employ an estimation procedure analogously to that of the CCCGARCH model. The (negative) quasi log-likelihood function is given apart from constant terms as T
lT (θ) = l(θ; Y1 , . . . , YT ) =
1 lt , 2 t=1
(123.50)
˜ t ) + ε˜ H ˜ −1 ˜t . We obtain the QMLE by minimizing where lt = log det(H t t ε the (negative) quasi-log-likelihood (123.50) with respect to θ. However, for the empirical studies in Section 123.5, we employ the “one-step maximum likelihood” method proposed by Bauwens and Laurent (2005). They replace ¯ by its empirical counterpart as in Engle (2002), before minimizing the Q (negative) log-likelihood function of (123.50). Once an estimate θˆ is obtained, ˆ Engle (2002) claims ˆ t , and Φ ˆ t as functions of θ. ˆ t, R we are able to compute H that the DCC model can be inefficiently but consistently estimated by using a ¯ is estimated by the sample second moment “two-step approach”, in which Q of the standardized returns. The two-step estimator relies on the conjecture ¯ = E[ut ut ]. ¯ is the second moment of ut , i.e., Q that Q The DCC model has been controversial among researchers. Caporin and McAleer (2013) present several caveats about applications of the DCC model. Aielli (2013) provides a thorough investigation of the properties and estimation methods of the DCC model. Pointing out that Engle’s conjecture about ¯ is not correct, i.e., Q ¯ = E[ut ut ], Aielli (2013) proves the inconsistency Q of the two-step estimator of Engle (2002).7 He also suggests a correction of the DCC model to a more tractable one called the cDCC model. This model admits strict stationarity and ergodic solution under certain explicitly stated 7
This criticism of Aielli (2013) is also valid to the one-step ML method of Bauwens and Laurent (2005). The one-step ML estimator may not be consistent.
page 4226
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4227
regularity conditions. He proposes the cDCC estimator and proves the consistency of this estimator. The discussion of Aielli (2013) signifies theoretically as well as numerically the contribution and limitation of Engle (2002). Despite some theoretical disadvantages as pointed out by Aielli (2013) and Caporin and McAleer (2013), the DCC model of Engle (2002) and its extensions are now among the most popular approaches to the modeling of multivariate volatility. The numerical behaviors of the DCC model are almost identical to those of the cDCC model for the typical parameter range of a and b (a + b ≤ 0.990 and a ≤ 0.04) indicated by Aielli (2013).8 Comparing the performances of various popular MGARCH models by way of Monte Carlo experiments, de Almeida et al. (2018) report that specification of the models is more important than choice of estimators, and DCC models generally show good performances. 123.4.3 Dynamic conditional variance decomposition We assume that the volatility spillover effects are unidirectional along the order of variables Y1t through Ynt (i.e., from Y1t to Y2t and from Y2t to Y3t until Ynt ) and that there is no reverse spillover. We can uniquely decompose the conditional covariance matrix (Ht ) by the Cholesky method9 as Ht = Φt Σt Φt ,
(123.51)
where Φt is a lower triangular matrix with diagonal elements of ones and 2 , . . . , σ 2 ) is a diagonal matrix. We note that Φ in (123.51) Σt = diag(σ1,t t n,t changes over time. In order to decompose the unexpected returns into the idiosyncratic shocks, we transform the error term vector εt into a new random vector as follows10 ε˜t = (˜ ε1,t , . . . , ε˜n,t ) = Φ−1 ˜t , t εt , or equivalently, εt = Φt ε
8
(123.52)
Indeed, our study on the East Asian bond markets reveals a ˆ + ˆb ≤ 0.990 and a ˆ ≤ 0.04 for seven countries of Hong Kong, Singapore, South Korea, Malaysia, China, the Philippines, and Indonesia, and a ˆ + ˆb = 0.992 and a ˆ = 0.006 only for Thailand as shown in Table 123.2 of Section 123.5.1.2. Even for the exceptional case of Thailand, deviations from the typical range are very small. Moreover, for all the stock markets the estimated parameters are within the range of a ˆ + ˆb ≤ 0.990 and a ˆ ≤ 0.04. These observations may partially justify the use of the DCC model. 9 See Hamilton (1994, pp. 87–102). 10 The definition of ε˜t in equation (123.52) is different from that in equation (123.37). There should be no confusion.
page 4227
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4228
Table 123.2: γ41 Hong Kong 0.260∗ (0.05) Singapore 0.119∗ (0.03) Korea 0.118∗ (0.02) Thailand 0.070∗ (0.03) Malaysia 0.029∗ (0.01) China 0.006 (0.01) Philippines 0.038∗ (0.02) Indonesia −0.003 (0.02)
γ42 −0.028 (0.02) −0.034∗ (0.02) −0.008 (0.01) 0.012 (0.02) −0.002 (0.01) 0.002 (0.01) 0.006 (0.01) 0.010 (0.01)
Estimates of parameters for the bond markets. γ43 0.068 (0.08) 0.190∗ (0.06) 0.045 (0.05) 0.218∗ (0.09) 0.023 (0.05) −0.058∗ (0.03) 0.076 (0.06) 0.102 (0.06)
γ44 −0.064 (0.04) −0.035 (0.04) −0.025 (0.04) 0.122∗ (0.04) 0.160∗ (0.05) 0.304∗ (0.05) 0.071∗ (0.04) 0.186∗ (0.04)
α41 0.038∗ (0.02) 0.140∗ (0.01) 0.073∗ (0.03) 0.086∗ (0.02) 0.197∗ (0.03) 0.202∗ (0.03) 0.341∗ (0.12) 0.496∗ (0.10)
α42 0.123∗ (0.02) 0.098∗ (0.02) 0.303∗ (0.07) 0.116∗ (0.04) 0.078 (0.05) 0.154∗ (0.07) −0.010 (0.11) −0.217∗ (0.10)
β41
a
b
0.892∗ (0.02) 0.738∗ (0.01) 0.771∗ (0.04) 0.841∗ (0.02) 0.804∗ (0.02) 0.681∗ (0.01) 0.703∗ (0.07) 0.548∗ (0.06)
0.009 (0.01) 0.011∗ (0.00) 0.022∗ (0.01) 0.006 (0.00) 0.017 (0.02) 0.012 (0.01) 0.020∗ (0.01) 0.018 (0.01)
0.977∗ (0.01) 0.976∗ (0.01) 0.947∗ (0.02) 0.986∗ (0.01) 0.957∗ (0.07) 0.972∗ (0.03) 0.950∗ (0.02) 0.925∗ (0.08)
Note: The parameters of γ41 , γ42 , γ43 and γ44 respectively indicate the coefficients for the emerging East Asian local market in equation (123.57). Standard errors are in parentheses. The asterisks denote significance at the 5% level. Almost all estimates of α41 , α42 , β41 , a and/or b are highly significant.
where ε˜|It−1 ∼ N (0, Σt ). The elements of ε˜t can be interpreted as idiosyncratic shocks, which are independent of each other. The unexpected return of the ith market at time t is expressed as a linear combination of idiosyncratic shocks εi,t = φi1,t ε˜1,t + · · · + φii−1,t ε˜i−1,t + ε˜i,t ;
i = 2, . . . , n,
(123.53)
where the coefficient φij,t (j = 1, . . . , i − 1) indicates the sensitivity of the unexpected return in the ith market to the idiosyncratic shock in the jth market. The expectation of εi,t conditional on ε1,t , . . . , εi−1,t , is given as E(εi,t |ε1,t , . . . , εi−1,t ; It−1 ) = φi1,t ε˜1,t + · · · + φii−1,t ε˜i−1,t .
(123.54)
This quantity indicates the contemporaneous spillover effects on the unexpected returns from the external markets. Most previous researchers have not paid much attention to this quantity. However, equation (123.54) provides a useful measure of contemporaneous spillover effects on the unexpected return of the ith market from the jth market (j = 1, . . . , i − 1).
page 4228
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4229
The conditional variance of the unexpected return for the ith market is 2 2 2 + · · · + φ2ii−1,t σi−1,t + σi,t . hii,t = E(ε2i,t |It−1 ) = φ2i1,t σ1,t
(123.55)
The dynamic conditional variance ratios are defined as 2 φ2ij,t σj,t for j = 1, . . . , i − 1, and ξii,t = 1 − ξi1,t − · · · − ξii−1,t , ξij,t = hii,t (123.56) which indicate the relative contribution of the jth market to the conditional variance of the ith market at time t, and 0 ≤ ξij,t ≤ 1. The quantities in (123.56) are called the dynamic conditional variance decompositions. The decomposition of variance introduced by Sims (1980) is now a standard practice. The decomposition in this study is an extension of Sims (1980) to the decomposition of the conditional variance at each time of observation. This quantity is regarded as a volatility-spillover effect from the jth market to the ith market at time t. The volatility spillover effects averaged over time are given by ξij = T −1 Tt=1 ξij,t . The dynamic conditional variance decomposition provides a useful measure to evaluate the magnitude of integration process among the financial markets. In Section 123.5, we apply this concept to the emerging East Asian financial markets. 123.5 Application of the DCC-GARCH Model to East Asian Financial Markets This section applies the DCC-GARCH model to consider whether the comovements in the bond yields and stock prices in the emerging East Asian financial markets have increased over time. The main purpose is to illustrate how the DCC-GARCH model works in practice even though we investigate the East Asian financial markets relatively in detail. We empirically clarify the process of integration during the last few decades and discuss the present integration status in the eight emerging East Asian bond and stock markets of Hong Kong, Singapore, Korea, Thailand, Malaysia, the Philippines, Indonesia, and China.11 We explain some fundamental facts on the emerging East Asian financial markets in Appendix 123A, which provide basic background knowledge for understanding the importance of the 11 China may not be an “emerging” country any more at present time. However, this paper categorizes ASEAN5 (Indonesia, Malaysia, the Philippines, Singapore, Thailand), China, Hong Kong and Korea as emerging East Asian countries since we study the historical development of this region in the past decades.
page 4229
July 6, 2020
16:7
4230
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
financial markets in this region. Detailed data construction and properties of the data are described in Appendix 123B. 123.5.1 Application to the bond markets 123.5.1.1 Construction of the model We analyze the differences of the logarithmic bond yields using the data described in Appendix B. We employ the DCC-GARCH model explained in Section 123.4, in which the four variables for the global, the Japanese, the aggregate regional, and the τ th local markets are denoted by Yt = EA(τ ) , Δ log Rtτ ) .12 We specify the model as13 : (Δ log RtG , Δ log RtJ , Δ log Rt Yt = μ + Γ1 Yt−1 + εt .
(123.57)
The error term εt follows a DCC-GARCH(1, 1) model with εt |It−1 ∼ N (0, Ht ). The conditional covariance matrix (Ht ) is factorized into the product of the variance and correlation matrices as Ht = Dt Rt Dt , where Dt is a diagonal matrix of the square roots of the variances, and Rt is an n × n correlation matrix. The conditional variance of the ith element follows the univariate GJR(1, 1) model − ε2i,t−1 + βi hii,t−1 for i = 1, . . . n, hii,t = ωi + αi1 ε2i,t−1 + αi2 Ji,t−1
(123.58)
− − − are dummy variables such as Ji,t−1 = 1 if εi,t−1 < 0, Ji,t−1 =0 where Ji,t−1 −1/2
εt , the otherwise. Defining the normalized error term vector as ut = Dt conditional correlation matrix of εt is given by Rt = E(ut ut |It−1 ). We specify 12 The variables of Yt depend on the individual local market (τ ) although we do not explicitly express it for simplicity of exposition. 13 The empirical results in this section are taken from Tsukuda et al. (2017). They studied the emerging East Asian bond markets. They carefully analyzed the generating process of the data set described in Appendix 123B. First, they examined a possibility of nonstationarity of each data series by Dickey–Fuller test. Testing results indicate that the yields follow an integrated process of order one (I(1)) for almost all individual data series. Hence they employed the error correction model (ECM) as a possible data generating process. The ECM includes, as special cases, the difference VAR model if no co-integration exists, and the stationary VAR model if the order of ranks equals the number of variables. Johansen’s tests for co-integration ranks suggest that there is no co-integration for most of the countries they examined. This result indicates a difference VAR(k) model. Finally, they chose the order of k = 2 judging from the Schwartz Bayesian information criterion. In this paper, we begin with VAR(1) model in equation (123.57) as data generating process for simplicity of exposition.
page 4230
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4231
an innovation of the conditional correlation matrix as ¯ + aut−1 ut−1 + bQt−1 , Qt = (1 − a − b)Q
(123.59)
¯ is a matrix of location parameters. If a ≥ 0, b ≥ 0, where Q ¯ is positive definite, then Rt is also positive defia + b < 1 and Q nite. Hence, Rt can be expressed in terms of Qt as Rt = (ρij,t ) = diag(q11,t , . . . , qnn,t)−1/2 Qt diag(q11,t , . . . , qnn,t )−1/2 , where ρij,t is given by (123.49). The conditional correlations are dynamically driven by the process of Qt . The conditional covariances are obtained accordingly as hij,t = ρij,t {hii,t hjj,t }1/2 . This model enables us to investigate the three aspects of interdependence of different financial markets in a single unified model. These are: (i) mean spillover effects, (ii) DCCs, and (iii) volatility spillover effects. Each of the three quantities measure some aspect of the degree of integration in the East Asian financial markets. We employ the one-step maximum likelihood method proposed by Bauwens and Laurent (2005) as explained in Section 123.4. 123.5.1.2 Estimates of parameters We focus on the mean spillover effects from the global (US), Japan, and aggregate regional markets and from the individual local market itself to the individual local market, which correspond respectively to the parameters of γ41 (Global to Local), γ42 (Japan to Local), γ43 (Regional to Local) and γ44 (Local to Local) in the last row of Γ1 . These coefficients represent the intertemporal dependency across the markets. The estimated results of the parameters are shown in Table 123.2. The results reveal the following facts. (i) The effects of global markets (γ41 ) on the conditional mean of the local market are significant for all the emerging East Asian countries except for Indonesia and China. The global market yields in the previous period affect the local market yields positively in the present period except for Indonesia and China. In particular, given the small absolute value of t-statistic, China is not affected by the global market. (ii) The Japanese effect (γ42 ) is significant only for Singapore. Movements in the Japanese bond market barely affect the emerging East Asian markets. (iii) The aggregated regional effects (γ43 ) are significant for the three markets of Singapore, Thailand, and China, but regional effects on China are negative and significant. One of the most striking findings is that China differs substantially from other emerging markets. Chinese bond yields depend strongly on their own past values but not on the global markets even though China accounts for half of the bond outstanding
page 4231
July 6, 2020
16:7
4232
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
values of all emerging East Asian countries. This may reflect the fact that China imposes strict controls on capital flows. All estimated GARCH parameters (α41 and β41 ) in equation (123.58) and DCC parameters (a and/or b) in equation (123.59) are highly significant as seen in Table 123.2. This implies that the DCC-GARCH specification is valid for modeling bond yields in the emerging East Asian markets. 123.5.1.3 Dynamic conditional correlations The dynamic conditional correlations (DCCs) measure the contemporaneous dependency between the two markets but do not necessarily imply causal relations in contrast to the mean spillover effects. If the DCCs of a local (k) market with the global markets (say ρ41,t ) are consistently higher than the (j)
other local markets (ρ41,t ) over time, we would say that the kth local market is more integrated with the global market than the jth local market is. In the same context, the upward trend in the DCCs of the local market over the sample period (say ρ41,t ) shows an increase in integration with the global market. Johansson (2008) and Park and Lee (2011) adopt this approach for analyzing Asian markets, while Connor and Suurlaht (2013), and Syllignakis and Kouretas (2013) apply it to European markets. We concentrate our analysis on the DCCs between the East Asian local markets and the global, Japanese, and aggregate regional markets, expressed by ρ4j,t for j = 1, 2, and 3, respectively. Figure 123.1 illustrates the behaviors of ρ4j,t for the eight East Asian local markets. Those markets are classified into the same groups as in Figure 123B.1. (i) The Hong Kong and Singapore markets are highly correlated with all external markets over all periods. Their conditional correlations with the global market are strongest among the three groups and exceed 0.4. The second strongest conditional correlation is with the Japanese market. The regional market has the lowest conditional correlation but still exceeds 0.2 for most periods. (ii) For the mid-range-yield markets of Korea, China, Malaysia, and Thailand, the conditional correlations are generally weaker than those of the low-yield markets. The correlations in the regional markets are comparable to those of the global market. The correlations with the Japanese market are weakest and close to zero for most periods. (iii) For the high-yield markets of Indonesia and the Philippines, the conditional correlations are below 0.2, although the regional markets have the strongest of the three. As a whole, in most East Asian countries the DCCs with the aggregate region are low, and do not exhibit any clear-cut upward trend over the sample period. The extent of bond market integration within the region is limited in terms of the DCCs.
page 4232
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
0.8
Hong Kong
0.8
4233
Singapore
0.6
0.6
0.4 0.4 0.2 0.2
0.0
0.0 01/01 01/03 01/05 01/07 01/09 01/11 0.8
Korea
-0.2 01/01 01/03 01/05 01/07 01/09 01/11 0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
-0.2 01/01 01/03 01/05 01/07 01/09 01/11 0.8
Malaysia
-0.2 01/01 01/03 01/05 01/07 01/09 01/11 0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
-0.2 01/01 01/03 01/05 01/07 01/09 01/11 0.8
Philippines
0.8 0.6
0.4
0.4
0.2
0.2
0.0
0.0
: Global (41,t)
China
-0.2 01/01 01/03 01/05 01/07 01/09 01/11
0.6
-0.2 01/01 01/03 01/05 01/07 01/09 01/11
Thailand
Indonesia
-0.2 01/01 01/03 01/05 01/07 01/09 01/11
: Japan ( 42,t)
: Regional ( 43,t)
Figure 123.1: Bond conditional correlations of the local market with the global, Japanese and regional markets (ρ4j,t ).
page 4233
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4234
123.5.1.4 Volatility spillover effects The volatility spillover effects (ξ4j,t ), called the dynamic conditional variance decompositions, measure the contemporaneous causal relations of volatility from the jth market (j = global, Japanese, and aggregate regional markets) to the local markets. Larger values of ξ4j,t imply that the local market is more integrated with the jth external market. The conditional variance of the error term for the emerging East Asian local market is decomposed into the weighted sum of the conditional variances of the independent idiosyncratic shocks to the jth market as a special case of equation (123.55) 2 2 2 2 + φ242,t σ2,t + φ243,t σ31,t + σ4,t . h44,t = E(ε24,t |It−1 ) = φ241,t σ1,t
(123.60)
The volatility spillover effects from the jth market to the conditional variance of the local market at time t are given by ξ4j,t
2 φ24j,t σj,t = for j = 1, . . . , 3, h44,t
and ξ44,t = 1 − ξ41,t − .ξ42,t − ξ43,t , (123.61)
and 0 ≤ ξ4j,t ≤ 1 for j = 1, . . . , 4. The quantities of ξ4j,t indicate the relative contributions of volatility from the jth market to the local markets. The larger values of ξ4j,t imply a higher integration level of the local markets with the jth market. Figure 123.2 illustrates the relative contribution of each factor to the volatility of the individual local markets. The results reveal the following facts. (i) The local market’s intrinsic factor is dominant and exceeds 60% for all emerging East Asian markets for all sample periods (ξ44,t ). (ii) The spillover from the global factor is relatively large in the low-yield markets (Hong Kong and Singapore) but is less than 10% for the mid- and high-yield markets for all periods. (iii) The Japanese market’s contributions are negligible for all local markets. (iv) The regional factor’s contribution is about 5% in the mid-range-yield markets. These observations imply that intraregional integration remains low but that markets such as Hong Kong and Singapore are integrated with the global market more than with intraregional markets. Furthermore, there is no clear upward trend in bond market integration in terms of volatility spillover effects. Table 123.3 provides an overview of integration for the emerging East Asian bond markets by examining the averaged volatility spillover effects from the external markets to the individual local markets over the sample periods. The bond markets in Hong Kong and Singapore are highly integrated with the global market, whereas those in China, Indonesia, and the
page 4234
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
Hong Kong
100%
100%
80%
80%
60%
60%
40%
40%
20%
20%
0% 01/01 01/03 01/05 01/07 01/09 01/11
Korea
100% 80%
80%
60%
60%
40%
40%
20%
20%
Malaysia
100% 80%
80%
60%
60%
40%
40%
20%
20%
Philippines
100% 80%
80%
60%
60%
40%
40%
20%
20%
: Global (41,t) Figure 123.2:
China
0% 01/01 01/03 01/05 01/07 01/09 01/11 100%
0% 01/01 01/03 01/05 01/07 01/09 01/11
Thailand
0% 01/01 01/03 01/05 01/07 01/09 01/11 100%
0% 01/01 01/03 01/05 01/07 01/09 01/11
Singapore
0% 01/01 01/03 01/05 01/07 01/09 01/11 100%
0% 01/01 01/03 01/05 01/07 01/09 01/11
4235
Indonesia
0% 01/01 01/03 01/05 01/07 01/09 01/11
: Japan (42,t)
: Regional (3,t)
Bond volatility spillover effects from the external markets (ξ4j,t ).
page 4235
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4236
Table 123.3: Bond averaged volatility spillover effects from the external markets to the emerging Asian individual local markets.
Hong Kong Singapore Korea Thailand Malaysia China Philippines Indonesia
Global
Japan
Regional
Local
36.50 24.24 8.29 8.93 2.03 1.16 0.40 0.45
3.51 2.58 1.02 0.36 0.20 0.18 0.41 0.47
1.55 1.55 4.92 7.19 4.78 0.58 0.56 0.88
58.44 71.63 85.78 83.52 92.99 98.08 98.63 98.20
Note: Numerical values indicate the averaged volatility spillover effects over the sample periods: ξj = T1 Tt=1 ξj,t for j = global, Japan, regional and local markets.
Philippines are not integrated with any external markets. The integration within the region is still limited in terms of volatility spillover effects. 123.5.1.5 Implications The value of LCY bonds outstanding in the emerging East Asian markets has increased rapidly since the Asian financial crisis of 1997–98. By 2012, emerging East Asia’s share of world LCY bonds even surpassed that of advanced European economies such as France, Germany, and the UK. Emerging East Asia LCY bonds are now an indispensable asset class for global investors. However, despite the historical facts discussed in Appendix 123A, our investigation based on the DCC-GARCH model clarifies that regional integration remains limited in terms of both DCCs and dynamic conditional variance decomposition (volatility spillover) of the bond yields. Neither the conditional correlations nor the volatility spillover exhibit upward trends over the sample period, but they remain roughly at the same level. However, Hong Kong and Singapore are highly integrated with the global and Japanese markets as depicted in Figures 123.1 and 123.2. Nevertheless, East Asian countries still have low cross-border bond holdings except for Hong Kong and Singapore. Spiegel (2012), Bhattacharyay (2013) and Lee and Takagi (2014) assess the bond markets for the ASEAN economic community, and indicate the necessity of reforms to create more efficient and stable financial systems. 123.5.2 Application to the stock markets This section applies the DCC-GARCH model to the emerging East Asian stock market integration in an analogous manner to the bond markets.
page 4236
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4237
Importantly, we do not intend to consider a full investigation of stock market integration in this region, rather provides a sketch how the DCCGARCH model works in analyzing stock market integration.14 We analyze the differences of logarithmic stock price indices using the data described in Appendix 123B. The four variables for the global, the Japanese, the aggregate regional, and the τ th local markets are denoted by Yt = EA(τ ) , Δ log Ptτ ) . We start with the data gen(Δ log PtG , Δ log PtJ , Δ log Pt erating process of Yt in equation (123.57), which is the same model as that used for bond yields. 123.5.2.1 Estimates of parameters Table 123.4 provides the estimates of parameters for the stock markets. We look at the results in comparison with those of the bond markets shown in Table 123.2. (i) The effects of global markets (γ41 ) on the conditional mean of the local market are positively significant for all the emerging East Asian countries. Unlike the bond markets, the global market affects the Chinese stock market. (ii) The Japanese market (γ42 ) negatively affects all emerging East Asian markets, and the estimated coefficients for four countries are significant. (iii) The aggregated regional effects (γ43 ) are significant for the three markets of Hong Kong, Malaysia and Indonesia. All estimated GARCH parameters (α41 and β41 ) in equation (123.58) and DCC parameters (a and b) in equation (123.59) are highly significant as shown in Table 123.4. This implies that the DCC-GARCH specification is also valid for modeling stock returns in the emerging East Asian markets as well as for the region’s bond markets. 123.5.2.2 Dynamic conditional correlations We analyze the dynamic conditional correlations (DCCs) between the East Asian local markets and the global, Japanese, and aggregate regional markets, expressed by ρstock 4j,t for j = 1, 2, and 3, respectively in manner analogous to the bond markets. Figure 123.3 illustrates the behaviors of ρstock 4j,t for the eight East Asian local markets. First, in comparison to Figure 123.1, we can bond visually observe that ρstock 4j,t > ρ4j,t for j = 1, 2, 3 and for almost all t = 27 September 2003 to 31 December 2012 (intervals of common sample periods). This observation is the most striking fact for the emerging East Asian 14 The same authors are preparing to study the East Asian stock market integration in a different research paper.
page 4237
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4238
Table 123.4: γ41
γ42
Estimates of parameters for the stock markets. γ43
γ44
Hong Kong 0.297∗ −0.076∗ 0.048∗ −0.078∗ (0.04) (0.03) (0.02) (0.03) Singapore 0.252∗ −0.031 0.004 0.011 (0.04) (0.02) (0.03) (0.04) Korea 0.253∗ −0.011 −0.041 −0.077∗ (0.05) (0.03) (0.03) (0.03) Thailand 0.202∗ −0.023 0.011 −0.029 (0.05) (0.03) (0.03) (0.04) Malaysia 0.087∗ −0.035∗ 0.068∗ 0.011 (0.03) (0.02) (0.02) (0.04) China 0.310∗ −0.097∗ −0.084 0.035 (0.07) (0.04) (0.07) (0.05) Philippines 0.215∗ −0.049 0.030 −0.077∗ (0.04) (0.04) (0.04) (0.04) Indonesia 0.115∗ −0.010 0.108∗ −0.110∗ (0.05) (0.04) (0.05) (0.04)
α41
α42
β41
a
b
0.100∗ (0.02) 0.039 (0.02) 0.113∗ (0.03) 0.132∗ (0.03) 0.124∗ (0.03) 0.152∗ (0.03) 0.065∗ (0.02) 0.062∗ (0.02)
0.020 (0.02) 0.058∗ (0.02) 0.052 (0.04) 0.085∗ (0.03) 0.058 (0.04) 0.028 (0.04) 0.001 (0.02) 0.075∗ (0.03)
0.831∗ (0.03) 0.895∗ (0.02) 0.812∗ (0.03) 0.789∗ (0.05) 0.807∗ (0.03) 0.772∗ (0.05) 0.918∗ (0.02) 0.854∗ (0.03)
0.024∗ (0.01) 0.018∗ (0.01) 0.030∗ (0.01) 0.028∗ (0.01) 0.013∗ (0.00) 0.021∗ (0.01) 0.022∗ (0.01) 0.035∗ (0.01)
0.944∗ (0.02) 0.943∗ (0.05) 0.881∗ (0.03) 0.875∗ (0.03) 0.935∗ (0.03) 0.950∗ (0.02) 0.921∗ (0.04) 0.882∗ (0.03)
Note: The parameters of γ41 , γ42 , γ43 and γ44 respectively indicate the coefficients for the emerging East Asian local market in equation (123.57). Standard errors are in parentheses. The asterisks denote significance at the 5% level. Almost all estimates of α41 , α42 , β41 , a and b are highly significant.
financial markets. The individual stock markets in this region are highly integrated with the intra-regional, Japanese and global markets compared with the bond markets in terms of the DCCs. In particular, China indicates stock that ρbond 4j,t < 0.2 for j = 1, 2, 3 for almost all t, and ρ4j,t > 0.4 for j = 1, 2, 3 and for all t. While China’s bond markets have been severely segregated from the external markets as seen in Figure 123.1, its stock markets integrate with the external markets. Second, the DCCs with the intra-regional stock markets (ρstock 43,t : dotted lines in Figure 123.3) are highest among ρ4j,t > 0.4 for j = 1, 2, 3 for all countries. The individual stock markets most strongly correlate with the intra-regional markets in terms of the DCCs. As a whole, in most East Asian countries the DCCs with the aggregate region are high over all sample periods. However, the stock markets in this region do not exhibit any clear-cut upward trend. 123.5.2.3 Volatility spillover effects Figure 123.4 illustrates the relative contribution of each factor to the volatility of the individual local markets. In contrast to the bond markets, the stock markets reveal the following facts: (i) The local market’s intrinsic factor (ξ44,t ) is no more overwhelmingly dominant and is less than 50% for
page 4238
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
Hong Kong
1.0
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0 01/01
01/04
01/07
01/10
01/13
Korea
1.0
0.0 01/01
0.8
0.6
0.6
0.4
0.4
0.2
0.2
01/04
01/07
01/10
01/13
Malaysia
1.0
0.0 01/01
0.8
0.6
0.6
0.4
0.4
0.2
0.2
01/04
01/07
01/10
01/13
Philippines
1.0
0.0 01/01
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0 01/01
01/04
01/07
01/10
: Global(41,t)
01/04
01/13
0.0 01/01
: Japan( 42,t)
01/10
01/13
01/07
01/10
01/13
China
01/04
01/07
01/10
01/13
Indonesia
1.0
0.8
01/07
Thailand
1.0
0.8
0.0 01/01
01/04
1.0
0.8
0.0 01/01
Singapore
1.0
0.8
4239
01/04
01/07
01/10
01/13
: Regional( 43,t)
Figure 123.3: Stock conditional correlations of the local market with the global, Japanese and regional markets (ρ4j,t ).
page 4239
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
Hong Kong
100%
80%
60%
60%
40%
40%
20%
20%
01/04
01/07
01/10
01/13
Korea
100%
0% 01/01
80%
60%
60%
40%
40%
20%
20%
01/07
01/10
01/13
Malaysia
100%
0% 01/01
80%
60%
60%
40%
40%
20%
20%
01/04
01/07
01/10
01/13
Philippines
100%
0% 01/01
80%
60%
60%
40%
40%
20%
20%
0% 01/01
01/04
01/07
01/10
: Global ( 41,t) Figure 123.4:
01/04
01/13
0% 01/01
: Japan ( 42,t)
01/10
01/13
01/07
01/10
01/13
01/10
01/13
China
01/04
01/07
Indonesia
100%
80%
01/07
Thailand
100%
80%
0% 01/01
01/04
100%
80%
01/04
Singapore
100%
80%
0% 01/01
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4240
0% 01/01
9.61in x 6.69in
01/04
01/07
01/10
01/13
: Regional ( 3,t)
Stock volatility spillover effects from the external markets (ξ4j,t ).
page 4240
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4241
Table 123.5: Stock averaged volatility spillover effects from the external markets to the emerging Asian individual local markets.
Hong Kong Singapore Korea Thailand Malaysia China Philippines Indonesia
Global
Japan
Regional
Local
36.27 39.04 26.07 19.75 18.99 28.65 16.99 19.31
10.56 11.42 14.24 9.38 7.88 8.46 8.03 7.89
33.94 20.90 16.91 16.96 18.38 28.21 12.03 20.89
19.24 28.64 42.79 53.92 54.75 34.69 62.96 51.91
Note: Numerical values indicate the averaged volatility spillover effects over the sample periods: ξj = T1 Tt=1 ξj,t for j = global, Japan, regional and local markets.
Hong Kong, Singapore, Korea and China for all sample periods. (ii) The spillover from the regional factor is much larger for all individual local markets. (iii) The Japanese market’s contributions are small but not negligible for all local markets. Table 123.5 provides an overview of integration for the emerging East Asian stock markets. The individual local factor in the stock markets is less dominant, and less than 60% for all markets except the Philippines. In contrast, the individual local factor in the bond markets exceeds 60% for all markets except for Hong Kong, and in fact, is more than 90% for four countries (China, Indonesia, Malaysia and the Philippines). Both the global and regional markets have relatively large effects. These observations reveal that the degree of stock market integration in this region is greater than that of the region’s bond markets, while the stock markets in the region do not exhibit any clear-cut upward trend like the bond markets do not. 123.5.2.4 Implications Many studies find that equity markets in Asian emerging economies have increased in terms of both intraregional and global integration since the Asian financial crisis of 1997–98 in contrast to the region’s bond markets. See, for example, Park and Lee (2011), Guimar˜aes-Filho and Hee-Hong (2016), and Glick and Hutchinson (2013) among others. Park and Lee’s (2011) finding, for instance, suggests that emerging Asian equity markets are integrated both regionally and globally, whereas the region’s local currency bond markets remain largely segmented from each other as well as from global markets. For the most part, our findings concur with Park and Lee (2011).
page 4241
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
4242
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
123.6 Concluding Remarks This paper briefly reviews multivariate GARCH models in contrast with univariate GARCH models, and illustrates the practical usefulness of the DCC-GARCH model introduced by Engel (2002) through its application to the bond and stock markets of emerging East Asian countries. There is no direct extension of univariate GARCH model to a multivariate framework because of (i) the conditional covariance matrix entails positive definiteness condition; (ii) the specification of the conditional covariance matrix generally requires quite a large number of parameters if the number of variables is large, and often makes the estimation procedure numerically intractable. The DCC-GARCH model ingeniously compromises two contrary requirements for constructing a model: sufficiently flexible to catch behaviors of actually observed data process, and sufficiently parsimonious for statistical analysis in practice. The DCC-GARCH can evaluate the comovements of different markets by way of dynamic variance decomposition (volatility spillovers) in addition to the DCCs. The empirical investigation in this paper clarifies that the bond market integration in the region remains limited in terms of both the DCCs and volatility spillovers, while there is a high degree of integration in the stock markets both regionally and globally. The DCC-GARCH model would be promisingly applicable for investigating a wide range of financial time series analyses. Acknowledgments The authors gratefully acknowledge the financial support of Grant-in-Aid 18H00851 and 26380403 from the Ministry of Education, Culture, Sport, Science, and Technology of Japan. Bibliography Aielli, G.P. (2013). Dynamic Conditional Correlation: On Properties and Estimation. Journal of Business and Economic Statistics 31, 282–299. Aielli, G.P. and Caporin, M. (2014). Variance Clustering Improved Dynamic Conditional Correlation MGARCH Estimators. Computational Statistics & Data Analysis 76, 556–576. Amado, C. and Terasvirta, T. (2014). Conditional Correlation Models of Autoregressive Conditional Heteroskedasticity with Nonstationary GARCH Equations. Journal of Business & Economic Statistics 32, 69–87. Asian Development Bank (2017). Financial Integration. Asian Economic Integration Report 2017 (Chapter 4), Manila, Philippines, pp. 39–56.
page 4242
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Multivariate GARCH Model
b3568-v4-ch123
4243
Audrino, F. (2014). Forecasting Correlations During the Late-2000S Financial Crisis: Short-run Component, Long-run Component, and Structural Breaks. Computational Statistics & Data Analysis 76, 43–60. Bauwens, L., Grigoryeva, L. and Ortega, J.P. (2016). Estimation and Empirical Performance of Non-scalar Dynamic Conditional Correlation Models. Computational Statistics & Data Analysis 100, 17–36. Bauwens, L., Hafner, C.M. and Rombouts, J.V.K. (2007). Multivariate Mixed Normal Conditional Heteroskedasticity. Computational Statistics & Data Analysis 51, 3551– 3566. Bhattacharyay, B.N. (2013). Determinants of Bond Market Development in Asia. Journal of Asian Economics 24, 124–137. Bauwens, L. and Laurent, S. (2005). A New Class of Multivariate Skew Densities, With Application to Generalized Autoregressive Conditional Heteroscedasticity Models. Journal of Business and Economic Statistics 23, 346–354. Bauwens, L., Laurent, S. and Rombouts, J. (2006). Multivariate GARCH Models: A Survey. Journal of Applied Econometrics 21, 79–109. Bauwens, L., Hafner, C.M. and Pierpet, D. (2013). Multivariate Volatility Modeling of Electricity Futures. Journal of Applied Econometrics 28, 743–761. Beirne, J., Caporale, G. M., Schulze-Ghattas, M. and Spagnolo, N. (2013). Volatility Spillovers and Contagion from Mature to Emerging Stock Markets. Review of International Economics 21, 1060–1075. Bollerslev, T. (1986). Generalized Autoregressive Conditional Heterosckedasticy. Journal of Econometrics 31, 307–327. Bollerslev, T. (1990). Modeling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized ARCH Model. The Review of Economics and Statistics 72, 498–505. Bollerslev, T., Engle, R.F. and Wooldridge, J.M. (1988). A Capital Asset Pricing Model with Time Varying Covariances. Journal of Political Economy 96, 116–131. Bollerslev, T., Engle, R.F. and Nelson, D.B. (1994). Arch models. in R.F. Engle and D. McFadden (eds.), Handbook of Econometrics, Vol. 4, North-Holland, Amsterdam, pp. 2959–3038. Caporin, M. and McAleer, M. (2013). Ten Things You Should Know About the Dynamic Conditional Correlation Representation. Econometrics 1, 115–126. Connor, G. and Suurlaht, A. (2013). Dynamic Stock Market Covariances in the Eurozone. Journal of International Money and Finance 37, 353–370. Ding, Z. and Engle, R.F. (2001). Large Scale Conditional Covariance Matrix Modeling, Estimation and Testing. Academia Economic Papers 29, 157–184. de Almeida, D., Hotta, L.K. and Esther, R. (2018). MGARCH Models: Trade-off between Feasibility and Flexibility. International Journal of Forecasting 34, 45–63. Engle, R.F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica 50, 987–1007. Engle, R.F. (2002). Dynamic Conditional Correlation: A Simple Class of Multivariate GARCH Models. Journal of Business and Economic Statistics 20, 339–350. Engle, R.F. (2009). Anticipating Correlations: A New Paradigm for Risk Management, Princeton University Press, Princeton, NJ. Engle, R.F. and Kroner, F.K. (1995). Multivariate Simultaneous Generalized ARCH. Econometric Theory 11, 122–150. Francq, C. and Zakoian, J.M. (2010). GARCH Models: Structure, Statistical Inference and Financial Applications, John Wiley & Sons, Hoboken, NJ.
page 4243
July 6, 2020
16:7
4244
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
Glick, R. and Hutchinson, M. (2013). China’s Financial Linkages with Asia and the Global Financial Crisis. Federal Reserve Bank of San Francisco Working Paper Series, 2013–2012. Glosten, L., Jagannathan, R. and Runkle, D. (1993). On the Relation between the Expected Value and Volatility of the Nominal Excess Return on Stocks. Journal of Finance 48, 1179–1801. Grier, K.B. and Smallwood, A.D. (2013). Exchange Rate Shocks and Trade: A Multivariate GARCH-M Approach. Journal of International Money and Finance 37, 282–305. Gourieroux, C. (1997). ARCH models and Financial Applications, Springer, New York. Guimar˜ aes-Filho, R. and Hee Hong, G. (2016). Dynamic Connectedness of Asian Equity Markets. IMF Working Paper, WP/16/57. Hafner, C.M. and Franses, P.H. (2009). A Generalized Dynamic Conditional Correlation Model: Simulation and Application to Many Assets. Econometric Reviews 28, 612–631. Hamilton, J.D. (1994). Time Series Analysis, Princeton University Press, Princeton, NJ. Hecq, A., Laurent, S. and Palm, F.C. (2016). On the Univariate Representation of BEKK Models with Common Factors. Journal of Time Series Econometrics 8, 91–113. Jeantheau, T. (1998). Strong Consistency of Estimations of Multivariate ARCH Model. Econometric Theory 14, 70–86. Johansson, A.C. (2008). Interdependencies among Asian Bond Markets. Journal of Asian Economics 19, 101–116. Kim, S. and Lee, J.W. (2012). Real and Financial Integration in East Asia. Review of International Economics 20, 332–349. Laurent, S., Rombouts, J.V.K. and Violante, F. (2013). On Loss Functions and Ranking Forecasting Performances of Multivariate Volatility Models. Journal of Econometrics 173, 1–10. Lee, C.L. and Takagi, S. (2014). Assessing the Financial Landscape for the Association of Southeast Asian Nations Economic Community, 2015. Asia and the Pacific Policy Studies 2, 116–129. Ledoit, O., Santa-Clara, P. and Wolf, M. (2003). Flexible Multivariate GARCH Modeling with an Application to International Stock Markets. Review of Economics and Statistics 85, 735–747. Nelson, D.B. (1991). Conditional Heteroscedasticity in Asset Returns: A New Approach. Econometrica 59, 347–370. Noureldin, D., Shephard, N. and Sheppard, K. (2014). Multivariate Rotated ARCH Models. Journal of Econometrics 179, 16–30. Palm, F. (1996). GARCH Models of Volatility. In Handbook of Statistics, C.R. Rao and G.S. Maddala (eds.). Vol. 14, North-Holland, Amsterdam, pp. 209–240. Park, C.Y. and Lee, J.W. (2011). Financial Integration in Emerging Asia: Challenges and Prospects. Asian Economic Review Policy 6, 176–198. Silvennoinen, A. and Terasvirta, T. (2009). Multivariate GARCH Models. In Handbook of Financial Time Series, T.G. Andersen, R.A. Davis, J.P. Kreiss and T. Mikosch (eds.). Springer-Verlag, Berlin, pp. 201–229. Sims, C A. (1980). Comparison of Interwar and Postwar Business Cycles. American Economic Review 70, 250–257. Skintzi, V.D. and Refenes, A.N. (2006). Volatility Spillovers and Dynamic Correlation in European Bond Markets. Journal of International Financial Markets, Institutions & Money 16, 23–40. Spiegel, M.M. (2012). Developing Asian Local Currency Bonds Markets: Why and How. In Implications of the Global Financial Crisis for Financial Reform and Regulation in Asia, Kawai, et al. (eds.). Edward Elgar, Cheltenham, pp. 221–247.
page 4244
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4245
Syllignakis, M.N. and Kouretas, G.P. (2013). Dynamic Correlation Analysis of Financial Correlation: Evidence from the Central and Eastern European Markets. International Review of Economics and Finance 20, 717–732. Tsukuda, Y., Shimada, J. and Miyakoshi, T. (2017). Bond Market Integration in East Asia: Multivariate GARCH with Dynamic Conditional Correlations Approach. International Review of Economics and Finance 51, 293–213.
Appendix 123A Some Basic Facts on the Emerging East Asian Financial Markets Appendix 123A explains some fundamental facts on the emerging East Asian financial markets. These facts provide basic background knowledge for understanding the importance of the financial markets in this region. East Asia has grown rapidly during the past 25 years and is currently recognized as the growth center of the world economy, despite the setbacks of the Asian financial crisis in 1997–98 and the global financial crisis in 2007–08. Table 123A.1 indicates the share of GDP of each country and region in the world GDP. The GDP share of emerging East Asia increased dramatically during the past nearly three decades. In particular, China, starting from a relatively insignificant 1.2% in 1990, reached to the second largest economy (15.2%) next to the US (24.0%) in 2017. The GDP of China averaged the 10% annual growth over the past 30 years. The ASEAN5’s share is small but steadily increasing. By contrast, the share of the US peaked at 30.6% in 2000 and thereafter decreased to 24% in 2017, although the US still retains the top position in the world. Japan has lost GDP share during this period.
Table 123A.1:
US Europe & Central Asia Japan Emerging East Asia of which China of which Korea of which ASEAN5
Share of GDP in the world (%). 1990
2000
2010
2017
26.5 39.1 13.9 4.2 1.2 1.2 1.4
30.6 29.8 14.6 7.5 3.6 1.7 1.7
22.7 31.7 8.6 14.0 9.2 1.7 2.7
24.0 26.6 6.0 20.5 15.2 1.9 3.0
Note: ASEAN5 refers to the five largest economies of the ASEAN: Indonesia, Malaysia, the Philippines, Singapore, and Thailand. The values of GDP are measured by the US dollars in terms of nominal exchange rates. Source: World Bank.
page 4245
July 6, 2020
16:7
4246
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
123A.1 Bond markets Emerging East Asia local currency (LCY) bonds have become an indispensable asset class for global investors. We focus on: (i) the value of LCY bonds outstanding in the region compared with the world bond markets, and (ii) the growth in size, particularly the relative size of the bond markets to GDP. First, the value of LCY bonds outstanding in the region is the third largest in the world behind the US and Japan. Table 123A.2 shows the value of LCY bonds outstanding in emerging East Asia as a share of the world total. The share for emerging East Asia reached 8.8% in March 2012, which is higher than that of France (5.2%), Germany (3.8%), and the UK (2.7%). China and Korea continued to be the largest bond markets in the region apart from Japan, accounting for 5.1% and 1.9% of the global total, respectively. Second, the bond markets have grown rapidly given regional efforts and the commitments of individual countries in the region. Most countries in the region have increased their ratios of market size to GDP in addition to their overall size. Figure 123A.1(a) illustrates the value of LCY bonds outstanding in the eight emerging East Asian markets since the end of December 2000. China’s market is the largest and accounts for more than half the total. Korea and Hong Kong are the second and third largest, respectively, and the ASEAN5 markets are the smallest. The LCY bond markets provide an alternative channel for financing in the region in addition to the banking system. Figure 123A.1(b) illustrates the value of LCY bonds outstanding relative to GDP. The relative size as measured by the ratio of the bond outstanding to GDP exhibits different properties. Korea and Malaysia have the highest shares, and Indonesia the lowest. Table 123A.2:
LCY bonds outstanding in the major markets. LCY bonds outstanding
US Japan France Germany UK Emerging East Asia of which China of which Korea of which ASEAN5
26, 391 11, 897 3574 2621 1823 5886 3448 1290 957
Source: Asia Bond Monitor, November 2012.
% of World total 38.7 17.4 5.2 3.8 2.7 8.8 5.1 1.9 1.4
page 4246
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4247
$US billions 7000 6000 5000
ASEAN
4000
Hong Kong
3000
Korea
2000
China
1000 0 00 01 02 03 04 05 06 07 08 09 10 11 12 (a) Size of the Markets % 140
China
120
Korea
100
Hong Kong
80
Singapore
60
Thailand
40
Malaysia
20
Philippines
0
Indonesia 00
01
02
03
04
05
06
07
08
09
10
11
12
(b) Ratio to GDP
Figure 123A.1: LCY bonds outstanding in emerging East Asian countries. Note: ASEAN5 refers to Singapore, Thailand, Malaysia, the Philippines, and Indonesia. Sources: Bond outstandings are taken from Asian Bonds Online, and GDPs from World Bank.
123A.2 Stock markets We now look at the overview of stock markets in the emerging East Asia region. Table 123A.3 shows the market capitalization in 2017 for emerging East Asia as a share of the world total. The share of emerging East Asia is 22.0%, which is the second highest next to the US (40.5%). China alone exceeds Japan (7.9%). Within emerging East Asia, China is the largest, followed by Hong Kong. The stock market share of the region to the world total is more than double the corresponding share of its bond markets.
page 4247
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4248 Table 123A.3:
Market capitalization of listed domestic companies. Market cap
US Europe and Central Asia Japan Emerging East Asia of which China of which Hong Kong of which Korea of which ASEAN5
32, 121 11, 065 6223 17, 436 8711 4351 1772 2603
% of World total 40.5 14.0 7.9 22.0 11.0 5.5 2.2 3.3
Note: ASEAN5 refers to the five largest economies of the ASEAN: Indonesia, Malaysia, the Philippines, Singapore, and Thailand. China includes neither Hong Kong nor Macau. Data are end of year values converted to US dollars using corresponding year-end foreign exchange rates. Source: Datastream, World Bank WDI.
Figure 123A.2(a) illustrates the market capitalization for emerging East Asia from the end of December 2003 to December 2017. These markets have grown more than eightfold during this period, from 2 trillion US dollars in 2003 to some 17 trillion US dollars in 2017 in spite of the world financial crisis triggered by the bankruptcy of Lehman Brothers in 2008. More than half the market capitalization of the region was lost in this year. The Chinese market is the largest and accounts for about half of the region’s total. Hong Kong and ASEAN are the second and third largest. Figure 123A.2(b) plots the market capitalization relative to GDP, and this reveals a somewhat different interpretation from the overall market capitalization. The relative market capitalization of Hong Kong (shown on the right-hand side axis) is much higher than that for others, mainly because many mainland China’s companies list on the Stock Exchange of Hong Kong.15 China exhibits the lowest market capitalization relative to GDP throughout most of the sample period. 15
About half of companies listed in the Stock Exchange of Hong Kong (SEHK) are mainland China based. In fact, as of 2017, the SEHK had 2118 listed companies, 1051 of which are from mainland China (Red chip, H share and P chip). Market capitalization of main land enterprises reaches 22.5 trillion Hong Kong dollar and accounts for 66.24% of total capitalization in the SEHK. (Source: Hong Kong Exchanges and Clearing Limited, (2017), HKEX Fact Book 2017.)
page 4248
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4249
$US billions 20,000 18,000 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17
ASEAN 5 Korea Hong Kong China
(a) Size of the Markets
%
China
350
1400
300
1200
250
1000
Singapore
200
800
Thailand
150
600
Malaysia
100
400
Philippines
50
200
Korea
Indonesia
0
0 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 (b) Ratio to GDP
Hong Kong (Right Axis)
Figure 123A.2: Market capitalization of listed domestic companies. Note: ASEAN5 refers to Singapore, Thailand, Malaysia, the Philippines, and Indonesia. Sources: Datastream, World Bank WDI.
Appendix 123B Construction of Data and Descriptive Statistics Appendix 123B describes the data sets in detail used in Section 12.5. The data sets comprise the bond yields and stock returns for the global (US) market, the Japanese market, the aggregate regional market, and the individual local markets of eight emerging East Asian countries consisting of the ASEAN5 (Indonesia, Malaysia, the Philippines, Singapore, and Thailand), China, South Korea, and Hong Kong. We use weekly bond yield (stock returns) indices for LCY for our analysis. Our sample covers the weeks from January 1, 2001 to December 31, 2012 for the bond yields and from September 27, 2003 to June 27, 2018 for the stock markets.
page 4249
July 6, 2020
16:7
4250
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi EA(τ )
(τ )
We use the following notations: RtG , RtJ , Rt , and Rt for the yields EA(τ ) (τ ) , and Pt for the stock price indices), respectively, (or PtG , PtJ , Pt for the global (US), the Japanese, the aggregate regional, and the individual local markets at time t.16 We follow the idea of Skintzi and Refenes (2006) in constructing the yield (stock price) on the aggregate regional markets against the τ th local market. It is a weighted average of the yields (stock prices) on the intraregional cross-border markets at time t 8 8 EA(τ ) (τ ) τ τ = Rt j=1,=τ wj,t Rt , where wj,t = MCapj,t / j=1,=τ MCapj,t and Mcapj,t is the market capitalization of the jth bond (stock) market measured in US dollars. First, we look at the behaviors of the bond yields. Figures 123B.1(a)–(c) illustrate the behaviors of the bond yields over the sample periods. The term “Asia” in the figure denotes the market capitalization weighted average yields over the eight emerging East Asian markets, which are shown in each panel as a benchmark for convenience of comparison. We observe the following characteristics. (i) Bond yields in all markets fluctuate over time. (ii) Emerging East Asian local markets are classified into three groups according to the level of yield in the period immediately following the Asian financial crisis of 1997–98: low-yield markets (Hong Kong and Singapore), mid-range-yield markets (Korea, Malaysia, Thailand, and China), and highyield markets (Indonesia and the Philippines). (iii) Bond yields for most markets tend to converge to a range of 2–6 percentage points, but in many markets, yields fluctuate more wildly during the global financial crisis of 2007–08. Table 123B.1 reports descriptive statistics for the log-difference yields. Table 123B.1 confirms the stylized facts on asset yields, in the form of weakly significant skewness, high kurtosis, and strongly significant autocorrelations in squared yields. Table 123B.2 shows the contemporaneous unconditional correlations between the log-difference yields of different markets. The eight emerging East Asian local markets are ordered according to the degree of correlation with the global market. Hong Kong and Singapore have the highest correlations, the Philippines and Indonesia have the lowest, and the remaining countries fall in the middle of the range. The grouping of countries based on the unconditional correlation with the global market coincides with that
16
The data sources are mainly from “Annual yields” in the iBoxx ABF Index Family, Asia Bonds Online published by the Asian Development Bank, and FTSE All Cap Indexes in local currencies. See Tsukuda et al. (2017) for the source more in detail.
page 4250
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model
4251
% 7 6 5
Asia
4
Global
3
Japan
2
Hong Kong
1
Singapore
0 01/01
01/03
01/05
01/07
01/09
01/11
(a) Low-Yield Markets (Hong Kong, Singapore) % 8
Korea
6
Thailand Asia
4
Malaysia China
2
0 01/01
01/03
01/05
01/07
01/11
01/09
(b) Mid-Range-Yield Markets (Korea, China, Malaysia, and Thailand) % 25 20 Philippines
15
Indonesia 10
Asia
5 0 01/01
01/03
01/05
01/07
01/09
01/11
(c) High-Yield Markets (Philippines, Indonesia)
Figure 123B.1: Bond yields. Note: “Asia” denotes the market capitalization weighted average yields over the eight emerging East Asian markets considered in this paper. The yields of “Asia” are shown in each panel as a bench mark for ease of comparison. Sources: Bond outstandings are taken from Asian Bonds Online, and GDPs from World Bank.
page 4251
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4252 Table 123B.1: Mean Global Japan Hong Kong Singapore Korea Thailand Malaysia Philippines Indonesia China
Descriptive statistics for the log-difference bond yields. Std. dev
Skew
Kurt
Min
Max
Q(4)
Q(4)-2
0.041 0.065 0.049 0.036 0.026 0.032 0.019 0.027 0.028 0.017
0.05 1.57 0.20 0.52 0.37 0.63 1.11 1.88 0.64 0.16
2.92 8.39 2.04 4.07 4.00 6.41 10.59 20.09 14.12 9.11
−0.20 −0.16 −0.21 −0.14 −0.12 −0.19 −0.08 −0.11 −0.20 −0.12
0.18 0.50 0.22 0.20 0.14 0.19 0.15 0.26 0.23 0.12
10.9 19.0 7.5 5.7 7.4 37.1 38.8 12.5 7.1 106.7
108.8 45.1 113.8 78.4 20.1 76.5 40.1 18.9 171.8 58.1
−0.002 −0.002 −0.003 −0.001 −0.001 0.000 0.000 −0.002 −0.002 0.000
Note: Q(4) denotes the Ljung–Box statistic with four lags for the log-difference variable, and Q(4)-2 denotes the corresponding statistics for the squares of those variables. The 5% critical value of the Q(4)-statistic is 9.4. Table 123B.2:
Contemporaneous unconditional correlations between bond markets.
GLO
JPN
HOK
Global 1.00 Japan 0.33 1.00 Hong Kong 0.60 0.38 1.00 Singapore 0.49 0.31 0.56 Korea 0.27 0.18 0.25 Thailand 0.28 0.14 0.29 Malaysia 0.13 0.06 0.22 Philippines 0.00 0.02 0.09 Indonesia −0.04 −0.07 −0.01 China 0.10 0.03 0.10
SG
KOR
1.00 0.24 0.29 0.21 0.08 0.08 0.01
1.00 0.32 0.22 0.05 0.05 0.10
THA MAL
1.00 0.29 0.08 0.19 0.12
1.00 0.06 0.14 0.09
PHI IND
1.00 0.30 0.02
1.00 0.00
PRC
1.00
Note: The eight emerging East Asian local markets are ordered according to the magnitudes of the correlation with the global market.
based on yield levels in Figure 123B.1. Note that China has a small correlation with the global market despite its high level of bonds outstanding among the emerging East Asian countries, as shown by Figure 123A.1(a). Next, we look at the behaviors of the stock price indices. Figures 123B.2(a) and 123B.2(b) illustrate the stock price indices over the sample period. The indices of all countries are set to 200 on December 31, 2002 as a base value. We observe the following characteristics. (i) Stock prices in all markets exhibit upward trends during these 15 years, although they fluctuate widely over time, with all decreasing temporally in 2008 at
page 4252
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch123
Multivariate GARCH Model 1200
4253
12/31/02 = 200
USA
1000
JAPAN
800
Hong Kong
600
Singapore
400
Korea
200
Malaysia
0 9/03
Thai 9/05
9/07
9/09
9/11
9/13
9/15
9/17
(a) Middle and Low Growth Countries 3000
12/31/02 = 200 China
2000 Philippines 1000 Indonesia 0 9/03
9/05
9/07
9/09
9/11
9/13
9/15
9/17
(b) High Growth Countries
Figure 123B.2: Stock price indices. Note: The indices on December 31, 2002 are set 200 as a base value for all countries. Source: FTSE All Cap Indices are taken from Datastream, FTSE Russell.
the time of the global financial crisis. (ii) Emerging East Asian local markets are classified into the two groups (high growth and low growth countries). The high growth countries (China, Indonesia, and the Philippines) all display the values in excess of 1000 at the end of the sample period (June 27, 2018), whereas the low growth countries (Hong Kong and Singapore Korea, Malaysia, and Thailand) have the values less than 1000. The US and Japan belong to the latter group. Table 123B.3 reports descriptive statistics for the log-difference of price indices. Table 123B.3 confirms the stylized facts on asset returns. The stock returns exhibit basically the same characteristics as log-difference bond yields in Table 123B.1. Table 123B.4 shows the contemporaneous unconditional correlations between the stock returns of the different markets. The unconditional correlations of stock returns are generally much higher in comparison to those
page 4253
July 6, 2020
16:7
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch123
Y. Tsukuda, J. Shimada & T. Miyakoshi
4254
Table 123B.3:
Global Japan Hong Kong Singapore Korea Thailand Malaysia Philippines Indonesia China
9.61in x 6.69in
Descriptive statistics for the stock returns.
Mean
Std. dev
0.137 0.070 0.144 0.075 0.168 0.135 0.103 0.201 0.224 0.270
2.198 2.934 2.871 2.427 2.887 3.157 1.826 3.841 2.962 3.531
Skew −1.24 −0.71 −0.34 −0.27 −0.32 −0.53 −0.27 −0.51 −0.39 −0.90
Kurt 8.40 5.12 4.76 7.40 5.58 6.97 4.29 4.30 3.27 8.49
Min
Max
Q(4)
Q(4)-2
−17.21 −20.46 −16.28 −15.24 −16.53 −21.40 −9.04 −18.57 −15.54 −25.46
10.06 15.44 15.51 16.60 19.10 20.22 11.34 21.57 13.57 21.75
3.3 14.9 11.1 15.5 17.6 8.0 5.5 12.6 5.1 27.5
69.6 103.3 197.5 206.3 397.1 259.9 132.0 228.4 76.5 364.5
Note: Q(4) denotes the Ljung–Box statistic with four lags for the log-difference variable, and Q(4)-2 denotes the corresponding statistics for the squares of those variables. The 5% critical value of the Q(4)-statistic is 9.4. Table 123B.4: markets.
Contemporaneous unconditional correlations between stock
GLO JPN HOK Global JAPAN Hong Kong Singapore Korea Thai Malaysia Philippines Indonesia China
1.00 0.56 0.60 0.62 0.50 0.44 0.43 0.40 0.44 0.53
1.00 0.62 0.64 0.60 0.51 0.48 0.47 0.48 0.55
1.00 0.83 0.72 0.63 0.64 0.57 0.66 0.89
SG
1.00 0.70 0.64 0.67 0.58 0.67 0.75
KOR THI MAL PHI IND CHI
1.00 0.61 0.56 0.52 0.61 0.68
1.00 0.54 0.55 0.64 0.58
1.00 0.59 0.59 0.60
1.00 0.57 0.51
1.00 0.61
1.00
of the log-difference yields as indicated in Table 123B.2. In particular, the correlation between China and Hong Kong is the highest (0.89) among all entries in Table 123B.4 in contrast to that of yields (0.10) in Table 123B.2. This phenomenon may arise from the special relation of Hong Kong market to mainland China.17
17
See footnote 14.
page 4254
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
Chapter 124
Review of Difference-in-Difference Analyses in Social Sciences: Application in Policy Test Research William H. Greene and Min (Shirley) Liu Contents 124.1 Difference-in-Difference . . . . . . 124.1.1 Definition . . . . . . . . . 124.1.2 Models . . . . . . . . . . 124.2 Differencing . . . . . . . . . . . . 124.2.1 Basic set up . . . . . . . 124.2.2 Applications . . . . . . . 124.2.3 Strengths and limitations 124.3 Additional Discussion . . . . . . Bibliography . . . . . . . . . . . . . . . Appendix 124A Theoretical Models . . Appendix 124B Empirical Models . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
4256 4256 4258 4267 4267 4268 4274 4274 4277 4278 4278
Abstract In this chapter, we review the difference-in-difference (DID) method and first-difference method, which have been widely used in quantitative research designs in the social sciences (e.g., economics, finance, accounting, etc.). First, we define the DID and first-difference methods. Then, we explain the models that may be used in the DID and first-difference
William H. Greene New York University e-mail: [email protected] Min (Shirley) Liu Brooklyn College, CUNY e-mail: [email protected] 4255
page 4255
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4256
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
methods and briefly discuss the critical assumptions when researchers make a casual inference from the results. Next, we use some examples documented in previous studies to illustrate how to apply the DID and first-difference methods in the research related to policy implementations. Finally, we compare the DID method to the comparative interrupted time series (CITS) design and briefly introduce two popular methods that researchers have used to create a control sample in order to reduce sample selection bias in a quasiexperimental design: propensity score matching (PSM) and regression discontinuity design (RDD). Keywords Difference-in-differences (DID) • First-difference method • Causal inference • Policy analyses.
124.1 Difference-in-Difference 124.1.1 Definition In general, the difference in differences (DID) design is a (an) statistical (econometric) method used in the quantitative research of social sciences; typically, it is employed in quasi-experimental designs.1 Using longitudinal data, DID determines the differences in the effects of a treatment on a “treatment group”, which is assumed to receive the real treatment, versus on a “control group”, which is assumed not receive the real treatment, in a quasiexperiment or a natural experiment. Specifically, DID quantifies the effect of a treatment (proxied by an independent variable) on an outcome (proxied by a dependent variable) by comparing the average change over time in the outcome of the treatment group to that of the control group. DID requires two or more different time periods of data for a treatment group and control group. This means that the sample data must include information from a designated period before treatment (pre-treatment) and at least one period after treatment (post-treatment). Selecting sample period depends on the research question(s) and could be challenging for researchers because including improper sample may have misleading results. Researchers could choose one period immediately before (after) the treatment as the pre-treatment (post-treatment) period to reduce the confounding effect on
1
A quasi-experimental design is similar to a true experimental design but lacks a key element: random assignment. Please refer to Section 124.1.2.4 for a detailed discussion on the application of DID in the quasi-experimental design.
page 4256
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4257
DID Illustra on
Outcome
Treatment
Constant difference in outcome
Observed outcome trend in treated group
Observed outcome trend in control Before
Treatment
Figure 124.1:
A er
DID illustration.
the treatment and the threat to the parallel assumption.2,3 In the above illustration, the golden (bottom) line represents the outcome over time for the treatment (control) group. In the pre-treatment period in which neither the treatment nor control group receives the treatment, the outcomes of two groups, which are proxied by dependent variables, are measured. After receiving the treatment, the outcomes of the two groups are measured again in the post-treatment period. Because the treatment and control groups do not start out at the same point in the pre-treatment period, the total difference between the treatment and control groups in the post-treatment period cannot be interpreted as the effect of the treatment. Therefore, to estimate the effect of the treatment, DID first computes the “constant” difference, which would still exist in the outcome variables between the two groups if neither group received the treatment. In Figure 124.1, the constant difference is shown as the distance between the bottom line and dotted line. Then, DID calculates the treatment effect as 2
The parallel assumption is the most important assumption when making a causal inference. Please refer to the detailed discussion on the assumptions that need to hold when drawing a causal inference from the results in Section 124.1.2.3. 3 In some cases, researchers prefer to use one period before (after) the treatment and exclude the treatment year to give an enough time to let the treatment affect. However, using a long period before (after) the treatment as the pre-(post-)treatment period may involve many factors that may confound the effect of the treatment. Moreover, the parallel trend assumption for DID is less likely to hold over a long time-window. Therefore, researchers should be cautious to extraploate short-term effects to long-term effects.
page 4257
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4258
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
the difference between the observed outcome and the constant difference, which is shown as the difference between the golden solid line and golden dotted line in the post-treatment period. This procedure may remove possible confounding effects, such as potential biases in the post-treatment period comparisons between the treatment and control groups, which could stem from permanent differences between the two groups (e.g., the characteristics of the members of the two groups). DID also addresses biases emerging from comparisons over time in the treatment group, which likely result from time trends. 124.1.2 Models 124.1.2.1 Simplest setting for DID analysis We can use the below general formulas to demonstrate the simplest form of the DID model, which includes only two groups and two periods (i.e., a 2 × 2 setting): Y = β0 + β1 dT + β2 dB + δ1 dT ∗ dB + e,
(124.1)
where Y is the outcome of interest. dT is the dummy variable for the post-treatment period (e.g., one for post-treatment period, zero otherwise), which captures the aggregate changes in Y without the treatment (or policy change). The dummy variable dB captures the possible differences between the treatment and control groups before the treatment is applied (or during the implementation of the policy). The coefficient of interest on dT∗ dB (the interaction term), δ1 , captures the effect of the treatment (or the implementation of the policy) on the treated group in the post-treatment period. The estimate for DID can be rewritten as below: δˆ1 = (Y¯T,1 − Y¯T,0 ) − (Y¯C,1 − Y¯C,0 ). The term (Y¯T,1 − Y¯T,0 ) represents the estimated (average) change in the outcome for the treatmentgroup from pre-treatment to post-treatment period. The term Y¯C,1 − Y¯C,0 represents the estimated (average) change in the outcome for the control group from the pre-treatment to post-treatment period. Therefore, the DID estimator, δˆ1 , captures the differences in the posttreatment outcomes between the treated and control groups. Thus, δˆ1 indicates the estimated average effect of the treatment on the treatment group. To test whether δˆ1 is statistically different from zero, we can use a regression analysis to estimate equation (1) above. Then, we find the standard error of δˆ1 to compute its t-statistics. The intercept, β0 , is the average outcome before the treatment for the control group. The parameter β1 captures
page 4258
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4259
the changes in both the treated and control groups in the post-treatment period. The coefficient β2 measures any change in the outcome that is not due to the implementation of the treatment. Lastly, e is the error term. 124.1.2.2 General settings for DID analysis: Multiple groups and time periods The DID method can be extended to more groups and time periods. Assuming that the treatment (policy) has the same effect on subjects every year, we can add a full set of time dummies to equation (124.1). If we relax this assumption, we can include a full set of dummies for each of the two groups and all time periods and pairwise interactions. Then, a treatment (policy) indicator (or continuous) variable can be used to measure the effects of the treatment (policy). Readers could refer a more detailed discussion on its application to Meyer (1995). Drawing from a general framework provided by Bertrand, Duflo, and Mullainathan (2004), the below equation (124.2) expresses the DID application for multiple groups and periods: Yigt = At + Bg + X gt β + Z igt γgt + vgt + uigt ,
(124.2)
where i indicates individual, g indicates group, and t indicates time. In this model, At estimates the time effects and Bg estimates the group effects. X gt represents the covariates between group and time, Z igt denotes the individual specific covariates, vgt symbolizes unobserved group and time effects, and uigt signifies individual-specific errors. The coefficient of interest is β. In the case of cluster sample, we can rewrite equation (124.2) as equation (124.3): Yigt = δgt + Z igt γgt + uigt ,
i = 1, . . . , Mgt .
(124.3)
Equation (124.3) shows a model at the individual level, in which both the intercepts and slopes can differ across all (g, t) pairs. Then, we estimate δgt at the group/time period level, using the below equation (124.4): δgt = At + Bg + X gt β + vgt .
(124.4)
Presuming that the individual-level observations are independent, we can estimate (124.5) and ignore vgt (we assume that vgt is independent across g). Then, we can estimate δgt using OLS at the individual-level, assuming that E(Z igt uigt ) = 0 and the group/time sizes, Mgt , are large (Bertrand et al., 2004, Greene, 2008). δˆgt = At + Bg + X gt β + vgt .
(124.5)
page 4259
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4260
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
Recent studies (Hansen, 2007; Greene, 2008) suggest using feasible generalized least squares (GLS) to estimate an efficient estimator, supposing that vgt (t = 1, 2, . . . , T ) is serially correlated and Xgt (t = 1, . . . , T ) is strictly exogenous.4 This latter restriction is not required when OLS is solely used as T approaches infinity. However, this exogeneity restriction might be violated if the treatment (policy intervention) has been applied time by time. For cases when the number of groups is large enough, Hansen (2007) suggests using Bootstrap to compute consistent standard errors. To reduce estimation errors in δˆgt , we can aggregate the equations over individuals, if Mgt is not large, and obtain the following equation (124.6): ¯ gt γ + vgt + u ¯gt . (124.6) Y¯gt = At + Bg + X gt β + Z We can estimate equation (124.6) by using a fixed effects method to draw ¯gt ), is weakly a fully robust inference because the composite error, (vgt + u dependent. 124.1.2.3 Casual inference DID has been widely used to estimate the treatment (casual) effect on the treated (exposed) group. Further, DID can be used to estimate the average treatment effect (ATE) or the casual effect in the population, if the following assumptions hold (Lechner, 2011). A1. Stable Unit Treatment Value Assumption (SUTVA) The members of treated and control groups remain stable across repeated cross-sectional design. Therefore, one, and only one, of the potential outcomes of treatment (policy intervention) is observable for all members of the population (Rubin, 1977). Moreover, no relevant interactions between the members of the population are assumed. In other words, there is not a spillover effect among populations. A2. Exogeneity Assumption The control variables in the DID analysis are not influenced by the treatment in both the pre- and post-treatment periods. If we relax this assumption, the results can be interpreted as DID only captures part of the causal effect, which was not already captured by the endogenous variable. A3. Treatment Unrelated to the Outcome at Baseline Treatment is not related to the outcome of control group. 4
Please refer to Hansen (2007) for details regarding how to use feasible GLS to estimate an efficient δˆgt .
page 4260
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4261
A4. Parallel Trends in Outcome Assumption The parallel trend assumption can also be called the “common trend” and “bias stability” assumption (Lechner, 2010). This is the most important assumption for confirming internal validity of DID models, and it is the most difficult to realize. This assumption states that, without the treatment (policy intervention), the difference between the “treatment” and “control” group stays constant over time. Visual inspection can be used to determine if this assumption still holds when data over many time periods are available. Violation of this assumption could lead to biased inferences regarding causal effects. 124.1.2.4 Applications The DID can be adopted in both quasi-experimental and naturalexperimental designs to obtain causal inferences. A quasi-experimental design is a type of empirical study used to estimate the causal effect of a treatment on its target population without random assignment.5 A natural experiment is a type of empirical study where participants’ exposure to treatment is determined by nature or by other factors that are not controlled by the researchers, and the assignment of the treatment is random. When there is a clearly defined exposure/treatment/policy change and subpopulation (treatment/exposure and control/non-exposure subpopulation), a researcher may be able to isolate the treatment effect, from other factors, on the outcomes of treated group. Therefore, it can be inferred that the difference in the outcomes of the treated and control groups can be due to the application of the treatment. In this particular case, the results of using the DID method can be used to make a causal inference. In contrast, when applying the DID method to a quasi-experimental design to make causal inferences, it is crucial to hold the assumptions discussed in Section 124.1.2.3. 5
In a quasi-experimental design, the researcher can use some criterion to assign the treatment to the target group. However, a quasi-experiment may have internal validity concerns, because the treatment and control groups may not be comparable in the pre-treatment period. Random assignment cannot eliminate the concern of internal validity because even though the probability of assigning subjects to the treatment and control groups is the same, the differences between the observable and unobservable outcomes for the two groups may result from random chance, rather than the treatment. Therefore, controlling for confounding factors is particularly important for using the quasi-experimental design to draw causal inferences. In other words, DID research design is a method used to control confounding factors that may affect the outcomes. Ultimately, holding the assumption of parallel trends in outcomes is crucial for generating causal inferences from the results of DID studies.
page 4261
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4262
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
The DID method has been widely used in economics, finance, and accounting research. We first illustrate an application of the DID method on policy studies using a 2 × 2 setting and then a multiple period setting. Example 1 Card and Krueger (1994) adopt the DID method to study the effect of minimum wage legislation on average employment in New Jersey. On April 1, 1992, New Jersey rose the minimum wage from $4.25 to $5.05. In particular, Card and Krueger (1994) compare the employment in fast food restaurants (i.e., Burger King, KFC, Roy Rogers, and Wendy’s) in the State of New Jersey to the State of Pennsylvania, by studying the time period of February 1992 (pre-treatment period) and November 1992 (post-treatment period). Card and Krueger (1994) include Pennsylvania as a natural control group because the two states had the same weather and macroeconomic conditions and Pennsylvania did not experience the recent implementation of new minimum wage legislation. Using Pennsylvania as a control group should implicitly control for confounding factors, even when these factors are unobservable. Moreover, if New Jersey and Pennsylvania show parallel trends over time, it can be inferred that the difference in the employment rate between New Jersey and Pennsylvania from the pre- to post-treatment period occurs due to the implementation of the minimum wage law in New Jersey. In Table 2 of Card and Krueger (1994), the mean values of characteristics of stores (e.g., percentage of full-time employees, starting wage, wage, price of full meal, hours open, and recruiting bonus) in New Jersey and Pennsylvania do not show that the parallel trend assumption is violated. Card and Krueger (1994) acknowledge that the violation of the stable unit treatment value assumption (SUTVA) may bias their results, but they provide validation check using balanced subsample, which meet the SUTVA assumption. Moreover, they do not find “spillover” effect among population. Card and Krueger (1994)’s research setting is more likely to be a natural experiment design than a quasi-experimental design because they could not control which restaurants in New Jersey (treatment group) were exposed to the change in law. Table 124.1 shows that there was a slight increase (0.59) in the number of full-time equivalent employment in the post-treatment period in New Jersey compared to the pre-treatment period, while the change in full-time equivalent employment in Pennsylvania shows an opposite trend. There were 2.75 more full-time equivalent employment in New Jersey than in Pennsylvania,
page 4262
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
Review of Difference-in-Difference Analyses in Social Sciences
4263
Table 124.1: Average employment per store before and after the rise in New Jersey minimum wage. New Jersey
Pennsylvania
Difference (NJ-PA)
FTE before regulation
20.44 (0.51)∗∗∗
23.33 (1.35)∗∗∗
−2.89 (1.44)∗∗
FTE after regulation
21.03 (0.52)∗∗∗
21.17 (0.94)∗∗∗
−0.14 (0.17)
Change
0.59 (0.54)
−2.16 (1.25)∗
2.76 (1.36)∗∗
Notes: This table refers to Table 3 of Card and Krueger (1994). They compute the number of full-time equivalents (FTE) as the number of full-time workers (including managers) plus 0.5 times the number of part-time workers. The standard errors are reported in parentheses. ∗ , ∗∗ , and ∗∗∗ indicate statistical significance in means at the 10%, 5%, and 1% levels (two-tailed), respectively.
from the pre- to post-treatment period. This result directly challenges the economic prediction that employment should decrease when the minimum wage increases. Example 2 Rodano, Serrano-Velarde, and Tarantino (2016) examine the effects of reorganization and liquidation in bankruptcy on bank financing and firm investment in Italy over the period of 2005–2006. The regulation reforms were caused by the Parmalat scandal, one of largest corporate scandals in Europe, and were not driven by trends in small and medium-size enterprises. The reorganization reform (Legislative Decree no. 35 of 2005) introduces a few provisions to facilitate the renegotiation of outstanding loans and to protect debtors, which gives the borrowers a strong bargaining position. Therefore, the 2005 reorganization reform increases the cost of bank financing. Furthermore, the authors expect the increase to be greater for debtors who are more likely to default. The liquidation reform (Law no. 5 of 2006) strengthens creditor rights and weakens the power of the court-appointed trustees who manage the liquidation proceeding. After 2006 reform, creditors could monitor trustee and coordinate the liquidation proceedings, subsequently, speeding up the liquidation proceedings. Therefore, after the 2006 liquidation reform, the bank’s expected payoff in renegotiation increases. Moreover, the authors expect that interest rate decrease among borrowers who are more
page 4263
July 6, 2020
16:8
4264
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
likely to renegotiate.6 To test their hypotheses, Rodano et al. (2016) implement a DID research design by estimating the following model7 : Yijt = Constant + α Exposedi + β (Exposedi × After Reorganization t ) + γ (Exposedi × Interim Periodt ) + δ (Exposedi × After Liquidationt ) + · · · + Quarter × Year + εijt , where Yijt represents the interest rate on the loan issued by bank, j, to firm, i, at time, t (interaction between quarter and year). Exposedi is a dummy variable dividing the sample into two subgroups based on the value of the firm’s score (e.g., Score, 1–9), which measures the probability of a firm defaulting. A higher score indicates a greater probability of default; as such, the treatment group (with Score of 5–9) is more sensitive to legal reforms (treatments) than the control group (with Score of 1–4). The interaction terms between the exposure and reform indicators (After Reorganization and After Liquidation) measure the impact of each legal reform on loan interest rates. The coefficient of the interaction (Exposedi × After Reorganizationt ), β, is the DID estimate for the impact of reorganization reform, which measures the differences in the interest rates between the two groups in the post-reorganization period compared to in the pre-reorganization period. Rodano et al. (2016) predict β to be positive. The coefficient of the interaction (Exposedi × After Liquidationt ), δ, is the DID estimate for the impact of liquidation reform, which is expected to be negative. Table 124.2 shows that the results (OLS regression estimated coefficients and firm-clustered standard errors in the parentheses) are consistent with their predictions. The research design of Rodano et al. (2016) is more like a quasiexperiment than a natural-experiment because the members of treatment group were not randomly selected. Therefore, it is important to hold the aforementioned assumptions when drawing a causal inference for their results. Figure 124.2 shows that the parallel trend assumption, the most important assumption that needs to make a causal inference, holds in their setting.8 Their sample selection procedure does not show any violation of 6
The debtors who are more likely to default are also the ones who are more likely to renegotiate to avoid default and are expected to be more sensitive to the reforms. 7 Interested readers could also look at Autor (2003) for the generalized DID application on testing the influence of “unjust dismissal” on the US Temporary Help Services (THS) employment. 8 Using figures/graphs/plots to check if the parallel assumption holds has been widely used in the DID design applied to social sciences research (e.g., Cerqueiro, Ongena, and Roszbach, 2016; Qin and Ronen, 2018).
page 4264
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
Review of Difference-in-Difference Analyses in Social Sciences Table 124.2:
4265
Results of bankruptcy reforms on interest rates of loans. 1–4 versus 5–9
After Reorganization × Exposed After Liquidation × Exposed Interim Period × Exposed Credit Standard SME × Exposed Loan and Firm Time-Varying Controls Firm × bank fixed effects Quarter × year fixed effects R-squared N
0.045 (0.016)∗∗∗ −0.048 (0.015)∗∗∗ 0.005 (0.014) 0.020 (0.022) Yes Yes Yes 0.559 183,498
Notes: Table 124.2 replicates the column (2) of Table 3 of Rodano et al. (2016). The table reports OLS estimations of the effect of the bankruptcy reforms on loan interest rates. After Reorganization is an indicator variable, which equals to one starting in January 2005 (2005.Q1). Interim Period is an indicator variable, which equals to one beginning in June 2005 (2005.Q3). After Liquidation is an indicator variable, which equals to one beginning in January 2006 (2006.Q1). The degree of exposure to the reforms is measured on the basis of a firm’s Score in 2004. Exposed is the Score indicator itself (with values between 1 and 9) in 2004. Credit Standards SME, which represents the expected credit standards applied to Italian small and medium-size enterprises (SMEs), is interacted with the Exposure indicator. This regression also controls for loan and firm time-varying factors, as well firm-bank and quarter-year fixed effects. ∗ , ∗∗ , and ∗∗∗ indicate statistical significance in means at the 10%, 5%, and 1% levels (two-tailed), respectively.
the assumption of stable unit treatment value. To isolate the treatment effect (or to make the shock exogenous), they control for a rich set of firm and financial contact characteristics (e.g., age of firm, leverage, total sales, maternity, the presence of collateral,. . .). In addition, they also include the firm-bank and quarter fixed effects. Firm-bank fixed effects capture heterogeneities both across borrowers or banks and across teach firm-bank pairing. The quarter (time) fixed effects control for macroeconomic and aggregate shocks that influence credit demand or supply. Therefore, Rodano et al. (2016) made a casual inference from their results. 124.1.2.5 Strengths and limitations The DID research design has the following advantages. First, the results of the DID research design may be interpreted intuitively. Second, researchers can obtain casual inferences from the results if the above assumptions
page 4265
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
4266
Liq.
Reorg.
High Score
Low Score
(A)
UCI
LCI
Liq.
PE
Reorg.
July 6, 2020
(B)
Figure 124.2: Difference-in-differences plot of interest rates. Notes: Figure 124.2 replicates Figure 7 of Rodano et al. (2016). Panel A plots average interest rates for firms with a low Score (i.e., between 1 and 4) (represented by the solid line) and those with a high Score (i.e., between 5 and 9) (represented by the dotted line). Panel B plots the difference in average interest rates paid by firms in different categories of Score for each quarter. Vertical lines represent the times of passaging the reforms: the first quarter of 2005 and the first quarter of 2006 for the reorganization (denoted by “Reorg.”) and liquidation reform (denoted by “Liq.”), respectively. (Interested readers can refer to the detailed interpretation of this figure in the web version of the article.)
(in Section 124.1.2.3) are satisfied. Third, data selection is flexible. Researchers can use either individual- or group-level data. Fourth, the DID allows the comparison groups to start at different outcome levels because it focuses on the difference in the change of outcomes rather than on the absolute levels of the two groups. Fifth, the DID research design can control for confounding factors that may lead to the difference in the outcome of the two groups.
page 4266
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4267
However, the DID research design also has its limitations. First, the DID design requires data from a comparison group (e.g., baseline or a nonintervention/control group), which must start either before the intervention/ treatment or after the intervention/treatment (although before the intervention/treatment is preferred). Second, the DID design cannot be used if the intervention allocation was determined by the baseline outcome. In other words, if the treatment/intervention is not exogenous, we cannot use the DID research design to examine the causal effect of a policy on the treated group. Third, if the comparison group(s) has(ve) different outcome trend(s), the DID design cannot be adopted. In the case that the parallel trend assumption is violated, Abadie (2005) suggests that a two-step strategy can be employed to estimate the average effect of the treatment for the treated group. Fourth, if the stable unit treatment value assumption is violated, the DID is not suitable for generating causal inferences. 124.2 Differencing In this section, we illustrate how to use the first-difference technique to conduct the DID research design. The advantage of using the first-difference method is that it removes latent heterogeneity from the model (Greene 2008). We first introduce the circumstances under which we could use the first-difference technique. Then, we define the first-difference method and estimator in the regression. We follow with four examples and conclude the section with the discussion on the strengths and limitations of the firstdifference method. 124.2.1 Basic set up In Example 1 (Card and Krueger, 1994), the weather and macroeconomic factors are the latent factors that may contribute to the employment rates in New Jersey and Pennsylvania. We assume that these factors do not change over the sample period (i.e., time invariant) but may be related to the treatment variable: the implementation of the minimum wage law in New Jersey. We control for the time-varying factors that might correlate with the treatment variable by including the differenced time-varying variables in the regression equation. In this case, we can write the following equations (124.7)–(124.9) to illustrate the procedure: Yi1 = (β0 + δ0 ) + β1 Xi1 + Li + ui1
(t = 1),
(124.7)
Yi0 = β0 + β1 Xi0 + Li + ui0
(t = 0).
(124.8)
page 4267
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
4268
If we minus equation (124.8) from (124.7), we have (Yi1 − Yi0 ) = δ0 + β1 (Xi1 − Xi0 ) + (ui1 − ui0 ), or ΔYi = δ0 + β1 ΔXi + Δui ,
(124.9)
where Δ denotes the change from period t = 0 to t = 1. The latent factor, Li , disappears in equation (124.9) because it was “differenced away”. Moreover, the intercept in equation (124.9), δ0 , is the change in the intercept from period 0 to 1. Equation (124.9) is called the first-difference equation. If the strict exogeneity assumption is satisfied (i.e., ΔXi is uncorrelated with Δui ), we can use OLS to estimate β1 , which is called the first-difference estimator. To take advantage of the first-difference method in the DID research design, we can add a dummy variable to the above equation (124.9) to test the influence of a treatment (policy change) on the outcomes of our interest. That means, in equation (124.10), θ estimates the policy effect on Yi : ΔYi = δ0 + β1 ΔXi + θD + Δui .
(124.10)
124.2.2 Applications Example 1 We still use the Card and Krueger (1994) example to illustrate how to apply the first-difference method in the DID design. They used the below equation to test the effect of the implementation of minimum wage law on employment in New Jersey: ΔEi = a + bΔXi + cN Ji + εi , where ΔEi denotes the change in employment from the pre-policy to postpolicy period at store i. Xi is a set of characteristics of store i, and NJ is the indicator variable, which is coded as one if the store is in New Jersey and zero otherwise. The coefficient of interest on NJi , c, measures the difference in the change of employment between New Jersey and Pennsylvania from the pre- to post-law period. Holding the assumptions of parallel trend in the two states, strict exogeneity of the implementation of the law, and the treatment unrelated to the outcomes of the control group, Card and Krueger (1994) make the causal inference that the estimated coefficient, c, captures the effect of the law on employment in the New Jersey. Finally, εi is the error term. Table 124.3 reports the results of estimating the above equation. The standard errors are reported in the parentheses.
page 4268
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
Review of Difference-in-Difference Analyses in Social Sciences Table 124.3: employment.
4269
Results of the effect of the minimum wage requirement on the
Model Independent variable New Jersey dummy Controls for chains and ownershipa Standard error of regression Probability value for controlsb
(i)
(ii)
2.33 (1.19)∗∗ No 8.79 —
2.30 (1.20)∗∗ Yes 8.78 0.34
Notes: Table 124.3 replicates columns (i) and (ii) of Table 4 of Card and Krueger (1994). Standard errors are reported in parentheses. The dependent variable of all models is the change in the number of full-time equivalent (FTE) employment. The mean and standard deviation of FTE are −0.237 and 8.825, respectively. All models include an unrestricted constant. ∗∗ indicate statistical significance in means at the 5% level (two-tailed). a Including three dummy variables for chain type (i.e., Burger King, Roy Rogers, Wendy’s, or KFC) and whether the store is company-owned (i.e., company owned or franchisee-owned). b Probability value of joint F -test for exclusion of all control variables.
Example 2 The next example of using the first-difference method to conduct research on policy effects is Cerqueiro, Ongena, and Roszbach (2016). They examine the impact of the law, which is effective on January 1, 2004 and reduced the value of all floating liens in Sweden. The floating lien (charge) is a way for a company to obtain a loan using a security interest in prespecified classes of properties (e.g., inventories or account receivable), in which the individual properities are not specifically identified, as collateral. Before 2004, floating liens give creditors special rights to possess the debtors’ properties outside bankruptcy and without court intervention. The law implemented in 2004 annulled the special rights of all floating liens and reduced the pool of the eligible assets under them. Therefore, the effective regulation also reduces the collateral value of loans. Consequently, to hold the secured assets, the holders of floating liens must follow a court order and declare the debtor’s bankruptcy. In response to the regulation, the lenders are expected to increase the interest rate and tighten the credit limits to protect themselves. To examine the expectations, Cerqueiro et al. (2016) compared the effect of the law on two groups: (1) the treatment group, comprised of borrowers that pledged floating liens to banks before 2004 that were still outstanding on January 1, 2004; and (2) the control group, comprised of borrowers that did not register floating liens before January 1, 2004.
page 4269
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
4270 Table 124.4: spread.
Results of the effect of a legal regulation in Sweden on loan
Loan spread Treated
0.20 (5.00)∗∗∗ 0.61 (60.76)∗∗∗ 2580 0.01
Constant Observations R-squared
Ln(internal loan limit) −0.11 (−2.29)∗∗ −0.12 (−10.84)∗∗ 2477 0.00
Notes: Table 124.4 replicates part of Table V of Cerqueiro et al. (2016). The dependent variable is the post–pre difference in loan spread (and log value of internal loan limit, respective), which is measured as the interest rate above the bank’s reference rate (the thousands of euro). ∗∗ , and ∗∗∗ indicate statistical significance in means at the 5% and 1% levels (two-tailed), respectively.
The control group was matched with treated group by industry. Cerqueiro et al. (2016) use the below equation to test the effect of the law on the bank debt contracts: Y¯post − Y¯pre i = a + β ∗ Treatedi + εi , where Y¯post − Y¯pre denotes the average difference in the loan characterisi
tics (e.g., interest rates and tightness of credit limits) of loan i from the preto post-law period. Treated, a dummy variable, denotes the treated loans. The estimated β captures the effect of the law on the treated group. Table 124.4 reports one of the results of estimating the above equation (i.e., estimated coefficients with t-statistics in parentheses), indicating that the treatment group suffers on average a 20 basis points increase in loan spread and 11% reduction in internal loan limit compared to the control group in the post-law period. Figure 124.2 in Cerqueiro et al. (2016) does not show a serious threat to the parallel trend assumption. Moreover, their sample selection procedure does not show a violation of the stable unit treatment assumption. Moreover, the passage of the regulation is an exogenous shock to the sample units. Therefore, they made a causal inference from their results. Example 3 Differencing multiple period panel data is very useful in policy research analysis. We can apply differencing to multiple period data by repeating the firstdifference method on the two continuous periods. Example 3 illustrates how to apply the DID research design in an accounting study by using the firstdifference method without a control group. Carcello and Li (2013) examine
page 4270
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4271
the effect of the implementation of the audit engagement partner signature requirement on audit quality of the firms listed in London Stock Exchange in the United Kingdom (UK). Requiring the signature of engagement partner, which was effective on April 1, 2009 in the UK, should increase the accountability of the audit engagement partner, thereby, increasing the effort that the partner devotes into the audit engagement and consequently the audit quality. To test this hypothesis, one of their tests is to compare the change in audit quality in the UK from year t − 2 to t − 1 (t, the first year that the firm implemented the engagement partner signature requirement), to the change from year t − 1 to t. They use an indicator variable, CHG SIGNATURE, which equals one for changes between year t − 1 and year t, zero for changes between year t − 2 and year t − 1. The dependent and control variables are first-differenced as well. Since the first-difference method can take off the possible time trends in those variables, Carcello and Li (2013) employ a simple before-and-after design by using the UK firms as both treated and control group. In this setting, the regulation change is an exogenous shock to the treatment group (UK firms) and the experiment is natural. Moreover, the stable unit treatment assumption is met. Thus, this simple first-difference set up is equivalent to the DID research design.9 Specifically, Carcello and Li (2013) use the below equation to quantify the impact of the passage of the rule on the audit quality: ΔYi = δ0 + θ CHG SIGNATURE + β1 ΔXi + Δui , where ΔYi are dependent variables presenting changes in outcomes; ΔXi are control variables; and θ is the DID (also the first-differenced estimator) estimator. Table 124.5 shows that the estimated θ is −0.028, indicating that the change in absolute value of abnormal accrual, which was constructed as that a smaller number indicates a higher audit quality, is negatively associated with the change in the engagement partner signature requirement in the UK. This result consistent with the authors’ prediction that the audit quality improved in the post-signature period compared to the pre-signature period. Example 4 Example 4 introduces another way to use the first-difference method in the DID analyses when employing multiple period data with a control group. 9
To draw a causal inference from the results, Carcello and Li (2013) also conduct empirical tests with the control sample, which are similar to Example 2 in Section 124.1.2.4.
page 4271
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch124
W. H. Greene & M. Liu
4272 Table 124.5: quality.
9.61in x 6.69in
Results of the effect of engagement partner signature requirement on audit
CHG ABS ACC Intercept CHG SIGNATUE CHG SIZE CHG ROA CHG LEVERAGE CHG LOSS CHG MB CHG LCACCR CHG CFO CHG VOLATILITY LITIGATE CHG AUDITOR N F R-squared
0.006 (0.79) −0.028 (−2.74)∗∗∗ 0.055 (4.00)∗∗∗ 0.044 (1.90)∗ 0.012 (0.27) 0.008 (0.79) 0.000 (0.50) −0.055 (−2.87)∗∗∗ −0.007 (−0.20) 0.056 (1.67)∗ −0.055 (−0.15) −0.011 (−0.31) 1,474 6.79 0.04
Notes: Table 124.5 replicates a part of Table 6 of Carcello and Li (2013). The following variables are the changes in outcome (ABS ACC) and changes in independent variables from year t − 2 to t − 1 (t, the first year that the firm implemented the engagement partner signature requirement), to the changes of those variables from year t − 1 to t. CHG ABS ACC: the change in the performance-matched abnormal accruals. ABS ACC: a measurement of audit quality and the smaller the better audit quality. CHG SIZE: the change in client firms’ sizes. CHG ROA: the change in return on asset (profitability). CHG LEVERAGE: the change in client firms’ leverage. CHG LOSS: the change in occurrence of loss. CHG MB: the change in market-to-book ratio (a proxy for growth). CHG LCACCR: the change in prior year’s total current accruals, which was scaled by lagged total assets. CHG CFO: the change in cash flow from operations, which was scaled by total assets. CHG VOLATILITY: the change in volatility, which is measured as the standard deviation of annual sales over the prior seven years. LITIGATE: coded as 1 if the firm’s main operations are in a high-litigation industry (e.g., biotechnology, computer, electronics, and retail industries), and 0 otherwise. CHG AUDITOR: the change in whether if the client was audited by a BIG 6 firm in year t. ∗ and ∗∗∗ indicate statistical significance in means at the 10%, and 1% levels (two-tailed), respectively.
page 4272
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4273
Researchers could first average the variables in the pre-treatment and posttreatment windows and then difference the means of the variables over the two windows (e.g., Byard, Li, and Yu, 2011; Liu, 2017). Liu (2017) investigates whether the treatment, the implementation of requiring the audit engagement partner (EP) to sign his/her name on the audit report, which is an increased disclosure (or a piece of new information) in clients’ financial statements, improves the financial analysts’ information environment in the United Kingdom (UK). Liu (2017) uses the following model to test the hypothesis that the implementation of the EP signature requirement in the UK causes an increased analyst following: ΔAnCovi = β0 + β1 EP i + βj ΔXi +εi , where, ΔAnCovi denotes the difference in the AnCovi from two-year preto two-year post-signature period. AnCovi are the means of the natural logs of the number of analysts following firm i in the pre- and post-signature windows. This design is unique because it takes advantage of first-difference while preserving the properties of panel data. The indicator variable, EPi , which is coded as one for firm-year observation beginning and after April 1, 2009 (the effective date of the requirement) and zero otherwise, captures the effect of the treatment. Therefore, the coefficient β1 estimates the difference in effect of the treatment on the outcome of between the treatment (UK) group and the control (other European countries) group. ΔXi are the control variables, which are measured as the differences in the averages of these variables between two-year pre-and two-year post-signature windows for each firm. Liu (2017) requires the same firm-analyst pairs over the sample period (i.e., two-year before and after the treatment). Therefore, the stable unit treatment value assumption holds. Moreover, the implementation of the EP signature requirement (i.e., the treatment) is an exogenous shock to the sample firms. This setting is like a natural experimental. Furthermore, the treatment is not related to the outcome of the EU control firms. Although the parallel assumption has a threat because the characteristics of treatment and control sample firms are not exactly same, the author tries to use the first-difference method to eliminate unobserved individual effects.10 10
This difficulty to meet the parallel assumption is commonly faced by researchers. Byard, Li, and Yu (2011) test whether and how the implementation of IFRS influences the analysts’ information environment. In Byard et al. (2011), the characteristics of control sample firms are not completely matched with those of treatment sample firms. They used the same research design as Liu (2017) to isolate the treatment effect, which could reduce the threat to causality.
page 4273
July 6, 2020
16:8
4274
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
Table 124.6 shows the result of estimating the above equation. The estimated coefficient on the indicator variable EP is 0.195, suggesting that there are 1.177, on average, more analysts following the UK treatment sample firms than the EU control sample firms in the post-signature period.11 This evidence indicates that the adoption of the EP signature requirement leads to an increase in number of analysts following the firms in the UK, which is an improvement of analyst information environment. 124.2.3 Strengths and limitations In addition to the aforementioned strengths and limitations of using the DID research design (as discussed in Section 124.1.2.5), applying the firstdifference technique to it can eliminate the latent heterogeneity from the model. However, researchers cannot observe the effects of some fixed group characteristics on the outcomes, while such effects can be observed in the DID design when using data at the individual- or group-level, as shown in Section 124.1. 124.3 Additional Discussion Some researchers regard the DID design as a simpler version of the comparative interrupted time series (CITS) design. The CITS design studies whether the treatment group experiences a larger change in its trend line from the pre-treatment period to the post-treatment period than the control group, while the DID design evaluates the treatment effect by examining whether the treatment group experiences a larger change in its mean from the pre-treatment to the post-treatment period than the control group. Therefore, the CITS design imposes more data restrictions than the DID design. In order to estimate the baseline trend, data must be available at least four time periods prior to the beginning of the treatment, which may not always be possible. Because of the difficulty of accurately estimating the trends of two groups and the rigorous requirements for data, the DID method is a more popular choice when researchers study the impact of a treatment (policy/program).12 When the DID analysis is widely used in a quasi-experimental design, researchers commonly use the propensity score matching (PSM) and the 11 0.195
e = 2.3030.195 = 1.177. Please refer to the detailed comparisons between the CITS and DID design in MarieAndree, Zhu, Jocob, and Bloom (2013). 12
page 4274
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4275
Table 124.6: Results of the effect of engagement partner signature requirement on financial analyst coverage. ΔAnCov Constant EP ΔSize ΔROA ΔMB ΔLeverage ΔIntangible ΔABAQ EP∗ ΔABAQ Big4 Number of Firms R-squared
−0.107 (0.151) 0.195 (0.000)∗∗∗ 0.392 (0.000)∗∗∗ −0.271 (0.105) −0.012 (0.010)∗∗ 0.162 (0.355) −0.144 (0.222) 0.088 (0.718) −0.178 (0.598) 0.042 (0.157) 709 0.241
Notes: Table 124.6 replicates partial results in Table 4 of Liu (2017). P -values are reported in the parentheses. ΔAnCov: the difference in the log value of average number analysts following firm i between the two-year pre- and two-year post-signature requirement windows. EP: 1 for the UK firms, 0 for the (other European countries) control firms. ΔSize: difference in the log value of average total assets in the US dollar amount between the two-year preand two-year post-signature requirement windows. ΔROA: difference in average ROA (the proxy for profitability), which is measured as the ratio of income before extraordinary items to the lag value of total assets, between the two-year pre- and two-year post-signature requirement windows. ΔMB: difference in average MB (the proxy for growth opportunity), which is measured as the market-to-book ratio, between the twoyear pre- and two-year post-signature requirement windows. ΔLeverage: difference in average Leverage (measured as the ratio of total liability to total assets) between the two-year pre- and two-year post-signature requirement windows. ΔIntangible: difference in average Intangible (measured as the intangible assets deflated by the lag value of the total asset) between the two-year pre- and two-year post-signature requirement windows. ΔABAQ: difference in average ABAQ, absolute value of discretionary accruals (the proxy for audit quality), between the two-year pre- and two-year post-signature requirement windows. A larger value of ABAQ indicates a poor audit quality. EP∗ ΔABAQ: interaction between EP and ΔABAQ. Big4: coded 1 for the Big 4 auditors, 0 otherwise. ∗∗ and ∗∗∗ indicate statistical significance in means at the 10%, 5%, and 1% levels (two-tailed), respectively.
page 4275
July 6, 2020
16:8
4276
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
regression discontinuity design (RDD) to create a valid control group to reduce the possible selection bias.13 Selection bias arises when there are systematical differences between the members of the treatment and control group, without implementing the treatment/policy on the treated group, which may lead to a biased estimate of the treatment/policy effect. With the presence of selection bias, the difference in the change in outcomes between the treated and control groups may be due to the characteristics that, at least, partially determine whether an individual received treatment, rather than due to the treatment per se. Since the quasi-experimental research design is not random, such selection bias may exist. Therefore, PSM and RDD are employed to match the samples and mimic randomization by selecting a treatment sample that has all the comparable observed characteristics of the control sample. Propensity score matching (PSM) is a statistical matching technique that try to estimate a treatment by accounting for the covariates that predict receiving the treatment. Specially, PSM matches treated and untreated observations on the estimated probability of being treated (propensity score). A PSM sample must be large. PSM also requires that the two groups must have the same variables and that the data of the two groups should be collected during the same period. The main weakness of PSM is that the matching is based on observed characteristics of the two groups. The estimated treatment effect could be biased if any unobserved factors affect the treatment and change over time. The other popular matching method, the regression discontinuity design (RDD) can be used when there is a certain criterion (threshold) that must be met before the subjects can receive the treatment. RDD could produce the causal effects of treatment by, first, assigning a cutoff (threshold), above or below which a treatment is assigned, and then, by comparing the observations that lie closely on either side of the threshold. Doing so, RDD can handle unobserved characteristics of the two groups better than PSM. Additionally, RDD may be able to estimate the average treatment effect in case when randomization is not possible. For the brevity here, we do not review the details of RDD here. Interested empirical researchers can refer to more details of RDD in Lee and Lemieux (2010). 13
King and Nielsen (2016) argue that researchers should not use the propensity scores matching to select control group for causal inference because their results indicate that the PSM increases imbalance, inefficiency, model dependence, and bias. Therefore, researchers should be cautious when they use PSM in their research design although it is very popular among empirical researchers.
page 4276
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Review of Difference-in-Difference Analyses in Social Sciences
b3568-v4-ch124
4277
Bibliography Abadie, A. (2005). Semiparametric Difference-in-Differences Estimators. Review of Economic Studies 72, 1–19. Bertrand, M., Duflo, E. and Mullainathan, S. (2004). How Much Should We Trust Differences-in-Differences Estimates? The Quarterly Journal of Economics 119, 249–275. Byard, D., Li, Y. and Yu, Y. (2011). The Effect of Mandatory IFRS Adoption on Financial Analysts’ Information Environment. Journal of Accounting Research 49, 69–96. Card, D. and Krueger, A.B. (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania. American Economic Review 84(4), 772–793. Carcello, J.V. and Li, C. (2013). Costs and Benefits of Requiring an Engagement Partner Signature: Recent Experience in the United Kingdom. The Accounting Review 88, 1511–1546. Cerqueiro, G, S. O. and Roszbach, S. (2016). Collateralization, Bank Loan Rates, and Monitoring. Journal of Finance 71, 1295–1322. Greene, W. (2008). Econometric Analysis. Pearson Education, Inc., Upper Saddle River, NJ. Hansen, C. (2007). Generalized Least Squares Inference in Panel and Multilevel Models with Serial Correlation and Fixed Effects. Journal of Econometrics 140, 670–694. King, G. and Nielsen, R. (2016). Why Propensity Scores Should Not be Used for Matching. Working Paper, https://gking.harvard.edu/publications/ why-propensity-scores-should-not-be-used-formatching. Lechner, M. (2010). The Estimation of Causal Effects by Difference-in-Difference Methods. Foundations and Trends in Econometrics 4, 165–224. Lee, D.S. and Lemieux, T. (2010). Regression Discontinuity Designs in Economics. Journal of Economic Literature 48, 281–355. Liu, S. (2017). Does the Requirement of an Engagement Partner Signature Improve Financial Analysts’ Information Environment in the United Kingdom? Review of Quantitative Finance and Accounting 49, 263–281 Marie-Andree, S., Zhu, P., Jacob, R. and Bloom, H. (2013). The Validity and Precision of the Comparative Interrupted Time Series Design and the Differencein-Difference Design in Educational Evaluation. Working paper, Manpower Demonstration Research Corporation (MDRC), Available at: https://www.mdrc.org/sites/ default/files/validity precision comparative interrupted time series design.pdf. Meyer, B. (1995). Natural and Quasi-Experiments in Economics.Journal of Business & Economic Statistics 13, 151–161. Qin, L. and Ronen, J. (2018). Effects of the Leases Exposure Draft on Loan Spreads and Analyst Dispersion. Working Paper, New York University. Rodano, G., Serrano-Velarde, N. and Tarantino, E. (2016). Bankruptcy law and bank financing. Journal of Financial Economics 120, 363–382. Rubin, D. B. (1977). Assignment to treatment group on the basis of a covariate. Journal of Educational Statistics 2, 1–26. Wooldridge, J.M. (2016). Introductory Econometrics. Cengage Learning, Boston, MA.
page 4277
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4278
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu
Appendix 124A Theoretical Models
References
Theoretical models
Bertrand, Duflo, and Mullainathan (2004)
Yigt = At + Bg + X gt β + Z igt γgt + vgt + uigt
Greene (2008)
Δyi = θ + (ΔX i ) β + ui
Wooldridge (2016)
Y = β0 + β1 dT + β2 dB + δ1 dT ∗ dB + e
Summary This chapter introduces the model that can be used for DID analyses in a setting with multiple groups and periods. The coefficient(s) of β is(are) the DID estimator(s), capturing the treatment effect. This chapter uses the first-difference model to conduct DID analyses. The treatment effect would be the constant, θ, in the model. This chapter explains the model that can used for DID analyses in the simplest setting: two groups and two periods. The treatment effect can be estimated by the coefficient on the interaction term, δ1 .
Appendix 124B Empirical Models
References Card and Krueger (1994)
Theoretical models
Summary
ΔEi = a + bΔXi + cN Ji + εi This chapter uses the first-difference method to estimate the effect of minimum wage legislation on average employment in New Jersey in 1992. The coefficient c estimates the effect of the legislation on the full-time equivalent employment in NJ. The authors find that there were 2.75 more full-time equivalent employment in NJ (treatment group) than in PA (control group). (Continued )
page 4278
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch124
Review of Difference-in-Difference Analyses in Social Sciences
4279
(Continued ) References Carcello and Li (2013)
Theoretical models
Summary
ΔYi = δ0 + θC HG SIGNATURE + β1 ΔXi + Δui
This chapter uses the first-difference method to estimate the effect of engagement partner signature requirement on audit quality in the UK. The treatment effect would be the coefficient, θ, in the model. Cerqueiro, Ongena, (Y¯post − Y¯pre ) = This chapter uses the first-difference and Roszbach a + β ∗ Treatedi + εi method to estimate the impact of (2016) the law, which is effective on January 1, 2004 and regulates banks’ use of floating liens in response to the legal reform that exogenously reduces collateral value of loans (e.g., increase interest and tighten credit limits). The DID estimator, β, captures the effect of the law on the treated group. Rodano, Yijt = Constant + α This chapter examines the effects of Serrano-Velarde, Exposedi + β (Exposedi × reorganization and liquidation in After Reorganizationt ) + γ and Tarantino bankruptcy on bank financing and (2016) firm investment in Italy over the (Exposedi × Interim period of 2005–2006. The Periodt ) + δ (Exposedi × After Liquidationt ) + · · · + interaction terms between the exposure and reform indicators Quarter × Year + εijt (After Reorganization and After Liquidation) measure the impact of each legal reform on loan interest rates. The coefficient of the interaction (Exposedi × After Reorganizationt ), β, is the DID estimate for the impact of reorganization reform, which measures the differences in the interest rates between the two groups in the post-reorganization period compared to in the pre-reorganization period. Rodano et al. (2016) predict β to be positive. The coefficient of the interaction (Exposedi × After Liquidationt ), δ, is the DID estimate for the impact of liquidation reform, which is expected to be negative. The results are consistent with their predictions. (Continued )
page 4279
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4280
9.61in x 6.69in
b3568-v4-ch124
W. H. Greene & M. Liu (Continued )
References
Theoretical Models
Liu (2017)
ΔAncovi = β0 + β1 EPi + βj ΔX i + ε1
Summary This chapter uses the first-difference method to estimate the effect of engagement partner signature requirement on the number of analysts following the firms in the UK. The treatment effect would be the coefficient, β1 , in the model.
page 4280
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
Chapter 125
Using Smooth Transition Regressions to Model Risk Regimes∗ Liam A. Gallagher, Mark C. Hutchinson and John O’Brien Contents 125.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 125.2 Smooth Transition Regression . . . . . . . . . . . . 125.2.1 Smooth transition regression methodology 125.2.2 Specifying and estimating an STR model . 125.3 Applications of Smooth Transition Regression . . . 125.3.1 Convertible arbitrage . . . . . . . . . . . . 125.3.2 Application . . . . . . . . . . . . . . . . . . 125.3.3 Risk factor models . . . . . . . . . . . . . . 125.3.4 Data . . . . . . . . . . . . . . . . . . . . . 125.3.5 Empirical results . . . . . . . . . . . . . . . 125.4 Conclusions . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
4282 4285 4285 4286 4287 4287 4289 4289 4291 4293 4301
Liam A. Gallagher Dublin City University e-mail: [email protected] Mark C. Hutchinson University College, Cork e-mail: [email protected] John O’Brien University College, Cork e-mail: [email protected] ∗
Some of the analyses used in this chapter appear in the paper, “Does convertible arbitrage risk exposure vary through time?” which was published in Review of Pacific Basin Financial Markets and Policies, forthcoming. 4281
page 4281
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4282
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
Bibliography . . . . . . . . . . . . . . . . . . . . . Appendix 125A Smooth Transition Regression . 125A.1 Development of smooth transition regression . . . . . . . . . . . . . . 125A.2 Specification . . . . . . . . . . . . 125A.3 Model estimation . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
4302 4305
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4305 4308 4310
Abstract The smooth transition regression (STR) methodology was developed to model nonlinear relationships in the business cycle. We demonstrate the methodology can be used to analyse return series where exposure to financial market risk factors depends on market regime. The smooth transition between regimes inherent in STR is particularly appropriate for risk models as it allows for gradual transition of risk factor exposures. Variations in the methodology and tests its appropriateness are defined and discussed. We apply the STR methodology to model the risk of the return series of the convertible arbitrage (CA) hedge fund strategy. CA portfolios are comprised of instruments that have both equity and bond characteristics and alternate between the two depending on market level (state). The dual characteristics make the CA strategy a strong candidate for nonlinear risk models. Using the STR model, we confirm that the strategy’s risk factor exposure changes with market regime and, using this result, are able to account for the abnormal returns reported for the strategy in earlier studies. Keywords Regime switching • Smooth transition regression • Risk measurement • Hedge funds.
125.1 Introduction Traditionally both academic literature and industry practice have focused on linear models of the relationship between investment return and risk factors. However, as academic research generated a growing body of evidence for nonlinearity in the returns of hedge funds (and their underlying strategies), the validity of applying linear risk model to the analysis of hedge funds was questioned. Fung and Hsieh (1997) provide early evidence of nonlinear returns in hedge funds. This was quickly followed by a series of papers, including Liang (1999), Agarwal and Naik (2000), Fung and Hsieh (2001, 2004), Kat and Brooks (2001), Kat and Lu (2002), supporting the initial findings. Nonlinearity has been discussed in the context of Value at Risk (Bali et al., 2007), higher moments of equity returns (Agarwal et al., 2009), exposure to extreme events (Jiang and Kelly, 2012) and correlation risk (Buraschi et al., 2013). More recently, Agarwal et al. (2017b) has confirmed the extent of tail risk in hedge funds in an extensive analysis of a hedge fund database.
page 4282
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4283
In a parallel with generating evidence of nonlinearity in returns, researchers began to developed novel methods to model the relationships. One avenue of research modeled nonlinearity in a linear asset-pricing framework by using non-Gaussian risk factors, generally based on financial derivatives. Fung and Hsieh (2001, 2002, 2004) present evidence of hedge fund strategy payoffs sharing characteristics with look-back straddles. Mitchell and Pulvino (2001) document the returns from a merger arbitrage portfolio exhibiting similar characteristics to a short position in a stock index put option. Agarwal and Naik (2004) demonstrate the nonlinear relationship between hedge fund returns and proposed the use of option payoffs as risk factors. More recently, Hutchinson and Gallagher (2010) and Agarwal et al. (2011) construct factor portfolios to replicate the performance of the convertible arbitrage strategy. Both Agarwal et al. (2017a) and Chabi-Yo et al. (2018) develop measures of nonlinearity originating in the equity market, based on volatility of volatility and crash sensitivity respectively. An alternative approach utilizes models where the functional specification, rather than factor specification, captures nonlinear relationships. Rather than specifying factors with dynamic return distributions, these studies relax the assumption of a linear relationship between risk factors and hedge fund returns. Kazemi and Schneeweis (2003) explicitly address the dynamics in hedge fund trading strategies by specifying conditional models of hedge fund performance. They employ a stochastic discount factor model previously employed in the mutual fund literature. Alternately, Amin and Kat (2003) evaluate hedge funds from a contingent claims perspective, imposing no restrictions on the distribution of fund returns. Kat and Miffre (2008) employ a conditional model of hedge fund returns that allows the risk coefficients and alpha to vary. A more recent stream of literature has begun to link the performance to economic conditions. Kritzman et al. (2012) develop a measure of market turbulence, demonstrating that the performance of a variety of hedge fund strategies varies with this. Hutchinson and O’Brien (2014) show that the performance of time series momentum, a core hedge fund strategy, is related to economic conditions. Bruder et al. (2016) links regime shifts with skewness in returns. In a similar approach, Bali et al. (2014) use their measure of economic uncertainty as a risk factor to explain hedge fund returns. This chapter proposes a novel method to model the risk of hedge funds, smooth transition regression. Smooth transition regression (STR) is a well-defined methodology for modeling nonlinear relations that
page 4283
July 6, 2020
16:8
4284
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
incorporates two regimes1 and allows for a smooth transition between the two. Since first suggested by Chan and Tong (1986) and developed by Terasvirta and Anderson (1992) and Ter¨ asvirta (1994) it has been widely used for modeling nonlinear systems. STR has been specified extensively to model economic time series (see, for example, Sarantis, 1999; Skalin ¨ and Ter¨ asvirta, 1999; Ocal and Osborn, 2000; and Holmes and Maghrebi, 2004). Dahlhaus (2017) uses STR to model money supply in financial crises. Allegretet al. (2017) use the techniques to explore economic contagion. Bruneau and Cherfouh (2018) model the parameters of the property market. The model has been extended to model equity markets (see, for example, McMillan, 2001; Bradley and Jansen, 2004; Bredin and Hyde, 2008; Coudert et al., 2011; Aslanidis and Christiansen, 2012 and Andreou et al., 2016). In addition to a well-understood theory, STR models have two features that make them particularly suitable for modeling nonlinear risk in hedge fund returns. First, they incorporate two distinct regimes, ideal for modeling return where the risk exposure depends on market state. The performance of a range of trading strategies has been shown to be variant to changes in market state (Kritzman et al., 2012, Bowen and Hutchinson, 2016 and Daniel and Moskowitz, 2016). The second feature is smooth transition between states. Alternate models, such as the Hamilton (1989) Markov switching or the threshold autoregressive models introduced by Tong and Lim (1980) assume an abrupt change in market regime. The gradual transition between states better replicates financial markets where, due to many participants operating independently and at different time horizons, changes have been shown to be smooth rather than sharp. Merton (1987) attributes this to information asymmetry while Barberis and Thaler (2003) provide behavioral explanations. More recently, Mitchell et al. (2007) highlight the slow movement of capital, subsequently explained by institutional impediments (Duffie, 2010), capital imbalances (Duffie and Strulovici, 2012) and inter-temporal links with feedback (Fajgelbaum et al., 2017 and Kurlat, 2018). Finally, while this chapter is focused on modeling the risk and return of hedge funds, better understanding of these have implication across all aspects of the industry including portfolio selection (Agarwal and Naik, 2004), optimization (Nystrup et al., 2018) and governance (Shadab, 2013).
1
The methodology can be extended to model more than two regimes (see appendix) but that is outside the scope of the discussion here.
page 4284
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4285
125.2 Smooth Transition Regression 125.2.1 Smooth transition regression methodology The application of smooth transition regression models to risk modeling is now discussed.2 The method was first suggested in Chan and Tong (1986) to overcome the abrupt change of state implicit in the threshold autoregressive model and developed by Ter¨ asvirta (1994) for modeling nonlinearity in the business cycle. The standard linear model for estimating risk-adjusted performance is rt = β xt + εt ,
(125.1)
where rt is a funds return in excess of the risk-free rate at time t, β = (β0 , . . . , βm ) where β0 is the risk-adjusted excess return, and β1 to βm are the fund’s exposure to m risk factors and xt = (1, x1,t , . . . , xm,t ), where x1,t to xm,t are the return of m risk factors at time t. This becomes nonlinear with the addition of a time varying term. rt = β xt + ϕ xt f (zt ) + εt ,
(125.2)
ϕ
= (ϕ0 , . . . , ϕm ), zt is the transition variable and f (zt ) is transition where function. If f (zt ) is a smooth continuous function, the regression coefficient will change smoothly with zt . This is a smooth transition regression model. While there are several definitions of f (zt ) which are compatible with a smooth transition regression model, two are particularly useful and well understood, the logistic and exponential transition functions. These transition functions generate logistic (L-STR) and exponential (E-STR) smooth transition regression models, respectively. Figure 125.1 shows the general form of the two functions. The logistic transition function is defined as f (zt ) = [1 + exp(−γ(zt − C))]−1 ,
γ > 0,
(125.3)
where γ is the smoothness parameter (i.e., the slope of the transition function) and c is the threshold. Choosing this transition function yields the L-STR model. In the limit, as γ → 0 the L-STR model becomes linear, while as γ → ∞, transition is instant and the model is equivalent to threshold autoregressive models. For intermediate values of γ, the degree of decay depends upon the value of zt . As zt → −∞, f (zt ) → 0 and the behavior of rt is given by rt = β xt + et . As zt → +∞, f (zt ) → 1 and the behavior of rt is given by rt = (β + ϕ )xt + et . 2
A more detailed discussion is provided in Appendix A.
page 4285
July 6, 2020
16:8
4286
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien Panel A: Logistic Transition Function
Panel B: Exponential Transition Function
Figure 125.1: Logistic and exponential transition functions. Notes: The figure shows the form of the transition functions used in smooth transition regression. Panel A shows the logistic transition function, f (zt ) = [1 + exp(−γ(zt − c))]−1 , γ > 0 and Panel B shows the exponential function, f (zt ) = 1 − exp(−γ(zt − c)2 ), γ > 0.
The alternative exponential transition function is f (zt ) = 1 − exp(−γ(zt − c)2 ), γ > 0.
(125.4)
Again γ is the smoothness parameter while the function is symmetric around zt = c. This function produces the E-STR model. The E-STR model becomes linear for both γ → 0 and γ → 0 as f (zt ) becomes constant. Otherwise, the model displays nonlinear behavior. Unlike the L-STR model, E-STR is symmetric, as (zt − c) → ±∞, f (zt ) → 1 and the behavior of rt is given by rt = (β + ϕ )xt + et . As zt → c, f (zt ) → 0 and the behavior of rt is given by rt = β xt + et . 125.2.2 Specifying and estimating an STR model Granger and Terasvirta (1993) set out three stages to specify a STR model, specify a linear model, test the linearity and select the transition function. (a) Specification of a linear model. The initial step requires the specification of the linear model. The linear model should be appropriate to the strategy under investigation and is not dependent on the use of STR. rt = β xt + ut ,
(125.5) β
is the funds where rt is the excess return on the hedge fund index, exposure to the risk factors, including the intercept β0 , and xt is an n + 1 × t, consisting of a 1 × t constant array and an n × t matrix of risk factor returns.
page 4286
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4287
(b) Test of linearity. The second step involves testing linearity against STR models using the linear model specified in equation (125.5) as the null. This test is based on the auxiliary regression: ut = β0 xt + β1 xt zt + β2 xt zt2 + β3 xt zt3 + εt ,
(125.6)
where the values of ut are the residuals of the linear model specified in the first step and zt is the transition variable. The null hypothesis of linearity is H0 : β1 = β2 = β3 = 0.
(125.7)
This hypothesis test can also be used to select from among a number of candidates for the transition variable, zt . The linear model (equation (125.5)) and the auxiliary regression (equation (125.6)) are run for each potential variable. The hypothesis test is then run for each of these and the variable producing the smallest (significant) P -value selected. (c) Selection of L-STR or E-STR. The final step, assuming linearity has been rejected, is selection between L-STR and E-STR models. This is based on the results of the auxiliary regression (Equation (125.6)) and requires a series of nested F -tests. H3 : β3 = 0,
(125.8)
H2 : β2 = 0|β3 = 0,
(125.9)
H1 : β1 = 0|β2 = β3 = 0.
(125.10)
Rejecting H3 leads to an L-STR model. Accepting H3 and rejecting H2 indicates selecting an E-STR model. Accepting both H3 and H2 while rejecting H1 leads to an L-STR model. Granger and Terasvirta (1993) argue that strict application of this sequence of tests may lead to incorrect conclusions, where H3 is accepted and both H2 and H1 are rejected. They suggest estimating the P -values of the F -tests of H2 and H1 and selecting the STR model based on the lowest P -value for H2 or H1 to overcome this problem. 125.3 Applications of Smooth Transition Regression 125.3.1 Convertible arbitrage We now analyse the performance of a hedge fund strategy, both to demonstrate the use of STR and to highlight potential insights to be gained from its application. A number of hedge fund strategies have been shown to be sensitive to changes in market state, including cross-sectional momentum
page 4287
July 6, 2020
16:8
4288
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
(Cooper et al., 2004; Daniel and Moskowitz, 2016), merger arbitrage (Mitchell and Pulvino, 2001), time series momentum (Hutchinson and O’Brien, 2018) and pairs trading (Bowen and Hutchinson, 2016). Each of these is a good candidate for a nonlinear regime based risk model. However, we have chosen the Convertible Arbitrage (CA) as our sample strategy. Convertible Arbitrage is a strategy that holds a long position in convertible bonds and generally hedges the equity exposures with short positions in the corresponding equities. The underlying instrument, the convertible bond, has characteristics of both bond and equity securities. After the equity price rises, the value is dominated by the equity component, while after falls in the price of the equity, the relative value of the equity component falls, leaving the instrument with bond-like characteristics. The difference in instrument risk characteristics, depending of market level, makes convertible arbitrage an ideal candidate to test the STR methodology (Gallagher et al., 2018). From 1997 to mid-2007, hedge funds pursuing the Convertible Arbitrage strategy generated a Sharpe Ratio above 1.50 and assets under management grew from $5bn to $57bn.3 At their peak, CA funds accounted for 75% of the market in convertible bonds (Mitchell et al., 2007). Subsequently, during the financial crisis, CA was the second worst performing hedge fund strategy, losing 35% from September 2007 to December 2008.4 Since then, despite relative strong post 2008, performance the strategy has generally been shunned by institutional investors with assets under management (AUM), currently $19.6bn,5 remaining well below peak values. In studies of general hedge fund performance, both Capocci and H¨ ubner (2004) and Fung and Hsieh (2002) provide some evidence of CA performance. Capocci and H¨ ubner (2004) specify a linear factor model to model the returns of several hedge fund strategies and estimate that CA hedge funds earn an abnormal return of 0.4% per month. Fung and Hsieh (2002) estimate the CA hedge fund index generates alpha of 0.7% per month. Co¨en and H¨ ubner (2009) develop a higher moment estimation model to improve the accuracy of estimates of abnormal returns and, using this, demonstrate the abnormal return of CA strategies is underestimated using linear models. Focusing exclusively on CA hedge funds Hutchinson and Gallagher (2010) find evidence of individual fund abnormal performance but no abnormal
3
Source: BarclayHedge. HFRI Convertible Arbitrage Index. 5 As at year end 2017, source: BarclayHedge. 4
page 4288
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4289
returns in the hedge fund indices. There is significant evidence that companies generally under-price new issue securities, see Welch (1989) and Ritter and Welch (2002) for the US and Chen et al. (2002), Dimovski and Brooks (2004), Zheng et al. (2005) and Nguyen et al. (2010) for global evidence. A number of studies produce evidence of this behavior in the context of convertible bonds, a potential source of the strategy’s profits. Choi et al. (2010) show that CA funds are the dominant purchasers of new issues and consequently as suppliers of capital to issuers. Chin et al. (2005) document predictable market patterns around issuance. Chan and Chen (2007) provide evidence of consistent under-pricing of new issues. Similarly, Agarwal et al. (2011) interpret the abnormal returns they find following new issue convertible bonds as evidence of under-pricing. By identifying a change in risk exposure in different equity market regimes, we demonstrate an appropriate functional model to better explain CA risk and, with this, isolate the true skill of hedge fund managers pursuing CA strategies. Holding a long position in a convertible bond and a corresponding short position in the underlying stock, CA funds are hedged against equity market risk but are left exposed to default and term structure risk. Agarwal and Naik (2004) provide evidence that CA hedge fund indices’ returns are positively related to the payoff from a short equity index option, highlighting the nonlinearity of their returns. 125.3.2 Application In this section, we discuss the risk factor models and the STR methodology specified in this study to model CA returns. The nature of a convertible bond, the underlying instrument leads us to specify the excess return of the US equity market as the transition variable.6 125.3.3 Risk factor models The appropriate linear risk model is a function of the strategy under investigation and independent of the use of STR. This investigation focuses on two, the generic Fung and Hsieh (2004) model, which is widely used across all classes of hedge fund and the convertible arbitrage specific model, developed by Agarwal et al. (2011), described in Table 125.1. 6
Specifically, we use the Fama–French excess market return (RMRF), sourced from Kenneth French’s website.
page 4289
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
4290 Table 125.1: returns.
9.61in x 6.69in
Summary statistics and correlation matrix of factors used to analyse fund μ
σ
SR
Skew
Kurt
Panel A: Fung and Hsieh factors SNPRF 6.17 SCMLC −0.50 BD10RET −3.66 BAAMTSY 1.65 PTFSBD −15.54 PTFSFX −6.38 PTFSCOM −5.46
15.52 11.79 24.67 19.81 53.55 68.11 47.43
0.40 −0.04 −0.15 0.08 −0.29 −0.09 −0.12
−0.64 −0.29 0.17 −0.22 1.40 1.35 1.14
3.91 7.82 5.68 6.71 5.53 5.47 5.12
Panel B: Agarwal et al. factors VG 5.03 X 1.53
13.01 9.31
0.39 0.16
−0.94 −0.53
6.85 6.31
Panel C: Fama and French market factor RMRF 6.27 15.97
0.39
−0.69
3.94
Panel D: Correlation matrix
1.00 0.36
RMRF
1.00 0.59 0.89
X
1.00 −0.89 1.00 −0.31 0.34 1.00 −0.17 0.26 0.26 1.00 −0.16 0.22 0.21 0.37 1.00 0.21 −0.44 −0.24 −0.20 −0.17 0.03 −0.19 −0.13 −0.09 −0.14 0.24 −0.40 −0.26 −0.21 −0.17
VG
PTFSCOM
PTFSFX
PTFSBD
1.00 −0.17 0.20 0.12 0.02 0.06 −0.38 −0.48 −0.21
BAAMTSY
SCMLC
1.00 −0.07 0.22 −0.38 −0.25 −0.21 −0.17 0.83 0.28 0.99
BD10RET
SNPRF SNPRF SCMLC BD10RET BAAMTSY PTFSBD PTFSFX PTFSCOM VG X RMRF
1.00
Notes: The summary statistics are the mean monthly return, μ, standard deviation of monthly returns, σ, the Sharpe Ratio, SR, the skewness, Skew and the excess kurtosis, Kurt. SNPRF is the excess total return on the S&P 500, SCMLC is the return on small capitalization minus the return on large capitalization stocks. BD10RET is the excess return on the 10 year US T-Bond. BAAMTSY is the return on BAA rated bonds minus the return on the 10 Year Bond. PTFSBD, PTFSFX and PTFSCOM are the return on trend following factors for Bonds, FX and Commodities. VG is the excess return on the Vanguard convertible bond mutual fund. X is the return on the delta neutral hedge portfolio of convertible bonds. RMRF is the excess return on US equities. The sample period is January 1994 to September 2012.
page 4290
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4291
125.3.3.1 Fung and Hsieh (2004) model The Fung and Hsieh (2004) model is designed to capture the risks in a broad portfolio of hedge funds. The model contains two equity risk factors, two fixed income risk factors and three option-based risk factors. The equity factors are SNPRF, the excess total return on the Standard & Poor’s 500 index, and SCMLC, the Size Spread Factor (Russell 2000– S&P 500 monthly total return).7 The two bond-oriented risk factors are BD10RET, the monthly change in the 10-year treasury constant maturity yield (month end-to-month end), and BAAMTSY, a credit spread factor (the monthly change in the Moody’s Baa yield less 10-year treasury constant maturity yield (month end-to-month end)). Finally, the three option based factors are derived from option prices of futures contracts from three underlying markets, specifically Bond (PTFSBD), Currency (PTFSFX) and Commodity (PTFSCOM).8 125.3.3.2 Agarwal et al. (2011) model The Agarwalet al. (2011) model is defined specifically for Convertible Arbitrage. The authors specify two factors corresponding to the two return drivers for convertible arbitrage investors, buy-and-hedge and buy-and-hold. The buy-and-hedge strategy, which they term the X factor, is constructed as a long position in a portfolio of convertible bonds combined with a delta neutral hedged short position in a portfolio of equities. This hedged position is dynamically rebalanced daily. Agarwal et al. (2011) use a custom dataset of convertible bonds and issue weighed equities for their hedged portfolio. In this study, we use a long position in the Merrill Lynch Convertible Securities Index combined with a dynamically hedged short position in the S&P 500 future. We use the return series of the Vanguard Convertible Securities mutual fund (VG) to proxy for the performance of a passive buy-and-hold component of the strategy, as specified in Agarwal et al. (2011). 125.3.4 Data The convertible arbitrage return series used in the study are obtained from two sources. We create one set from a database of CA fund returns, while the other set consists of commercially available hedge fund indices. 7
This is the most up to date definition of the size factor, the original paper defines the factor as (Wilshire Small Cap 1750 — Wilshire Large Cap 750 monthly total return). 8 Details on the construction of the option based factors are available in Fung and Hsieh (2001) and the data from http://faculty.fuqua.duke.edu/∼dah7/DataLibrary/TF-FAC.xls.
page 4291
July 6, 2020
16:8
4292
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
Our fund database is created from the monthly net-of-fee returns of live and dead funds in the union of the Bloomberg, HFR and Lipper/TASS databases from January 1994 to September 2012.9 The three databases contain 728 funds in total classified as CA. However, this broad sample contains multiple share classes of the same fund and there are significant overlaps across the three databases. Our sample is reduced to 288 unique funds after allowing for this and further removing funds that report only gross returns and funds that do not report monthly returns. Finally, we remove funds with less than a 24-month return history, leaving a sample of 254 funds.10 The merged database is used to create four portfolios. The first portfolio (EQL) is an equally weighted portfolio of all CA funds. We then create three sub-portfolios, based on size. We rank hedge funds each month based on t−1 assets under management, before dividing the sample into tercile portfolios. We calculate the equal weighted return each of these portfolios to produce three return series, based on large, medium and small funds. We denote these LRG, MID and SML respectively. We also specify four CA Indices to ensure the results are robust. These are the CSFB Tremont CA Index, the HFRI CA Index, the Barclay Group CA Index and the CISDM CA Index. The CSFB Tremont CA Index is an asset weighted index (rebalanced quarterly) of CA hedge funds beginning in 1994, the CISDM CA Index represents the median fund performance, whereas the HFRI and Barclay Group CA Indices are both equally weighted indices of fund performance. The Barclay Group index begins on January 1997 and all other series beginning in January 1994. We report descriptive statistics of the eight hedge fund series in Table 125.2 and their cumulative returns in Figure 125.2. Mean returns range from 5.8% (SML) to 9.1% (MID) and the annualized standard deviations of the series are typically in the range of 5% to 7%, with the exception of SML where it is much larger at 13%. All the hedge fund series, with the exception of SML, have a Sharpe ratio greater than 1. Also notable is the large negative skewness and excess kurtosis reported for all series. This is the first evidence that the returns of the CA strategy have non-normal statistical characteristics. This characteristic of CA is captured quite dramatically in the cumulative returns for the series, reported in Figure 125.2. The financial crises period 9
The database vendors typically do not keep information on funds that died before December 1993 that may lead to survivorship bias. To avoid this, our sample of fund returns begins in January 1994. 10 CA has had significant attrition rates, particularly during the 2008 financial crisis. This is very evident in our sample with only 52 unique live funds at the end of the period.
page 4292
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
Using Smooth Transition Regressions to Model Risk Regimes Table 125.2:
4293
Summary statistics of hedge fund returns.
μ Panel A: Hedge fund portfolios EQL 7.91 SML 5.83 MID 9.14 LRG 8.64 Panel B: Hedge fund indices HFRI 7.80 CSFB Tremont 7.50 BarclayHedge 8.07 CISDM 8.64
σ
SR
Skew
Kurt
6.76 13.05 6.69 5.82
1.17 0.45 1.37 1.48
−2.46 −1.28 −2.19 −2.73
19.58 17.11 17.50 21.01
7.23 6.87 6.71 5.09
1.08 1.09 1.20 1.70
−2.80 −2.68 −2.69 −3.45
26.87 19.01 22.47 29.42
Notes: The summary statistics are the mean monthly return, μ, standard deviation of monthly returns, σ, the Sharpe Ratio, SR, the skewness, Skew and the excess kurtosis, Kurt. EQL is an equally weighted portfolio of convertible arbitrage hedge funds from the unified database, LRG, MED & SML are equal weighted portfolios of large, medium and small size (assets under management) convertible arbitrage hedge funds from the unified CA database. HFRI is the HFR Convertible Arbitrage Index of hedge funds, CSFB is the CSFB Tremont Convertible Arbitrage Index of hedge funds, BCLY is the Barclay Group Convertible Arbitrage Index of hedge funds and CISDM is the CISDM Convertible Arbitrage Index of hedge funds. The sample period is January 1994 to September 2012.
from mid-2007 to late 2008 is a period of extremely poor performance for CA with investors losing between 30% and 50% of historical cumulative returns in an extremely short period. It is also quite notable that since the start of 2009 performance has been strong, with all portfolios (except SML) surpassing their previous peaks by early 2010. 125.3.5 Empirical results The empirical results of the analysis are presented here. Results are generated for the eight CA return series and for the two risk models. We use RATS (Regression Analysis of Time Series) software to estimate the model. RATS specifies the Marquardt variation of Gauss–Newton method to solve the nonlinear least squares regression. 125.3.5.1 Linear model We first estimate the linear regression model. This is required to examine the base case and to provide data for the auxiliary regression required for the hypothesis test to specify the form of the model. Results for the linear factor model are presented in Table 125.3.
page 4293
July 6, 2020
16:8
4294
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
Panel A: Hedge fund portfolios
(a)
Panel B: Hedge fund indices
(b)
Figure 125.2: Cumulative returns of the convertible arbitrage series. Notes: This figure plots the cumulative returns for each of the convertible arbitrage series over the sample period January 1994 to September 2012. EQL is an equally weighted portfolio of convertible arbitrage hedge funds from the unified database, LRG, MED & SML are equal weighted portfolios of large, medium and small size (assets under management) convertible arbitrage hedge funds from the unified CA database. HFRI is the HFR Convertible Arbitrage Index of hedge funds, CSFB is the CSFB Tremont Convertible Arbitrage Index of hedge funds, BCLY is the Barclay Group Convertible Arbitrage Index of hedge funds and CISDM is the CISDM Convertible Arbitrage Index of hedge funds.
page 4294
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
Using Smooth Transition Regressions to Model Risk Regimes Table 125.3:
4295
Linear model.
Panel A: Fung and Hsieh model α
βSNPRF βSCMLC βBD10RET βBAAMTSY βPTFSBD βPTFSFX βPTFSCOM
¯2 R
EQL SML MED LRG
0.29 0.04 0.39 0.38
0.16 0.30 0.16 0.11
−0.08 −0.12 −0.11 −0.04
−0.24 −0.39 −0.23 −0.22
−0.38 −0.70 −0.37 −0.32
−0.01 0.00 −0.01 −0.01
0.00 0.00 0.00 −0.01
0.00 −0.01 0.00 0.00
67% 61% 67% 53%
HFRI CSFB BCLY CISDM
0.31 0.31 0.38 0.41
0.09 0.03 0.05 0.06
−0.03 −0.03 −0.03 −0.03
−0.30 −0.28 −0.29 −0.23
−0.48 −0.44 −0.43 −0.33
−0.01 −0.01 −0.01 0.00
0.00 0.00 0.00 0.00
−0.01 −0.01 −0.01 −0.01
57% 46% 55% 54%
Panel B: Agarwal et al. model
EQL SML MED LRG HFRI CSFB BCLY CISDM
α
βX
βVG
¯2 R
0.23 −0.09 0.34 0.34
0.39 0.76 0.38 0.27
0.10 0.03 0.09 0.13
69% 58% 67% 56%
0.25 0.25 0.31 0.36
0.32 0.24 0.25 0.22
0.13 0.18 0.17 0.11
48% 38% 47% 48%
Notes: This table reports the OLS estimation of the linear models. Coefficients in bold are significant at the 5% level. Panel A reports results for the Fung and Hsieh model. Panel B reports results for the Agarwal et al. model. EQL is an equally weighted portfolio of convertible arbitrage hedge funds from the unified database, LRG, MED and SML are equal weighted portfolios of large, medium and small size (assets under management) convertible arbitrage hedge funds from the unified CA database. HFRI is the HFR Convertible Arbitrage Index of hedge funds, CSFB is the CSFB Tremont Convertible Arbitrage Index of hedge funds, BCLY is the Barclay Group Convertible Arbitrage Index of hedge funds and CISDM is the CISDM Convertible Arbitrage Index of hedge funds. The sample period is January 1994 to September 2012.
The table lists the factor coefficients for both the Fung and Hsieh (2004) and the Agarwal et al. (2011) models. Both perform well for both the CA fund portfolios and indices, with adjusted R2 values ranging from 38% to 69%. For the Fung and Hsieh (2004) model the equity and bond market risk factors exhibit statistical significance, whereas, using the Agarwal et al. (2011) models, both X, the delta neutral hedged CA factor, and VG, the long only convertible bond factor, are statistically significant for all the series (with the exception of X for SML). From a practitioner’s perspective, the most important coefficient is the intercept, which is a measure of skill by CA hedge fund managers (Jensen, 1968). All of the hedge fund portfolios and hedge fund indices have significantly positive alpha, with the exception of SML, the portfolio of small CA hedge funds, which is insignificantly different from zero for the Fung and Hsieh model and significantly negative for the Agarwal et al. model.
page 4295
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
4296
The evidence from the linear model suggests that managers pursuing the CA strategy do generate alpha for their investors, with the exception of managers with relatively limited assets under management. Later we will consider results for the nonlinear model to see if these conclusions hold. 125.3.5.2 Linearity tests The linearity tests for each of the series are displayed in Table 125.4. For both factor models, linearity (H0 ) is rejected for all of the hedge fund series. Generally, H1 and H2 are rejected for all series, with the exception of CSFB for the Fung and Hsieh model. H3 is rejected for the majority of series indicating an L-STR model. H3 is not rejected for three of the series (SML, LRG and CSFB) in the Agarwal et al. model and one of the series (BCLY) for the Fung and Hsieh model but both H1 and H2 are rejected, indicating either L-STR or E-STR. Table 125.4: EQL
EQL
SML
Linearity and STR tests.
MID
LRG
HFRI
CSFB
BCLY
Panel A: Fung and Hsieh model 0.00 0.00 0.00 H0 H3 0.00 0.01 0.00 0.00 0.01 0.00 H2 0.00 0.03 0.02 H1
0.00 0.00 0.04 0.02
0.00 0.00 0.05 0.00
0.00 0.00 0.02 0.19
0.02 0.11 0.06 0.10
0.00 0.00 0.01 0.00
Panel B: Agarwal 0.00 H0 0.06 H3 H2 0.00 0.00 H1
0.00 0.12 0.00 0.00
0.00 0.04 0.00 0.00
0.00 0.25 0.00 0.00
0.00 0.05 0.00 0.00
0.00 0.00 0.00 0.00
et al. model 0.00 0.00 0.25 0.06 0.07 0.00 0.00 0.00
CISDM
Notes: The table presents results from a sequence of F -tests carried out for each of the convertible arbitrage series following estimation of the following auxiliary regression, ut = β0 xt + β1 xt zt β2 xt zt2 + β3 xt zt3 + εt . In Panel A (Panel B) for each convertible arbitrage series ut are the residuals from estimating the Fung and Hsieh (Agarwal et al.) linear model, zt is RMRF, the excess equity market return and xt is an n × t matrix of risk factors, where n is the number of factors in the Fung and Hsieh (Agarwal et al.) model. The null hypothesis of linearity is H0 : β1 = β2 = β3 = 0. The selection between L-STR and E-STR models is based on the following series of nested F -tests. H3 : β3 = 0 H2 : β2 = 0|β3 = 0 H1 β1 = 0|β2 = β3 = 0 P -Values in bold are significant at the 5% level.
page 4296
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4297
Taken together the results of the STR tests suggest that the most appropriate nonlinear model is L-STR. 125.3.5.3 Smooth transition regression model The results of the estimation of the L-STR model are presented in Table 125.5. Panels A and B show the results for the Fung and Hsieh (2004) and the Agarwal et al. (2011) risk models, respectively. Consistent with theoretical expectations, the estimated parameters of the L-STR model provide evidence of the existence of a nonlinear relationship between CA returns and explanatory risk factors; this result is consistent across all eight of the CA return series. We identify two alternate risk regimes for the strategy.11 Figure 125.3 shows plots of the transition function against the transition variable (Panel A) and time (Panel B). The first regime is defined by the transition variable, zt , being less than the threshold constant, c, i.e., the current month’s excess equity market returns are below the threshold level. This regime is characterized by statistically significant positive abnormal returns (alpha). From Figure 125.2, Panel B, it is clear that this regime coincides with incidences of market stress, with a corresponding decrease in liquidity, such as the 1994 Peso crisis, the 1998 Asian currency crisis, the 2001 Dotcom crash and the 2007–2008 financial crisis. The second risk regime is defined by the transition variable, zt , being greater than the threshold constant, c, i.e., the current month’s excess equity market returns are above the threshold level and regime is characterized by statistically significant negative abnormal returns (alpha) and is associated with relatively benign financial markets. The relationship between the CA return series and the risk factors diverges between the two regimes. In the case of the Fung and Hsieh model, the relationship between the return series and the equity size spread, bond yield and credit spread is significantly negative for all series in the high alpha regime; while in the low alpha regime the relationship is positive in all cases and statistically significant in twenty out of twenty four. The magnitude of the relationship is also greater in the high alpha regime in all cases. A similar pattern is seen in the case of the Agarwal et al. model, where the exposure to the risk factors changes sign and increases in magnitude when moving from the low alpha to the high alpha regime. 11
For estimation convergence we set c = 0.00 for the Agarwal et al. (2011) model and c = −0.02 for the Fung and Hsieh (2004) model.
page 4297
July 6, 2020
Table 125.5:
Smooth transition regression (STR) model.
zt < c
zt > c
−0.35 −0.62 −0.31 −0.35
−0.52 0.00 0.00 0.00 −0.91 0.00 0.00 −0.01 −0.47 −0.01 0.00 0.00 −0.49 −0.01 −0.01 0.00
−0.92 −0.61 −1.39 −0.81
0.07 0.19 0.03 0.08
0.10 0.19 0.07 0.07
0.21 0.38 0.15 0.24
0.26 −0.01 0.39 −0.01 0.21 0.00 0.31 0.00
0.00 0.01 0.00 0.00
0.01 0.02 0.01 0.00
−0.02 −0.02 −0.02 −0.02
115.3 111.3 81.8 100.7
71% 63% 70% 58%
HFRI CSFB BCLY CISDM
1.34 1.60 1.50 1.30
0.14 0.14 0.09 0.10
−0.14 −0.13 −0.14 −0.09
−0.44 −0.37 −0.38 −0.35
−0.66 −0.01 −0.57 −0.01 −0.61 0.00 −0.50 0.00
−1.49 −1.75 −1.72 −1.35
0.09 0.01 0.12 0.07
0.13 0.13 0.13 0.06
0.26 0.18 0.19 0.24
0.34 0.00 0.00 0.00 −0.02 0.26 0.00 −0.01 0.02 −0.02 0.33 −0.01 0.00 −0.01 −0.02 0.31 0.00 0.00 −0.01 −0.02
92.7 87.3 64.8 62.2
62% 50% 61% 60%
γ
¯2 R
Panel B: Agarwal et al. model zt < c
zt > c
βx
βVG
α
βx
βVG
c
γ
¯2 R
EQL SML MED LRG
0.92 0.93 1.02 1.00
0.52 0.95 0.51 0.41
0.12 0.05 0.11 0.15
−0.93 −1.46 −0.94 −0.82
−0.09 −0.11 −0.09 −0.12
−0.09 −0.11 −0.08 −0.07
0.00 0.00 0.00 0.00
262.4 248.2 179.6 329.1
72% 60% 69% 59%
HFRI CSFB BCLY CISDM
2.48 3.45 3.16 2.54
0.60 0.61 0.64 0.50
0.19 0.19 0.10 0.11
−4.10 −5.97 −4.20 −3.18
−0.09 −0.18 −0.19 −0.14
−0.30 −0.25 −0.11 −0.13
0.00 0.00 0.00 0.00
25.2 19.6 34.1 34.4
54% 48% 57% 58%
b3568-v4-ch125
Notes: This table reports the NLLS estimation of the logistic smooth transition regression models for each of the convertible arbitrage series. Coefficients in bold are significant at the 5% level. Panel A reports results for the Fung and Hsieh model. Panel B reports results for the Agarwal et al. model. EQL is an equally weighted portfolio of convertible arbitrage hedge funds from the unified database, LRG, MED and SML are equal weighted portfolios of large, medium and small size (assets under management) convertible arbitrage hedge funds from the unified CA database. HFRI is the HFR Convertible Arbitrage Index of hedge funds, CSFB is the CSFB Tremont Convertible Arbitrage Index of hedge funds, BCLY is the Barclay Group Convertible Arbitrage Index of hedge funds and CISDM is the CISDM Convertible Arbitrage Index of hedge funds. The sample period is January 1994 to September 2012.
9.61in x 6.69in
α
Handbook of Financial Econometrics,. . . (Vol. 4)
βPTFSCOM
−0.16 −0.26 −0.17 −0.10
c
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
βPTFSFX
βPTFSBD
βBD10RET
0.18 0.24 0.24 0.12
0.00 0.00 0.00 −0.01 0.00 0.01 0.00 0.00
βBAAMTSY
βSCMLC
0.91 0.30 1.40 0.92
α
βSNPRF
βPTFSCOM
βPTFSFX
βPTFSBD
βBD10RET
βBAAMTSY
βSCMLC
βSNPRF
EQL SML MED LRG
α
16:8
4298
Panel A: Fung and Hsieh model
page 4298
July 6, 2020
Smooth transition regression (STR) model — Unsmoothed series.
16:8
Table 125.6: Panel A: Fung and Hsieh model
βPTFSFX
βPTFSCOM
βSNPRF
βSCMLC
βBD10RET
βBAAMTSY
βPTFSBD
βPTFSFX
βPTFSCOM
−0.11 −0.07 −0.12 −0.05
−0.43 −1.17 −0.36 −0.41
−0.56 −1.13 −0.49 −0.51
−0.01 −0.04 −0.01 −0.01
−0.01 −0.03 −0.01 −0.02
−0.02 −0.12 −0.02 −0.02
−1.26 1.73 −1.93 −0.85
0.15 0.69 0.07 0.14
−0.03 −0.20 −0.08 −0.04
0.28 1.38 0.19 0.27
0.28 0.82 0.22 0.29
0.00 0.05 0.01 0.00
0.01 0.06 0.01 0.01
0.04 0.21 0.03 0.02
−0.02 −0.02 −0.02 −0.02
77.0 17.7 69.6 93.9
67% 53% 71% 52%
−0.09 −0.03 −0.08 −0.06
−0.51 −0.43 −0.43 −0.37
−0.70 −0.01 −0.01 −0.60 −0.02 0.00 −0.63 −0.01 −0.01 −0.50 0.00 −0.01
−0.02 −0.05 −0.01 −0.01
−1.64 −2.27 −1.40 −1.30
0.16 0.04 0.08 −0.04 0.14 0.02 0.09 0.01
0.31 0.22 0.20 0.25
0.36 0.21 0.31 0.30
0.00 0.01 0.01 −0.01 0.00 0.01 0.00 0.01
0.03 0.07 0.01 0.01
−0.02 −0.02 −0.02 −0.02
100.8 76.8 95.6 72.8
56% 37% 57% 58%
α
c
γ
¯2 R
Panel B: Agarwal et al. model zt < c βx
EQL SML MED LRG
1.54 1.67 1.66 1.66
0.71 1.38 0.71 0.54
HFRI CSFB BCLY CISDM
2.61 3.13 3.32 2.46
0.70 0.71 0.71 0.51
α
βx
βVG
c
γ
¯2 R
0.12 −0.14 0.10 0.25
−1.81 −2.36 −1.85 −1.85
−0.09 −0.21 −0.11 −0.09
−0.12 0.07 −0.09 −0.19
0.00 0.00 0.00 0.00
64.4 115.4 52.8 46.4
71% 52% 73% 58%
0.26 0.34 0.15 0.14
−3.31 −4.10 −4.51 −3.06
−0.11 −0.13 −0.16 −0.10
−0.32 −0.36 −0.16 −0.17
0.00 0.00 0.00 0.00
51.2 47.3 34.9 37.7
55% 43% 60% 59%
βVG
b3568-v4-ch125
4299
Note: This table reports the NLLS estimation of the logistic smooth transition regression models for each of the unsmoothed convertible arbitrage series. Returns are unsmooth using the Getmansky et al. (2004) methodology. Coefficients in bold are significant at the 5% level. Panel A reports results for the Fung and Hsieh model. Panel B reports results for the Agarwal et al. model. The sample period is January 1994 to September 2012.
9.61in x 6.69in
α
zt > c
Using Smooth Transition Regressions to Model Risk Regimes
βPTFSBD
0.16 0.20 0.11 0.11
βBAAMTSY
1.30 1.78 1.22 1.23
βBD10RET
HFRI CSFB BCLY CISDM
0.96 0.25 −1.07 −0.51 1.64 0.35 0.81 0.16
βSCMLC
EQL SML MED LRG
βSNPRF
α
zt > c
Handbook of Financial Econometrics,. . . (Vol. 4)
zt < c
page 4299
July 6, 2020
16:8
4300
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
Panel A: Transition function f(zt) against transition variable
Panel B: Transition function f(zt) against time
Figure 125.3: Transition function for the smooth transition regression (L-STR) models. Notes: Panel A plots the transition function f (zt ) against the transition variable zt , where zt is RMRF, the excess return on the aggregate US equity market. The transition function is defined as f (Zt ) = [1 + exp(−γ(zt − c))]−1 . Panel B plots the transition function against time. The sample period is January 1994 to September 2012.
In Table 125.6, we repeat the analysis of Table 125.5 using the Getmansky et al. (2004) specification to unsmooth hedge fund returns. According to Getmansky et al. (2004), the most likely reason for the serial correlation observed in hedge funds returns is due to illiquidity exposure. Hedge funds, particularly those engaged in CA trade in securities which are not actively traded and whose market prices are not readily available. To remove the effects of serial correlation induced by illiquidity exposure, we estimate θ0 , θ1 , and θ2 from their model using maximum likelihoods for each of the CA portfolios and then use these estimates to unsmooth fund returns. Then,
page 4300
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4301
we re-do the LSTAR analysis on the unsmoothed sample of CA hedge fund returns. The results are almost identical to those reported in Table 125.5.
125.4 Conclusions The analysis provides evidence of a nonlinear relationship between CA hedge fund return series and risk factors. The first evidence is the rejection of the hypothesis of a linear relationship. This is strengthened when the L-STR model provides a satisfactory description of the nonlinearity found in CA ¯ 2 values) hedge fund returns and with a superior explanatory power (higher R to linear models. These results are consistently across multiple variations of the convertible arbitrage return series and risk factors. Previous research identified one risk regime for CA. The success of the L-STR model in explaining the returns provides evidence of two risk regimes, defined as a function of market return. When excess equity market returns move below the threshold level convertible arbitrage securities, hybrid fixed income/equity instruments in aggregate become more fixed income like in their characteristics (as their probability of being converted to equity decrease). This is reflected in increased exposure to the fixed income risk factors (default and term structure risk) during this regime. We use the results to provide insights to the performance of CA fund managers in different regimes and show the skill of managers is better evaluated when considered in a nonlinear framework. Prior research has documented funds following the strategy generating either significantly positive alpha or alpha insignificant from zero. Using the STR model, we identify two distinct regimes. There is a positive alpha regime, with higher fixed income risk when equity returns are below a threshold level, and a negative alpha regime, with lower fixed income risk when excess equity market returns are above a threshold level. Convertible arbitrageurs outperform a passive investment in risk factors in relatively volatile financial markets when arbitrageurs are more exposed to more risk. Historically the switch into the positive alpha regime coincides with several severe financial crises. While these insights to the performance characteristics are interesting in themselves, that is not the purpose of the analysis. Instead, it was to demonstrate the methodology and effectiveness of smooth transition regressions to model nonlinear risks. The demonstration of L-STR model’s power to improve explanation of the return series of the convertible arbitrage strategy and provide insights into the risk exposure and manager performance of
page 4301
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4302
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
the strategy shows the potential of the methodology to improve understanding of risk and return across the range of hedge fund strategies.
Bibliography Agarwal, V., Arisoy, Y.E. and Naik, N.Y. (2017a). Volatility of Aggregate Volatility and Hedge Fund Returns. Journal of Financial Economics 125, 491–510. Agarwal, V., Bakshi, G. and Huij, J. (2009). Do Higher-Moment Equity Risks Explain Hedge Fund Returns? Centre for Financial Research Working Paper 10–07. Agarwal, V., Fung, W.H., Loon, Y.C. and Naik, N.Y. (2011). Risk and Return in Convertible Arbitrage: Evidence from the Convertible Bond Market. Journal of Empirical Finance 18, 175–194. Agarwal, V. and Naik, N.Y. (2000). Multi-Period Performance Persistence Analysis of Hedge Funds. Journal of Financial and Quantitative Analysis 35, 327–342. Agarwal, V. and Naik, N.Y. (2004). Risks and Portfolio Decisions Involving Hedge Funds. Review of Financial Studies 17, 63–98. Agarwal, V., Ruenzi, S. and Weigert, F. (2017b). Tail Risk in Hedge Funds: A Unique View from Portfolio Holdings. Journal of Financial Economics 125, 610–636. Allegret, J.-P., Raymond, H. and Rharrabti, H. (2017). The Impact of the European Sovereign Debt Crisis on Banks Stocks: Some Evidence of Shift Contagion in Europe. Journal of Banking & Finance 74, 24–37. Amin, G.S. and Kat, H.M. (2003). Hedge Fund Performance 1990–2000: Do the “Money Machines” Really Add Value? Journal of Financial and Quantitative Analysis 38, 251–274. Andreou, P.C., Louca, C. and Savva, C.S. (2016). Short-Horizon Event Study Estimation with a STAR Model and Real Contaminated Events. Review of Quantitative Finance and Accounting 47, 673–697. Aslanidis, N. and Christiansen, C. (2012). Smooth Transition Patterns in the Realized Stock–Bond Correlation. Journal of Empirical Finance 19, 454–464. Bali, T.G., Brown, S.J. and Caglayan, M.O. (2014). Macroeconomic Risk and Hedge Fund Returns. Journal of Financial Economics 114, 1–19. Bali, T.G., Gokcan, S. and Liang, B. (2007). Value at Risk and the Cross-Section of Hedge Fund Returns. Journal of Banking & Finance 31, 1135–1166. Barberis, N. and Thaler, R. (2003). A Survey of Behavioral Finance. Handbook of the Economics of Finance 1, 1053–1128. Bowen, D.A. and Hutchinson, M.C. (2016). Pairs Trading in the UK Equity Market: Risk and Return. European Journal of Finance 22, 1363–1387. Bradley, M.D. and Jansen, D.W. (2004). Forecasting with a Nonlinear Dynamic Model of Stock Returns and Industrial Production. International Journal of Forecasting 20, 321–342. Bredin, D. and Hyde, S. (2008). Regime Change and the Role of International Markets on the Stock Returns of Small Open Economies. European Financial Management 14, 315–346. Breusch, T.S. and Pagan, A.R. (1980). The Lagrange Multiplier Test and Its Applications to Model Specification in Econometrics. Review of Economic Studies 47, 239–253. Bruder, B., Kostyuchyk, N. and Roncalli, T. (2016). Risk Parity Portfolios with Skewness Risk: An Application to Factor Investing and Alternative Risk Premia, SSRN.
page 4302
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4303
Bruneau, C. and Cherfouh, S. (2018). Modelling the Asymmetric Behaviour of Property Yields: Evidence from the UK Office Market. Journal of Property Research 35, 1–27. Buraschi, A., Kosowski, R. and Trojani, F. (2013). When There is No Place to Hide: Correlation Risk and the Cross-Section of Hedge Fund Returns. Review of Financial Studies 27, 581–616. Capocci, D. and H¨ ubner, G. (2004). Analysis of Hedge Fund Performance. Journal of Empirical Finance 11, 55–89. Chabi-Yo, F., Ruenzi, S. and Weigert, F. (2018). Crash Sensitivity and the Cross Section of Expected Stock Returns. Journal of Financial and Quantitative Analysis 53, 1059–1100. Chan, A.W. and Chen, N.-F. (2007). Convertible Bond Underpricing: Renegotiable Covenants, Seasoning, and Convergence. Management Science 53, 1793–1814. Chan, K.-S. (1993). Consistency and Limiting Distribution of the Least Squares Estimator of a Threshold Autoregressive Model. Annals of Statistics 21, 520–533. Chan, K.S. and Tong, H. (1986). On Estimating Thresholds in Autoregressive Models. Journal of Time Series Analysis 7, 179–190. Chen, A., Hung, C.C. and Wu, C.-S. (2002). The Underpricing and Excess Returns of Initial Public Offerings in Taiwan Based on Noisy Trading: A Stochastic Frontier Model. Review of Quantitative Finance and Accounting 18, 139–159. Chin, C.-L., Lin, T.T. and Lee, C.-C. (2005). Convertible Bonds Issuance Terms, Management Forecasts, and Earnings Management: Evidence from Taiwan Market. Review of Pacific Basin Financial Markets and Policies 8, 543–571. Choi, D., Getmansky, M., Henderson, B. and Tookes, H. (2010). Convertible Bond Arbitrageurs as Suppliers of Capital. Review of Financial Studies 23, 2492–2522. Co¨en, A. and H¨ ubner, G. (2009). Risk and Performance Estimation in Hedge Funds Revisited: Evidence from Errors in Variables. Journal of Empirical Finance 16, 112–125. Cooper, M.J., Gutierrez, R.C. and Hameed, A. (2004). Market States and Momentum. Journal of Finance 59, 1345–1365. Coudert, V., Couharde, C.and Mignon, V. (2011). Exchange Rate Volatility Across Financial Crises. Journal of Banking & Finance 35, 3010–3018. Dahlhaus, T. (2017). Conventional Monetary Policy Transmission During Financial Crises: An Empirical Analysis. Journal of Applied Econometrics 32, 401–421. Daniel, K. and Moskowitz, T.J. (2016). Momentum Crashes. Journal of Financial Economics 122, 221–247. Dimovski, W. and Brooks, R. (2004). Initial Public Offerings in Australia 1994 to 1999, Recent Evidence of Underpricing and Underperformance. Review of Quantitative Finance and Accounting 22, 179–198. Duffie, D. (2010). Presidential Address: Asset Price Dynamics with Slow-Moving Capital. Journal of Finance 65, 1237–1267. Duffie, D. and Strulovici, B. (2012). Capital Mobility and Asset Pricing. Econometrica 80, 2469–2509. Enders, W. (2008). Applied Econometric Time Series. John Wiley & Sons. Fajgelbaum, P.D., Schaal, E. and Taschereau-Dumouchel, M. (2017). Uncertainty Traps. Quarterly Journal of Economics 132, 1641–1692. Fung, W. and Hsieh, D.A. (1997). Empirical Characteristics of Dynamic Trading Strategies: The Case of Hedge Funds. Review of Financial Studies 10, 275–302. Fung, W. and Hsieh, D.A. (2001). The Risk in Hedge Fund Strategies: Theory and Evidence from Trend Followers. Review of Financial Studies 14, 313–341. Fung, W. and Hsieh, D.A. (2002). Hedge-Fund Benchmarks: Information Content and Biases. Financial Analysts Journal 58, 22–34.
page 4303
July 6, 2020
16:8
4304
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
Fung, W. and Hsieh, D.A. (2004). Hedge Fund Benchmarks: A Risk-Based Approach. Financial Analysts Journal 60, 65–80. Gallagher, L.A., Hutchinson, M.C. and O’Brien, J. (2018). Does Convertible Arbitrage Risk Exposure Vary Through Time? Review of Pacific Basin Financial Markets and Policies, 21(4). Getmansky, M., Lo, A.W. and Makarov, I. (2004). An Econometric Model of Serial Correlation and Illiquidity in Hedge Fund Returns. Journal of Financial Economics 74, 529–609. Granger, C.W. and Terasvirta, T. (1993). Modelling Non-linear Economic Relationships. Oxford University Press, Oxford. Hamilton, J.D. (1989). A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle. Econometrica 57, 357–384. Holmes, M.J. and Maghrebi, N. (2004). Asian Real Interest Rates, Nonlinear Dynamics, and International Parity. International Review of Economics & Finance 13, 387–405. Hutchinson, M.C. and Gallagher, L.A. (2010). Convertible Bond Arbitrage: Risk and Return. Journal of Business Finance & Accounting 37, 206–241. Hutchinson, M.C. and O’Brien, J.J. (2018). Time Series Momentum and Macroeconomic Risk. SSRN. Hutchinson, M.C. and O’Brien, J.J. (2014). Is This Time Different? Trend-Following and Financial Crises. Journal of Alternative Investments 17, 82–102. Jensen, M.C. (1968). The Performance of Mutual Funds in the Period 1945–1964. Journal of Finance 23, 389–416. Jiang, H. and Kelly, B. (2012). Tail Risk and Hedge Fund Returns. Chicago Booth Research Paper. Kat, H. and Lu, S. (2002). An Excursion into the Statistical Properties of Hedge Fund Returns. CASS Business School Research Paper. Kat, H.M. and Brooks, C. (2001). The Statistical Properties of Hedge Fund Index Returns and their Implications for Investors. CASS Business School Research Paper. Kat, H.M. and Miffre, J. (2008). The Impact of Non-normality Risks and Tactical Trading on Hedge Fund Alphas. Journal of Alternative Investments 10, 8–22. Kazemi, H. and Schneeweis, T. (2003). Conditional Performance of Hedge Funds. Working Paper, University of Massachusetts. Kritzman, M., Page, S. and Turkington, D. (2012). Regime Shifts: Implications for Dynamic Strategies. Financial Analysts Journal 68, 22–39. Kurlat, P. (2018). Liquidity as Social Expertise. Journal of Finance 73, 619–656. Liang, B. (1999). On the Performance of Hedge Funds. Financial Analysts Journal 55(4), 72–85. Luukkonen, R., Saikkonen, P. and Ter¨ asvirta, T. (1988). Testing Linearity Against Smooth Transition Autoregressive Models. Biometrika 75, 491–499. McMillan, D.G. (2001). Nonlinear Predictability of Stock Market Returns: Evidence from Nonparametric and Threshold Models. International Review of Economics & Finance 10, 353–368. Merton, R.C. (1987). A Simple Model of Capital Market Equilibrium with Incomplete Information. Journal of Finance 42, 483–510. Mitchell, M., Pedersen, L.H. and Pulvino, T. (2007). Slow Moving Capital. National Bureau of Economic Research. Mitchell, M. and Pulvino, T. (2001). Characteristics of Risk and Return in Risk Arbitrage. Journal of Finance 56, 2135–2175.
page 4304
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4305
Nguyen, H., Dimovski, W. and Brooks, R. (2010). Underpricing, Risk Management, Hot Issue and Crowding Out Effects: Evidence from the Australian Resources Sector Initial Public Offerings. Review of Pacific Basin Financial Markets and Policies 13, 333–361. Nystrup, P., Madsen, H. and Lindstr¨ om, E. (2018). Dynamic Portfolio Optimization Across Hidden Market Regimes. Quantitative Finance 18, 83–95. ¨ Ocal, N. and Osborn, D.R. (2000). Business Cycle Non-linearities in UK Consumption and Production. Journal of Applied Econometrics 15, 27–43. Potter, S. (1999). Nonlinear Time Series Modelling: An Introduction. Journal of Economic Surveys 13, 505–528. Ritter, J.R. and Welch, I. (2002). A Review of IPO Activity, Pricing, and Allocations. Journal of Finance 57, 1795–1828. Saikkonen, P. and Luukkonen, R. (1988). Lagrange Multiplier Tests for Testing Nonlinearities in Time Series Models. Scandinavian Journal of Statistics 15, 55–68. Sarantis, N. (1999). Modeling Non-linearities in Real Effective Exchange Rates. Journal of International Money and Finance 18, 27–45. Shadab, H.B. (2013). Hedge Fund Governance. Stanford Journal of Business and Finance 19, 141. Skalin, J. and Ter¨ asvirta, T. (1999). Another Look at Swedish Business Cycles, 1861–1988. Journal of Applied Econometrics 14, 359–378. Ter¨ asvirta, T. (1994). Specification, Estimation, and Evaluation of Smooth Transition Autoregressive Models. Journal of the American Statistical Association 89, 208–218. Terasvirta, T. and Anderson, H.M. (1992). Characterizing Nonlinearities in Business Cycles Using Smooth Transition Autoregressive Models. Journal of Applied Econometrics 7, 119–136. Tong, H. and Lim, K.S. (1980). Threshold Autoregression, Limit Cycles and Cyclical Data. Journal of the Royal Statistical Society. Series B (Methodological) 42, 245–292. Van Dijk, D. and Franses, P.H. (1999). Modeling Multiple Regimes in the Business Cycle. Macroeconomic Dynamics 3, 311–340. Van Dijk, D., Ter¨ asvirta, T. and Franses, P.H. (2002). Smooth Transition Autoregressive Models — a Survey of Recent Developments. Econometric Reviews 21, 1–47. Welch, I. (1989). Seasoned Offerings, Imitation Costs, and the Underpricing of Initial Public Offerings. Journal of Finance 44, 421–449. Zheng, S.X., Ogden, J.P. and Jen, F.C. (2005). Pursuing Value Through Liquidity in IPOs: Underpricing, Share Retention, Lockup, and Trading Volume Relationships. Review of Quantitative Finance and Accounting 25, 293–312.
Appendix 125A Smooth Transition Regression 125A.1 Development of smooth transition regression This appendix provides a more general discussion of smooth transition regression than found in Section 125.2. In addition to a more detailed description of the methodology, the development of STR in the context of other state dependent methods and suggested extensions to STR are discussed. Smooth transition regression is a well-established methodology, a key advantage of adopting it for risk modeling, and consequently it is accessible in a number
page 4305
July 6, 2020
16:8
4306
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
of texts on nonlinear time series including Granger and Terasvirta (1993) and, more recently, Enders (2008). A standard linear regression model can be defined as yt = β xt + εt ,
(125A.1)
where yt is a vector of t observations of the dependent variable, xt = (1, x1,t , . . . , xm,t ), x1,t to xm,t are t observations of m independent variables and β = (β0 , . . . , βm ) defines the linear relationship between the dependent and explanatory variables. In the special case where the values of xt are lagged values of yt , so xt = (1, yt−1 , . . . , yt−m ), the becomes autoregressive of order m (AR(m)). Equation 125A.1 becomes a nonlinear model with the addition of a time varying term: yt = β1 xt (1 − f (zt )) + β2 xt f (zt ) + εt .
(125A.2)
The class of nonlinear relationship defined depends on the specification of f ( ). Three different realisations of f ( ) are common in econometrics: Markov switching, threshold models and smooth transition regressions (Potter, 1999). The first of the three is the Markov switching model. See Hamilton (1989) for the development and early use of this technique in economic modeling. In a Markov model, the relationship becomes xt (1 − S) + β2,t xt (S) + εt , yt = β1,t
(125A.3)
where S takes the values 1 or 0, depending on the state. The relationship can be expressed as β xt + εt , S = 0, (125A.4) yt = ϕ xt + εt , S = 1. The Markov switching model assumes exogenous regime switches, that is no knowledge of the mechanism underlying the change in state is required. Transition between states is based on a fixed set of probabilities defined by the transition matrix: p00 p00 , (125A.5) Γ= p10 p11 where pij is the probability of moving from state i to state j. These models are useful for systems where there are large shocks that push a system between states, e.g., low volatility to extremely high volatility.
page 4306
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4307
To include an endogenous trigger for regime change we must move to threshold models introduced by Tong and Lim (1980). These models were introduced in the context of autoregressive series and consequently known as threshold autoregressive (TAR) models. The transition function is defined as 1, zt > c, (125A.6) f (zt ) = 0, zt < c, where zt is the transition variable and c is the critical (threshold) value of zt that defines the change of state. Using this, the model becomes: β xt + εt , zt > c, (125A.7) yt = ϕ xt + εt , zt < c. In the case of much of the work on this model, it was assumed that zt was a lagged value of yt but this is not a necessary part of the model. The key advance of TAR over Markov models is the specification of the transition variable that triggers the transition between states. It is worth noting that it is only necessary to specify the transition variable, the critical value can be derived from the data (Chan, 1993). The TAR model is still limited to instantaneous changes of state. In order to allow a gradual move between states f (zt ) must change continuously as zt changes. There are a number of definitions of f (zt ) that meet the definition of smooth transition, two are particularly useful and well understood, the logistic and exponential transition functions. These transition functions generate logistic (L-STR) and exponential (E-STR) smooth transition regression models, respectively. The logistic transition function is defined as f (zt ) = [1 + exp(−γ(zt − c))]−1 ,
γ > 0,
(125A.8)
where γ is the smoothness parameter (i.e., the slope of the transition function) and c is the threshold. Choosing this transition function yields the L-STR model. In the limit, as γ → 0 or γ → ∞, the L-STR model becomes linear as the value of f (zt ) is constant. For intermediate values of γ, the degree of decay depends upon the value of zt . As zt → −∞, f (zt ) → 0 and the behavior of yt is given by yt = β xt + et . As zt → +∞, f (zt ) → 1 and the behavior of yt is given by yt = (β + ϕ )xt + et . The alternative exponential transition function is f (zt ) = 1 − exp(−γ(zt − c)2 ),
γ > 0.
(125A.9)
Again γ is the smoothness parameter while the function is symmetric around zt = c. This function produces the E-STR model. Like before as γ → 0
page 4307
July 6, 2020
16:8
4308
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
or γ → ∞, the E-STR model becomes linear as f (zt ) becomes constant. Otherwise, the model displays nonlinear behavior. Unlike the L-STR model, E-STR is symmetric, as (zt − c) → ±∞, f (Zt ) → 1 and the behavior of yt is given by yt = (β + ϕ )xt + et . As zt → c, f (zt ) → 0 and the behavior of yt is given by yt = β xt + et . Both the logistic and exponential transition functions are bounded by 0 and 1. Before moving on it is useful to note that the specification for the STR model in equation (125A.2) can be rewritten as − β1 )xt f (zt ) + εt . yt = β1 xt + (β21
(125A.10)
Setting β = β1 and ϕ = β2 − β1 generates the form used in the analysis in the main text. yt = β1 xt + ϕ xt f (zt ) + εt .
(125A.11)
While this discussion has focused on two state STR models, the method can easily be extended to combine multiple regimes (Van Dijk et al., 2002). In discussing multiple states, it is convenient to define the transition function as dependent on the transition variable, smoothness factor and critical value. It is denoted as f (zi,t γj , ck ), where the subscripts i, j and k denote multiple transition variables, smoothness factors and critical values, respectively. It is possible to categories the extended models into two groups. In the first case, the transition functions are functions of the same transition variable, with different smoothness factors and threshold values. In the case of a three state model (two critical values) this becomes yt = β xt + ϕ1 xt f (zt , γ1 , c1 ) + ϕ2 xt f (zt , γ2 , c2 ) + εt .
(125A.12)
The alternative is multiple transitions variable. Van Dijk and Franses (1999) develop multiple regime STR (MRSTR) model for this case. Their model generates 2m regimes, for m transition variables. In the case of two transition variables the model can be simplified to: yt = β xt + ϕ1 xt f (z1,t , γ1 , c1 ) + ϕ2 xt f (z2,t , γ2 , c2 ) + ϕ1 xt f (z1,t , γ1 , c1 )f (z1,t , γ1 , c1 ).
(125A.13)
125A.2 Specification Tests for the E-STR and L-STR were presented in Saikkonen and Luukkonen (1988) and Luukkonenet al. (1988), respectively. Tests of linearity are based on Lagrange Multiplier (LM) tests, first proposed to test econometric models by Breusch and Pagan (1980). The LM tests are useful as they require only estimation of the null (linear) model. Both methods rely on a two-step
page 4308
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4309
process that tests the residuals of a regression of the linear model against a Taylor expansion of the transition function. In the case of the E-STR model (Luukkonenet al., 1988) a quadratic function is used: ut = β0 xt + β1 xt zt + β2 xt zt2 + εt .
(125A.14)
In the case of the L-STR model the series is expanded to the third power (equation 125A.17). These two tests do not allow selection between the L-STR and E-STR models. Ter¨ asvirta (1994) combine the two tests to produce a three step procedure that tests the null hypothesis of linearity and, assuming linearity is rejected, then selects between E-STR and L-STR models. Granger and Terasvirta (1993) provide an accessible description of the procedure in their text book and we follow this methodology closely. The first step is to estimate the linear model: yt = β xt + ut .
(125A.15)
The residuals of this regression are calculated uˆt = yt − β xt .
(125A.16)
The auxiliary regression can then be estimated: ut = β0 xt + β1 xt zt + β2 xt zt2 + β3 xt zt3 + εt .
(125A.17)
The LM is LM = T R2 .
(125A.18)
T is the sample size (number of time periods) and R2 is the coefficient of multiple determination from the estimation of the auxiliary regression (equation 125A.18). The LM has a χ2 -distribution. Ter¨asvirta (1994) suggests that using an F -test may provide a more accurate method of testing the hypothesis. This tests the null hypothesis: H0 : β1 = β2 = β3 = 0.
(125A.19)
If we cannot reject this hypothesis, we can conclude that there is no nonlinear relationship defined in terms of the transition variable, zt . It remains possible that another transition variable could result in rejecting linearity. More detailed analysis of the procedure can be is described in Van Dijk et al. (2002). Once the hypothesis of a linear relationship has been rejected, the next step is the selection of the appropriate nonlinear model, L-STR or E-STR.
page 4309
July 6, 2020
16:8
4310
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch125
L. A. Gallagher, M. C. Hutchinson & J. O’Brien
The tests for this are based on the results of the auxiliary regression (equation 125A.6) and is tested through a series of nested F -tests. H3 : β3 = 0,
(125A.20)
H2 : β2 = 0|β3 = 0,
(125A.21)
H1 : β1 = 0|β2 = β3 = 0.
(125A.22)
From Saikkonen and Luukkonen (1988) we see that the E-STR expansion, equation 125A.14, does not have a cubic term, therefore rejecting H3 also rejects the E-STR model. Consequently rejecting H3 is consistent with an L-STR model. Accepting H3 and rejecting H2 indicates a function with a symmetric, quadratic form, consistent with E-STR transition function. Finally accepting both H3 and H2 while rejecting H1 indicates an L-STR model. It can be noted that the tests of H1 and H2 only provide indicative answers. Granger and Terasvirta (1993) argue that strict application of this sequence of tests may lead to incorrect conclusions, where H3 is accepted and both H2 and H1 are rejected. They suggest an alternative approach to the selection problem in this case. Both hypothesis, H1 and H2 are tested, and if both are rejected the form of the transition function is selected based on the stronger of the two results, that is the test with the lowest p-value. As similar approach can be used to select from different candidates for the transition variable, zt . The linear model (equation 125A.15) and the auxiliary regression (equation 125A.17) are run for each potential variable. The hypothesis H3 is tested for all candidate variables, the variable generating the strongest test, smallest (significant) p-value, is selected. 125A.3 Model estimation Estimating an STR model is a standard optimization problem and most statistical software packages provide the necessary functionality. The algorithms are usually based on nonlinear least squares but maximum likelihood methods are sometimes used (Granger and Terasvirta, 1993). As further discussion of the underlying algorithms is outside the scope of this work, we finish with a discussion of number of practical issues that can arise during in estimation, highlighted in Ter¨asvirta (1994). These occur when the optimization converges at a local minimum, struggles to estimate a parameter or fails to converge. As with most optimization problems, the algorithm can converge at a local rather than global solution. It is possible to test for this by repeating
page 4310
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Using Smooth Transition Regressions to Model Risk Regimes
b3568-v4-ch125
4311
the optimization with different initial values and confirming the same solution is reached. The optimizer may be slow to converge when γ is large. Very large changes in the value of γ will have little effect on the value of the transition function and a large number of observations in the region of c would be needed to estimate γ accurately. Ter¨ asvirta (1994) suggests rescaling the parameters (scaling γ down and c up) to solve this problem. Finally, under certain circumstances, the optimizer may fail to converge due to the complexity of the problem. In this event, Ter¨asvirta (1994) suggests simplifying the problem by solving it in two stages. In the first stage, the value of γ is fixed, setting γ = 1 is recommended, and the other regression coefficients are estimated. These estimated coefficients are then fixed in a second regression and the value of γ estimated.
page 4311
This page intentionally left blank
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
Chapter 126
Application of Discriminant Analysis, Factor Analysis, Logistic Regression, and KMV-Merton Model in Credit Risk Analysis∗ Cheng Few Lee and Hai-Chin Yu Contents 126.1 126.2 126.3 126.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Credit Analysis . . . . . . . . . . . . . . . . . . . . . . . . Bankruptcy and Financial Distress Analysis . . . . . . . . Applications of Factor Analysis to Select Useful Financial Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.5 Bond Rating Forecasting . . . . . . . . . . . . . . . . . . . 126.6 Ohlson’s and Shumway’s Methods for Estimating Default Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 126.6.1 MIDAS Logit Model . . . . . . . . . . . . . . . . . 126.7 Merton Distance Model and KMV-Merton Model . . . . . 126.7.1 Merton distance model . . . . . . . . . . . . . . . 126.7.2 A Na¨ıve alternative . . . . . . . . . . . . . . . . . 126.7.3 KMV-Merton model . . . . . . . . . . . . . . . . .
. . . . . .
4314 4315 4319
. . . .
4323 4325
. . . . . .
4328 4332 4332 4332 4333 4335
. . . . . .
Cheng Few Lee Rutgers University e-mail: cfl[email protected] Hai-Chin Yu Chung Yuan University e-mail: [email protected] ∗
This paper draws upon Chapter 4, “Application of discriminant analysis and factor analysis in financial management,” in Financial Analysis, Planning and Forecasting: Theory and Application, 3rd edition, World Scientific, Singapore. 4313
page 4313
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4314
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
126.8 Capital Structure Determination and Na¨ıve Default Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.9 Empirical Comparison . . . . . . . . . . . . . . . . . . . . . 126.10 Summary and Concluding Remarks . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 126A Logistic Model, Probit Model and MIDAS Logit Model . . . . . . . . . . . . . . . . . Appendix 126B SAS Code for Hazard Model in Bankruptcy Forecasting . . . . . . . . . . . . . . . Appendix 126C Application of Merton Distance to Default Model
. . . .
4336 4337 4337 4338
.
4342
. .
4344 4345
Abstract The main purposes of this paper are to review and integrate the applications of discriminant analysis, factor analysis, and logistic regression in credit risk management. First, we discuss how the discriminant analysis can be used for credit rating such as calculating financial z-score to determine the chance of bankruptcy of the firm. In addition, we also discuss how discriminant analysis can be used to classify banks into problem banks and non-problem banks. Secondly, we discuss how factor analysis can be combined with discriminant analysis to perform bond rating forecasting. Thirdly, we show how logistic and generalized regression techniques can be used to calculate the default risk probability. Fourthly, we will discuss the KMV-Merton model and Merton distance model for calculating default probability. Finally, we compare all techniques discussed in previous sections and draw conclusions and give suggestions for future research. We propose using CEV option model to improve the original Merton DD model. In addition, we also propose a modified na¨ıve model to improve Bharath and Shumway’s (2008) na¨ıve model. Keywords Discriminant analysis • Factor analysis • Logistic regression • KMV-Merton model • Probit model • MIDAS logit model • Hazard model • Merton distance model • Financial z-score • Default probability.
126.1 Introduction Discriminant analysis, factor analysis, logistic regression, and the KMVMerton model have been used to perform credit analysis for almost four decades. The main purposes of this paper are to review and integrate the applications of Discriminant Analysis, Factor Analysis, and logistic and generalized regressions in credit risk management. First, we will discuss how the discriminant analysis can be used for credit rating such as calculating financial z-score to determine the chance of bankruptcy of the firm. In addition, we also discuss how discriminant analysis can be used to classify banks into problem banks and non-problem banks. Secondly, we will discuss how factor analysis can be combined with discriminant analysis to perform bond
page 4314
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4315
rating forecasting. Thirdly, we will show how both static and dynamic logistic regression technique can be used to calculate the default risk probability. Fourthly, we will discuss the KMV-Merton model and Merton Distance model for calculating Default probability. Finally, we will compare all techniques discussed in this paper and draw the conclusions and give suggestions for future research. Section 126.2 discusses how discriminant analysis can be used in loan credit analysis. Section 126.3 applies discriminant analysis to bankruptcy and financial distress analysis. Section 126.4 demonstrates how the factor-analysis technique can be used to select useful financial ratios. Section 126.5 discusses how k-group discriminant-analysis methods can be used to forecast bond ratings. In addition, a new method developed by Kao and Lee (2012) is also briefly discussed. Section 126.6 discusses how alternative logistic regressions can be used to calculate the default probability. Section 126.7 discusses Merton Distance Model and KMV Merton Model. In Section 126.8 we discuss how Capital Structure Determination model can be used to improve Bharath and Shumway (2008) and Na¨ıve default probability. Section 126.9 discusses the performance of alternative models. Finally, Section 126.10 summarizes the results of this chapter. Appendix 126A discusses logistic model, probit model and MIDAS logit model. Appendix 126B presents SAS Code for Hazard model in bankruptcy forecasting, and Appendix 126C shows application of Merton distance to default model. 126.2 Credit Analysis Two-group multiple discriminant analysis (MDA) can be used to determine whether a customer’s credit should be authorized or not, or to determine the financial soundness of an industrial firm, a bank, or an insurance company. In determining the trade credit policy of a firm, Mehta (1974) and Van Horne (2001) proposed a two-group discriminant-analysis model to identify the “good account” from the “bad account”. A linear discriminant function can be defined as Yi = AX1i + BX2i ,
(126.1)
where Yi is the index value for the ith account; X1i is the ith firm’s quick ratio; X2i is the ith firm’s total sales/inventory ratio; and A and B are the parameters or weights to be determined. For purposes of formulating the original model, we extend open-book credit to all new credit applicants for a sample period, recording for each account the quick ratio and the total sales/inventory ratio and whether or not the account defaults in payment after a specified length of time. If the account defaults, it is classified as a bad account and the index is assigned the
page 4315
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4316
value of zero. If the account pays on time, it is classified as a good account and the index is assigned the value of one. With this information, we are able to apply a linear discriminant analysis with two independent variables. Based upon the sample data of X1 and X2 , the coefficients A and B can be calculated by the following procedure. Two equations used in solving for A and B of equation (126.1) are defined as follows: S11 A + S12 B = D1 ,
(126.2)
S12 A + S22 B = D2 .
(126.3)
Using Cramer’s rule, we obtain A=
S22 D1 − S12 D2 2 , S22 S11 − S12
(126.4a)
B=
S11 D2 − S12 D1 2 , S22 S11 − S12
(126.4b)
where S11 = Variance of X1 ; S22 = Variance of X2 ; S12 = Covariance between X1 and X2 ; D1 = Difference between the average of X1 s for good accounts and the average of X1 s for bad accounts; and
D2 = Difference between the average of X2 for good accounts the average of X2 for bad accounts. Based upon the estimated parameters, A and B, we need to determine the minimum cutoff value of the discriminant function. The aim here is to refuse credit to accounts with a value of Y below the cutoff value and to extend credit to accounts with a Y value above the cutoff. Theoretically, we wish to find the discriminant-function value that is denoted by Y ∗ in Figure 126.1, where Y (B) and Y (G) are average discriminant function values for bad and good accounts, respectively. Following Van Horne (2001), we start by calculating the Yi for twelve accounts, as shown in Table 126.1, in ascending order of magnitude. We see that there is an area of overlap for accounts 6, 2, 11, and 4, as show graphically in Figure 126.1. We know that the cutoff value must lie between 1.64 and 1.96. For simplicity, we may want to use the midpoint, 1.80, as our cutoff value. In Figure 126.2, we are able to draw a discriminant boundary
page 4316
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression Table 126.1: Status and index values of the accounts. Account number 7 10 2 3 6 12 11 4 1 8 5 9
Account status
Yi
Bad Bad Bad Bad Bad Good Bad Good Good Good Good Good
0.81 0.89 1.30 1.45 1.64 1.77 1.83 1.96 2.25 2.50 2.61 2.80
Probability of occurrence
Y(B)
Y*
Y(G)
Discriminant function value
Figure 126.1:
Universes of good and bad accounts. Discriminant boundary line
Quick ratio X
Percent net worth
Figure 126.2:
Discriminant analysis of accounts receivable.
b3568-v4-ch126
4317
page 4317
July 6, 2020
16:8
4318
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
that most accurately classifies accounts into good and bad categories. Note, however, that two of the accounts, 11 and 12, are misclassified, given the cutoff value. Account 11 is classified by the graph as a good account when, in fact, we know it to be a bad account, while account 12 is classified as a bad account when, in fact, it was a good account. These are Type I classification errors and Type II classification errors, respectively. Type I errors involve the rejection of the null hypothesis when it is actually true, while Type II classification errors involve the acceptance of the null hypothesis when it is actually false. For the sake of practicality, the analysis should assume an area of possible misclassification for indexes between 1.64 and 1.96. Accounts or firms falling within this range require further investigation and analysis. If one has reason to believe that new credit applicants will not differ significantly from past ones whose performance has been analyzed, discriminant analysis can be used to select and reject credit sales customers. Using a minimum cutoff value, we reject all credit sales if the Y value for the credit applicant is less than 1.78. Using a 32-point range, we accept all credit sales where the prospective customer has a Y -value over 1.96, and reject applicants with Y -values below 1.64. For applicants with Y -values lying between those two values, we might want to obtain additional credit information, along with information as to the profitability of the sale, before making a decision. In this credit analysis, Mehta (1974) used the quick ratio and the inventory-turnover ratio to construct the index. He shows that there are four tasks to be faced by the manager in using the MDA for credit analysis. They are: (i) (ii) (iii) (iv)
determining significant factors; selecting the sample; assigning weights to factors in order to develop an index; establishing cutoff values for the index.
Mehta regards the discriminant approach, the decision-tree approach, and the conventional method as the three major approaches generally used by the credit department of a firm in making the credit-granting decision. In assessing the risk of a request for credit, the conventional approach regards the three C’s as relevant: character, capital, and capacity. This is a subjective credit-analysis method and it generally gives indeterminate and misleading results. Both the discriminant approach and the decision-tree approach can supply financial mangers with objective credit-analysis results, and therefore are generally more useful than the conventional method in
page 4318
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4319
credit analysis. However, these two methods use a static analytic framework and assume that collection measures are given. 126.3 Bankruptcy and Financial Distress Analysis For the past 15 years, academicians and practitioners have used the linear discriminant function to analyze bankruptcy and financial distress for both industrial and financial firms. Here we will discuss only the three most important studies: (1) Altman’s (1968) bankruptcy analysis for industrial firms, (2) Sinkey’s (1975) study of identifying problem banks from nonproblem banks, and (3) Trieschmann and Pinches’ (1973) analysis of the financial insolvency of insurance companies. Altman’s study included 33 manufacturers who filed bankruptcy petitions under Chapter X of the Bankruptcy Act during 1946-1965. These 33 firms were paired with 33 nonbankrupt firms on the basis of similar industry and asset size. Asset size ranged between $1 million and $25 million. For each firm, 22 variables were initially chosen for analysis on the basis of their popularity in the literature and their potential relevance to the study. These ratios were classified into five categories: liquidity, profitability, leverage, solvency, and activity. One ratio from each category was chosen for inclusion in the discriminant model. The variables used to obtain the final results are: X1 X2 X3 X4 X5
= = = = =
Working capital/Total assets; Retained earnings/Total assets; EBIT/Total assets; Market value of equity/Book value of total debt; and Sales/Total assets.
The mean ratios of the two groups of firms one year prior to the filing for bankruptcy are presented as follows: a. Mean ratios of bankrupt firms are: X1 = −0.061; X2 = −0.626; X3 = −0.318; X4 = 0.401; and X5 = 1.500 b. Mean ratios of nonbankrupt firms: X1 = 0.414; X2 = 0.355; X3 = 0.153; X4 = 2.477; and X5 = 1.900. Altman’s final estimated discriminant function is: Yi = 0.012X1 + 0.014X2 + 0.033X3 + 0.006X4 + 0.999X5 .
(126.5a)
His results show that this discriminant function has done well in predicting nonbankrupt firms.
page 4319
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4320
Using (i) the formula for the overall cost of misclassification (E) and (ii) the expected error rate as discussed in the previous chapters, Altman classified the initial sample. He concluded that all firms having a z-score greater than 2.99 clearly fall into the nonbankrupt sector, while those firms having a z-score below 1.81 are all bankrupt. The area between 1.81 and 2.99 will be defined as the zone of “ignorance” or the “gray” area. Although these are the original values proposed by Altman, in a more recent interview, he stated that a negative Z-score was now an indicator of potential bankruptcy. Altman’s original Z-score model requires a firm to have publicly traded equity and be a manufacturer. He uses a revised model to make it applicable for private firms and nonmanufacturers. the resulting model is this: Z = 6.56
Accumulated retained earnings Net working capital + 3.26 Total assets Total assets
+ 1.05
Book value of equity EBIT + 6.72 , Total assets Total liabilities
(126.5b)
where Z < 1.23 indicates a bankruptcy prediction, 1.23 ≤ Z ≤ 2.90 indicates a gray area, and Z > 2.9 indicates no bankruptcy. Sinkey’s study of problem banks is another example of the use of a discriminant function in financial analysis. Sinkey draws a profile of characteristics associated with banks that may be in danger of failing, in an attempt to develop an early warning system to help predict problem banks. The Federal Deposit Insurance Corporation wants to predict problem banks as early as possible in order to be able to advise banks of necessary financialmanagement changes in time for them to maintain solvent operations. “Problem” banks are categorized according to the likelihood of their needing financial assistance. The three classes of problem banks and the composition of the firms studied are presented below. (a) Class 1 (Serious problem-potential payoff): An advanced problem bank that has at least 50 percent chance of requiring financial assistance in the near future. (b) Class 2 (Serious problem): A bank whose financial condition threatens ultimately to obligate financial outlay by the FEIC unless drastic changes occur. (c) Class 3 (Other problem): A bank with some significant weakness, with vulnerability less than Class 2, but still calling for aggressive supervision and extraordinary concern by the FEIC.
page 4320
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4321
Of the 110 banks, analyzed, 90 were identified as “problems” in 1972 and 20 in 1973. Each problem bank was matched with a non-problem bank similar to it by (1) geographic market area, (2) total deposits, (3) number of banking offices, and (4) Federal Reserve membership status. Over 100 variables were initially examined to see if there was a significant difference between the ratios of problem and non-problem banks on a univariate basis. Five variables used by Sinkey (1975) represent the following dimensions of bank finances: (1) loan volume, represented by loans/assets; (2) capital adequacy, by loans/capital plus reserves; (3) efficiency, by operating expense/operating income; (4) sources of revenue, by loan revenue/total revenue; and (5) uses of revenue, by other expenses/total revenue. Sinkey then applied the multiple-discriminant-analysis technique to classify the banks on the basis of the above-mentioned five financial ratios. Separate discriminant functions were estimated for each year from 1969 through 1972. In general, six or seven variables were included in each function. Recall that a type I error represents the prediction of a problem bank as a non-problem one, and that a type II error represents the prediction of a non-problem bank as a problem one. Obviously, from the viewpoint of financial outlay, a type I error is more costly to the FDIC. The prediction errors for each year are presented in Table 126.2 as follows. In the years closer to a bank’s classification as a problem bank, the discriminant model becomes better able to classify banks that were termed problems in 1971 and 1972. It is apparent that the potential exists for the banking agency to more efficiently allocate resources and analyze preexamination data through the implementation of an effective early warning system. A third example of the application of a discriminant function in financial analysis is Orgler’s (1970, 1975) dummy regression model for examining loan quality. His research sample contained 75 criticized loans and 225 noncriticized loans. The validation sample contained 40 criticized loans and 80 Table 126.2:
Type I and Type II errors.
Year
Type I error
Type II error
Total error
1969 1970 1971 1972
46.36% 42.73% 38.18% 28.15%
25.45% 27.27% 24.55% 21.36%
35.91% 35.00% 31.36% 24.76%
page 4321
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4322
non-criticized loans. A final regression model is: Zi = 1.1018 + 0.1017X1i − 0.3966X2i − 0.0916X3i − 0.1573X4i − 0.0199X5i − 0.4533X6i , where: X1 = X2 = X3 = X4 =
0 1 0 1 0 1 0 1
(126.6)
Unsecured loan, Secured loan; Past interest payment due, Current loan; Not audited firm, Audited firm; Net loss firm, Net profit firm;
X5 = Working Capital/Current Assets; 0 Loan criticized by bank examiner, X6 = 1 Loan not criticized by bank examiner. = 0.08 and C2 = Orgler used two cutoff points to classify loans: C1 0.25. These two cutoff points gave three predicted categories for commercial loans: Zˆi > C2 = “Bad” loans, C1 ≤ Zˆi ≤ C2 = “Marginal” loan, Zˆi ≤ C1 = “Good” loan. A fourth example of the application of a discriminant function in financial analysis is Trieschmann and Pinches’ (1973) multivariate model for predicting financially distressed Property-Liability (P-L) insurers. Their paper is concerned with insurance company insolvency and the identification of firms with a high probability of financial distress. They use multiple discriminant analysis to classify firms into two groups (solvent or distressed). The model was able to classify correctly 49 out of 52 firms in the study. One solvent firm was classified as being distressed, while two of the distressed firms were classified as being solvent. Of the 70 variables the researchers felt might be important, the final model included only six. The discriminant model for
page 4322
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4323
identifying property-liability firms with a high potential for financial distress is: Z = −11.08576X1 − 1.50752X2 + 3.53606X3 − 2.49824X4 − 2.45352X5 − 0.24492X6 ,
(126.7)
where X1 = Agents’ balances/Total assets; a measure of the firms’ accounts receivable management; X2 = Stocks at cost (preferred and common)/Stocks at market (preferred and common); measures investment management; X3 = Bonds at cost/Bonds at market; measures the firm’s age; X4 = (Loss adjustment expenses paid + underwriting expenses paid)/ Net premiums written; a measure of a firm’s funds flow from insurance operations; X5 = Combined ratio; traditional measure of underwriting profitability; and X6 = Premiums written direct/Surplus; a measure of the firm’s sales aggressiveness. The final discriminant model has an F-ratio of 12.559, which, with the appropriate degrees of freedom (6 and 45), is significant beyond the 0.005 level. Once it is established that the model does discriminate between distressed and solvent property-liability insurance firms, the individual contribution of each of the variables can be examined through the use of the t-test. The individual contribution is dominated by X6 , which is significant at the 0.0005 level; X1 , which is significant at the 0.005 level; and X2 and X4 , which are significant at the 0.025 level. The discriminant model has been proved to be quite applicable to this sort of analysis. With further development of data bases for analysis, future regulators should be in a better position to step in and to help potentially distressed firms well before bankruptcy threatens. 126.4 Applications of Factor Analysis to Select Useful Financial Ratios Farrar (1962) and King (1966) have used the factor-analysis technique to determine the important factors for security rates-of-return determination. King (1966) found that there are market factors, industry factors, and other factors, that determine stock price behavior over time. Lloyd and Lee (1975) have used factor analysis to identify the subgroups for the Dow-Jones thirty
page 4323
July 6, 2020
16:8
4324
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
industrial firms and to test the existence of Block Recursive Systems in asset pricing models. To test the existence of their arbitrage pricing theory (APT), Roll and Ross (1980) have used factor loadings of individual securities to show that many factors are involved in the return-generating process. In financial ratio analysis, Johnson (1979) has used the information factor loadings to test the cross-sectional stability of financial ratio patterns; Pinches and Mingo (1973) have used factor loading information to determine which variables should be used to estimate their n-group multivariate discriminant analysis for industry bond ratings. In addition, Chen and Shimerda (1981) used factor analysis in the empirical determination of useful financial ratios. The properties and characteristics of financial ratios have received considerable attention in recent years, with interest focused primarily on determining the predictive ability of financial ratios and related financial data. Principal areas of investigation have included (i) prediction of corporate bond ratings as explored by Horrigan (1966), Pinches and Mingo (1973), and Pogue and Soldofsky (1969); (ii) prediction of financial ratios by Lev (1969) and Frecka and Lee (1983); (iii) prediction of the suitability of the firm to merge, through an examination of the characteristics of merged firms by Simkowitz and Monroe (1971) and Stevens (1973); and (iv) anticipation of financial impairment by Altman (1968), Beaver (1968), Blume (1974), Tinsley (1970), and Wilcox (1971). Using the information from factor loadings, Johnson (1979) classified 61 financial ratios for 306 primary manufacturers and 159 retailers into eight financial groups, which are: 1. Return on Investment, 2. Financial Leveragement, 3. Capital Intensiveness, 4. Inventory Intensiveness, 5. Cash Position, 6. Receivables Intensiveness, 7. Short-Term Liquidity, and 8. Decomposition Measures. The results of Johnson’s study, and results obtained by Pinches, Mingo, and Caruthers (1973) [PMC], and Pinches, Eubank, and Mingo (1975) suggest that meaningful, empirically based classifications of financial ratios can be isolated, and that the composition of these financial ratio groups is reasonably stable over the time periods investigated and across the two industry classifications examined. Similarly, Chen and Shimerda (1981) used factor loadings to study useful financial ratios, obtaining conclusions similar to those obtained by Johnson (1979). Overall, the above-mentioned empirical studies indicated that factor-loading information can be objectively used to determine which ratios are useful for security analysis and financial management. The detailed procedure of using factor analysis to analyze financial ratio can be referred to the above-mentioned papers and Chapter 4 of Lee and Lee (2017).
page 4324
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
Application of Discriminant, Factor Analysis, Logistic Regression
4325
126.5 Bond Rating Forecasting Pinches and Mingo (1973) (P&M) have developed and tested a factoranalysis/multiple-discriminant model for predicting bond ratings. They used factor analysis to screen the data in the manner discussed earlier. This technique has been discussed in previous sections. Here, we will use P&M’s multiple-discriminant results to show how a k-group discriminant function can be used to analyze the performance of bond ratings and predict future bond ratings. The following six variables estimate the final discriminant functions: X1 = Subordination; the legal status of bonds; X2 = Years of consecutive dividends; indicates the stability of the firms’ earnings; X3 = Issue size; reflects the size of the firm; X4 = (Net income + interest)/interest: five-year mean; indicates the ability of the firm to meet debt obligations; X5 = Long-term debt/total assets: five-year mean; a measure of the capital structure of the firm; and X6 = Net income/total assets; a measure of management’s ability to earn a satisfactory return on investment. These variables were chosen from a group of 35 through factor analysis. In all, 180 bonds were analyzed and randomly assigned to one of two sample groups — forming the original sample for model development and 48 forming a holdout sample for testing purposes. P&M’s model includes three discriminant functions that were significant at the 0.001 level (based upon Barlett’s v-statistic). The structural form of these three functions is presented below: Y1 = −0.329X1 + 0.107X2 + 0.100X3 + 0.005X4 − 0.270X5 + 0.893X6 , (126.8) Y2 = 0.046X1 + 0.218X2 − 0.212X3 − 0.264X4 − 0.505X5 − 0.762X6 , Y3 = −0.128X1 − 0.044X2 − 0.138X3 + 0.001X4 + 0.320X5 − 0.928X6 . The overall discriminating power of the model was determined by testing the equality of the group means. The calculated F -value for the MDA model is 17.225. As the tabled value (0.001) is 2.13, the calculated F -value permits rejection of the null hypothesis that the bonds came from the same population. With the overall conclusion that, a priori, the groups of bonds are
page 4325
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4326
Table 126.3: Ranks of different variables in terms of three different discriminant functions. Variable X1 X2 X3 X4 X5 X6
Function 1 Y 1
Function 2 Y 2
Function 3 Y 3
1 2 3 6 5 4
6 2 3 1 4 5
2 5 1 6 4 3
significantly different, the six variables entering the final MDA model can be examined. The individual discriminating ability of each of the variables and their rank of importance in the three discriminant functions was defined in equation 126.8 and are now presented as follows. Table 126.3 indicates the relative importance of each variable for different discriminant functions. For example, variable 1 is most important in discriminant function 1, however it is least important in discriminant function 2. In addition, variable 1 is the second most important in discriminant function 3. The MDA model correctly rated 92 of the 132 bonds in the original sample, which corresponds to a correct prediction rate of 69.7 percent; this is analogous to the coefficient of determination (R2 ) in regression analysis. If an analyst were interested only in the ability of the model to predict ratings within one classification (either higher or lower) of the actual rating, the model performed very accurately, classifying all but two of the bonds within one rating of the actual rating. In another test of validity, the model was applied to the holdout sample. With this group, the model correctly rated 31 of 48 bonds (64.58 percent) and rated all 48 bonds within one rating higher or lower than the actual rating. In order to further validate the MDA model, a stratified random sample of 48 companies issuing bonds during the first six months of 1969 was gathered. The MDA model (developed from a sample of newly rated bonds issued in 1967 and 1968) was employed to predict the ratings on the new bonds issued in 1969. Twenty-seven of the 48 bonds were rated correctly, indicating that the model possesses future predictive ability. The subordination status of a corporate bond [represented by a binary (0-1) variable] appears to be the most important variable among those that were examined in this study. If one were interested only in rating bonds as investment quality (AA, A, and BAA) or noninvestment quality (BA and B),
page 4326
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4327
the best single predictor is the subordinated status of the bond. Based on this variable alone, correct ratings (investment vs. noninvestment quality) would have occurred 88.6 percent of the time for the original sample (117/132) and 83.3 percent of the time for the holdout sample (40/48). The model performed very poorly for BAA-rated bonds. An analysis of the multiple range tests indicated that the inability of the MDA model to accurately predict BAA-rated bonds appears to be due to a lack of statistically significant differences in the quantifiable variables included in the model. Overall, though, the model’s ability to discriminate between different bond rating is fairly good, and it shows potential for application to the prediction of future bond ratings. There are two major technical differences between the methods and procedures used in this section and those used in Section 126.3. First, the MDA used in this section is a k-group instead of two-group analysis. Secondly, Pinches and Mingo’s MDA analysis is predictive instead of descriptive in nature. Joy and Tollefson (1974) argue that MDA can be used for either predictive or descriptive purposes. They regard Altman’s (1968) MDA application as a descriptive instead of a predictive analysis. This comment can also apply to both Sinkey’s and Trieschmann and Pinches’ (1973) MDA analysis. Other related arguments can be found in Joy and Tollefson (1978) and Altman and Eisenbeis (1978). The financial-ratio-based credit-scoring model for bond rating system requires the maximization of two conflicting objectives (i.e., the explanatory and discriminatory power, simultaneously), which had not been directly addressed in the paper by Pinches (1973). Therefore, Kao and Lee’s (2012) paper has developed a hybrid multivariate credit-scoring model that combines the principle component analysis and Fisher’s discriminant analysis using the MINIMAX goal programming technique so that the maximization of the two conflicting objectives can be compromised. The performance of alternative credit-scoring models is analyzed and compared using dataset from previous studies. They found that the proposed hybrid credit-scoring model outperforms other alternative models in both explanatory and discriminatory powers. For multiple-class prediction, a frequently used approach is based on ordered probit model. Hwang et al. (2008) show that this approach is not optimal in the sense that it is not designed to minimize the error rate of the prediction. Based upon the works by Altman (1968), Ohlson (1980), and Begley et al. (1996) on two-class prediction, Hwang et al. (2008) propose a modified ordered probit model. The modified approach depends on an
page 4327
July 6, 2020
16:8
4328
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
optimal cutoff value and can be easily applied in applications. An empirical study is used to demonstrate that the prediction accuracy rate of the modified classifier is better than that obtained from usual ordered probit model. In addition, we also show that not only the usual accounting variables are useful for predicting issuer credit ratings, market-driven variables and industry effects are also important determinants. 126.6 Ohlson’s and Shumway’s Methods for Estimating Default Probability1 Beginning in the 1980s more complex estimation methods such as logit and probit were used to determine the likelihood of company bankruptcy. Logit model is a methodology that uses maximum likelihood estimation, or socalled conditional logit model. Two well-known models using logit models to estimate the probability of bankruptcy are Ohlson (1980) and Shumway (2001), which we will discuss next. Ohlson discusses the following econometric advantages of logit model over multivariate discriminant analysis (MDA) used for the development of Altman Z-score. MDA imposes certain statistical requirements on the distributional properties of the predictors, violation of which will result in invalid or poor approximations. For example, MDA assumes the same variance–covariance matrices of the predictors for both failed and non-failed group; it also requires normal distributions of failure predictors which bias against the use of indicator independent variables. Moreover, the output of MDA model is a score which has little intuitive interpretation. The matching procedure which is typically used in MDA technique can result in losing useful information such as losing meaningful predictors. The use of logistic regression, on the other hand, essentially overcomes the weaknesses of MDA discussed above. Ohlson (1980) used data available prior to the date of bankruptcy to ensure the strict forecasting relationships. His sample included 105 bankruptcies and 2058 non-bankruptcies from the seventies (1970–1976). Among the 105 bankrupt firms, 8 firms are traded in New York Stock Exchange, 43 are traded in American Stock Exchange, and 54 are traded over-the-counter market or regional exchanges. Nine variables defined below are used to develop the logit model. X1 = Natural log of (Total Assets/ GNP Implicit Price Deflator Index). The index assumes a base value of 100 for 1968; 1
This section was written by Professors Lili Sun and Bi-Huei Tsai.
page 4328
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
Application of Discriminant, Factor Analysis, Logistic Regression
4329
X2 = (Total Liabilities/Total Assets); X3 = (Current Assets – Current Liabilities)/Total Assets; X4 = Current Assets/Current Liabilities; X5 = One if total liabilities exceeds total assets, zero otherwise; X6 = Net income/total assets; X7 = Funds provided by operations/total liabilities; X8 = One if net income was negative for the last two years, zero otherwise; and X9 = (Net income in year t − Net income in t − 1) / (Absolute net income in year t + Absolute net income in year t − 1). Three sets of coefficients are estimated using data one year prior to bankruptcy and or two years prior to bankruptcy. Intuitively, the model with estimates computed using data one year prior to bankruptcy performs the best, which is expressed as follows: Y = −1.32 − 0.407X1 + 6.03X2 − 1.43X3 + 0.0757X4 − 2.37X5 − 1.83X6 + 0.285X7 − 1.72X8 − 0.521X9 ,
(126.9)
where Y = log[P/(1 − P )], P = the probability of bankruptcy. Thus, the probability of bankruptcy (P ) is calculated as exp(Y )/[1 + exp(Y )] exp, and the model becomes relatively easy to interpret. Ohlson found that using a probability cutoff of 3.8 percent for classifying firms as bankrupt minimized Type I and Type II errors of the model presented in equation (126.9). At this probability cutoff point, the model correctly classified 87.6 percent of his bankrupt firm sample and 82.6 percent of the non-bankrupt firms. Begley et al. (1996) applied Ohlson’s logit model (1980) to predict bankruptcy for a holdout sample of 65 bankrupt and 1300 non-bankrupt firms in the 1980s. They found substantially higher Type I and Type II error rates than those in the original studies. They reestimated the coefficients for each model using data for a portion of their 1980s sample. However, they found no performance improvement for the re-estimated Ohlson Model. The logit model used by Ohlson (1980) is single-period static logit model. Shumway (2001) employed a discrete hazard model or multiple-period dynamic logit model. The concept of discrete-time hazard model originates
page 4329
July 6, 2020
16:8
4330
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
from the survival model and is widely used in biological medication field. It was not until recent years that social science researchers started using it for analyzing variables’ effect upon survival (e.g., Lancaster, 1992). Cox and Oakes (1984) calculate hazard rate to estimate the likelihood of survival and survival time. Shumway (2001) elaborates the econometric advantages of a hazard model over a static binary logit model. First, hazard models control for each firm’s period at risk, while static models do not. Secondly, hazard models exploit each firm’s time-series data by including annual observations as time-varying covariates. Thirdly, hazard models produce more efficient out-of-sample forecasts by utilizing much more data. Shumway (2001) proves that a multi-period logit model is equivalent to a discrete-time hazard model. Therefore, his model used multiple years of data for each sample firm, and treated each firm as a single observation. Moreover, Shumway (2001) also corrected the problem in the traditional approaches of bankruptcy forecasting, that is, previous studies used only most of the accounting ratios used in the previous bankruptcy studies are found not significant. Shumway (2001) incorporated not only financial ratios but also market variables such as market size, past stock returns, and idiosyncratic returns variability as bankruptcy predictors. The dependent variable in the prediction models is each firm-year’s bankruptcy status (0, 1) in a given sample year. In a hazard analysis, for a bankrupt firm, the dependent variable equals 1 for the year in which it files bankruptcy, and the dependent variable equals 0 for all sample years prior to the bankruptcy-filing year. The non-bankrupt firms are coded 0 every year they are in the sample. The Shumway (2001) study employed all available firms in a broad range of industries, resulting in a sample of 300 bankrupt firms for the period of 1962–1992. The study found that a multi-period logit model out-performed MDA and single-period logit models, and that a combination of marketbased and accounting-based independent variables out-performed models that were only accounting-based. The Shumway model with incorporation of market-based and accounting-based predictors is expressed as follows: Y = −13.303 − 1.982X1 + 3.593X2 − 0.467X3 − 1.809X4 + 5.79X5 , (126.10) where Y = log[P/(1 − P )], P = the probability of bankruptcy; X1 = Net Income/Total Assets; X2 = (Total Liabilities/Total Assets);
page 4330
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
Application of Discriminant, Factor Analysis, Logistic Regression
4331
X3 = The logarithm of (each firm’s market capitalization at the end of year prior to the observation year/total market capitalization of NYSE and AMEX market); X4 = Past excess return as the return of the firm in year t − 1 minus the value-weighted CRSP NYSE/AMEX index return in year t − 1; and X5 = idiosyncratic standard deviation of each firm’s stock returns. It is defined as the standard deviation of the residual of a regression which regresses each stock’s monthly returns in year t − 1 on the value-weighted NYSE/AMEX index return for the same year. To evaluate the forecast accuracy of the Hazard model presented in Equation (126.10), Shumway (2001) divided the test sample into ten groups based upon their predicted probability of bankruptcies using the model. The hazard model classifies almost 70% of all bankruptcies in the highest bankruptcy probability decile and classifies 96.6 of bankrupt firms above the median probability. Following Shumway (2001), researchers of bankruptcy prediction have been using multiple-period logit regression. For instance, Sun (2007) developed a multi-period logistic regression model and found this model outperforms auditors’ going concern opinions in predicting bankruptcy. Saunders and Allen (2002) and Saunders and Cornett (2006) have discussed alternative methods for determining credit risk. Hwang et al. (2008) use the discrete-time survival model by Cox and Oakes (1984) to predict the probability of financial distress for each firm under study. The maximum likelihood method is employed to estimate the values of our model parameters. The resulting estimates are analyzed by their asymptotic normal distributions, and are used to estimate the in-sample probability of financial distress for each firm under study. Using such estimated probability, a strategy is developed to identify failing firms, and is applied to study the probability of bankruptcy in the future for firms listed in Taiwan Stock Exchange. Empirical studies demonstrate that our strategy developed from the discrete-time survival model can yield more accurate forecasts than the alternative methods based on the logit model in Ohlson (1980) and the probit model in Zmijewski (1984). Appendix 126A presents both logit and probit models. In addition, Audrino et al.’s (2019) generalized logit model is also presented in this appendix. Some empirical results obtained by Audrino et al. are also presented.
page 4331
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4332
Table 126.4:
Instance confusion matrix. Actual Outcome
Predicted Outcome Survive (y ∧ = 0) Failure (y ∧ = 1)
Survive (y = 0)
Failure (y = 1)
6,000 1,000 (Type II errors)
50 (Type I errors) 100
Source: Audrino et al., 2019.
126.6.1 MIDAS Logit Model Audrino et al. (2019) propose a new approach based on a generalization of the logit model to improve prediction accuracy in U.S. bank failures. A mixeddata sampling (MIDAS) was introduced in the logistic regression. Using the MIDAS logit model, the results show a significantly higher accuracy in predicting bank failure compared with the traditional logit model where the largest bank failures had been misclassified. Due to evaluation classification accuracy is important for comparing classifiers. Sun, Kamel, Wong, and Wang (2007) show that standard indicators are not valid when the data includes the imbalance of classes. The example below speaks to the problem while considering 7,150 classified banks with 150 banks for a rare class (e.g., bank failure) (see Audrino et al., 2019). Table 126.4 presents a sample confusion matrix. In this case, the authors predict 1,100 banks to belong to class “1” (to fail), but the prediction is incorrect in 1,000 cases (type II errors). Moreover, 50 banks are not suspected but fail (type I errors). Although the standard measures indicate a general high accuracy of the classifiers, the number of false negatives is large compared to the sample size: Specificity = 6,000/(50 + 6,000) = 0.99, almost perfect, Accuracy = (100 + 6,000)/7,150 = 0.85, very good, Share of correctly predicted failures =100/(100 + 50)=0.67, good. Table 126.4 is a confusion matrix for an example with 7,150 classified banks (150 banks belong to a rare class “1”). Source: Audrino et al. (2019). 126.7 Merton Distance Model and KMV-Merton Model 126.7.1 Merton distance model The Merton DD model produces a probability of default for each firm in the sample at any given point in time. To calculate the probability, the model subtracts the face value of the firm’s debt from an estimate of the market
page 4332
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4333
value of the firm and then divides this difference by an estimate of the volatility of the firm (scaled to reflect the horizon of the forecast). The resulting z-score, which is sometimes referred to as the distance to default, is then substituted into a cumulative density function to calculate the probability that the value of the firm will be less than the face value of debt at the forecasting horizon. The market value of the firm is simply the sum of the market values of the firm’s debt and the value of its equity. If both these quantities were readily observable, calculating default probabilities would be simple. While equity values are readily available, reliable data on the market value of firm debt is generally unavailable. Appendix 126C explains how to resolve this issue. Following Appendix 126C, the Merton distance to default can be calculated as ln VF + (μ − 0.5σV2 )T √ , (126.11) DD = σV T where μ is an estimate of the expected annual return of the firm’s assets. V is the total value of the firm, σV is the volatility of firm value. T is the time to maturity. F is the face value of the firm’s debt, r is the instantaneous risk-free rate. The corresponding implied probability of default, sometimes called the expected default frequency (or EDF), is ln VF + (μ − 0.5σV2 )T √ = N (−DD), (126.12) πMerton = N − σV T where N (·) is the cumulative standard normal distribution function. There are five variables in equation (126.11) i.e., μ, V, σV , T and F . In these five variables, only V and σV are not observable. Following Bharath and Shumway (2008), in Appendix 126C we will discuss how V and σV can be numerically estimated. In addition, we also discuss how DD can be estimated. 126.7.2 A Na¨ıve alternative To avoid solving equations (126C.2) and (126C.5) by implementing the iterative procedure described in Appendix 126C. Bharah and Shumway (2008) construct a naive predictor with two objectives. First, they want their naive predictor to have a reasonable chance of performing as well as the Merton DD predictor, so they want it to capture the same information the Merton DD predictor uses. They also want their naive probability to approximate the functional form of the Merton DD probability. Second, they want their naive probability to be simple, so they avoid solving any equations or estimating
page 4333
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4334
any difficult quantities in its construction. They proposed the form for the naive probability similar to the Merton DD model. None of the numerical choices of na¨ıve model is result of any type of estimation or optimization as required by original Merton DD model, which is defined in equation (126.11). For constructing a naive probability, they first approximate the market value of each firm’s debt with the face value of its debt, naive D = F,
(126.13)
Since they assume that firms that are close to default have very risky debt, and the risk of their debt is correlated with their equity risk, they approximate the volatility of each firm’s debt as naive σD = 0.05 + 0.25 · σE .
(126.14)
They include the five percentage points in this term to represent term structure volatility, and they include the 25% times equity volatility to allow for volatility associated with default risk. This gives them an approximation to the total volatility of the firm of E naive D σE + naive σD E + naive D E + naive D E E σE + (0.05 + 0.25 ∗ σE ). = E+F E+F
naive σV =
(126.15)
Next, they set the expected return on the firm’s assets equal to the firm’s stock return over the previous year, naive μ = ri t−1 .
(126.16)
This allows them to capture some of the same information that is captured by the Merton DD iterative procedure described above. The iterative procedure is able to condition on an entire year of equity return data. By allowing their na¨ıve estimate of μ to depend on past rate of returns, they incorporate the same information used by Merton DD model. The naive distance to default is then + (ri t−1 − 0.5 naive σV2 )T ln E+F F √ . (126.17) naive DD = naive σV T This naive alternative model is easy to compute; however, it retains the structure of the Merton DD distance to default and expected default frequency. It also captures approximately the same quantity of information as the Merton DD probability. Thus, examining the forecasting ability of this quantity helps them separate the value of solving the Merton model from
page 4334
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4335
the value of the functional form of πMerton . Bharath and Shumway (2008) define a naive probability estimate as πnaive = N (−naive DD).
(126.18)
Bharath and Shumway (2008) argue that it is fairly easy to criticize their naive probability. This is because their choices for modeling firm volatility are not particularly well motivated and their decision to use past returns for μ is arbitrary at best. However, to quibble with their naive probability is to miss the point of their exercise. In their paper, they have empirically demonstrate that the predictive power of na¨ıve model is as good as the Merton DD model. However, they suggest that a theoretical improved na¨ıve model should be considered for future research. Bharath and Shumway (2008) have found their na¨ıve model perform as good as the original Merton DD model. I would like to give several explanation of their findings as follows. (i) It is well known that the functional form of Black–Scholes original model is subject to specification biases. These include that original Black–Scholes model assumes the firm value is lognormally distributed. In addition, Bharath and Shumway (2008)’s empirical work assumes that volatility of firms is constant. Therefore, it is worthwhile to reconsider to use stochastic volatility type of CEV model to reexamine the performance of Merton DD model. (ii) It is well known that the estimator can contaminate estimation risk. Therefore, the estimation risk might be larger than the measurement error which uses face value of bond to replace market value of bond. In Section 126.8, we will propose a theoretically improved na¨ıve model in terms of the capital structure determination models suggested by Titman and Wessels (1988), Chang and Lee (2009), Yang et al. (2010) and Lee and Tai (2016) and others. 126.7.3 KMV-Merton model The KMV Corporation developed the KMV-Merton model, which now is frequently used in both practice and academic research. This model is an application of Merton DD model which has been discussed in Section 126.7.1. Bharath and Shumway (2008) have pointed out that there are a number of things which differentiate the Merton DD model, which they test from that actually employed by Moody’s KMV. One important difference is that we use Merton’s (1974) model while Moody’s KMV uses a proprietary model
page 4335
July 6, 2020
16:8
4336
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
that they call the KV model. Apparently the KV model is a generalization of the Merton model that allows for various classes and maturities of debt. Bharath and Shumway (2008) use the cumulative normal distribution to convert distances to default into default probabilities. However, Moody’s KMV uses its large historical database to estimate the empirical distribution of changes in distances to default and it calculates default probabilities based on that distribution. The distribution of distances to default is an important input to default probabilities, but it is not required for ranking firms by their relative probability. Therefore, several of our results will emphasize the model’s ability to rank firms by default risk rather than its ability to calculate accurate probabilities. Moody’s KMV may also make proprietary adjustments to the accounting information that they use to calculate the face value of debt. We cannot perfectly replicate the methods of Moody’s KMV because several of the modeling choices made by Moody’s KMV are proprietary information. As I discussed in Section 126.7.2, cumulative normal density function might not appropriate for estimating Merton DD model. I strongly believe KMV model use the distribution in terms of historical data is more appropriate than the cumulative normal density function used by Bharath and Shumway (2008). 126.8 Capital Structure Determination and Na¨ıve Default Probability In this section, we will discuss how capital structure determination models can be used to theoretically improve the na¨ıve default probability model as defined in equation (126.17). Following Titman and Wessels (1988), Chang et al. (2009), Yang et al. (2010) and Lee and Tai (2016), we will define the capital structure determination model as follows. Titman and Wessels (1988) and Chang et al. (2009) have used single equation to estimate the determination of capital structure. Yang et al. (2010) and Lee and Tai (2016) have incorporated stock return information to simultaneously estimate the capital structure. We can use Lee and Tai (2016)’s model to estimate the capital structure. Then we can use the information of estimation of capital structure and market value of the firm to estimate the value of debt as follows: Debt ∗ Market value of equity. (126.19) Estimated Debt = Estimated Equity To obtain the theoretical improved na¨ıve model, we can replace equation (126.13) by equation (126.19). Then we follow Bharath and Shumway
page 4336
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4337
(2008)’s approach to obtain the modified na¨ıve default probability estimator. The performance of this modified model and original na¨ıve model will be empirically tested in the future research. 126.9 Empirical Comparison Bharath and Shumway (2008) have empirically compared the performance of three alternative models for estimating default probability. These three models are Merton DD model, na¨ıve DD model and Moody’s KMV model. They found na¨ıve model performs at least as good as two other models. The main reason of this finding has been discussed in Section 126.7.2. Mai’s dissertation (2010) have re-examine the four most commonly employed default prediction models: Z-score model (Altman, 1968), logit model (Ohlson, 1980), probit model (Zmijewski, 1984), and hazard analysis (Shumway, 2001). Her empirical results show that the discrete-time hazard model adopted by Shumway (2001), combined with a new set of accounting-ratio and market-driven variables improves the bankruptcy forecasting power. By using a hand-collected business default events from Compustat Annual Industrial database and publicly available press-news, Mai’s (2010) has constructed a sample of publicly-traded companies in one of the three US stock markets between 1991 and 2006. With cautiously chosen cutoff at 0.021 implied bankruptcy probability level, the out-of-sample hazard model with stepwise methodology results in classifying 82.7% of default firms and 82.8% of non-default firms. Comparing to the best results in Shumway (2001), which provides 76.5% classification of default firms, 55.2% in Altman (1993), 66.1% in Ohlson (1980), and 65.4% in Zmijewski (1984), it can be concluded that resolve from her dissertation did better than the other 4 models. However, the most recent results in Audrino et al. (2019) provides 85% prediction accuracy. The specification of Logit, Probit and MIDAS Logit Models can be found in Appendix 126A, the SAS Code for Hazard Model in Bankruptcy Forecasting in Appendix 126B and Application of Merton distance to default model in Appendix 126C. 126.10 Summary and Concluding Remarks In this paper, we have first discussed applications of discriminant analysis and factor analysis. Examples of using two-group discriminant functions to perform credit analysis, predict corporate bankruptcy, and determine
page 4337
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4338
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
problem banks and distressed P-L insurers were discussed in detail. Basic concepts of factor analysis were presented, showing their application in determining useful financial ratios. In addition, the combination of factor analysis and discriminant analysis to analyze industrial bond ratings was also explored. Furthermore, Ohlson’s and Shumway’s methods for estimating default probability were discussed. Finally, the KMV-Merton DD model is discussed and critiqued in detail. In sum, this paper reviews the application of Discriminant Analysis, Factor Analysis, Logistic Regression, MIDAS Logit Model, and KMV-Merton Model in Credit Risk Analysis in detail. I strongly believe this paper will be useful for academic researchers and practitioners to understand how default probability can be estimated and applied in the real world. Bibliography Altman, E.I. (1968). Financial Ratios, Discriminant Analysis, and the Prediction of Corporate Bankruptcy. Journal of Finance 23, 589–609. Altman, E.I., Haldeman, R. and Narayanan, P. (1977). ZETA Analysis: A New Model for Bankruptcy Classification. Journal of Banking and Finance 1, 29–54. Altman, E.I. and McGough, T.P. (1974). Evaluation of a Company as a Going-Concern. Journal of Accountancy 138(6), 50–57. Altman, E.I. (1982). Accounting Implications of Failure Prediction Models. Journal of Accounting, Auditing, Finance 6, 4–19. Altman, E.I. (1984). The Success of Business Failure Prediction Models. An International Survey. Journal of Banking and Finance 8(2), 171–98. Altman, E.I. (1989). Measuring Corporate Bond Mortality and Performance. Journal of Finance 44(4), 909–922. Altman, E.I. (1993). Corporate Financial Distress and Bankruptcy: A Complete Guide to Predicting and Avoiding Distress and Profiting from Bankruptcy. Wiley, New York. Altman, E.I. (2000). Predicting Financial Distress of Companies Revisiting the Z-score and ZETA Models. Working Paper. Altman, E.I., Marco, G. and Varetto, F. (1994). Corporate Distress Diagnosis: Comparisons using Linear Discriminant Analysis and Neural Networks. Journal of Banking & Finance 18, 505–29. Amemiya, T. (1981). The Qualitative Response Models: A Survey. Journal of Economic Literature 19, 1483–1536. Anandarajan, M., Lee, P. and Anandarajan, A. (2001). Bankruptcy Prediction of Financially Stressed Firms: An Examination of the Predictive Accuracy of Artificial Neural Networks. International Journal of Intelligent Systems in Accounting, Finance and Management 10, 69–81. Anderson, J.A. (1972). Separate Sample Logistic Discrimination. Biometrika 59, 19–35. Audrino, F., Kostrov, A. and Ortega, J.-P. (2019). Predicting U.S. Bank Failures with MIDAS Logit Models. Journal of Financial & Quantitative Analysis, 54, 2575–2603. Barth, M.E., Beaver, W.H. and Landsman, W.R. (1998). Relative Valuation Roles of Equity Book Value and Net Income as a Function of Financial Health. Journal of Accounting & Economics 25, 1–34.
page 4338
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4339
Beaver, W. (1968). Market Prices, Financial Ratios and Prediction of Failure. Journal of Accounting Research 6(2), 179–192. Beaver, W.H. (1966). Financial Ratios as Predictors of Failure. Journal of Accounting Research 4, 71–111. Beaver, W.H. (1968). Alternative Accounting Measures as Predictors of Failure. Accounting Review 43, 113–122. Beaver, W.H., McNichols, M.F. and Rhie, J. (2005). Have Financial Statements Become Less Informative? Evidence from the Ability of Financial Ratios to Predict Bankruptcy. Review of Accounting Studies 10(1), 93–122. Begley, J., Ming, J. and Watts, S. (1996). Bankruptcy Classification Errors in the 1980s: An Empirical Analysis of Altman’s and Ohlson’s Models. Review of Accounting Studies 1, 267–284. Bhandari, S.B., Soldofsky, R.M. and Boe, W.J. (1979). Bond Quality Rating Changes for Electric Utilities: A Multivariate Analysis. Financial Management 8, 74–81. Bharath, S.T. and Shumway, T. (2004). Forecasting Default with the KMV-Merton Model. Working Paper, the University of Michigan. Bharath, S.T. and Shumway, T. (2008). Forecasting Default with the Merton Distance to Default Model. Review of Financial Studies 21(3), 1339–1369. Billings, B. (1999). Revisiting the Relation between the Default Risk of Debt and the Earnings Response Coefficient. Accounting Review 74(4), 509–522. Blume, M.P. (1974). The Failing Company Doctrine. Journal of Accounting Research 43, 1–25. Chen, K.H. and Shimerda, T.A. (1981). An Empirical Analysis of Useful Financial Ratios. Financial Management 10, 51–60. Cielen, A., Peeters, L. and Vanhoof, K. (2004). Bankruptcy Prediction Using a Data Envelopment Analysis. European Journal of Operational Research 154(2), 526–532. Clark, K. and Ofek, E. (1994). Mergers as a Means of Restructuring Distressed Firms: An Empirical Investigation. Journal of Financial & Quantitative Analysis 29(4), 541–565. Cox, D.R. and Oakes, D. (1984). Analysis of Survival Data. Chapman & Hall, New York. Crosbie, P. and Kocagil, A. (2003). Modeling Default Risk. Moody’s KMV Company. Denis, D., Denis, D. and Sarin, A. (1997). Ownership Structure and Top Executive Turnover. Journal of Financial Economics 45, 193–221. Diamond, Jr. H. (1976). Pattern Recognition and the Detection of Corporate Failure. Ph.D. dissertation, New York University. Dichev, I. (1998). Is the Risk of Bankruptcy a Systematic Risk? Journal of Finance 53, 1131–48. Dimitras, A.I., Zanakis, S.H. and Zopounidis, C. (1996). Theory and Methodology: A Survey of Business Failure with an Emphasis on Prediction Methods and Industrial Applications. European Journal of Operational Research 90, 487–513. Duffie, D. and Singleton, K. (1999). Modeling Term Structures of Defaultable Bonds. Review of Financial Studies 12, 687–720. Drehmann, M., Patton, A.J. and Sorensen, S. (2005). Corporate Defaults and Large Macroeconomic Shocks. Working Paper. Farrar, F.S. (1962). The Investment Decision Under Uncertainty. Prentice-Hall, Englewood Cliffs, NJ. Falkenstein, E.G., Boral, A. and Carty, L. (2000). Riskcalc for Private Companies: Moody’s Default Model. Global Credit Research. Fitzpatrick, P. (1932). A Comparison of the Ratios of Successful Industrial Enterprises with Those of Failed Companies. The Accountants Publishing Company.
page 4339
July 6, 2020
16:8
4340
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
Foster, G. (1998). Financial Statement Analysis, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ. Foster, B., Ward, T. and Woodroof, J. (1998). An Analysis of the Usefulness of Debt Defaults and Going Concern Opinions in Bankruptcy Risk Assessment. Journal of Accounting, Auditing & Finance 13(3), 351–371. Frecka, T. and Lee, C.F. (1983). Generalized Ratio Generation Process and Its Implications. Journal of Accounting Research 21, 308–316. Ghysels, E., Plazzi, A. and Valkanov, R. (2016). Why Invest in Emerging Markets? The Role of Conditional Return Asymmetry. Journal of Finance 71, 2145–2192. Ghysels, E. and Qian, H. (2019). Estimating MIDAS Regressions via OLS with Polynomial Parameter Profiling. Econometrics and Statistics 9, 1–16. Glennon, D. and Nigro, P. (2005). Measuring the Default Risk of Small Business Loans: A Survival Analysis Approach. Journal of Money, Credit, and Banking 37(5), 923–947. Hillegeist, S.A., Keating, E.K. and Cram, D.P. (2004). Assessing the Probability of Bankruptcy. Review of Accounting Studies 9, 5–34. Hol, S., Westgaard, S. and Wijst, N. (2002). Capital Structure and the Prediction of Bankruptcy. Working Paper. Honjo, Y. (2000). Business Failure of New Firms: An Empirical Analysis Using a Multiplicative Hazards Model. International Journal of Industrial Organization 18(4), 557–574. Hopwood, W.S., Mckeown, J.C. and Mutchler, J.P. (1989). A Test of the Incremental Explanatory Power of Opinions Qualified for Consistency and Uncertainty. Accounting Review 64, 28–48. Hopwood, W., Mckeown, J.C. and Mutchler, J.F. (1984). A Reexamination of Auditor versus Model Accuracy within the Context of the Going-Concern Opinion Decision. Contemporary Accounting Research 10, 409–431. Horrigan, J.O. (1965). Some Empirical Bases of Financial Ratio Analysis. Accounting Review 40, 558–586. Hwang, R.C. and Cheng, K.F. (2009). On Multiple-Class Prediction of Issuer Credit Ratings, Journal of Applied Stochastic Models in Business and Industry 25(5), 535–550. Hwang, R.C., Wei, H.C., Lee, J.C. and Lee, C.F. (2008). On Prediction of Financial Distress Using the Discrete-Time Survival Model. Journal of Financial Studies 16, 99–129. Johnson, W.B. (1979). The Cross-Sectional Stability of Financial Ratio Patterns. Journal of Financial and Quantitative Analysis 14, 1035–1048. Jones, F. (1987). Current Techniques in Bankruptcy Prediction. Journal of Accounting Literature 6, 131–164. Jones, S. and Hensher, D.A. (2004). Predicting Firm Financial Distress: A Mixed Logit Model. Accounting Review 79(4), 1011–1038. Joy, O.M. and Tollefson, J.O. (1975). On the Financial Applications of Discriminant Analysis. Journal of Financial And Quantitative Analysis 10, 723–739. Kao, L.-J. and Lee C.-F. (2012). Alternative Method for Determining Industrial Bond Ratings: Theory and Empirical Evidence. International Journal of Information Technology & Decision Making 11(6), 1215–1235. Kiefer, N.M. (1988). Economic Duration Data and Hazard Functions. Journal of Economic Literature 26, 646–79. King, B.F. (1966). Market and Industry Factors in Stock Price Behavior. Journal of Business (Suppl.), 139–190. Lachenbruch, P.A. (1967). An Almost Unbiased Method of Obtaining Confidence Intervals for the Probability of Misclassification in Discriminant Analysis. Biometrics 23, 639–645.
page 4340
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4341
Lancaster, T. (1992). The Econometric Analysis of Transition Data. Cambridge University Press, New york. Lane, W.R., Looney, S.W. and Wansley, J.W. (1986). An Application of the Cox Proportional Hazards Model to Bank Failure. Journal of Banking and Finance 10, 511–531. Lau, A.H.L. (1987). A Five-State Financial Distress Prediction Model. Journal of Accounting Research 18, 109–131. Lee, C.F. and Lee, A. (2017). Application of Discriminant Analysis and Factor Analysis in Financial Management. Financial Analysis, Planning and Forecasting: Theory and Application. 3rd edition. World Scientific, Singapore. Lloyd, W.P. and Lee, C.F. (1975). Block Recursive Systems in Asset Pricing Models. Journal of Finance 31, 1101–1113. Mai, J.S. (2010). Alternative Approaches to Business Failure Prediction Models. Essay I of Dissertation, Rutgers University. Mehta, D.R. (1974). Working Capital Management. Prentice-Hall, Englewood Cliffs, NJ. Merton, R.C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance 29, 449–70. Molina, C.A. (2005). Are Firms Underleveraged? An Examination of the Effect of Leverage on Default Probabilities. Journal of Finance 3, 1427–1459. Moyer, R.C. (1977). Forecasting Financial Failure: A Re-examination. Financial Management Spring, 111–117. Ohlson, J.S. (1980). Financial Ratios and the Probabilistic Prediction of Bankruptcy. Journal of Accounting Research 19, 109–131. Orgler, Y.E. (1970). A Credit-Scoring Model for Commercial Loans. Journal of Money, Credit and Banking 2, 435–445. Orgler, Y.E. (1975). Analytical Methods in Loan Evaluation. Lexington Books, Lexington, MA. Penman, S.H. (2006). Financial Statement Analysis and Security Valuation. 3rd edn. McGraw-Hill/Irwin, New York. Pinches, G.E. and Mingo, K.A. (1973). A Multivariate Analysis of Industrial Bond Ratings. Journal of Finance 28, 1–18. Pinches, G.E., Eubank, A.A. and Mingo, K.A. (1975). The Hierarchical Classification of Financial Ratios. Journal Of Business Research 3, 295–310. Pinches, G.E., Mingo, K.A. and Caruthers, J.K. (1973). The Stability of Financial Patterns in Industrial Organizations. Journal of Finance 28, 389–396. Pinches, G.E., Singleton, J.C. and Jahankhani, A. (1978). Fixed Coverage as a Determinant of Electric Utility Bond Ratings. Financial Management 8, 45–55. Pogue, T.F. and Soldofsky, R.M. (1969). What’s in a Bond Rating? Journal of Financial and Quantitative Analysis 201–228. Rao, C.R. (1952). Advanced Statistical Methods in Biometric Research. Wiley, New York. Roll, R. and Ross, S.A. (1980). An Empirical Investigation of the Arbitrage Pricing Theory. Journal of Finance 35, 1073–1103. Saretto, A.A. (2005). Predicting and pricing the probability of default. Working Paper. Sarkar, S. and Sriram, R.S. (2001). Bayesian Models for Early Warning of Bank Failures. Management Science 47(11), 1457–1475. Saunders, A. and Allen, L. (2002). Credit Risk Measurement: New Approaches to Value at Risk and Other Paradigms, 2nd ed. Wiley, New York. Saunders, A. and Cornett, M.M. (2013). Financial Institutions Management: A Risk Management Approach, 8th edn. McGraw-Hill/Irwin, New York. Scott, J. (1981). The Probability of Bankruptcy: A Comparison of Empirical Predictions and Theoretical Models. Journal of Banking and Finance 5, 317–344.
page 4341
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4342
Shumway, T. (2001). Forecasting Bankruptcy More Accurately: A Simple Hazard Model. Journal of Business 74, 101–124. Simkowitz, M.A. and Monroe, R.J. (1971). A Discriminant Analysis Function for Conglomerate Targets. Southern Journal of Business November, 1–16. Singer, J.D. and Willett, J.B. (1993). It’s About Time: Using Discrete-Time Survival Analysis to Study Duration and the Timing of Events. Journal of Educational Statistics, 18, 155–195. Sinkey, J.F. (1975). A Multivariate Statistical Analysis of the Characteristics of Problem Bank. Journal of Finance 30, 21–36. Stevens, D.L. (1970). Financial Characteristics of Merged Firms: A Multivariate Analysis. Journal of Financial and Quantitative Analysis 5, 36–62. Sun, L. (2007). A Re-evaluation of Auditors’ Opinions versus Statistical Models in Bankruptcy Prediction. Review of Quantitative Finance and Accounting 28(1), 55–78. Tam, K.Y. and Kiang, M.Y. (1992). Managerial Applications of Neural Networks: The Case of Bank Failure Redictions. Management Science 38(7), 926–47. Trieschmann, J.S. and Pinches, G.E. (1973). A Multivariate Model for Predicting Financially Distressed P-L Insurers. Journal of Risk and Insurance September, 327–338. Van Horne, J.C. (2001). Financial Management and Policy. 12th edn. Prentice-Hall, Englewood Cliffs, NJ. Venuti, E.K. (2004). The Going-Concern Assumption Revisited: Assessing a Company’s Future Viability. CPA Journal May. Vuong, Q. (1989). Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses. Econometrica 57(2), 307–333. Wilcox, J.W. (1971). A Simple Theory of Financial Ratios as Predictors of Failure. Journal of Accounting Research Fall, 389–395. Zavgren, C.V. (1983). The Prediction of Corporate Failure: The State of the Art. Journal of Accounting Literature 2, 1–38. Zmijewski, M.E. (1984). Methodological Issues Related to the Estimation of Financial Distress Prediction Models. Supplement to Journal of Accounting Research 22, 59–68.
Appendix 126A Logistic Model, Probit Model and MIDAS Logit Model The likelihood function for binary sample space of bankruptcy and nonbankruptcy is l=
i∈s1
P (Xi , β)
(1 − P (Xi , β)),
(126A.1)
i∈s2
where P is some probability function, 0 ≤ P ≤ 1; and P (Xi , β) denotes probability of bankruptcy for any given Xi and β. Since it is not easy to solve the selecting probability function P , for simplicity, one can solve the likelihood function in (126A.1) by taking the
page 4342
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4343
natural logarithm. The logarithm of the likelihood function then is log P (Xi , β) + log(1 − P (Xi , β)), (126A.2) L(l) = i∈s1
i∈s2
where S1 is the set of bankrupt firms; and S2 is the set of non-bankrupt firms. The maximum likelihood estimators for βs can be obtained by solving Maxβ L(l). In logistic model, the probability of company i going bankrupt given independent variables Xi is defined as P (Xi , β) =
1 . 1 + exp(−β Xi )
(126A.3)
The two implications
here are (1) P (·) is increasing in β Xi and (2) β Xi is P equal to log (1−P ) . We then classify bankrupt firms and non-bankrupt firms by setting a “cut-off” probability attempting to minimize Type I and Type II errors. In probit models, the probability of company i going bankrupt given independent variables Xi is defined as (Xi ,β) z2 1 √ e− 2 dz, (126A.4) P (Xi , β) = 2π −∞
the cumulative standard normal distribution function. Maximum likelihood estimators from probit model can obtained similarly as in the logistic models. Although probit models and logistic models are similar, logistic models are preferred to probit models due to the nonlinear estimation in probit models (Gloubos and Grammatikos, 1998). Moreover, the MIDAS Logit model generalizes the simple Logit model and provides better prediction accuracy. The MIDAS Logit Regression introduces mixed-data sampling (MIDAS) aggregation to construct financial predictors in logistic regression. MIDAS regression is a rather new topic in statistics software where data with different frequencies are employed. In other words, a regression includes variables with combined frequencies, quarterly and annually. It is a rule of thumb that in the traditional methods, the estimation of dependent and independent variables should use the same frequency, and no combinations of higher or lower frequency are allowed. This rule becomes even more restrictive when panel data is used (to achieve strongly balanced panels). The MIDAS Logit method allows for relaxing the limitation of conventional annual aggregation where an assumed equal weight of four quarters
page 4343
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
4344
in the annual aggregate. The MIDAS weighting scheme allows for individual weights for every quarter, where relevant periods can be selected and weighted with different weights. In addition to capturing the mixedfrequency information, an algorithm can be applied to reweight observations in the log-likelihood function and mitigate the class-imbalance problem. This happened when one class of observations are severely under-sampled (Audrino et al., 2019). For example, the generalized logit model with MIDAS-weighted part (MIDAS logit) is given by y t+h = Λ(αι + Zt β + γX d (θ1 , θ2 )) + εt+h ,
(126A.5)
where it includes a MIDAS-weighted factor x ˜d with a temporal aggregation period d. Compared with logit model y t+h = Λ(αι+ Zt β)+ εt+h , the MIDAS logit needs to estimate extra parameters of θ1 , θ2 , and γ. We note that one could generalize the above model by adding multiple MIDAS weighted variables. Appendix 126B SAS Code for Hazard Model in Bankruptcy Forecasting libname jess ’D:\Documents and Settings\MAI\Desktop\sas code’; *test modified model’s performance in the test sample; *** Logistic Regression Analysis ***; options pageno=1; * Add prediction data set to original data; data jess._prddata; _FREQ_ = 1; set jess.train_winsor1 jess.test_winsor1(in=inprd); _inprd_ = inprd; * set freq variable to 0 for additional data; if inprd then _FREQ_ = 0; run; proc logistic data=jess._prddata DESCEND; freq _FREQ_; model BPTSTATUS = NITA CASALES CACL CATA CASHTA LTDTA LSALES CAR LNMCP/ ctable pprob=0.001 to 0.99 by 0.01; ** Create output data set for predictions **; output out=jess.pred p=phat; run; proc means data=jess._prddata; var NITA CASALES CACL CATA CASHTA LTDTA LSALES stress3 CAR LNMCP; run; data jess.out_sample;
page 4344
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
Application of Discriminant, Factor Analysis, Logistic Regression
4345
set jess.pred; if _FREQ_ = 0; run; data jess.count1_1; set jess.out_sample; if phat >=0.021 and bptstatus = 1; run; data jess.count1_2; set jess.out_sample; if bptstatus = 1; run; data jess.count2_1; set jess.out_sample; if phat < 0.021 and bptstatus = 0; run; data jess.count2_2; set jess.out_sample; if bptstatus = 0; run;
Appendix 126C Application of Merton Distance to Default Model Following Bharath and Shumway (2008), we will discuss how V and σV can be numerically estimated. The Merton DD model estimates the market value of debt by applying the classic Merton (1974) bond pricing model. The Merton model makes two particularly important assumptions. The first is that the total value of a firm follows geometric Brownian motion, dV = μV dt + σV V dW,
(126C.1)
where V is the total value of the firm, μ is the expected continuously compounded return on V , σV is the volatility of firm value and dW is a standard Wiener process. The second critical assumption of the Merton model is that the firm has issued just one discount bond maturing in T periods. Under these assumptions, the equity of the firm is a call option on the underlying value of the firm with a strike price equal to the face value of the firm’s debt and a time-to-maturity of T . Moreover, the value of equity as a function of the total value of the firm can be described by the Black–Scholes–Merton formula. By put–call parity, the value of the firm’s debt is equal to the value of a risk-free discount bond minus the value of a put option written on the
page 4345
July 6, 2020
16:8
4346
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
firm, again with a strike price equal to the face value of debt and a time-tomaturity of T . Symbolically, the Merton model stipulates that the equity value of a firm satisfies E = V N (d1 ) − e−rT F N (d2 ),
(126C.2)
where E is the market value of the firm’s equity, F is the face value of the firm’s debt, r is the instantaneous risk-free rate, N (·) is the cumulative standard normal distribution function, d1 is given by ln VF + (r + 0.5σV2 )T √ , (126C.3) d1 = σV T √ and d2 is just d1 − σV T . The Merton DD model makes use of two important equations. The first is the Black–Scholes–Merton equation (126C.2), expressing the value of a firm’s equity as a function of the value of the firm. The second relates the volatility of the firm’s value to the volatility of its equity. Under Merton’s assumptions the value of equity is a function of the value of the firm and time, so it follows directly from Ito’s lemma that V ∂E σV . (126C.4) σE = E ∂V ∂E = N (d1 ), so In the Black–Scholes–Merton model, it can be shown that ∂V that under the Merton model’s assumptions, the volatilities of the firm and its equity are related by V N (d1 )σV , (126C.5) σE = E
where d1 is defined in equation (126C.3). The Merton DD model basically uses these two nonlinear equations (126C.2) and (126C.5), to translate the value and volatility of a firm’s equity into an implied probability of default. In most applications, the Black– Scholes–Merton model describes the unobserved value of an option as a function of four variables that are easily observed (strike price, time-to-maturity, underlying asset price, and the risk-free rate) and one variable that can be estimated (volatility). In the Merton DD model, however, the value of the option is observed as the total value of the firm’s equity, while the value of the underlying asset (the total value of the firm) is not directly observable. Thus, while V must be inferred, E is easy to observe in the marketplace by multiplying the firm’s shares outstanding by its current stock price. Similarly,
page 4346
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Application of Discriminant, Factor Analysis, Logistic Regression
b3568-v4-ch126
4347
in the Merton DD model, the volatility of equity, σE , can be estimated but the volatility of the underlying firm, σV , must be inferred. The first step in implementing the Merton DD model is to estimate σE from either historical stock returns data or from option-implied volatility data. The second step is to choose a forecasting horizon and a measure of the face value of the firm’s debt. For example, it is common to use historical returns data to estimate σE , assume a forecasting horizon of 1 year (T = 1), and take the book value of the firm’s total liabilities to be the face value of the firm’s debt. The third step is to collect values of the risk-free rate and the market equity of the firm. After performing these three steps, we have values for each of the variables in equations (126C.2) and (126C.5) except for V and σV , the total value of the firm and the volatility of firm value, respectively. The fourth, and perhaps most significant, step in implementing the model is to solve equation (126C.2) numerically for values of V and σV . Once this numerical solution is obtained, the distance to default can be calculated as DD =
ln
V F
+ (μ − 0.5σV2 )T √ , σV T
(126C.6)
where μ is an estimate of the expected annual return of the firm’s assets. The corresponding implied probability of default, sometimes called the expected default frequency (or EDF), is πMerton = N
−
ln
V F
+ μ − 0.5σV2 T √ = N (−DD) . σV T
(126C.7)
If the assumptions of the Merton model really hold, the Merton DD model should give very accurate default forecasts. In fact, if the Merton model holds completely, the implied probability of default defined above, πMerton , should be a sufficient statistic for default forecasts. Simultaneously solving equations (126C.2) and (126C.5) is reasonably straightforward. However, Crosbie and Bohn (2003) explain that “In practice the market leverage moves around far too much for [equation (126C.5)] to provide reasonable results.” To resolve this problem, we follow Crosbie and Bohn (2003) and Vassalou and Xing (2004) by implementing a complicated iterative procedure. First, we propose an initial value of σV = σE [E/(E +F )] and we use this value of v and equation (126C.2) to infer the market value of each firm’s assets every day for the previous year. We then calculate the
page 4347
July 6, 2020
16:8
4348
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch126
C. F. Lee and H.-C. Yu
implied log return on assets each day and use that returns series to generate new estimates of σV and μ. We iterate on σV in this manner until it converges (so the absolute difference in adjacent σV s is less than 10−3 ). Unless specified otherwise, in the rest of the paper values of πMerton are calculated by following this iterative procedure and calculating the corresponding implied default probability using equation (126C.7).
page 4348
July 17, 2020
14:54
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Chapter 127
Predicting Credit Card Delinquencies: An Application of Deep Neural Networks∗ Ting Sun and Miklos A. Vasarhalyi Contents 127.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 127.2 Literature Review . . . . . . . . . . . . . . . . . . 127.2.1 Statistical modeling . . . . . . . . . . . . . 127.2.2 Machine learning . . . . . . . . . . . . . . 127.2.3 Artificial neural networks . . . . . . . . . . 127.2.4 Deep neural networks . . . . . . . . . . . . 127.3 Deep Learning Approach . . . . . . . . . . . . . . . 127.3.1 The basic idea of deep learning . . . . . . . 127.3.2 The differences between deep learning and conventional machine learning approaches . 127.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . 127.5 Experimental Analysis . . . . . . . . . . . . . . . . 127.5.1 Splitting the data . . . . . . . . . . . . . . 127.5.2 Tuning the hyperparameters . . . . . . . . 127.5.3 Techniques handling data imbalance . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
4350 4352 4352 4353 4354 4356 4357 4357
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
4358 4359 4361 4362 4362 4365
Ting Sun The College of New Jersey email: [email protected] Miklos A. Vasarhalyi Rutgers University email: [email protected] ∗
This chapter is an updated version of the paper, “Predicting credit card delinquencies: An application of deep neural networks”, which was published in Intelligent Systems in Accounting, Finance and Management Vol. 25, No. 4, pp. 174–189, 2018. 4349
page 4349
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4350
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
127.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127.6.1 The predictor importance . . . . . . . . . . . . . . 127.6.2 The predictive result for cross validation . . . . . 127.6.3 Z-Test . . . . . . . . . . . . . . . . . . . . . . . . . 127.6.4 Prediction on test set . . . . . . . . . . . . . . . . 127.7 Conclusion and Limitations . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 127A Variable Definition . . . . . . . . . . . . . . . . . Appendix 127B Summary Statistics . . . . . . . . . . . . . . . . Appendix 127C Differences of the Mean for Important Variables between Groups . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
4365 4365 4367 4370 4370 4373 4374 4377 4379
. .
4380
Abstract The objective of this paper is 2-fold. First, it develops a prediction system to help the credit card issuer model the credit card delinquency risk. Second, it seeks to explore the potential of deep learning (also called deep neural network), an emerging artificial intelligence technology, in credit risk domain. With a real-life credit card data linked to 711,397 credit card holders from a large bank in Brazil, this study develops a deep neural network to evaluate the risk of credit card delinquency based on the client’s personal characteristics and the spending behaviors. Compared to machine learning algorithms of logistic regression, na¨ıve Bayes, traditional artificial neural network, and decision tree, deep neural network has a better overall predictive performance with the highest F scores and AUC. The successful application of deep learning implies that artificial intelligence has great potential to support and automate credit risk assessment for financial institutions and credit bureaus. Keywords Credit card delinquency • Deep neural network • Artificial intelligence • Risk assessment • Machine learning.
127.1 Introduction Credit card debt is climbing rapidly. For example, the total amount of outstanding revolving credit debt in the US has reached more than $1 trillion in 2017, according to a report of the Federal Reserve.1 This is the highest level of credit card debt since January 2009 (Porche, 2017). In the UK, the annual rate of credit card lending in September 2017 expanded by 9.2% on the same month a year earlier (Chu, 2017). This trend serves as an alarm for a high risk of credit card delinquencies. S&P/Experian Bankcard Default Index shows that the credit card delinquency rate in the US in March 2017 has reached 1
Information source: https://www.federalreserve.gov/releases/g19/hist/cc hist sa levels. html.
page 4350
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4351
the highest point since June 2013, 3.31% (Durden, 2017). Currently, the credit risk assessment has become a critical basis for the credit card issuer’s decision-making. The failure of effectively evaluating the risk could result in high non-performing ratio, increased debt collection cost, and growing bad debt counts, which threaten the health of credit card industry (Twala, 2010; Chen and Huang, 2011). Due to the significance of credit risk (especially the delinquency risk) assessment, many techniques have been proposed with promising results, e.g., discriminant analysis and logistic regression, decision trees, and support vector machine (Marqu´es, Garc´ıa, and S´anchez, 2012). Artificial neural networks were also employed to forecast credit risk (Koh and Chan, 2002; Thomas, 2000). Over the past decade, deep learning (also called deep neural network), an emerging artificial intelligence technique, has been applied to a variety of areas. It achieved excellent predictive performance in areas like healthcare and computer games where data is complex and big (Hamet and Tremblay, 2017) and exhibited great potential to be used for many other areas where human decision making is inadequate (Ohlsson, 2017). However, this approach has not been applied to predict credit card delinquencies. Furthermore, it is unclear whether this approach is superior to other machine learning approaches. Today’s business becomes more and more complex. The scale and complexity of the data make the decision-making of financial institutions much more challenging than ever before, even with the help of traditional data analytical technology (Turban et al., 2005). Hence, it is necessary to apply this state-of-the-art technology to develop intelligent systems to support and automate the decision-making of credit card issuers based on large volumes of data (Ohlsson, 2017). This paper bridges this gap by developing a deep neural network (DNN) for the prediction of credit card delinquency risk. Using a real-life credit card data from a large bank in Brazil, this research demonstrates the effectiveness of DNN in assisting financial institutions to quantify and manage the credit risk for the decision-making of credit card issuance and loan approval. Prior research believes that financial statements and customers’ transactional records are useful for credit risk assessment (Yeh and Lien, 2009). As for the credit card delinquency, researchers associated it with the personal characteristics of credit card holders and their spending behaviors (e.g., Khandani, Kim, and Lo, 2010; Chen and Huang, 2011). To date, as data storage becomes more convenient and inexpensive, financial institutions have accumulated a massive amount of data about credit card transactions and clients’ personal characteristics. This provides an excellent opportunity to use deep learning to establish delinquency risk prediction models. This
page 4351
July 6, 2020
16:8
4352
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
paper construct models using the personal information of the credit card holder (e.g., the occupation, the age, and the region of residence), the accumulative transactional information (e.g., the frequency that the client has been billed, and the total amount of cash withdrawals) based on the bank’s record in September 2013, and the data of transactions occurred in June 2013. After comparing deep learning to other machine learning algorithms with regard to the predictive performance on our data, we find that the DNN outperforms other models in terms of better F1 and AUC, which measure the overall predictive accuracy. The result suggests that deep learning can be a useful addition to the current toolkit for credit risk assessment especially for financial institutions and regulators in credit card industry, especially for data with severe imbalance issue, large size, and complex structure. The remainder of the paper is organized as follows. Section 127.2 reviewed the prior literature and addressed the research gap. Section 127.3 overviewed the deep learning method and discussed the differences between deep learning and other machine learning approaches. Section 127.4 introduced the data and variables. The modeling process and results are presented and reported in Section 127.5 and Section 127.6, respectively. Section 127.7 concludes the paper and discusses the limitation and future research. 127.2 Literature Review Prior research proposed a variety of data mining techniques to predict credit card delinquencies. Those techniques include statistical modeling approaches, such as discriminant analysis, logistic regression and k-nearest neighbors (KNN) (Abdou and Pointon, 2011), and machine learning approaches such as decision tree, na¨ıve bayes, support vector machine. In addition, artificial neural networks are considered an important alternative method. 127.2.1 Statistical modeling The standard approach of estimating the probability of credit card delinquencies is logistic regression (Crook, Edelman, and Thomas, 2007; Kruppa et al., 2013). Wiginton (1980) proposed one of the earliest studies comparing logistic regression with discriminant analysis for credit scoring. It found that the logistic regression model exhibited a better accuracy rate than the model of discriminant analysis. Leonard (1993) used logistic regression with random effects to evaluate commercial loans for a major Canadian bank. Logistic regression was also applied by Abdou, Pointon, and El-Masry (2008), who
page 4352
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4353
investigated the credit risk for Egyptian banks with logistic regression, discriminant analysis, probit analysis, and ANNs. However, it was argued that the underlying assumptions of Logistic Regression are rather strict (Malley et al., 2012). For instance, multicollinearity should not exist among independent variables. 127.2.2 Machine learning Unlike statistical modeling that has pre-defined structures and assumptions, machine learning allows the computer to learn the particular structure of the model from the data (Huang et al., 2004). As a result, another research stream explored the problem of credit risk assessment by using machine learning algorithms. With three data sets from UCI Irvine Machine Learning Database, Lahmiri (2016) constructed five machine learning models, including support vector machine (SVM), backpropagation neural network (BPNN), radial basis function neural networks (RBNN), linear discriminant analysis (LDA) and na¨ıve Bayes (NB), to assess credit risk. They found that SVM provides the best predictive accuracy for all three data sets. Using credit bureau, transactional, and account-balance data from January 2005 to April 2009 of a major commercial bank, Khandani, Kim, and Lo (2010) employed generalized classification and regression trees (CART) that initially proposed by Breiman et al. (1984) to forecast the consumers’ delinquencies. The authors asserted that CART was superior to logistic regression, discriminant analysis models, and credit scores, with regard to identifying subtle nonlinear relationships underlying the massive data set. Butaru et al. (2016) analyzed another large sample of account-level credit card data from six major commercial banks from January 2009 to December 2013. They developed and compared prediction models for credit card delinquencies with logistic regression, decision tree (4.5), and random forest. It was concluded that, while all models performed reasonably well, the decision tree and the random forest outperformed the logistic regression. Other studies investigated credit default. For example, Kruppa et al. (2013) applied random forest, optimized logistic regression, KNN, and bagged k-nearest neighbors to estimate the credit default probability to a large data set of short-termed installment credits for a company producing household appliances. This study demonstrated the superiority of random forest over other models. The data set consisted of 64,524 transactions, with 13% of the total financed amounts remain uncollectible (which means default). However, in a real-world case, the ratio of credit card default is usually much lower than 13%. Fitzpatrick and Mues (2016) focused on the
page 4353
July 6, 2020
16:8
4354
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
mortgage default. They applied boosted regression trees, random forests, penalized linear and semi-parametric logistic regression models to four portfolios of over 300,000 Irish owner-occupier mortgages. The results showed that while those models had varying degrees of predictive capability, the boosted regression trees significantly outperformed logistic regression. Other research concentrated on ensemble approaches. For instance, Twala (2014) ensembled five classifiers to predict credit risk, consisting of decision trees, artificial neural networks, na¨ıve Bayes, logistic regression, and KNN. It showed that the ensemble of classifiers improved the predictive accuracy as opposed to each individual classifier for data sets with different types noise. The data sets used in their work include loan payments, Texas banks, Australian credit approval, and German credit. Nevertheless, the size of those data sets was relatively small. For example, the loan payments only had 32 examples for training. Moreover, instead of using all data points, it used matched samples (e.g., the data set of Texas banks had 59 failed banks matched with the same number of non-failed banks), and the number of attributes of the data sets was relatively small (e.g., the data set of Australian credit approval had 15 attributes). Another comparative research was conducted by Abell´an and Mantas (2014) who employed the Bagging scheme on ensembles of several decision trees models to bankruptcy prediction and credit scoring. 127.2.3 Artificial neural networks Due to the ability of modeling complex and nonlinear relationship between inputs and outputs (Angelini, Tollo, and Roli, 2008; Abdou and Pointon, 2011), Artificial neural networks (ANNs) have been considered a ubiquitous method to a variety of problems. Barniv, Agarwal, and Leach (1997) utilized ANNs, logistic regression, and discriminant analysis to develop three-state classification models to predict the outcome following bankruptcy filings. The empirical results indicated that ANNs provided significantly better overall classification than the other two algorithms. Etheridge and Sriram (1997) employed the same three techniques to analyze financial distress. They found that the ANN outperformed logistic regression and discriminant analysis when the relative error costs were considered. ANNs were also applied to stock price predictions. Kohara et al. (1997) collected data of historical prior stock price and information from newspaper articles discussing domestic and foreign events. Their work demonstrated the effectiveness of the use of eventknowledge and ANNs as the predictive accuracy of their approach was higher than that of multiple regression analysis.
page 4354
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4355
Trinkle and Baldwin (2016) reviewed prior studies and discussed research opportunities in the use of ANN for credit risk assessment and credit scoring. They documented that ANN models could help firms make better investment decisions by predicting the credibility of a customer. Baesens et al. (2003) used Neural Rule Extraction to evaluate credit risk with three real-life data sets and claimed that their approach was able to clarify the neural network decisions by using explanatory rules that captured the learned knowledge embedded in the neural networks. Angelini, Tollo, and Roli (2008) developed two ANNs to predict the credit risk on an Italian data set of small businesses. The overall performance of the proposed model showed that ANNs could be applied successfully in credit risk assessment. Using a German data set, Khashman (2010) compared the predictive accuracy of three neural networks with nine learning schemes for credit risk prediction. The result showed that the highest overall accuracy rate achieved was 83.6%. Huang et al. (2004) predicted credit rating of companies’ bonds with ANNs and SVM. They used backpropagation neural network (BNN) as a benchmark and found slight improvement of SVM. Akko¸c (2012) compared discriminant analysis, logistic regression, ANNs, and a three-stage hybrid system based on statistical techniques and neuro fuzzy for credit scoring. The credit card data set was obtained from an international bank operating in Turkey. The result showed that the proposed hybrid system worked more effectively than other algorithms in terms of average correct classification rate and estimated misclassification cost. Fu et al. (2016) proposed a convolutional neural network (CNN) to predict credit card fraud for a major commercial bank. Before training the CNN, the data set was processed through feature engineering, sampling method, and feature transformation. The data set contained over 260 million transactions in one year. However, only 4000 transactions were fraudulent. To reduce data imbalance, they generated synthetic fraudulent samples from real frauds with a cost-based sampling method. The experiment result, in terms of F1 score, showed that the CNN model performed more effectively than other state-of-art techniques. In the case of credit loan, Bekhet and Eletter (2014) employed a relatively small data set of both accepted and rejected applications from different Jordanian commercial banks from 2006 to 2011. The total number of observations was 492, among which 292 (59.3%) applications were credit worthy while 200 (40.3%) applications were not. They applied logistic regression and ANNs to conduct their task and found that the logistic regression model performed better than the ANN with regard to the overall classification rate, while the ANN model outperformed the logistic regression model in identifying rejected risky applications.
page 4355
July 6, 2020
16:8
4356
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
127.2.4 Deep neural networks Deep Neural Networks (DNNs) have been successfully applied to a wide range of areas, from self-driving car to game playing, from voice recognition to computer vision, from medical diagnostic to natural language understanding. Galeshchuk and Mukherjee (2017) applied DNNs in foreign exchange market. They proposed deep convolution neural networks (CNNs) to predict the direction of change in foreign exchange rates for currency pairs EUR/USD, GBP/USD and JPY/USD. The prediction results showed that CNNs are significantly better than time series models, traditional machine learning classifiers such as shallow networks and SVMs. Pandey (2017) employed this technique to detect credit card frauds. The data were selected from the UCSD-FICO Data mining contest 2009 data set and consisted of 94,682 transactions with 17 input attributes. The developed model contained two hidden layers. The predictive performance of the DNN model was measured by mean squared error (MSE), root mean squared error (RMSE), mean absolute errors (MAE), and root mean squared log error (RMSLE), without further explanation. However, those error measures are more suitable for regression problems rather than classification problems (Bprovicka et al., 2012). Furthermore, it is unclear that how many false positives and false negatives were detected by the model, as well as the AUC, precision, recall, and other important metrics. Unlike traditional ANNs which have no more than two hidden layers, DNNs usually consist of three or more hidden layers. This deep hierarchical structure enables a neural network to gradually learn complex representations of data from one layer to another. Thus, deep learning is suitable to analyze bigger or more complex data sets, such as textual data, audios, images, or data sets with many input variables or large size (Sun and Vasarhelyi, 2017). We believe that, in order to better demonstrate the effectiveness of deep learning in predicting credit card delinquencies, it is necessary to train a DNN with more than two hidden layers against a relatively big and complex data set. After reviewed 214 publications which applied various statistical or machine learning methods, Abdou and Pointon (2011) indicated that there was no overall best technique regarding the credit scoring. Deep learning, as a pioneering machine learning approach, has not been applied to predict credit card delinquencies. Its predictive performance, as compared to those conventional approaches, is also unclear. The discussion of this section is summarized in Table 127.1.
page 4356
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies Table 127.1:
Machine Learning
Traditional ANN
DNN
4357
A Summary of literature.
Techniques Statistical Modeling
b3568-v4-ch127
Research examples Wiginton (1980), Leonard (1993), Abdou, Pointon, and El-Masry (2008) For credit risk: Lahmiri (2016), Khandani, Kim, and Lo (2010), Butaru et al. (2016) For default: Kruppa et al. (2013), Fitzpatrick and Mues (2016) Ensembled approach: Twala (2014), Abellan and Mantas (2014) For credit risk: Trinkle and Baldwin (2016), Baesens et al. (2003), Angelini, Tollo, and Roli (2008), Khashman (2010) For other problems: Barniv et al. (1998), Etheridge and Sriram (1998), Kohara et al. (1998), Huang et al. (2004), Akko¸c (2012), Fu et al. (2016), Bekhet and Eletter (2014) Galeshchuk and Mukherjee (2017), Pandey (2017)
127.3 Deep Learning Approach 127.3.1 The basic idea of deep learning Deep learning is also called Deep neural networks. The central idea of deep learning is that layers of virtual neurons in the DNN will automatically learn from massive amounts of observational data, recognize the underlying pattern, and classify the data into different categories. As shown in Figure 127.1, a DNN consists of interconnected layers of neurons (as represented by circles in Figure 127.1): there are one input layer, multiple hidden layers, and one output layer. The input layer receives the raw data, identifies the most basic element of the data, and passes it to the hidden layers. The hidden layer can further analyze, extract data representations, and send the output to the next layer. After receiving the data representations from the last layer, the output layer categorizes the data into predefined classes. Within each layer, complex nonlinear computations are executed by the neuron, and the output will be assigned with a weight. The weighted outputs are then combined through a linear transformation and transferred to the next layer. As the data is processed and transmitted from one layer to another, a DNN extracts higher-level data representations (e.g., class 1 and class 2) defined in terms of other, lower-level representations (Bengio, 2012; Goodfellow et al., 2016; Sun and Vasarhelyi, 2017).
page 4357
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
4358
Input layer
Hidden layers
Output layer
weight
Class 1
Class 2
… …
…
…
…
… Class i
Figure 127.1:
A simple deep neural network.
Deep learning has its genesis from ANNs which are defined by Gately (1996) as “an artificial intelligence problem solving computer program that learns through a training process of trial and error”. In a DNN, with each iteration of model training, the final classification result provided by the output layer will be compared to the actual observation to compute the error, and the DNN gradually “learns” from the data by adjusting the weight and other parameters in next rounds of training. After numerous (i.e., millions or even billions of) rounds of model training, the algorithm iterates through the data until the error cannot be reduced any further (Sun and Vasarhelyi, 2017). Then the validation data is used to examine the data overfitting, and the selected model is used to predict the holdout data, which is the out-of-sample test. 127.3.2 The differences between deep learning and conventional machine learning approaches Although the concept of the neural network is decades old, ANNs have not achieved solid progress due to the technical limitation. Nowadays, cheaper data storage, more powerful computational capability (e.g., the availability of GPUs), distributed processing, and the availability of data in various
page 4358
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4359
structures facilitate the advancement of this technology. Computer scientists can establish deeper hierarchical layers of virtual neurons than ever before. With the great depth of layers and the massive number of neurons, a DNN has much greater representational power than a traditional ANN that only has one or two hidden layers. Another important difference between deep learning and traditional machine learning techniques is its performance as the scale of data increases. Deep learning algorithms learn from past examples. As a result, they need a sufficiently large amount of data to understand the complex pattern underlying. A DNN may not perform better than traditional machine learning algorithms like Decision Trees when the data set is small or simple. But their performance will significantly improve as the data scales increases (Shaikh, 2017). Deep learning performs excellently in terms feature engineering. While traditional machine learning usually relies on human experts’ knowledge to identify critical data features to reduce the complexity of the data and eliminate the noise created by irrelevant attributes, deep learning automatically learns highly abstract features from the data itself without human intervention (Sun and Vasarhelyi, 2017). For example, a convolutional neural network (CNN) trained for face recognition can identify basic elements such as pixels and edges in the first and second layers, then parts of faces in successive layers, and finally a high-level representation of a face as the output. This characteristic of DNNs is seen as “a major step ahead of traditional Machine Learning” (Shaikh, 2017). Therefore, deep learning performs excellently for unstructured data analysis and has produced remarkable breakthroughs. It can now automatically detect objects in images (Szegedy, 2014), translate speeches (Levy, 2016), understand text (Abdulkader, Lakshmiratan, and Zhang, 2016), and play board game Go (Silver et al., 2016) on real-time basis at better than humanlevel performance (Heaton, Polson, Witte, 2016). Professionals in leading accounting firms delve into this technology. KPMG is using Watson to analyze substantial financial data (such as data about bank loans) for anomalies detection. Focusing on text analysis, Deloitte cooperates with Kira to perform document analysis tasks, including investigations, mergers, contract management and so on (Kepes, 2016). 127.4 Data The credit card data in the experiment is provided by a large bank in Brazil. The final data set consists of three subsets, including (1) a data set describing
page 4359
July 6, 2020
16:8
4360
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
the personal characteristics of the credit card holder (i.e., gender, age, annual income, residential location, occupation, account age, and credit score); (2) a data set providing the accumulated transactional information at account level recorded by the bank in September, 2013 (i.e., the frequency that the account has been billed, the count of payments, and the number of cash withdrawals in domestic); and (3) a data set containing account-level transactions in June, 2013 (i.e., credit card revolving payment made, the amount of authorized transaction exceeded the evolve limit of credit card payment, and the number of days past due). The original transaction set contains 6,516,045 records at the account level based on transactions made in June 2013, among which 45,017 are made with delinquent credit card, and 6,471,028 are legitimate. For each credit card holder, we link the original transaction set with the personal characteristics set and the accumulated transactional set. The objective of this work is to investigate the credit card holder’s characteristics and the spending behaviors and use them to develop an intelligent prediction model for credit card delinquency. As a result, we summarize some transactional data at the level of credit card holder. For example, we aggregate all the transactions made by the client on all credit cards owned and generate a new variable, TRANS ALL. Another derived variable, TRANS OVERLMT, is the average amount of authorized transactions that exceed the credit limit made by the client on all credit cards owned. We also use the latest value for certain variables, such as the balance of authorized unpaid transactions made by the client on all credit cards owned (BALANCE ALL). After summarization, standardization, eliminating observations with missing variables, and discarding variables with zero variations, we have 44 input data fields (among which, 15 fields are related to credit card holders’ characteristics, 6 variables provide accumulative information for all past transactions made by the credit card holder based on the bank’s record as of September 2013, and 23 attributes summarize the account level records in June, 2013), which are linked to 711,397 credit card holders. In other words, for each credit card holder, we have 15 variables describing his or her personal characteristics, 6 variables summarizing his or her past spending behavior, and 23 variables reporting the transactions the client made with all credit cards owned in June 2013. The final data is imbalanced because only 6537 clients are delinquent. In this study, a credit card client is defined as delinquent when any of his or her credit card account was permanently blocked by the bank in September 2013 due to the credit card delinquency. Table 127.2 summarized the input data. The input data fields are listed and explained in Appendix A,
page 4360
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies Table 127.2:
b3568-v4-ch127
4361
The data structure.
Panel A: Delinquent vs. legitimate observations Dataset Credit Card Data
Delinquent Obs. (Percentage) 6,537 (0.92%)
Legitimate Obs. (Percentage) 704,860 (99.08%)
No. of Data Fields 15 6
Time Period as of September, 2013 as of September, 2013
23
June 2013
Total (Percentage) 711,397 (100%)
Panel B: Data content Data Categories2 Client Characteristics Accumulative Transactional Information Transactional Information Total
44
and the descriptive statistics of numerous input variables are reported in Appendix 127B.
127.5 Experimental Analysis The data analysis process is performed with an Intel (R) Xeon(R) CPU (64 GB RAM, 64-bit OS). The software used in this paper is H2 O, an open source machine learning and predictive analytics platform. H2 O provides deep learning algorithms to help users train their DNNs based on different problems (Candel et al., 2016). We use H2 O Flow, which is a notebook-style user interface for H2 O. It is a browser-based interactive environment allowing uses to import files, split data, develop models, iteratively improve them, and make predictions. H2 O Flow blends command-line computing with a graphical user interface, providing a point-and-click interface for every operation (e.g., selecting hyper parameters).3 This feature enables users with limited programming skills such as auditors to build their own machine learning models much easier than they do with other tools.
2 3
A description of the attributes in each data category is provided in Appendix 127A. https://www.h2o.ai/h2o-old/h2o-flow/.
page 4361
July 6, 2020
16:8
4362
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
127.5.1 Splitting the data First, we hold 20% of the data as a test set,4 which will be used to give a confident estimate of the performance of the final tuned model. The stratified sampling method is applied to ensure that the test set has the same distribution of both classes (delinquent vs. legitimate class) as the overall data set. The test set is different from the validation set which is used to estimate the prediction ability of the constructed model while tuning the model’s parameters (Brownlee, 2017). Second, for the remaining 80% of the data (hereafter called “remaining set”), we use 5-fold cross validation. In H2 O, the 5-fold cross validation works as follows. Totally six models are built. The first five models are called crossvalidation models. The last model is called main model. In order to develop the five cross-validation models, remaining set is divided into 5 groups using stratified sampling to ensure each group has the same class distribution. To construct the first cross-validation model, group 2, 3, 4, and 5 are used as training data, and the constructed model is used to make predictions on group 1; to construct the second cross-validation model, group 1, 3, 4, and 5 are used as training data, and the constructed model is used to make predictions on group 2, and so on. So now we have five holdout predictions. Next, the entire remaining set is trained to build the main model, with training metrics and cross-validation metrics that will be reported later. The cross-validation metrics are computed as follows. The five holdout predictions are combined into one prediction for the full training data set. This “holdout prediction” is then scored against the true labels, and the overall cross-validation metrics are computed. This approach scores the holdout predictions freshly rather than takes the average of the five metrics of the cross-validation models (H2 O.ai 2018). 127.5.2 Tuning the hyperparameters There are numerous hyperparameters needed to be configured before we fitting the model (Tartakovsky, Clark, and McCourt, 2017). The choice of hyperparameters is critical as it determines the structure (e.g., the number of the hidden layers) and the variables controlling how the network is trained (e.g., the learning rate) (Radhakrishnan, 2017), which will in turn makes the 4
We use a 80:20 ratio of data splitting as it is a common rule of thumb (Guller, 2015; Giacomelli, 2013; Nisbet, Elder, and Miner, 2009; Kloo, 2015).
page 4362
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4363
difference between poor and superior predictive performance (Tartakovsky, Clark, and McCourt, 2017). In this research, we use a prevalent hyperparameter optimization technique, Grid Search, to select some key hyperparameters and other settings in deep learning, such as the number of hidden layers and neurons as well as the activation function. The basic idea of Grid Search is that, the user selects several grid points of the hyperparameter and train the neural network using every combination of those hyperparameters. The combination that performs the best will be selected. In our case, we select the combination of hyperparameters that produces the lowest validation error. This leads to the choice three hidden layers. In other words, the DNN consists of five fully connected layers (one input layer, three hidden layers, and one output layer). The input layer contains 322 neurons. The first hidden layer contains 175 neurons, the second hidden layer contain 350 neurons, and the third hidden layer contains 150 neurons. Finally, the output layer has 2 output neurons, which is the classification result of this research (whether or not the credit card holder is delinquent). The number of hidden layer and the number of neurons determine the complexity of the structure of the neural network. It is critical to build an NN with an appropriate structure that fits the complexity of the data. While small number of layers or neurons may cause underfitting, an extremely complex DNN would lead to overfitting (Radhakrishnan, 2017). We use the default Uniform Distribution Initialization method to initialize the network weights to a small random number between 0 and 0.05 generated from a uniform distribution, then forward propagate the weight throughout the network. At each neuron, the weights and the input data are multiplied, aggregated, and transmitted through the activation function. The activation function is used to introduce nonlinearity to the DNN. It is the nonlinear transformation performed over the input data, and the transformed output will then be sent to the next layer as the input data (Radhakrishnan, 2017). Without the activation function, the weights of the neural network would simply execute a linear transformation, which is too simple to learn complex data (Gupta, 2017). Our model uses the Rectifier activation function on the three hidden layers to solve the problem of exploding/vanishing gradient which is introduced by Bengio Simard, and Frasconi (1994) (Jin et al., 2016, Baydin, Pearlmutter, and Siskind, 2016). The Sigmoid activation function is applied to the output layer as it is a binary prediction. Table 127.3 depicts of the neural network’s structure. Number of epochs is the number of times the entire data set is passed (forward and backward) through the neural network while training. Since we
page 4363
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
4364
Table 127.3:
Layer Number of neurons 1 2 3 4 5
9.61in x 6.69in
322 175 350 150 2
The structure of the DNN.
Type
Initial weight distribution/activation function
Input Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output
Uniform Rectifier Rectifier Rectifier Sigmoid
are using a limited data set, to optimize the learning we are using Gradient Descent, an iterative optimization process that is used with many machine learning algorithms (Brownlee, 2016). Thus, updating the weights and other parameters with only one epoch is not enough as it will lead to underfitting (Sharma, 2017). As the number of epochs increases, more number of times the weight and other parameters are updated in the neural network, the training accuracy as well as the validation accuracy will increase. However, when the number of epochs reaches a certain point, the validation accuracy stars decreasing while the training accuracy is still increasing. This means the model is overfitting. Thus, the optimal number of epochs is the point where the validation accuracy reaches its highest value. The number of epochs in our DNN model is 10. The learning rate defines how quickly a network updates its parameters. Instead of using a constant learning rate to update the parameters (e.g., network weights) for each training epoch, we employ adaptive learning rate, which allows specification of different learning rates per layer (Brownlee, 2016; Lau, 2017). Two parameters, Rho and Epsilon, need to be specified to implement the adaptive learning rate algorithm. Rho is similar to momentum and relates to the memory to prior weight updates. Typical values are between 0.9 and 0.999. In this study, we use the value 0.99. Epsilon is similar to learning rate annealing during initial training and momentum at later stages where it allows forward progress. It prevents the learning process being trapped in local optima. Typical values are between 1e−10 and 1e−4. The value of epsilon is 1e−8 in our study. Because it is impossible to pass the entire data set into the deep neural network at once, the data set is divided into a number of parts called batches. Batch size is the total number of training examples present in a single batch. The batch size used here is 32.
page 4364
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies Table 127.4:
The distributions of classes.
Training (over-balanced) Delinquency Observations Legitimate Observations Overall
4365
563,744 563,766 1127,530
5 cross-validation sets 5260 563,786 569,046
Test
1277 141,074 142,351
127.5.3 Techniques handling data imbalance The entire data set has imbalanced classes. The vast majority of the credit card holders do not have delinquency. A total of 6537 instances are labeled with class “delinquent”, while the remaining 704,860 are labeled with class “legitimate”. To avoid the data imbalance, over-sampling and undersampling are two popular resampling techniques. While over-sampling adds copies of instances from the under-represented class (which is the delinquency class in our case), under-sampling deletes instances from the overrepresented class (which is the legitimate class in our case). We apply Grid Search again to try both approaches and find over-sampling works better for our data. Table 127.4 summaries the distributions of classes in training, 5 cross-validation, and test set.5 To compare the predictive performance of DNN to that of ANN, Logistic Regression, Na¨ıve Bayes, and Decision Tree, we analyze the same data set and use the same data splitting and preprocessing method to develop similar prediction models. The results of cross validation are reported in the next section. 127.6 Results 127.6.1 The predictor importance This paper evaluates the independent contribution of each predictor in explaining the variance of the target variable. Figure 127.2 lists the top 10 5
When splitting frames, H2 O does not give an exact split. It’s designed to be efficient on big data using a probabilistic splitting method rather than an exact split. For example, when specifying a 0.75/0.25 split, H2 O will produce a test/train split with an expected value of 0.75/0.25 rather than exactly 0.75/0.25. On small data sets, the sizes of the resulting splits will deviate from the expected value more than on big data, where they will be very close to exact. http://h2o-release.s3.amazonaws.com/h2o/master/3552/docs-website/h2odocs/datamunge/splitdatasets.html.
page 4365
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
4366
Relave Importance 1
TRANS_ALL 0.9622
LOCATION
0.9383
CASH_LIM GRACE_PERIOD
0.6859
BALANCE_CSH
0.6841
PROFESSION
0.6733
BALANCE_ROT
0.6232
FREQUENCY
0.6185
TRANS_OVERLMT
0.5866
LATEDAYS
0.5832 0
0.2
0.4
0.6
0.8
1
1.2
Relave importance
Figure 127.2:
The importance of top ten predictors.
important indicators and their importance scores measured by the relative importance as compared to that of the most important variable. The most powerful predictor is TRANS ALL, the total amount of all authorized transactions on all credit cards held by the client in June, which indicates that the more the client spent, the riskier that the client will have severe delinquency issue later in September. The second important predictor is LOCATION, suggesting that clients living in some regions are more likely to default on credit card debt. Compared to TRANS ALL, whose relative importance is 1 as it is the most important indicator, LOCATION’s relative importance is 0.9622. It is followed by the limit of cash withdrawal (CASH LIM) and the number of days given to the client to pay off the new balance without paying finance charges (GRACE PERIOD). This result suggests that the flexibility the bank provides to the client facilitates the occurrence of delinquencies. Other important data fields include BALANCE CSH (the current balance of cash withdrawal), PROFESSION (the occupation of the client), BALANCE ROT (the current balance of credit card revolving payment), FREQUENCY (the number of times the client has been billed until September 2013), and TRANS OVERLMT (the average amount of the authorized transactions exceeded the limit on all credit card accounts
page 4366
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4367
owned by the client). The last predictor is the average number of days the client’s payments (on all credit cards) in June 2013 have passed the due dates. Appendix 127C presents the difference of mean between delinquent and legitimate (control) groups for each of the numerical variables in Figure 127.2. We observe that the conditional means differ significantly. 127.6.2 The predictive result for cross validation We use a list of metrics to evaluate the predictive performance of the constructed DNN for cross validation. Furthermore, we use traditional ANN algorithm with single hidden layer and comparative number of neurons to build a similar prediction model. Logistic regression, na¨ıve Bayes, and decision tree techniques are also employed to conduct the same task. Next, we use those metrics to compare the prediction result of the DNN and ANN as well as logistic regression, na¨ıve Bayes, and decision tree. As shown in Table 127.5, the DNN has an overall accuracy of 99.54%, slightly lower than the ANN and Decision Tree, but higher than the other two approaches. Since there is a large class imbalance in the validation data, the classification accuracy alone cannot provide useful information for model selection as it is possible that a model can predict the value of the majority class for all predictions and achieve a high classification accuracy. Therefore, we consider a set of additional metrics. Table 127.5:
6
Predictive performance.6
Metrics
DNN
ANN
Decision tree (J48)
Na¨ıve bayes
Logistic regression
Overall accuracy Recall Precision Specificity F1 F2 F0.5 FNR FPR AUC Model Building Time
0.9954
0.9955
0.9956
0.5940
0.9938
0.6042 0.8502 0.9990 0.7064 0.6413 0.7862 0.3958 0.0010 0.9547 8 hours 3 mins 13 secs
0.5975 0.8739 0.9980 0.6585 0.6204 0.7016 0.4027 0.0020 0.9485 13 minutes 56 seconds
0.5268 0.9922 0.9999 0.6882 0.5813 0.8432 0.4732 0.0001 0.881 0.88 seconds
0.8774 0.0196 0.5913 0.0383 0.0898 0.0243 0.1226 0.4087 0.7394 9 seconds
0.4773 0.7633 0.9986 0.5874 0.5166 0.6816 0.5227 0.0014 0.8889 34 seconds
We choose the threshold that gives us the highest F1 score, and the reported value of the metric is based on the selected threshold.
page 4367
July 6, 2020
16:8
4368
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
Specificity (also called true negative rate (TNR)) measures the proportion of negatives that are correctly identified as such. In this case it is the percentage of legitimate holders who are correctly identified as non-delinquent. The TNR of DNN is 0.9990, which is the second highest score of all algorithms. This result shows that, the DNN classifier performs excellently in correctly identifying legitimate clients. Decision tree has a slightly higher specificity, which is 0.9999. ANN and logistic regression also have a high score of specificity. However, na¨ıve Bayes has a low TNR, which is 0.5913. This means that many legitimate observations are mistakenly identified by the Na¨ıve Bayes model as delinquent ones. False negative rate (FNR) is the Type II error rate. It is the proportion of positives that are incorrectly identified as negatives. An FNR of 0.3958 of DNN indicates 39.58% of delinquent clients are undetected by the classifier. This is the second lowest score. The lowest one is 0.1226 generated by na¨ıve Bayes. So far, it seems like that the Na¨ıve Bayes model tend to consider all observations as default ones because of the low level of TNR and FNR. False positive rate (FPR) is called Type I error rate. It is the proportion of negatives that are incorrectly classified as positives. In Table 127.5, we can see that the Type I error rate of decision tree is 0.01%, higher than that of DNN, which is 0.1%. This result suggests that it is unlikely that a normal client will be identified by decision tree and DNN as a problematic one. Precision and recall are two important measures for the ability of the classifier for delinquency detection, where precision7 measures the percentage of actual delinquencies in all perceived ones. The precision score, 0.8502, of DNN is lower than that of decision tree and ANN, which is 0.9922 and 0.8739, respectively, but higher than that of other two algorithms. Specifically, the na¨ıve Bayes model receives an extremely low score, 0.0196. This number shows that approximately all perceived delinquencies are actually legitimate observations. Recall,8 on the other hand, indicates that, for all actual delinquencies, how many of them are successfully identified by the classifier. It is also called sensitivity or the true positive rate (TPR), which can be thought of as a measure of a classifiers completeness. The Recall score of DNN is 0.6042, the highest score of all models except na¨ıve Bayes. This number also means 39.58% of delinquent observations are not identified by our model, which is consistent with the result of FNR.
7 8
Precision = true positive/true positive + false positive). Recall = true positive/(true positive +false negative).
page 4368
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies
4369
While the decision tree and ANN models perform better than the DNN in terms of precision, the DNN outperforms them in terms of recall. Thus, it is necessary to evaluate the performance of models by considering both precision and recall. Three F scores, F1 , F2 , and F0.5 , are frequently used by existing data mining research to conduct this job (Powers, 2011). The F1 score9 is the harmonic mean of precision and recall, treating precision and recall equally. While F2 10 treats recall with more importance than precision by weighting recall higher than precision, F0.5 11 weighs recall lower than precision. The F1 , F2 , and F0.5 score of the DNN is 0.7064, 0.6413, and 0.7862, respectively. The result shows that, with the exception of F0.5 , DNN exhibit the highest overall performance than decision tree, ANN, na¨ıve Bayes, and logistic regression. The overall capability of the classifier can also be measured by the area under the receiver operating characteristic (ROC) curve, AUC. The ROC curve (see Figure 127.3) plots the recall versus the false positive rate as the discriminative threshold is varied between 0 and 1. Again, the DNN 1 0.9 0.8
True PosiƟve Rate
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False PosiƟve Rate Figure 127.3: 9
The ROC curve — cross-validation metric.
F1 = 2 × (precision × recall)/(precision + recall). F2 = 5 × (precision × recall)/(4 × precision + recall). 11 F0.5 = 54 × (precision × recall)/( 14 × precision + recall). 10
page 4369
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4370
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
provides the highest AUC of 0.9547 compared to other models, showing its strong ability to discern between the two classes. Finally, the model building time shows that it is a time-consuming procedure (more than 8 hours) to develop a DNN due to the complexity of computing.
127.6.3 Z-Test The results of cross validation show that, for some metrics (e.g., overall accuracy), DNN is less effective than other algorithms. Furthermore, it takes much longer to train and validate the DNN than other models. However, the key predictive measures, AUC and F1 ,12 suggest that the overall performance of DNN is better than other models. As a result, it seems difficult to determine the performance of DNN as compared to other approaches. To resolve this issue, we follow O’Leary (1998) as well as Coats and Fant (1993) and use the normally distributed Z-test of equality of proportions to examine the probability that there is a difference between the key predictive measures using DNN and those using other models. We first compare the differences of performance metrics between DNN and ANN (see Table 127.6). The null hypothesis is that the proportion of hit (F1 ) for the ANN model is greater than or equal to the proportion of hit (F1 ) for the DNN model. It is shown in Table 127.6 that F1 is significantly greater than that of ANN at the significance level of 0.01. Additionally, as measured by the same hit, F1 , DNN is significantly (at the level of 0.01) more effective than logistic regression, na¨ıve bayes, and decision tree. Furthermore, the result of Z-tests for AUC, which is not tabulated, is consistent with that for F1 . To summarize, as measured by F1 and AUC, the overall predictive performance of DNN is significantly superior to ANN, logistic regression, na¨ıve Bayes, and decision tree.
127.6.4 Prediction on test set Table 127.7 is the confusion matrix for the test set. A total of 85 legitimate credit card holders are classified as delinquent ones by the DNN. In addition, 773 out of 1277 delinquent clients are successfully detected. 12 We do not consider Overall Accuracy as it is a misleading performance measure for highly imbalanced data. We focus on F1 and AUC as they measure the overall predictive performance of the model. Specifically, F1 , the harmonic mean of Precision and Recall, weighs these two metrics equally.
page 4370
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies Table 127.6:
4371
Tests for differences between proportions.
DNN vs. ANN H0 : PANN ≥ PDNN versus PANN < PDNN Percentage of hit (F1 ) N 569046
DNN 70.64
ANN 65.85
Z 54.9312***
P < 0.01
Z 133.8884***
P < 0.01
Z 1020.0000***
P < 0.01
Z 21.1382***
P < 0.01
DNN vs. Logistic Regression H0 : Plogit ≥ PDNN versus PLogit < PDNN Percentage of hit (F1 ) N 569046
DNN 70.64
Logit 58.74
DNN vs. Na¨ıve Bayes H0 : PNaivebayes ≥ PDNN versus PNaivebayes < PDNN Percentage of hit (F1 ) N 569046
DNN 70.64
Naive Bayes 3.83
DNN vs. Decision Tree (J48) H0 : PJ48 ≥ PDNN versus PJ48 < PDNN Percentage of hit (F1 ) N 569046
DNN 70.64
Table 127.7: Actual/Predicted Legitimate obs. Delinquent obs. Total
J48 68.82
The confusion matrix of DNN (test set). Legitimate obs.
Delinquent obs.
Total
140989 504 141493
85 773 858
141074 1277 142351
The result of out-of-sample test in Table 127.8 and the ROC curve in Figure 127.4 both show that the DNN model generally performs effectively for detecting delinquencies, as reflected by the highest AUC value, 0.9246. The recall is 0.6053, which is the second highest value. The highest value of recall is 0.8677 for the Na¨ıve Bayes model. The precision of the DNN is also the second highest, which is 0.9009. Considering both precision and recall, the DNN outperforms other models with the highest F1 score, 0.7241. This result is consistent with the result for all models on the cross-validation
page 4371
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
4372
Table 127.8:
The result of out-of-sample test.
Metrics
DNN
ANN
Na¨ıve bayes
Logistic
Decision tree (J48)
Overall Accuracy Recall Precision Specificity F1 F2 F0.5 False Negative Rate False Positive Rate AUC
0.9959 0.6053 0.9009 0.9994 0.7241 0.6478 0.8208 0.3947 0.0006 0.9246
0.9941 0.5521 0.7291 0.9981 0.6283 0.5802 0.6851 0.4479 0.0019 0.9202
0.6428 0.8677 0.0217 0.6407 0.0424 0.0987 0.0270 0.1323 0.3593 0.7581
0.9949 0.5770 0.8047 0.9987 0.6721 0.6116 0.7459 0.4230 0.0013 0.8850
0.9944 0.4527 0.9080 0.9996 0.6042 0.5032 0.7559 0.5473 0.0004 0.8630
1 0.9 0.8 0.7 True Posi ve Rate
July 6, 2020
0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False Posi ve Rate Figure 127.4:
The ROC curve-testing metrics.
sets. Specifically, the F1 score for test set is higher than that for the crossvalidation set. The remaining metrics support that, compared to traditional ANN, na¨ıve Bayes, logistic regression, and decision tree, the DNN performs more effectively in identifying credit card delinquency.
page 4372
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4373
127.7 Conclusion and Limitations Due to the scarcity of the real-life credit data as well as the complexity of this algorithm, the study on the applications of deep learning in credit card industry has just evolved (e.g., Pandey, 2017). This paper demonstrates the effectiveness of deep learning in predicting credit card delinquencies. Using a real-life credit card data from a large bank in Brazil, we develop a DNN to predict severe delinquencies based on the clients’ personal information and spending characteristics. We compare the predictive performance of DNN with that of traditional ANN, logistic regression, na¨ıve Bayes, and decision tree model. The result of the cross validation and testing shows that the DNN generally works better than other models as measured by F scores and AUC. The success of the DNN model implies that deep learning is a promising technique which may have much to contribute to the effective management of credit card risk for financial institutions and regulators, especially when the available data is big and complex. A model like this can be utilized to automate the credit risk assessment. A follow-up investigation and monitoring should be applied to those clients who are assessed by the DNN as risky customers. We believe that our findings are indicative of considerably more powerful models of credit risk prediction that can be developed with deep learning using broader and more complex data (e.g., unstructured data like text, images, and audio data) in the future. The paper is subject to the following limitations. First, although it is better than other algorithms, the DNN approach leaves 39.58% of delinquencies undetected. In addition, the precision score of the DNN is lower than that of traditional ANN. This is because our data is still not large enough. Deep learning algorithms need a sufficiently large amount of data to understand the complex pattern underlying. The performance of deep learning models will significantly improve as the data scales increases (Shaikh, 2017). Therefore, future work can explore more data from other sources to provide a complete picture of the spending pattern and the characteristics of the credit card holders and allow deep learning to better exhibit its superiority over other algorithms. Second, although the data set covers accumulated transactional data, the spending activity data is only for one month — June, 2013. As a result, future work may consider obtaining data with longer time frame and combining multiple machine learning algorithms to obtain better predictive performance. Lastly, it is necessary to examine the possibility of utilizing deep learning to automate the test of details on suspicious card holders to reduce the cost of follow-up investigation.
page 4373
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4374
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
Bibliography Abdou, H.A. and Pointon, J. (2011). Credit Scoring, Statistical Techniques and Evaluation Criteria: A Review of the Literature. Intelligent Systems in Accounting, Finance and Management 18, 59–88. Abdou, H., Pointon, J. and El-Masry, A. (2008). Neural Nets Versus Conventional Techniques in Credit Scoring in Egyptian Banking. Expert Systems with Applications 35, 1275–1292. Abdulkader, A., Lakshmiratan, A. and Zhang, J. (2016). Introducing DeepText: Facebook’s Text Understanding Engine. https://backchannel.com/an-exclusive-look-athow-ai-and-machine-learning-work-at-apple- 8dbfb131932b. Abell´ an, J. and Mantas, C.J. (2014). Improving Experimental Studies About Ensembles of Classifiers for Bankruptcy Prediction and Credit Scoring. Expert Systems with Applications 41, 3825–3830. American Bankers Association (2017). ABA Report: Consumer Delinquencies Mixed in Fourth Quarter. Consumer Credit Delinquency Bulletin. http://www.aba.com/Press/ Pages/040617DelinquencyBulletin.aspx# ga=2.227249693.1048654941.1497319278-4 95879525.1497319278. Angelini, E., Tollo, G. and Roli, A. (2008). A Neural Network Approach for Credit Risk Evaluation. Quarterly Review of Economics and Finance 48, 733–755. Akko¸c, S., (2012). An Empirical Comparison of Conventional Techniques, Neural Networks and the Three-Stage Hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) Model for Credit Scoring Analysis: The Case of Turkish Credit Card Data. European Journal of Operational Research 222, 168–178. Baesens, B., Setiono, R., Mues, C. and Vanthienen, J. (2003). Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation. Management Science 49, 312–329. Barniv, R., Agarwal, A. and Leach, R. (1997). Predicting the Outcome Following Bankruptcy Filing: A Three-State Classification Using Neural Networks. Intelligent Systems in Accounting, Finance and Management 6, 177–194. Baydin, A.G., Pearlmutter, B.A. and Siskind, J.M. (2016). Tricks from Deep Learning. arXiv Preprint arXiv:1611.03777. Bekhet, H.A. and Eletter, S.F.K. (2014). Credit Risk Assessment Model for Jordanian Commercial Banks: Neural Scoring Approach. Review of Development Finance 4, 20–28. Bengio, Y., Simard, P. and Frasconi, P. (1994). Learning Long-Term Dependencies with Gradient Descent is Difficult. IEEE transactions on neural networks 5, 157–166. Bengio, Y. (2012). Deep Learning of Representations for Unsupervised and Transfer Learning, in Proceedings of ICML Workshop on Unsupervised and Transfer Learning, June, 17–36. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees. Wadsworth and Brooks Cole Advanced Books and Software, Pacific Grove, CA. Brownlee, J. (2017). What is the Difference Between Test and Validation Datasets? https: //machinelearningmastery.com/difference-test-validation-datasets/. Brownlee, J. (2016). Using Learning Rate Schedules for Deep Learning Models in Python with Keras. Machine Learning Mastery. https://machinelearningmastery.com/ using-learning-rate-schedules-deep-learning-models-python-keras/. Brownlee, J. (2016). Gradient Descent for Machine Learning. Machine Learning Mastery. https://machinelearningmastery.com/gradient-descent-for-machine-learning/.
page 4374
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Predicting Credit Card Delinquencies
b3568-v4-ch127
4375
Butaru, F., Chen, Q., Clark, B., Das, S., Lo, A.W. and Siddique, A. (2016). Risk and Risk Management in the Credit Card Industry. Journal of Banking and Finance 72, 218–239. Candel, A., Parmar, V., LeDell, E. and Arora, A. (2016). Deep Learning with H2 O. Working Paper. http://h2o.ai/resources. Chen, S.C. and Huang, M.Y. (2011). Constructing Credit Auditing and Control and Management Model with Data Mining Technique. Expert Systems with Applications 38, 5359–5365. Chu, B. (2017). Credit Card Lending Picks up Again Amid Warnings Over UK Household Debt. Independent. http://www.independent.co.uk/news/business/news/credit-cardlending-uk-household-debt-unsecured-september-bank-of-england-interest-rates-a802 6896.html. Crook, J.N., Edelman, D.B. and Thomas, L.C. (2007). Recent Developments in Consumer Credit Risk Assessment. European Journal of Operational Research 183, 1447–1465. Durden, T. (2017). Credit Card Defaults Surge Most Since Financial Crisis. ZeroHedge. https://www.zerohedge.com/news/2017-06-09/credit-card-defaultssurge-most-financial-crisis. Etheridge, H.L. and Sriram, R.S. (1997). A Comparison of the Relative Costs of Financial Distress Models: Artificial Neural Networks, Logit and Multivariate Discriminant Analysis. Intelligent Systems in Accounting, Finance and Management 6, 235–248. Fitzpatrick, T. and Mues, C. (2016). An Empirical Comparison of Classification Algorithms for Mortgage Default Prediction: Evidence from a Distressed Mortgage Market. European Journal of Operational Research 249, 427–439. Fu, K., Cheng, D., Tu, Y. and Zhang, L. (2016). Credit Card Fraud Detection Using Convolutional Neural Networks. In International Conference on Neural Information Processing 10, 483–490. Galeshchuk, S. and Mukherjee, S. (2017). Deep Networks for Predicting Direction of Change in Foreign Exchange Rates. Intelligent Systems in Accounting, Finance and Management 24, 100–110. Gately, E. (1996). Neural Networks for Financial Forecasting: Top Techniques for Designing and Applying the Latest Trading Systems. John Wiley and Sons, Inc.:New York. Giacomelli, P. (2013). Apache mahout cookbook. Packt Publishing Ltd. Goodfellow.I., Bengio. Y. and Courville, A. (2016). Deep Learning. MIT Press. http:// www.deeplearningbook.org Guller, M. (2015). Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large Scale Data Analysis. Apress, 155. Gupta, D. (2017). Fundamentals of Deep Learning — Activation Functions and When to Use Them? Analytics Vidhya. https://www.analyticsvidhya.com/blog/2017/10/ fundamentals-deep-learning-activation-functions-when-to-use-them/. H2 O.ai. (2018). Cross-Validation. H2 O Documents. http://docs.h2o.ai/h2o/latest-stable/ h2o-docs/cross-validation.html. Hamet, P. and Tremblay, J. (2017). Artificial Intelligence in Medicine. Metabolism 1–5. Heaton, J.B., Polson, N.G. and Witte, J.H. (2016). Deep Learning in Finance. arXiv preprint arXiv:1602.06561. Huang, Z., Chen, H., Hsu, C.J., Chen, W.H. and Wu, S. (2004). Credit Rating Analysis with Support Vector Machines and Neural Networks: A Market Comparative Study. Decision Support Systems 37, 543–558. Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J. and Yan, S. (2016). Deep Learning with S-Shaped Rectified Linear Activation Units. In AAAI 2, 1737–1743.
page 4375
July 6, 2020
16:8
4376
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
Kepes, B. (2016). Big Four Accounting Firms Delve Into Artificial Intelligence. Computerworld from IDG. http://www.computerworld.com/article/3042536/big-data/ big-four-accounting-firms-delve-into-artificial-intelligence.html. Khashman, A. (2010). Neural Networks for Credit Risk Evaluation: Investigation of Different Neural Models and Learning Schemes. Expert Systems with Applications 37, 6233–6239. Khandani, A.E., Kim, A.J. and Lo, A.W. (2010). Consumer Credit-Risk Models via Machine-Learning Algorithms. Journal of Banking and Finance 34, 2767–2787. Kloo, I. (2015). Textmining: Clustering, Topic Modeling, and Classification. http://dataanalytics.net/cep/Schedule files/Textmining%20%20Clustering,%20Topic%20 Modeling,%20and%20Classification.htm. Koh, H.C. and Chan, K.L.G. (2002). Data Mining and Customer Relationship Marketing in the Banking Industry. Singapore Management Review 24, 1–27. Kohara, K., Ishikawa, T., Fukuhara, Y. and Nakamura, Y. (1997). Stock Price Prediction using Prior Knowledge and Neural Networks. Intelligent Systems in Accounting, Finance and Management 6, 11–22. Kruppa, J., Schwarz, A., Arminger, G. and Ziegler, A. (2013). Consumer Credit Risk: Individual Probability Estimates Using Machine Learning. Expert Systems with Applications 40, 5125–5131. Lahmiri, S. (2016). Features Selection, Data Mining and Finacial Risk Classification: A Comparative Study. Intelligent Systems in Accounting, Finance and Management 23, 265–275. Lau, S. (2017). Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning. Towards Data Science. https://towardsdatascience.com/learning-rateschedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1. Leonard, K. J. (1993). Empirical Bayes Analysis of the Commercial Loan Evaluation Process. Statistics Probability Letters 18, 289–296. Levy, S. (Aug 24, 2016). An Exclusive Inside Look at How Artificial Intelligence and Machine Learning Work at Apple. Backchannel. https://backchannel.com/ an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b. Malley, James D., Jochen Kruppa, Abhijit Dasgupta, Karen, G. M. and Andreas Ziegler. (2012). Probability Machines: Consistent Probability Estimation using Nonparametric Learning Machines. Methods of Information in Medicine 51, 74. Marqu´es, A.I., Garc´ıa, V. and S´ anchez, J.S. (2012). Exploring the Behavior of Base Classifiers in Credit Scoring Ensembles. Expert Systems with Applications 39, 10244–10250. Nisbet, R., Elder, J. and Miner, G. (2009). Handbook of Statistical Analysis and Data Mining Applications. Academic Press. Ohlsson, C. (2017). Exploring the Potential of Machine Learning: How Machine Learning can Support Financial Risk Management. Master’s Thesis. Uppsala University. Pandey, Y. (2017). Credit Card Fraud Detection Using Deep Learning. International Journal of Advanced Research in Computer Science 8(5). Porche, B. (2017). Americans’ credit card debt hits $1 trillion. Creditcards.com. https: //www.creditcards.com/credit-card-news/americans-card-debt-1-trillion.php. Powers, D.M. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Radhakrishnan, P. (2017). What are Hyperparameters and How to tune the Hyperparameters in a Deep Neural Network? Towards Data Science. https://towardsdatascience. com/what-are-hyperparameters-and-how-to-tune-the-hyperparameters-in-a-deep-neu ral-network-d0604917584a. Silver, D., et al. (2016). Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 529, 484–489.
page 4376
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies
4377
Sun, T. and Vasarheyi, M.A. (2017). Deep Learning and the Future of Auditing: How an Evolving Technology Could Transform Analysis and Improve Judgment. The CPA Journal 6, 24–29. Shaikh, F. (2017). Deep Learning vs. Machine Learning-the Essential Differences You Need to Know. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2017/04/ comparison-between-deep-learning-machine-learning/. Sharma, S. (2017). Epoch vs Batch Size vs Iterations. Towards Data Science. https:// towardsdatascience.com/epoch-vs-iterations-vs-batch-size-4dfb9c7ce9c9. Szegedy, C. (2014). Building a Deeper Understanding of Images. Google Research Blog (September 5, 2014). https://research.googleblog.com/2014/09/building-deeperunderstanding-of-images.html. Tartakovsky, S., Clark, S. and McCourt, M. (2017). Deep Learning Hyperparameter Optimization with Competing Objectives. NVIDIA Developer Blog. https://devblogs. nvidia.com/parallelforall/sigopt-deep-learning-hyperparameter-optimization/. Thomas, L. C. (2000). A Survey of Credit and Behavioral Scoring: Forecasting Financial Risk of Lending to Consumers. International Journal of Forecasting 16, 149–172. Trinkle, B.S. and Baldwin, A.A. (2016). Research Opportunities for Neural Networks: The Case for Credit. Intelligent Systems in Accounting, Finance and Management 23, 240–254. Turban, E., Aronson, J. E., Liang, T.-P. and McCarthy, R. (2005). Decision Support Systems and Intelligent Systems (7th ed.). New Delhi: Asoke K. Ghosh, Prentice-Hall of India Private Limited. Twala, B. (2010). Multiple Classifier Application to Credit Risk Assessment. Expert Systems with Applications 37, 3326–3336. Wiginton, J.C. (1980). A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior. Journal of Financial and Quantitative Analysis 15, 757– 770 Yeh, I. C. and Lien, C. H. (2009). The Comparisons of Data Mining Techniques for the Predictive Accuracy of Probability of Default of Credit Card Clients. Expert Systems with Applications 36, 2473–2480.
Appendix 127A Variable Definition Description13
Target Variable INDICATOR
It indicates if any of the client’s credit card is permanently blocked in September 2013 due to credit card delinquency
Input Variables
Description
1. Personal characteristics SEX Individual
The gender of the credit card holder The code indicating if the holder is an individual or a corporation (Continued)
13
The unit of the amount is Brazilian Real.
page 4377
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4378
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi (Continued)
Input Variables AGE INCOME CL INCOME CF ADD ASSET LOCATION PROFESSION ACCOUNT AGE CREDIT SCORE SHOPPING CRD VIP CALL PRODUCT CARDS
Description The age of the credit card holder The annual income claimed by the holder The annual income of the holder confirmed by the bank The number of additional assets owned by the holder The code indicating the holder’s region of residence The code indicating the occupation of the holder The oldest age of the credit card accounts owned by the client (in months) The credit score of the holder The number of products in shopping cards The VIP code of the holder It equals 1 if the client requested an increase of the credit limit; 0 otherwise The number of products purchased The number of credit cards held by the client (issued by the same bank)
2. Information about accumulative transactional activities (as of September 2013) FREQUENCY PAYMENT ACC WITHDRAWAL BEHAVIOR BEHAVIOR SIMPLE CREDIT LMT PRVS 3. Transactions in June, 2013 CREDIT LMT CRT LATEDAYS UNPAID DAYS BALANCE ROT BALANCE CSH GRACE PERIOD
INSTALL LIM ACT CASH LIM
14
The frequency that the client has been billed The frequency of the payments made by the client The accumulated amount of cash withdrawals (domestic) The behavior code of the client determined by the bank The simplified behavior score provided by the bank The maximum credit limit in the last period
The maximum credit limit The average number of days that the client’s credit card payments have passed the due date The average number of days that previous transactions have remained unpaid The current balance of credit card revolving payment The current balance of cash withdrawal The remaining number of days that the bank gives the credit card holder to pay off the new balance without paying finance charges. The time window starts from the end of June, 2013 to the next payment due date. The available installment limits. It equals the installment limit plus the installment paid14 The limit of cash withdrawal
The actual amount of installment limit could exceed the installment limit provided by the bank for the customer. This happens when the customer made some payments, so those funds become available for borrowing again.
page 4378
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies
4379
(Continued) Input Variable
Description
INSTALL LIM ROT LIM DAILY TRANS TRANS ALL
The limit of installment The revolve limit of credit card payment The maximum number of authorized daily transactions The amount of all authorized transactions (including all credit card revolving payment, installment, and cash withdrawal) on all credit card accounts owned by the client The average amount of the authorized transactions exceeded the limit on all credit card accounts owned by the client The average balance for authorized unpaid transactions (including all revolving credit card payment, installment, and cash withdrawal) on all credit card accounts owned by of the client The average balance of all credit card transactions under the authorization process The total amount of credit card revolving payment that has been made The average percentage of cash withdrawal exceeded the limit on all credit card accounts owned by of the client The average payment under processing The total installment amount that has been paid The total number of installments, including the paid installments and the unpaid ones The average amount of credit card revolving payment exceeded the revolve limit The average percentage of the installment exceeded the limit
TRANS OVERLMT
BALANCE ALL
BALANCE PROCESSING ROT PAID CASH OVERLMT PCT
PAYMENT PROCESSING INSTALLMENT PAID INSTALLMENT ROT OVERLMT INSTALLMENT OVERLMT PCT
Appendix 127B Summary Statistics Variable AGE INCOME CL INCOME CF ADD ASSET ACCOUNT AGE CREDIT SCORE SHOPPING CRD PRODUCT CARDS FREQUENCY PAYMENT ACC
Mean
Std.dev
Minimum
Maximum
43.3187 10922.52 6174.55 0.3671 55.8679 0.2203 232.0902 3793.27 2.7291 34.7120 36.0636
12.6952 220775.79 220775.79 0.6181 49.3723 4.5965 375.8267 1413.98 4.3065 15.8502 19.4309
12 0 0 0 2 0 0 11 0 0 0
130 9999999 9999999 9 452 115 8651 4962 76 106 529 (Continued)
page 4379
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
T. Sun & M. A. Vasarhalyi
4380
(Continued) Variable WITHDRAWL CREDIT LMT PRVS CREDIT LMT CRT LATEDAYS UNPAID DAYS BALANCE ROT BALANCE CSH GRACE PERIOD INSTALL LMT ACT ROT LIM ACT CASH LIM INSTALL LIM ROT LIM DAILY TRANS TRANS ALL TRANS OVERLMT BALANCE ALL BALANCE PROCESSING ROT PAID CASH OVERLMT PCT PAYMENT PROCESSING INSTALLMENT PAID INSTALLMENT ROT OVERLMT INSTALLMENT OVERLMT PCT
Mean
Std.dev
Minimum
Maximum
2.9804 3671.84 5335.56 −5.5506 0.1358 3002.71 339.2920 6.1622 5217.95 5217.95 524.1415 5217.95 5217.95 3.3941 1472.47 66.3997 0.0004 2153.41 3027.58 0.3225 3.8951 5180.92 1.4030 0.6482 2.2248
95.5705 5143.26 6605.82 3.8964 0.7025 5143.78 953.9062 2.9438 6514 6514 378.2815 6514 6514 6.6249 21383.44 75.3326 0.3063 3145.50 5001.97 25.1814 50.4655 6413.81 1.8683 2.8986 13.0488
0 0 0 −16 0 −91516.48 −91516.48 0 0 0 0 0 0 0 0 0 0 0 −70612.11 0 0 0 0 0 0
24350 108000 140000 316 14 133106.17 25000 16 140000 140000 25000 140000 140000 25 8855034.05 240.01 258.3283 110124.03 136969.63 19097.50 14419.36 140000 48 924.1360 999
Appendix 127C Differences of the Mean for Important Variables between Groups
TRANS ALL: Delinquency group Control group CASH LIM: Delinquency group Control group GRACE PERIOD: Delinquency group Control group
Mean
Mean diff
T -value
534.5305 1481.166∗∗∗
946.6354
3.5628
0.001
363.4597
77.6530
0.001
4.5533
125.8563
0.001
164.0217 527.4814∗∗∗ 1.6508 6.2040∗∗∗
P -value
page 4380
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch127
Predicting Credit Card Delinquencies
4381
(Continued)
BALANCE CSH: Delinquency group Control group BALANCE ROT: Delinquency group Control group FREQUENCY: Delinquency group Control group TRANS OVERLMT Delinquency group Control group LATEDAYS Delinquency group Control group
Mean
Mean diff
T -value
P -value
−163.1281 343.9515∗∗∗
507.0796
42.8365
0.001
2735.2580 3005.1900∗∗∗
269.9324
4.2234
0.001
28.7505 34.7672∗∗∗
6.0167
30.5701
0.001
14.0538 66.8851∗∗∗
52.8312
56.5675
0.001
−1.3528 −6.0697∗∗∗
−4.7170
−1.3e+02
0.001
***, **, * significant different from delinquency group at a one tailed p-value ≤ 0.01, 0.05, and 0.10, respectively, under a t-test on the equality of means.
page 4381
This page intentionally left blank
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Chapter 128
Estimating the Tax-Timing Option Value of Corporate Bonds Peter Huaiyu Chen, Sheen Liu and Chunchi Wu Contents 128.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 128.2 The Tax Environment . . . . . . . . . . . . . . . . . . . . 128.3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . 128.3.1 Buy-and-hold strategy . . . . . . . . . . . . . . . . 128.3.2 Optimal trading strategy . . . . . . . . . . . . . . 128.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . 128.4.1 Prices and tax-timing option values of default-free bonds . . . . . . . . . . . . . . . . . . 128.4.2 Equilibrium prices and tax-timing option values for defaultable bonds . . . . . . . . . . . . . . . . 128.4.3 Effects of transaction costs . . . . . . . . . . . . . 128.4.4 Changes in interest rate volatility and tax regime
Peter Huaiyu Chen Youngstown State University e-mail: [email protected] Sheen Liu Washington State University, Pullman e-mail: [email protected] Chunchi Wu State University of New York at Buffalo e-mail: chunchiw@buffalo.edu 4383
. . . . . .
. . . . . .
4384 4385 4387 4388 4389 4392
. .
4393
. . . . . .
4396 4400 4400
page 4383
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4384
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
128.4.5 Sensitivity of tax-timing option values to default risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 128.4.6 Multiple trading dates . . . . . . . . . . . . . . . . . 128.5 Implications for Empirical Estimation . . . . . . . . . . . . 128.5.1 Effects of ignoring tax-timing options on estimation of default probability . . . . . . . . . . . . . . . . . 128.5.2 Effects of ignoring tax-timing option on estimation of implied tax rates . . . . . . . . . . . . . . . . . . 128.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
4405 4406 4414
.
4414
. . .
4415 4417 4418
Abstract US tax laws provide investors an incentive to time the sales of their bonds to minimize tax liability. This grants a tax timing option that affects bond value. In reality, corporate bond investors’ tax-timing strategy is complicated by risk of default. In this chapter, we assess the effects of taxes and stochastic interest rates on the timing option value and equilibrium price of corporate bonds by considering discount and premium amortization, multiple trading dates, transaction costs, and changes in the level and volatility of interest rates. We find that the value of tax-timing option account for a substantial proportion of corporate bond price and the option value increases with bond maturity and credit risk. Keywords Tax timing • Option • Capital gain • Default risk • Transaction cost • Asymmetric taxes.
128.1 Introduction In the financial economic literature, taxes are considered as an important determinant of yield spreads (see Liu et al., 2007; Lin, Liu and Wu, 2011). Corporate bond returns are subject to both state and federal taxes. As such, these bonds must provide additional returns to compensate investors for tax liability. Corporate bonds also expose investors to default risk and loss upon default is tax deductible. Bonds with higher default risk also carry higher coupon which subject investors to higher income tax. In this chapter, we examine the timing option value of corporate bonds with transaction costs under the conditions of asymmetric taxation, stochastic interest rates, premium (or discount) amortization rules, and changing tax rates. We find that on the one hand, the effect of default risk on the timing option value is greater for long-term bonds because the tax-timing option is a compound option. On the other, default risk reduces the effective maturity of long-term bonds, making the compound nature of the tax-timing option much less valuable. Moreover, asymmetric tax treatment of long- and short-term gains increases the tax-timing option value. Taken together, we
page 4384
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Estimating the Tax-Timing Option Value of Corporate Bonds
b3568-v4-ch128
4385
find the tax-timing option is sizable. Ignoring this option value leads to biased estimation of yield spreads, the marginal investor’s tax rate, and the implied default probability. Our study is related to several important papers including Constantinides and Ingersoll (1984), Dammon, Dunn and Spatt (1989), Dammon and Spatt (1996), Liu and u (2004), Chay, Choi and Pontiff (2006), Dai et al. (2015), and Ball, Creedy and Scobie (2018). In particular, we generalize the model of Constantinides and Ingersoll (1984) to incorporate the effects of default risk, recovery rates, and amortization of premium (or discount) in a setting with multiple trading dates and time-varying level and volatility of interest rates and show that the tax-timing option value depends critically on the pattern of realized returns and the tax treatment of unrealized gains at the end of the investment horizon. This chapter is organized as follows. In Section 128.2, we briefly review important tax provisions for corporate bond investments. In Section 128.3, we propose a model to examine the effects of personal taxes and amortization on the pricing of defaultable bonds under alternative trading strategies. Section 128.4 provides simulations, and Section 128.5 explores the implications of omitting taxes for empirical estimation of default probability and marginal tax rates. Finally, Section 128.6 summarizes the findings and concludes. 128.2 The Tax Environment Under the current US tax laws, coupon payments on corporate bonds are taxed at the ordinary income tax rate while capital gains (or losses) are subject to the capital gains tax rate. If the bond is purchased at par and held to maturity, only coupon interest is taxed at the ordinary income tax rate. If a bond is issued below (or above) par, the difference between the price and par is called Original Issue Discount (or premium). An investor may ignore the discount (or premium) amortization rule if it is less than one-fourth of 1% of the stated redemption price at maturity multiplied by the number of full years from the date of original issue to maturity. This is known as “de minimis” OID. At the end of maturity, this discount (or premium) will be recognized as a capital gain (or loss). If a bond is issued substantially above the par value,1 an investor has two choices. The investor could treat the difference as capital tax losses and claim a tax rebate at 1
That is, the premium is greater than “de minimis” amount.
page 4385
July 6, 2020
16:8
4386
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
maturity. Alternatively, the investor could amortize the premium annually until the maturity date. For bonds issued since September 27, 1985, the bond premium must be amortized by using a constant yield method on the basis of the bond’s yield to maturity. For bonds issued prior to September 27, 1985, the premium can be amortized by any reasonable method, including straightline amortization. The amount amortized in a tax year will be a deduction against interest income and the basis of the bond is reduced accordingly. If the bond is sold before maturity, the difference between the sell price and basis is treated as a capital gain (loss). Conversely, if a bond is issued substantially below the par value (greater than “de minimis”), the discount is treated as ordinary interest income. IRS allows an investor to pay ordinary income tax on the discount when the bond matures or sold. If the bond is held to maturity, the total amount of the discount is taxed as ordinary income at maturity. If the bond is sold before maturity, the difference between the proceeds from sale and the basis is a gain or loss.2 If there is a loss, the loss is treated as a capital loss. If there is a gain and the gain is greater than the accrued market discount (or the amortized portion of the discount), the difference between the proceeds from sale and the amortized basis is taxed at the capital gain rate, and the accrued market discount portion is taxed as ordinary income. On the other hand, if the gain is less than or equal to the accrued market discount but greater than zero, the entire gain is taxed as ordinary income (see Sec. 1276 of Internal Revenue Code, 2002). Capital gains or losses are classified as long-term or short-term according to the investor’s holding period. The current tax law requires at least oneyear holding period to obtain a long-term capital gain status.3 Short-term capital gains are taxed at an ordinary income tax rate.4 For corporations, the long-term capital gains tax is the same as the ordinary income tax rate. The tax laws prohibit an investor from deducting a loss from the sale of securities in a wash sale. A wash sale occurs when an investor sells securities at a loss and within 30 days before or after the sale, (1) buys substantially 2
In our model, we assume no other purchase costs. The current law has a lower rate for assets held more than five years (effective now for taxpayers in the lowest regular tax bracket). 4 The new tax bill was passed May 23, 2003 and the new tax rates were retroactive to January 1, 2003. For lower income individuals, the top capital gains rate is reduced to 5% from 10%, effectively May 6, 2003. For low-income taxpayers, the capital gains tax is phased out in 2007. The new law also reduces the top dividend tax rate from 38.5% to 15% retroactive to January 1, 2003. However, both capital gains and dividend tax cuts are “sunset” provisions because they will expire December 31, 2008 unless future Congresses extend them. 3
page 4386
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Estimating the Tax-Timing Option Value of Corporate Bonds
b3568-v4-ch128
4387
identical securities (2) acquires substantially identical securities in a fully taxable trade, or (3) acquires a contract or option to buy substantially identical securities. If the loss is disallowed because of the wash sale rules, the investor can add the loss to the cost of the new securities to form the new basis. This adjustment postpones the loss deduction until the disposition of the new securities. Besides federal taxes, corporate bond returns are subject to state and local taxes. State and local taxes vary by different states. In some states, there are no income or capital gain taxes. State and local taxes are deducted from income for the purpose of federal taxes. In view of these complicated tax rules, it is necessary for us to abstract from nuances of the tax codes by focusing on the most important aspects of the tax. We assume that bond investors face an asymmetric capital gain tax τx (t, tˆ), which depends on the time of purchase, tˆ, and the time of sell, t. Short-term capital gains (or losses) with the holding period less than one year are subject to a higher tax rate than long-term capital gains. Specifically, at the selling time t, τ if t − tˆ < 1 year, (128.1) τx (t, tˆ) = τl if t − tˆ ≥ 1 year, where the subscripts x = s, l stand for short- and long-term, respectively, and 0 ≤ τl ≤ τ . We assume that premiums and discounts are amortized linearly until maturity and the regular interest income is adjusted by these amortizations each period. We ignore the complication of the offset rule of gains and losses, and assume that all return realizations are taxed separately. In addition, we assume no capital loss deduction limitation and no restrictions on wash sales since the wash sale rule can be easily circumvented (see Green, 1993). Default loss is treated as capital losses (Altman and Kishore, 1998). If investors sell bonds before the maturity date, they will buy the same bonds back immediately. Given this setting, we next present the model. 128.3 The Model Corporate bonds are noncallable and have a par value equal to one, coupon rate ct and maturity date T . ν(t, T ) is the time t price of a corporate bond maturing at time T > t. Discounts and premiums are amortized and the basis adjusted each period accordingly. The probability of default for the bond is λ and the recovery rate is equal to δ. We adopt the recovery of face value formulation, i.e., the residual value upon default is expressed as a fraction of the face value of the bond. Investors receive a tax rebate from the loss upon default.
page 4387
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4388
128.3.1 Buy-and-hold strategy Under the buy-and-hold strategy, the investor buys a bond and holds it to maturity or claims a residual value if default occurs before maturity. The value of the bond, U (t, T ; v(t, T ), tˆ), to the investor at time t is the expected present value of the bond U (t + 1, T, v(t, T ), tˆ) at the end of the period, t + 1, and the after-tax coupon (1 − τ )ct in the event of no default, plus the expected present value of residual value and the tax rebate in the event of default at t + 1. For a premium bond, the amortization of premium is deducted from the interest income in each period. Let vˆ(tˆ, T ) be the basis established at the time when the trading takes place tˆ(in buy-and-hold case, tˆ = t0 ) and v(t, T ) be the basis at time t. The amortization rate at current time t is a(t) =
1 − vˆ(tˆ, T ) T − tˆ
(128.2)
and the basis is v(t, T ) = v(t − 1, T ) + a(t).
(128.3)
The value of the premium bond to the buy-and-hold investor is U (t, T, v(t, T ), tˆ) = EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ) + (1 − τ )ct − a(t + 1)τ ]1[t∗ >t+1] } + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(v(t + 1, T ) − δt )]1[t∗ ∈{t,t+1}] },
(128.4)
where Etp denotes the conditional expectation under a risk-neutral probability measure P at time t and 1{·} is the point process indicating the events of no-default and default. The first and second components on the right-hand side represent the expected payoffs without and with default, respectively. At the initial purchase time t0 when the investor first acquires the bond, vˆ(t0 , T ) = v(t0 , T ) = U (t0 , T, v(t0 , T ), t0 ).
(128.5)
At maturity, v(T, T ) = 1, U (T, T, v(T, T ), tˆ) = 1 − a(t)τ + (1 − τ )ct .
(128.6) (128.7)
page 4388
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds
4389
For a discount bond, an investor will pay ordinary income tax on the total amortized amount of the discount when the bond matures. The value of the discount bond to the buy-and-hold investor can be expressed as U (t, T, v(t, T ), tˆ) = EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ) + (1 − τ )ct 1[t∗ >t+1] } v (t0 , T ) + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(ˆ − δt )]1[t∗ ∈{t,t+1}] }.
(128.8)
At maturity, the difference between face value and original purchase price will be taxed with ordinary income tax rate v(T, T ) = 1, U (T, T, vˆ(T, T ), tˆ) = 1 − (1 − vˆ(t0 , T ))τ + (1 − τ )ct .
(128.9) (128.10)
128.3.2 Optimal trading strategy We next consider that investors trade before bond maturity. In each period, investors evaluate the market condition to see if they would be better off by selling their bonds. If the value of holding the bond is greater than that of selling it, the investor will continue to hold the bond. Otherwise, the investor will sell the bond and repurchase it back from the market. Thus, the worth of the bond to the investor will be the maximum of the no-trading (UN ) or trading (UT ) value: U (t, T, v(t, T ), tˆ(t)) = max{UN (t, T, v(t, T ), tˆ(t)), UT (t, T, v(t, T ), tˆ(t))}. (128.11) For a premium bond, if the bondholder chooses not to trade, the bond’s value to this investor at time t would be: UN (t, T, v(t, T ), tˆ(t)) = EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ(t + 1)) + (1 − τ )ct − aN (t + 1)τ ]1[t∗ >t+1] } + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(v(t + 1, T ) − δt )]1[t∗ ∈{t,t+1}] }.
(128.12)
As there is no trading at time t, the trading time indicator does not change; that is, tˆ(t + 1) = tˆ(t)
(128.13)
page 4389
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4390
and consequently, aN (t + 1) =
1 − vˆ(tˆ, T ) , T − tˆ(t)
v(t + 1, T ) = v(t, T ) + aN (t + 1).
(128.14) (128.15)
Conversely, if the bondholder sells the bond and repurchases it back immediately, the bond value to the investor at time t will be: UT (t, T, v(t, T ), tˆ(t)) = −τx (t, tˆ)[UT (t, T, v(t, T ), tˆ(t)) − v(t, T )] + EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ(t + 1)) + (1 − τ )ct − aT (t + 1)τ ]1[t∗ >t+1] } + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(v(t + 1, T ) − δt )]1[t∗ ∈{t,t+1] }.
(128.16)
As there is trading at time t, the purchase time is updated: tˆ(t + 1) = t, vˆ(tˆ, T ) = UT (t, T, v(t, T ), tˆ(t)), 1 (1 − vˆ(tˆ, T )), T −t v(t + 1, T ) = vˆ(tˆ, T ) + aT (t + 1), aT (t + 1) =
(128.17) (128.18) (128.19) (128.20)
where the basis is reinitialized to reflect the new purchase price. Both the basis, vˆ(t), the amount of amortization, and the time at which the basis is set, tˆ(t), are time-dependent. For example, the basis at time t + 1, vˆ(t + 1, T ), depends on whether the bond is traded at time t. If there is no trade at time t, the basis is adjusted at the old amortization rate, vˆ(t + 1, T ) = vˆ(t, T ) + aN (t + 1); otherwise, the basis will be adjusted based on the new price and new amortization rate, vˆ(t + 1, T ) = v(t, T ) + aT (t + 1). This example illustrates a complicated dynamic programming problem that the value of the bond to the investor, U (t + 1, T, vˆ(t + 1), tˆ(t + 1)), depends not only on what happens after t+1, 1{t∗ >t+1} , but also before t+1, 1{t∗ ∈{t,t+1}} . The following boundary conditions parallel to (128.5)–(128.7) must hold: vˆ(t0 , T ) = v(t0 , T ) = U (t0 , T, vˆ(t0 , T ), t0 ).
(128.21)
page 4390
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds
4391
At maturity, v(T, T ) = 1, U (T, T, v(T, T ), tˆ(T )) = 1 − a(t)τ + (1 − τ )ct .
(128.22) (128.23)
For a discount bond, the holding value to an investor will be UN (t, T, v(t, T ), tˆ(t)) = EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ(t + 1)) + (1 − τ )ct ]1[t∗ >t+1] } v (t0 , T ) + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(ˆ − δt )]1[t∗ ∈{t,t+1}] }.
(128.24)
Once the bondholder sells the bond and repurchases it back immediately, the bond value to the investor at time t will be: UT (t, T, v(t, T ), tˆ(t)) = −τx (t, tˆ)[(UT (t, T, v(t, T ), tˆ(t)) − v(t, T )] + (1 − τ )[v(t, T ) − vˆ(t0 , T )] + EtP {e−(1−τ )rt [U (t + 1, T, v(t + 1, T ), tˆ(t + 1)) + (1 − τ )ct ]1[t∗ >t+1] } + EtP {e−(1−τ )rt [δt + τx (t + 1, tˆ)(v(t + 1, T ) − δt ]1[t∗ ∈{t,t+1] }.
(128.25)
As there is trading at time t, the purchase time is updated: tˆ(t + 1) = t, vˆ(tˆ, T ) = UT (t, T, v(t, T ), tˆ(t)), 1 (1 − vˆ(tˆ, T )), T −t v(t + 1, T ) = vˆ(tˆ, T ) + aT (t + 1). aT (t + 1) =
(128.26) (128.27) (128.28) (128.29)
Let vBH (t0 , T ) denote the bond price under the buy-and-hold strategy and vOP (t0 , T ) the bond price under the optimal trading strategy. The tax timing option value of this bond in percentage terms is TO(t0 , T ) =
vOP (t0 , T ) − vBH (t0 , T ) . vOP (t0 , T )
(128.30)
page 4391
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4392
128.4 Numerical Results The short-term interest rate, r, is the only state variable in the model and we assume that it follows a driftless binomial random walk with two reflecting barriers (see Figure 128.1). In the beginning, we adopt the same parameter values (interest rates, coupons, etc.) and tax regime as in Constantinides and Ingersoll (1984) for direct comparison with their results. Later, we adopt parameters and the tax regime that are closer to the current economic environment. The short rate r in the binomial system goes up or down with an equal probability. We initially consider two volatility scenarios for the interest rate process. In the low-variance process, the interest rate takes on the twentyone values, 0.04, 0.05, . . . , 0.24. At each point in time, the interest rate either increases or decreases by 0.01, each with a probability of one half. If the interest rate hits one of the reflecting barriers, at the next time point it either remains unchanged or takes on the value of 0.05 or 0.23, each with a probability of one half. In the high-variance process, the interest rate takes on one of eleven values, 0.04, 0.06, . . . , 0.24. The interest rate increases or decreases by 0.02 and the probabilities of increase and decrease are the same as in the low-variance process with reflecting barriers set at 0.04 and 0.24. Following Constantinides and Ingersoll (1984), we adopt four tax scenarios except that both premiums and discounts are now amortized for each bond.
U1
ruu r ud
U2
ru
λ U3 rd
rdu
U4
rdd
λ
U5
λ U6
Figure 128.1:
The interest rate process.
page 4392
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds
4393
(I) Under scenario I, the marginal investor is an individual. Coupon income is taxed at the marginal income tax rate τ = 0.5. Realized short-term and long-term gains and losses are taxed at the same rate τs = τl = 0.25. (II) Under scenario II, the marginal investor is an individual. Coupon income is taxed at the rate τ = 0.5. Realized short-term gains and losses are taxed at the rate τs = 0.5. Realized long-term gains and losses are taxed at the rate τl = 0.25. (III) Under scenario III, the marginal investor is an individual. Coupon income is taxed at the rate τ = 0.5. Short- and long-term gains and losses are untaxed, i.e., τs = τl = 0. (IV) Under scenario IV, the marginal bondholder is a bank or bond dealer. Coupon income and all capital gains and losses are taxed at the rate τ = τs = τl = 0.5. Scenario I assumes symmetric taxation for long- and short-term capital gains whereas Scenario II adopts an asymmetric tax treatment for capital gains. Scenario III assumes no capital gains taxes (e.g., individual retirement accounts) and Scenario IV is intended to capture the tax effect when the trader is a dealer or a bank. Since the current US tax laws require an amortization for both premium and discount bonds, we incorporate these amortization rules in each of the above four tax scenarios.5 We keep Constantinides and Ingersoll’s (1984) assumption for amortization only when we replicate their results for the case of default-free taxable bonds. 128.4.1 Prices and tax-timing option values of default-free bonds We first replicate Constantinides and Ingersoll’s (1984) results for defaultfree taxable bonds (i.e., λ = 0 in Figure 128.1) under their original setting. The parameter choices are the same as Constantinides and Ingersoll (1984). The initial short-term interest rate is 14% and its standard deviation σr is equal to 0.02 and 0.01 in the high and low variance cases, respectively. The coupon rates are 6%, 10%, 14% and 18%. Unlike their study, we assume no capital loss limitation. Tables 128.1 and 128.2 show simulated prices and the tax-timing option values of default-free bonds. These results are comparable to theirs.6 Minor 5
Note that similar to our procedure, Constantinides and Ingersoll (1984) use the straightline method to amortize premiums. 6 The slight difference under optimal trading strategy may be due to the dynamic programming estimation procedure.
page 4393
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4394 Table 128.1:
Treasury bond prices under optimal trading strategy.
High variance process σr = 0.02 per year Maturity
I
II
5 10 15 20 25 30
0.803 0.689 0.619 0.573 0.543 0.522
0.804 0.699 0.641 0.607 0.586 0.572
5 10 15 20 25 30
0.905 0.856 0.828 0.811 0.801 0.795
0.912 0.890 0.889 0.892 0.896 0.898
5 10 15 20 25 30
1.020 1.047 1.065 1.076 1.083 1.087
1.036 1.103 1.160 1.201 1.230 1.248
5 10 15 20 25 30
1.162 1.268 1.331 1.369 1.392 1.407
1.175 1.329 1.441 1.519 1.570 1.604
III
IV
Coupon = 0.06 0.838 0.749 0.726 0.640 0.648 0.586 0.593 0.554 0.554 0.534 0.526 0.521 Coupon = 0.10 0.923 0.879 0.879 0.839 0.849 0.825 0.826 0.819 0.809 0.818 0.796 0.817 Coupon = 0.14 1.037 1.012 1.087 1.041 1.119 1.070 1.135 1.092 1.141 1.109 1.142 1.121 Coupon = 0.18 1.182 1.150 1.325 1.254 1.416 1.328 1.469 1.381 1.497 1.416 1.510 1.441
Low variance process σr = 0.01 per year I
II
III
IV
0.801 0.682 0.608 0.563 0.533 0.514
0.801 0.683 0.614 0.579 0.561 0.553
0.837 0.722 0.642 0.587 0.550 0.524
0.746 0.629 0.569 0.536 0.518 0.507
0.901 0.845 0.814 0.797 0.789 0.785
0.901 0.855 0.843 0.847 0.858 0.869
0.919 0.864 0.832 0.813 0.804 0.798
0.874 0.822 0.801 0.794 0.793 0.795
1.009 1.023 1.037 1.049 1.061 1.070
1.017 1.055 1.098 1.138 1.174 1.203
1.017 1.047 1.076 1.100 1.118 1.131
1.005 1.020 1.039 1.057 1.074 1.088
1.150 1.245 1.303 1.340 1.367 1.387
1.155 1.275 1.369 1.442 1.499 1.543
1.161 1.281 1.366 1.425 1.465 1.491
1.143 1.233 1.295 1.340 1.375 1.401
Notes: The initial short-term interest rate is 14%. σr is the annual standard deviation of changes in the short-term interest rate. If the trading price of the bond is above par, this difference is amortized linearly to the maturity date and the basis is increased by the amount amortized. If the trading price of the bond is below par, the difference is treated as capital gain at the next trading time or at maturity and the basis is equal to the previous trading price. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I. τs = τl
= 0.25;
II. τs = 0.50, τl = 0.25; III. τs = τl = 0; IV. τs = τl = 0.5.
page 4394
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.2: strategy.
Tax-timing option value of Treasury bonds under optimal trading
High variance process σr = 0.02 per year Maturity
4395
I
II
5 10 15 20 25 30
0.0% 0.6% 1.3% 1.8% 2.3% 2.7%
0.1% 2.0% 4.8% 7.3% 9.5% 11.2%
5 10 15 20 25 30
0.2% 1.1% 1.8% 2.3% 2.7% 2.9%
1.0% 4.9% 8.5% 11.1% 12.9% 14.1%
5 10 15 20 25 30
1.5% 3.4% 4.2% 4.3% 4.3% 4.1%
3.1% 8.3% 12.0% 14.3% 15.7% 16.5%
5 10 15 20 25 30
1.6% 3.0% 3.4% 3.3% 3.2% 3.0%
2.7% 7.4% 10.8% 12.9% 14.2% 14.9%
III Coupon 0.0% 0.1% 0.2% 0.1% 0.1% 0.1% Coupon 0.3% 1.4% 2.2% 2.3% 2.3% 2.0% Coupon 3.2% 6.9% 8.7% 9.3% 9.2% 8.7% Coupon 3.3% 7.2% 9.2% 9.9% 10.0% 9.7%
IV = 0.06 0.1% 1.6% 3.4% 4.6% 5.6% 6.2% = 0.10 0.4% 2.2% 4.0% 5.2% 6.1% 6.7% = 0.14 0.8% 2.8% 4.6% 5.7% 6.5% 7.0% = 0.18 0.6% 1.9% 3.2% 4.2% 4.8% 5.3%
Low variance process σr = 0.01 per year I
II
III
IV
0.0% 0.1% 0.5% 0.9% 1.4% 1.8%
0.0% 0.3% 1.3% 3.7% 6.3% 8.6%
0.0% 0.0% 0.1% 0.1% 0.2% 0.2%
0.0% 0.4% 1.4% 2.4% 3.4% 4.2%
0.0% 0.3% 0.8% 1.4% 2.0% 2.4%
0.0% 1.5% 4.3% 7.2% 9.8% 11.8%
0.0% 0.2% 0.8% 1.6% 2.4% 2.9%
0.0% 0.8% 2.0% 3.1% 4.0% 4.7%
0.8% 1.8% 2.5% 3.0% 3.2% 3.4%
1.5% 4.7% 7.9% 10.5% 12.6% 14.0%
1.5% 4.1% 6.1% 7.4% 8.2% 8.5%
0.4% 1.5% 2.7% 3.7% 4.4% 5.0%
0.9% 1.9% 2.2% 2.3% 2.4% 2.5%
1.3% 4.2% 7.0% 9.2% 11.0% 12.3%
1.8% 4.7% 6.8% 8.1% 8.9% 9.3%
0.3% 0.9% 1.6% 2.3% 2.9% 3.4%
Notes: The initial short-term interest rate is 14%. σr is the annual standard deviation of changes in the short-term interest rate. If the trading price of the bond is above par, this difference is amortized linearly to the maturity date and the basis is increased by the amount amortized. If the trading price of the bond is below par, the difference is treated as capital gain at the next trading time or at maturity and the basis is equal to the previous trading price. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I. τs = τl = 0.25; II. τs = 0.50, τl = 0.25; III. τs = τl = 0; IV. τs = τl = 0.5. )−vBH (t0 ,T ) in percentage, where Tax timing option is defined as TO(t0 , T ) = vOP (t0v,T OP (to ,T ) vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and optimal trading strategies, respectively.
page 4395
July 6, 2020
16:8
4396
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
differences arise because Constantinides and Ingersoll (1984) employed the following formulas to calculate the buy-and-hold prices: T πt + (1 − τl + τl P )πT for P ≤ 1, (128.31a) P = (1 − τc )c t=1
P = [(1 − τc )c + τc (P − 1)]
T
πt + πT
for P ≥ 1,
(128.31b)
t=1
where πt is the price at time zero of the after-tax cash flow at time t. The main difference between this method and ours is that we allow the interest (or discount) rate to be stochastic. 128.4.2 Equilibrium prices and tax-timing option values for defaultable bonds We next examine the effects of default risk on the equilibrium price and tax-timing option value. We set λ equal to 1% and δ equal to 50%. The equilibrium prices of defaultable bonds and their tax timing option values are reported in Tables 128.3 and 128.4, respectively. Bond prices decrease after incorporating the effect of default because default risk reduces the expected payoff of the bond. Bond prices are only slightly higher when the interest rate volatility is higher. Comparing Table 128.3 with Table 128.1, we find that the percentage decrease in bond value is large under Scenario III, especially for par and premium bonds. Since both short- and long-term capital gain taxes are zero under this scenario, an investor has no opportunity to receive a tax rebate from the loss of default in his investment. Thus, there is a larger drop in bond price under Scenario I. Bond prices in Scenario II are higher than in other scenarios. Two factors contribute to higher prices in Scenario II. First, realization of short-term losses provides valuable rebates and a short-term holding period is not difficult to establish. Second, compared to Scenario IV, Scenario II gives investors an opportunity to realize long-term capital gains with a lower tax rate whenever a realization of capital gains is optimal. In some circumstances investors may want to realize capital gains in order to establish a short-term holding status. Alternatively, investors may postpone capital gains until maturity. In any event, these benefits are lowest for short-maturity bonds because their market prices are least volatile. For this reason, prices of short-maturity bonds under Scenario II are very close to prices under other scenarios, particularly when interest rate volatility is low. Default risk tends to lower the tax-timing option value. As shown in Table 128.4, the tax-timing option value all decreases under Scenarios I, II,
page 4396
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.3:
Defaultable bond prices under optimal trading strategy.
High variance process σr = 0.02 per year Maturity
4397
I
II
5 10 15 20 25 30
0.789 0.670 0.600 0.556 0.528 0.509
0.790 0.678 0.614 0.575 0.552 0.537
5 10 15 20 25 30
0.887 0.828 0.795 0.777 0.765 0.758
0.893 0.851 0.833 0.825 0.821 0.819
5 10 15 20 25 30
0.994 1.003 1.010 1.013 1.015 1.016
1.004 1.040 1.073 1.095 1.111 1.120
5 10 15 20 25 30
1.127 1.208 1.253 1.278 1.292 1.299
1.135 1.249 1.329 1.381 1.414 1.434
III
IV
Low variance process σr = 0.01 per year I
Coupon = 0.06 0.821 0.738 0.787 0.703 0.627 0.664 0.624 0.572 0.591 0.571 0.541 0.546 0.536 0.522 0.519 0.511 0.510 0.502 Coupon = 0.10 0.902 0.866 0.884 0.842 0.818 0.820 0.802 0.800 0.784 0.774 0.792 0.764 0.755 0.788 0.754 0.742 0.786 0.749 Coupon = 0.14 0.998 0.993 0.984 1.018 1.011 0.982 1.029 1.031 0.986 1.031 1.047 0.991 1.030 1.058 0.997 1.002 1.025 1.066 Coupon = 0.18 1.135 1.127 1.116 1.239 1.214 1.187 1.298 1.274 1.227 1.328 1.315 1.253 1.340 1.341 1.270 1.342 1.359 1.282
II
III
IV
0.787 0.666 0.596 0.555 0.532 0.519
0.820 0.699 0.619 0.566 0.531 0.508
0.735 0.616 0.557 0.525 0.507 0.497
0.886 0.828 0.801 0.792 0.792 0.794
0.899 0.833 0.792 0.768 0.753 0.744
0.860 0.803 0.779 0.769 0.766 0.766
0.992 1.006 1.026 1.048 1.067 1.082
0.983 0.986 0.994 1.003 1.011 1.016
0.987 0.992 1.003 1.016 1.027 1.037
1.122 1.212 1.277 1.324 1.359 1.384
1.115 1.200 1.256 1.292 1.314 1.326
1.120 1.194 1.244 1.279 1.305 1.324
Notes: The initial short-term interest rates is 14%. σr is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I.
τs = τl = 0.25;
II. τs = 0.50, τl = 0.25; III. τs = τl = 0; IV. τs = τl = 0.5. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
page 4397
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4398 Table 128.4: strategy.
Tax-timing option value of defaultable bonds under optimal trading
High variance process σr = 0.02 per year Maturity
I
II
5 10 15 20 25 30
0.1% 0.5% 1.2% 1.6% 2.1% 2.4%
0.2% 1.8% 3.6% 5.2% 6.8% 7.9%
5 10 15 20 25 30
0.1% 0.8% 1.4% 1.9% 2.3% 2.5%
0.8% 3.5% 6.2% 8.3% 9.8% 10.8%
5 10 15 20 25 30
1.1% 2.7% 3.5% 3.6% 3.6% 3.6%
2.1% 6.5% 9.9% 12.1% 13.4% 14.2%
5 10 15 20 25 30
1.4% 2.8% 3.1% 3.0% 2.8% 2.6%
2.1% 6.3% 9.4% 11.4% 12.6% 13.2%
III
IV
Coupon = 0.06 0.0% 0.1% 0.1% 1.5% 0.2% 3.2% 0.1% 4.3% 0.1% 5.2% 0.1% 5.9% Coupon = 0.10 0.1% 0.4% 0.6% 2.1% 0.9% 3.7% 0.8% 4.9% 0.6% 5.8% 0.5% 6.3% Coupon = 0.14 1.7% 0.6% 4.8% 2.6% 6.4% 4.4% 6.8% 5.6% 6.6% 6.4% 6.0% 6.9% Coupon = 0.18 3.1% 0.5% 6.9% 1.8% 8.7% 3.1% 9.1% 4.0% 8.8% 4.6% 8.2% 5.1%
Low variance process σr = 0.01 per year I
II
III
IV
0.1% 0.2% 0.5% 0.7% 1.2% 1.5%
0.1% 0.4% 1.3% 2.3% 3.7% 5.0%
0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
0.0% 0.4% 1.3% 2.2% 3.1% 3.8%
0.0% 0.2% 0.7% 1.1% 1.6% 1.9%
0.2% 1.2% 2.9% 4.9% 6.6% 8.0%
0.0% 0.1% 0.3% 0.7% 1.1% 1.3%
0.1% 0.8% 1.8% 2.8% 3.6% 4.2%
0.3% 1.1% 1.8% 2.3% 2.6% 2.8%
1.1% 3.5% 5.9% 8.1% 9.8% 10.9%
0.4% 2.0% 3.5% 4.6% 5.4% 5.7%
0.3% 1.3% 2.4% 3.4% 4.1% 4.6%
0.7% 1.6% 1.9% 2.0% 2.0% 2.0%
1.3% 3.8% 6.0% 7.8% 9.2% 10.1%
1.6% 4.3% 6.1% 7.2% 7.7% 7.7%
0.2% 0.8% 1.5% 2.2% 2.7% 3.1%
Notes: The initial short-term interest rate is 14%. σr is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I.
τs = τl = 0.25;
II. τs = 0.50, τl = 0.25; III. τs = τl = 0; IV. τs = τl = 0.5. )−vBH (t0 ,T ) in percentage, where Tax timing option is defined as TO(t0 , T ) = vOP (t0v,T OP (to ,T ) vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
page 4398
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Estimating the Tax-Timing Option Value of Corporate Bonds
b3568-v4-ch128
4399
III and IV, compared to the results in Table 128.2. This is intuitive; once a bond defaults, it will stop an investor from taking a tax advantage in the future trading periods. Therefore, tax timing option value will decrease. Timing option values under Scenario II are expected to be highest for two reasons. First, investors are able to choose their positions optimally to fully exploit all available tax benefits since the restrictive offset rule does not apply here. Second, asymmetric capital gain taxes in this scenario allow investors to benefit most from the tradeoff between the tax rebate from short-term capital loss realization and the tax payment on long-term capital gains at a lower rate. Scenario III allows investors to maneuver their amortization basis without incurring any tax penalty on capital gain realization. But investors lose the opportunity to receive tax rebates from capital losses. The former effect dominates the latter for par and premium bonds. This explains why timing option values for discount bonds under Scenario III in Table 128.4 are close to zero. But for par bond and premium bonds the tax timing option values are much larger under this scenario. Conversely, for discount bonds under Scenarios I and IV, investors have a greater chance to realize capital losses to receive tax rebates because the basis of discount bonds increases over time. Under Scenario IV, the tax rebate on capital losses is larger due to a higher capital gain tax rate. This is why tax-timing option values for discount bonds are higher under Scenario IV than under Scenario III. In general, the tax-timing option value increases with the coupon rate. This is because the basis of premium bonds decreases by the amount amortized each period, giving investors an incentive to trade to establish a higher basis. Whenever bond price exceeds the current basis, investors are inclined to establish a higher basis to reduce taxes via future amortization. The situation is different for discount bonds. Under Scenario III, investors still want to trade whenever trading price exceeds the current basis. But their main purpose is to reduce the income tax burden since the amortized amount of the discount bond is taxed at the regular income tax rate. Our results show that the tax advantage of premium bonds outweighs that of discount bonds. In summary, the tax-timing option value remains sizable when the default effect is accounted. The tax-timing option value is higher for long-term discount bonds when interest rate volatility is higher. This value can be more than 14% for long-maturity premium bonds under asymmetric taxation. By contrast, the tax-timing option value is modest for short-maturity (five-year) bonds for all scenarios.
page 4399
July 6, 2020
16:8
4400
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
128.4.3 Effects of transaction costs Transaction cost may reduce the frequency of trades and the tax-timing option value. To incorporate the effects of transaction cost, we set trading cost at 0.5%, 1% and 2% of transaction price. In the interest of brevity, we report the results only for par bonds at high interest rate volatility (σr = 02). Table 128.5 reports timing option values in the presence of transaction cost. For ease of comparison, we also report the tax-timing option values with no transaction cost in the first column of each scenario. The results show that transaction cost has only a modest effect on the tax-timing option value when it is low. For example, when transaction cost is 0.5%, the tax-timing option value decreases only very slightly. Even at the 1% level of transaction cost, the tax-timing option value drops a little more but remains sizable, especially under Scenarios II and III and IV. As transaction cost increases to 2%, the timing option value drops substantially. For example, under Scenario II the timing option value drops from 14.2% to 4.2% for 30-year bonds. Results show that the timing option value is relatively small when transaction cost is high. The sensitivity of the timingoption value to transaction cost is higher for the case of asymmetric taxation (Scenario II). Schultz (2001) finds that round-trip trading costs average about 0.27% for institutional trades. Hong and Warga (1998) report institutional bond trading costs of 0.13% for investment-grade bonds and 0.19% for noninvestment-grade bonds. Edward, Harris and Piwowar (2004) report an average round-trip cost of 0.54% for a representative institutional trade and 1.38% for a retail trade. Within the range of these transaction costs, our results show that the tax-timing option value is sizable. 128.4.4 Changes in interest rate volatility and tax regime Tax regimes and interest rates change over time. Changes in tax rates and interest rates affect investors’ trading strategies, which in turn change the value of tax-timing option. We next examine the sensitivity of bond price and the tax-timing option value to changes in interest and tax rates. Table 128.6 provides summary statistics of one-month Treasury bill rates from 1981 to 2001. Average short rate drops from 7.75% in the period of 1981–1991 to 4.51% in the period of 1991–2001. The standard deviation of interest rates in the latter period (0.95%) is less than a half of that in the former (2.6%). In the period of 1996–2001, it further decreases to 0.59%. To accommodate this trend, we change the initial short–term interest rate to
page 4400
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.5: bonds.
4401
Effects of transaction costs on the tax-timing option value of par
Timing option values at different transaction costs Transaction cost
0.0%
5 10 15 20 25 30
1.1% 2.7% 3.5% 3.6% 3.6% 3.6%
5 10 15 20 25 30
1.7% 4.8% 6.4% 6.8% 6.6% 6.0%
0.5% 1% 2% Tax scenario I 0.6% 0.5% 1.8% 1.3% 2.6% 2.0% 2.8% 2.2% 2.9% 2.3% 2.8% 2.2% Tax scenario III 1.5% 4.1% 5.4% 5.7% 5.4% 4.9%
1.2% 3.2% 4.3% 4.5% 4.3% 3.9%
0.6% 1.1% 1.5% 1.7% 1.8% 1.8%
Timing option values at different transaction costs 0.0%
0.5% 1% Tax scenario II
2.1% 1.6% 1.2% 6.5% 5.3% 3.7% 9.9% 8.1% 5.8% 12.1% 9.9% 7.1% 13.4% 11.1% 7.8% 14.2% 11.7% 8.2% Tax scenario IV
0.9% 2.4% 3.3% 3.6% 3.5% 3.2%
0.6% 2.6% 4.4% 5.6% 6.4% 6.9%
0.4% 2.2% 3.9% 5.1% 5.9% 6.4%
0.3% 1.7% 3.3% 4.4% 5.2% 5.7%
2%
0.8% 2.3% 3.4% 3.9% 4.2% 4.2%
0.5% 1.4% 2.6% 3.5% 4.2% 4.6%
Notes: Computed at midpoint of interest rate range, r = 0.14. Interest rate follows high-variance process with standard deviation of 0.02 per year. Coupon rate is 0.14. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I.
τs = τl = 0.25;
II. τs = 0.50, τl = 0.25; III. τs = τl = 0; IV. τs = τl = 0.5. )−vBH (t0 ,T ) in percentage, where Tax timing option is defined as TO(t0 , T ) = vOP (t0v,T OP (to ,T ) vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
Table 128.6: rates (%).
Summary statistics of one-month Treasury bill
Years Mean Standard Deviation
1981–1991
1991–2001
7.75 2.60
4.51 0.95
1996–2001 4.86 0.59
page 4401
July 6, 2020
16:8
4402
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
6% and lower coupon rates to 2%, 4%, 6% and 8%. In addition, the standard deviation of interest rates is reduced to 1% (high) and 0.5% (low). The highest individual income tax bracket is 39.6% during the Clinton Administration. The maximum federal income tax rate (τF ) is reduced to 35% for both individuals and corporations during the Bush Administration,7 and maximum state income tax rate (τS ) ranges from 5% to 10%. The effective income tax rate for corporate bond investors is τ = τF + τS (1 − τF ) as the state tax is a deduction against the federal income tax. Combining both federal and state taxes, the maximum effective tax rate is about 40%. We thus lower the marginal income tax rate to 40% and the capital gains tax rate to 20% in Scenario I, to 20% and 40% for long- and short-term gains tax rates in Scenario II, and to 40% for both short- and long-term gain tax rates in Scenario IV. Both short- and long-term capital gains tax rates remain zero in Scenario III. Table 128.7 shows bond prices under the new tax and interest rate setting. Compared to the results in Table 128.3, prices for discount bonds are higher whereas prices for par and premium bonds are somewhat lower. These results hold even if the volatility level is fixed at 1%. Prices are not very sensitive to interest rate volatility. As shown, an increase in interest rate volatility from 0.5% to 1% affects bond prices only marginally. Table 128.8 reports tax-timing option values. Compared to those in Table 128.6 with the same interest rate volatility (σr = 1%), lower tax and interest rates generally increase the timing option value. The effect on the tax-timing option value is higher for long-term bonds, premium bonds and for the case of asymmetric taxation. Figure 128.2 shows the tax-timing option values for par bonds (6% coupon rate and 14% coupon rate) given the 1% interest rate volatility under Scenario II. The tax-timing option value increases about 0% to 3% as maturity increases. Nevertheless, tax-timing option values continue to account for a notable portion of bond prices under all scenarios. Unlike the effect on bond price, interest rate volatility strongly affects the tax-timing option value. Lowering the interest rate volatility from 1% to 0.5% reduces the value of timing option substantially. The impact of interest rate volatility on the timing option value is larger for longer-term bonds. 7
According to the tax bill passed May 23, 2003, the highest income tax rate for individuals is reduced to 35% retroactive to January 1, 2003. The next three rates are 33%, 28% and 25%. This tax act accelerates the tax reduction scheduled for 2004 through 2006 by the Economic Growth and Tax Relief Reconciliation Act of 2001 (see the report of the Wall Street Journal May 23, 2003).
page 4402
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.7:
Defaultable bond prices under optimal trading strategy.
High variance process σr = 0.01 per year Maturity
4403
I
II
5 10 15 20 25 30
0.853 0.747 0.670 0.612 0.569 0.535
0.854 0.754 0.684 0.635 0.598 0.571
5 10 15 20 25 30
0.916 0.860 0.820 0.792 0.771 0.756
0.920 0.875 0.849 0.835 0.826 0.822
5 10 15 20 25 30
0.983 0.980 0.981 0.985 0.988 0.991
0.989 1.001 1.022 1.045 1.067 1.087
5 10 15 20 25 30
1.062 1.118 1.162 1.197 1.223 1.244
1.066 1.139 1.207 1.268 1.319 1.363
III
Low variance process σr = 0.005 per year IV
Coupon = 0.02 0.873 0.823 0.774 0.709 0.697 0.634 0.637 0.582 0.590 0.544 0.553 0.516 Coupon = 0.04 0.926 0.902 0.872 0.844 0.832 0.810 0.801 0.788 0.777 0.775 0.757 0.766 Coupon = 0.06 0.984 0.982 0.986 0.980 0.990 0.986 0.995 0.996 0.998 1.007 1.001 1.017 Coupon = 0.08 1.062 1.064 1.127 1.119 1.180 1.168 1.221 1.210 1.252 1.246 1.275 1.275
I
II
III
IV
0.852 0.743 0.662 0.602 0.557 0.524
0.853 0.744 0.665 0.608 0.568 0.540
0.872 0.772 0.692 0.630 0.582 0.545
0.822 0.703 0.623 0.567 0.527 0.499
0.915 0.854 0.809 0.778 0.756 0.740
0.917 0.859 0.821 0.799 0.786 0.780
0.925 0.866 0.821 0.787 0.763 0.745
0.900 0.837 0.795 0.769 0.752 0.741
0.979 0.967 0.962 0.960 0.962 0.965
0.984 0.981 0.987 0.998 1.014 1.031
0.978 0.966 0.962 0.962 0.965 0.970
0.980 0.971 0.969 0.972 0.978 0.985
1.057 1.103 1.140 1.169 1.193 1.213
1.061 1.118 1.169 1.215 1.258 1.296
1.053 1.102 1.145 1.180 1.210 1.235
1.062 1.110 1.150 1.184 1.213 1.238
Notes: The initial short-term interest rates is 6%. σr is the annual standard deviation of changes in the short rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I.
τs = τl = 0.2;
II. τs = 0.40, τl = 0.2; III. τs = τl = 0; IV. τs = τl = 0.4. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
page 4403
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4404 Table 128.8: strategy.
Tax-timing option value of defaultable bonds under optimal trading
High variance process σr = 0.01 per year Maturity
I
II
5 10 15 20 25 30
0.0% 0.3% 0.6% 0.9% 1.2% 1.4%
0.1% 1.1% 2.7% 4.6% 6.4% 8.1%
5 10 15 20 25 30
0.0% 0.5% 0.9% 1.4% 1.6% 1.9%
0.4% 2.2% 4.4% 6.8% 8.9% 10.8%
5 10 15 20 25 30
0.4% 1.4% 2.3% 3.0% 3.4% 3.7%
1.0% 3.6% 6.5% 9.4% 11.7% 13.7%
5 10 15 20 25 30
0.7% 1.9% 2.6% 3.1% 3.3% 3.4%
1.1% 3.8% 6.6% 9.2% 11.4% 13.2%
III
Low variance process σr = 0.005 per year IV
Coupon = 0.02 0.0% 0.0% 0.1% 0.6% 0.1% 1.3% 0.2% 2.2% 0.1% 3.0% 0.1% 3.7% Coupon = 0.04 0.0% 0.1% 0.5% 0.8% 0.8% 1.7% 1.1% 2.6% 1.3% 3.5% 1.3% 4.2% Coupon = 0.06 0.6% 0.2% 2.4% 1.0% 3.9% 2.0% 5.0% 3.0% 5.7% 3.9% 6.1% 4.6% Coupon = 0.08 1.4% 0.1% 4.0% 0.8% 5.9% 1.5% 7.2% 2.3% 8.0% 2.9% 8.4% 3.5%
I
II
III
IV
0.0% 0.0% 0.1% 0.3% 0.5% 0.8%
0.1% 0.2% 0.6% 1.4% 2.5% 4.0%
0.0% 0.0% 0.0% 0.0% 0.1% 0.1%
0.0% 0.1% 0.4% 0.8% 1.4% 2.0%
0.0% 0.1% 0.2% 0.5% 0.9% 1.2%
0.2% 0.7% 1.7% 3.2% 4.8% 6.6%
0.0% 0.0% 0.1% 0.3% 0.6% 1.0%
0.0% 0.2% 0.6% 1.2% 1.8% 2.4%
0.1% 0.4% 0.9% 1.4% 1.9% 2.3%
0.6% 1.8% 3.5% 5.4% 7.3% 9.2%
0.1% 0.6% 1.5% 2.4% 3.3% 4.1%
0.0% 0.4% 0.9% 1.6% 2.2% 2.8%
0.3% 1.0% 1.5% 1.8% 2.1% 2.3%
0.8% 2.3% 4.0% 5.9% 7.6% 9.3%
0.7% 2.1% 3.5% 4.8% 5.7% 6.5%
0.1% 0.3% 0.7% 1.1% 1.6% 2.0%
Notes: The initial short-term interest rates is 6%. σr is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.50, short-term capital gains tax rates, τs , and long-term rate, τl : I.
τs = τl = 0.2;
II. τs = 0.40, τl = 0.2; III. τs = τl = 0; IV. τs = τl = 0.4. )−vBH (t0 ,T ) in percentage, where Tax timing option is defined as TO(t0 , T ) = vOP (t0v,T OP (to ,T ) vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
page 4404
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds
4405
Effect of Interest Rate Level Change on Tax Timing Option Value (Tax Scenario II) 16.00%
Tax Timing Option Value
July 6, 2020
14.00% 12.00% 10.00% 8.00% 6.00% 4.00% 2.00% 0.00%
Maturity (years)
5
10
15
20
25
30
Par bond tax timing option value (14% initial discount rate) Par bond tax timing option value (6% initial discount rate) Figure 128.2:
Tax timing option values under different initial interest rates.
Summarizing, lower tax and interest rates increase the timing option value for individual investors, when taxes are asymmetric and for dealers and banks (Scenario IV). Despite low interest rates and volatility, the tax-timing option value remains sizable for longer-maturity bonds under asymmetric taxation. The tax-timing option value is above 10% for par bond and premium bonds with maturity longer than 20 years when interest rate volatility is around 1%.
128.4.5 Sensitivity of tax-timing option values to default risk We next assess the impact of changes in default probability and recovery rates on tax-timing option values. The initial short rate is set at 6%, its standard deviation is 1%, and the recovery rate is 50%. The tax regime is identical to that in Table 128.8. We vary default probabilities from 0% to 4%. Other things being equal, the tax-timing option value should decrease as default probability increases since default forces investors to close their position before maturity and stops the trading process. This default effect is expected to be stronger for longer-maturity bonds since tax-timing option is a compounding option. However, the situation becomes more complicated when default-related tax effects are taken into account. For example,
page 4405
July 6, 2020
16:8
4406
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
investors may sell the bond to establish a short-term status to receive a higher tax rebate upon default. Table 128.9 reports the tax-timing option values associated with different default probabilities. We plot the timing option values for bonds with a 6% coupon rate for all scenarios in Figure 128.3. The curve with diamonds represents the results for zero default probability, squares represent 1% default probability, triangles represent 2% default probability and crosses represent 4% default probability. As shown, the timing option value decreases with default probability for all four scenarios. Default risk has a larger impact on the timing option value for Scenarios I and III and for longer-maturity bonds. For example, for 6% coupon bonds with 30-year maturity, the tax timing option value under Scenario III decreases from 10.7% to 0.1% as default probability increases from zero to 4%. Default risk puts investors under Scenario III at the most disadvantage because they cannot claim tax rebates from default losses. In contrast, investors under Scenarios II and IV are in a better position to shield the burden of default risk due to relatively high capital gain tax rates and larger expected tax rebates from default losses. This may explain why the effect of default risk on the tax-timing option value is milder for these two scenarios. Results suggest that the tax-timing option value is lower for speculativegrade bonds. For short-maturity bonds, timing option value is low when default probability is high. In contrast, the tax-timing option is still above 10% for bonds with maturity longer than 20 years even when default probability is high. Not surprisingly, for high-quality bonds with negligible default risk, the tax-timing option becomes more valuable. Table 128.10 reports the effects of changes in recovery rates on the taxtiming option value. Given maturity, the tax-timing option is almost flat across recovery rates. In some cases, there are slight increases or decreases but the magnitude is generally small, indicating that the recovery rate does not have a material effect for these two tax regimes. 128.4.6 Multiple trading dates The preceding analyses assume that there is only one transaction a year similar to the setting in Constantinides and Ingersoll (1984). We next extend our analysis to allow for two transactions a year. Table 128.11 reports the tax timing option values when investors can trade twice a year (or two trading periods). We report tax-timing option values for bonds with maturity up to 15 years. As shown, the tax timing option values under all scenarios are larger
page 4406
July 6, 2020 16:8
Table 128.9:
Tax-timing option value with different default probabilities. Tax timing option
II
III
IV
I
II
0.0% 1.0% 2.7% 4.8% 6.9% 9.0%
0.0% 0.1% 0.2% 0.3% 0.2% 0.2%
0.0% 0.7% 1.5% 2.6% 3.6% 4.6%
0.0% 0.3% 0.6% 0.9% 1.2% 1.4%
0.1% 1.1% 2.7% 4.6% 6.4% 8.1%
5 10 15 20 25 30
0.1% 0.3% 0.7% 2.1% 1.4% 4.6% 2.1% 7.3% 2.6% 9.8% 3.0% 12.2%
0.1% 1.1% 2.2% 3.1% 3.8% 4.3%
0.1% 0.9% 1.9% 3.1% 4.2% 5.2%
0.0% 0.4% 0.5% 2.2% 0.9% 4.4% 1.4% 6.8% 1.6% 8.9% 1.9% 10.8%
5 10 15 20 25 30
0.7% 1.0% 1.5% 0.2% 2.2% 3.7% 4.5% 1.1% 3.4% 6.8% 6.9% 2.3% 4.2% 9.8% 8.7% 3.5% 4.8% 12.3% 9.9% 4.5% 5.1% 14.7% 10.7% 5.4%
0.4% 1.0% 1.4% 3.6% 2.3% 6.5% 3.0% 9.4% 3.4% 11.7% 3.7% 13.7%
I
II
III
Coupon = 2% 0.0% 0.0% 0.0% 0.3% 0.0% 0.1% 0.6% 0.2% 1.3% 0.1% 0.1% 1.3% 0.4% 2.7% 0.1% 0.2% 2.2% 0.8% 4.4% 0.2% 0.1% 3.0% 1.0% 5.9% 0.1% 0.1% 3.7% 1.2% 7.3% 0.1% Coupon = 4% 0.0% 0.1% 0.0% 0.6% 0.0% 0.5% 0.8% 0.3% 2.3% 0.2% 0.8% 1.7% 0.6% 4.3% 0.2% 1.1% 2.6% 0.9% 6.4% 0.3% 1.3% 3.5% 1.1% 8.2% 0.2% 1.3% 4.2% 1.3% 9.7% 0.2% Coupon = 6% 0.6% 0.2% 0.2% 1.1% 0.2% 2.4% 1.0% 0.9% 3.6% 1.1% 3.9% 2.0% 1.4% 6.2% 1.8% 5.0% 3.0% 1.9% 8.6% 2.4% 5.7% 3.9% 2.2% 10.6% 2.6% 6.1% 4.6% 2.3% 12.2% 2.7%
IV
I
II
III
IV
0.0% 0.5% 1.1% 1.9% 2.5% 3.1%
0.0% 0.2% 0.3% 0.6% 0.7% 0.8%
0.6% 1.6% 2.8% 3.9% 4.9% 5.6%
0.0% 0.1% 0.1% 0.1% 0.1% 0.1%
0.0% 0.4% 0.9% 1.4% 1.8% 2.2%
0.1% 0.7% 1.4% 2.2% 2.9% 3.4%
0.0% 0.2% 0.4% 0.6% 0.7% 0.9%
1.0% 2.6% 4.3% 6.0% 7.2% 8.1%
0.0% 0.1% 0.1% 0.1% 0.0% 0.0%
0.0% 0.5% 1.1% 1.6% 2.0% 2.4%
0.1% 0.8% 1.7% 2.5% 3.2% 3.7%
0.0% 1.4% 0.3% 3.7% 0.6% 5.9% 0.8% 7.7% 0.9% 9.1% 1.0% 10.1%
0.0% 0.2% 0.2% 0.2% 0.2% 0.1%
0.1% 0.7% 1.2% 1.8% 2.2% 2.5%
(Continued) 4407
b3568-v4-ch128
0.0% 0.3% 0.6% 1.1% 1.4% 1.9%
IV
Default = 4%
9.61in x 6.69in
5 10 15 20 25 30
III
Default = 2%
Handbook of Financial Econometrics,. . . (Vol. 4)
I
Default = 1%
Estimating the Tax-Timing Option Value of Corporate Bonds
Default = 0%
page 4407
July 6, 2020
Tax timing option Default = 0% I
III
0.7% 0.7% 1.5% 2.1% 3.1% 4.4% 3.0% 5.8% 6.8% 8.6% 3.6% 8.5% 3.9% 10.8% 9.8% 4.0% 12.9% 10.7%
IV
0.2% 0.8% 1.6% 2.4% 3.2% 3.9%
I
II
III
Default = 2% IV
I
II
III
Coupon = 8% 0.7% 1.1% 1.4% 0.1% 0.6% 1.6% 1.3% 1.9% 3.8% 4.0% 0.8% 1.7% 4.5% 3.5% 2.6% 6.6% 5.9% 1.5% 2.4% 7.4% 5.0% 3.1% 9.2% 7.2% 2.3% 2.8% 9.8% 5.9% 3.3% 11.4% 8.0% 2.9% 2.9% 11.8% 6.2% 3.4% 13.2% 8.4% 3.5% 2.9% 13.3% 6.3%
Default = 4% IV
0.1% 0.7% 1.4% 2.2% 2.7% 3.2%
I
II
III
0.2% 2.0% 0.2% 0.9% 4.9% 1.0% 1.4% 7.6% 1.5% 1.6% 9.6% 1.7% 1.7% 11.1% 1.8% 1.7% 12.1% 1.7%
IV
0.2% 0.8% 1.4% 2.0% 2.3% 2.6%
9.61in x 6.69in
Notes: The initial short-term interest rate is 6%. σr = 1% is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Four tax scenarios are defined as in Table 128.8. Tax timing option is defined as )−vBH (t0 ,T ) TO(t0 , T ) = vOP (t0v,T in percentage, where vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and OP (to ,T ) optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1%, 2%, 4%, 6%, respectively and the recovery rate δ equal to 50%.
P. H. Chen, S. Liu & C. Wu
5 10 15 20 25 30
II
Default = 1%
Handbook of Financial Econometrics,. . . (Vol. 4)
(Continued)
16:8
4408
Table 128.9:
b3568-v4-ch128 page 4408
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds
4409
Tax Scenario I
Tax Timing Option Value
6.00% 5.00% 4.00% 3. 00% 2.00% 1.00% 0.00% 5
10
15
20
25
30
25
30
Maturity Tax Scenario II 16.00%
Tax Timing Option Value
14.00% 12.00% 10.00% 8.00% 6.00% 4.00% 2.00% 0.00% 5
10
15
20 Maturity
Figure 128.3: Tax timing option values for bonds with different default rates. Notes: Diamonds — 0% default probability, squares — 1% default probability, triangle — 2% default probability and crosses — 4% default probability. The recovery rate δ equals to 50%. Coupon rate equals 6%. The initial short-term interest rate is 6%. σr = 1% is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Four tax scenarios are defined as in Tables 128.8 and 128.9. The value of a tax-timing option is defined as )−vBH (t0 ,T )] in percentage, where vBH (t0 , T ), vOP (t0 , T ) are bond TO(t0 , T ) = [vOP (t0v,T OP (t0 ,T ) price under the buy-and-hold and optimal trading strategies, respectively.
page 4409
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4410
Tax Scenario III
Tax Timing Option Value
12.00% 10.00% 8.00% 6.00% 4.00% 2.00% 0.00% 5
10
15
20
25
30
25
30
Maturity Tax Scenario IV
Tax Timing Option Value
6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00% 5
10
15
20 Maturity
Figure 128.3:
(Continued )
than those reported in Table 128.8 as investors now have more opportunities to trade. The increase in the tax-timing option values tends to be higher for Scenario II and for low-coupon (discount) bonds. In general, the timing option value increases substantially as the number of transactions doubles.8 For example, compared to the results for Scenario II in Table 128.10, the timing option value increases from 6.5% to 8.3% for 15-year bonds when interest rate volatility is 1% and coupon rate is 8
This result holds even with moderate transaction costs. The timing option value remains quite high when the transaction cost is around 0.5%. For brevity, the results with transaction costs are not reported here.
page 4410
July 6, 2020 16:8
Table 128.10:
Tax-timing option value with different recovery rates. Tax timing option
II
III
IV
I
II
0.1% 1.1% 2.7% 4.6% 6.4% 8.1%
0.0% 0.1% 0.1% 0.2% 0.1% 0.1%
0.0% 0.6% 1.3% 2.2% 3.0% 3.7%
0.0% 0.3% 0.5% 0.9% 1.1% 1.5%
0.2% 1.2% 2.8% 4.8% 6.7% 8.5%
5 10 15 20 25 30
0.0% 0.4% 0.5% 2.2% 0.9% 4.4% 1.4% 6.8% 1.6% 8.9% 1.9% 10.8%
0.0% 0.5% 0.8% 1.1% 1.3% 1.3%
0.1% 0.8% 1.7% 2.6% 3.5% 4.2%
0.0% 0.4% 0.4% 2.3% 0.8% 4.5% 1.3% 6.9% 1.5% 9.1% 1.8% 11.0%
5 10 15 20 25 30
0.4% 1.0% 1.4% 3.6% 2.3% 6.5% 3.0% 9.4% 3.4% 11.7% 3.7% 13.7%
0.6% 2.4% 3.9% 5.0% 5.7% 6.1%
0.2% 1.0% 2.0% 3.0% 3.9% 4.6%
0.3% 1.0% 1.3% 3.7% 2.2% 6.6% 2.9% 9.4% 3.3% 11.8% 3.6% 13.9%
I
II
III
IV
Coupon = 2% 0.0% 0.0% 0.0% 0.2% 0.0% 0.0% 0.1% 0.6% 0.3% 1.3% 0.1% 0.6% 0.1% 1.3% 0.5% 2.9% 0.1% 1.2% 0.2% 2.2% 0.9% 5.0% 0.2% 2.2% 0.1% 3.0% 1.2% 6.9% 0.1% 2.9% 0.1% 3.7% 1.5% 8.9% 0.1% 3.7% Coupon = 4% 0.0% 0.1% 0.0% 0.5% 0.0% 0.1% 0.4% 0.8% 0.4% 2.4% 0.3% 0.7% 0.7% 1.6% 0.8% 4.6% 0.5% 1.6% 0.9% 2.6% 1.2% 7.1% 0.8% 2.6% 1.0% 3.4% 1.5% 9.3% 0.8% 3.4% 1.0% 4.2% 1.8% 11.3% 0.8% 4.2% Coupon = 6% 0.4% 0.1% 0.2% 1.0% 0.3% 0.1% 2.1% 1.0% 1.2% 3.7% 1.9% 0.9% 3.5% 1.9% 2.0% 6.7% 3.2% 1.9% 4.6% 3.0% 2.7% 9.5% 4.2% 3.0% 5.3% 3.8% 3.1% 11.9% 4.8% 3.8% 5.7% 4.6% 3.4% 14.0% 5.2% 4.6%
I
II
III
IV
0.3% 1.4% 3.1% 5.2% 7.2% 9.2%
0.0% 0.1% 0.1% 0.2% 0.1% 0.1%
0.0% 0.6% 1.3% 2.2% 2.9% 3.7%
0.0% 0.6% 0.4% 2.4% 0.7% 4.7% 1.1% 7.2% 1.4% 9.5% 1.7% 11.5%
0.0% 0.3% 0.4% 0.6% 0.6% 0.5%
0.1% 0.7% 1.6% 2.6% 3.4% 4.1%
0.2% 1.1% 0.3% 1.1% 3.8% 1.6% 1.9% 6.7% 2.8% 2.6% 9.6% 3.8% 3.0% 12.1% 4.4% 3.3% 14.2% 4.7%
0.2% 0.9% 1.9% 2.9% 3.8% 4.5%
0.0% 0.3% 0.5% 0.9% 1.1% 1.5%
(Continued) 4411
b3568-v4-ch128
0.0% 0.3% 0.6% 0.9% 1.2% 1.4%
IV
Recovery = 20%
9.61in x 6.69in
5 10 15 20 25 30
III
Recovery = 30%
Handbook of Financial Econometrics,. . . (Vol. 4)
I
Recovery = 40%
Estimating the Tax-Timing Option Value of Corporate Bonds
Recovery = 50%
page 4411
July 6, 2020
Tax timing option Recovery = 50% I
III
0.7% 1.1% 1.4% 1.9% 3.8% 4.0% 2.6% 6.6% 5.9% 3.1% 9.2% 7.2% 3.3% 11.4% 8.0% 3.4% 13.2% 8.4%
IV
0.1% 0.8% 1.5% 2.3% 2.9% 3.5%
I
II
III
Recovery = 30% IV
I
II
III
Coupon = 8% 0.7% 1.2% 1.4% 0.1% 0.7% 1.3% 1.4% 1.9% 4.0% 4.0% 0.8% 1.9% 4.2% 4.0% 2.7% 6.9% 5.9% 1.5% 2.7% 7.1% 5.9% 3.1% 9.5% 7.2% 2.3% 3.2% 9.8% 7.2% 3.4% 11.7% 8.0% 3.0% 3.4% 12.1% 7.9% 3.4% 13.6% 8.3% 3.6% 3.5% 14.0% 8.3%
Recovery = 20% IV
0.1% 0.8% 1.6% 2.4% 3.0% 3.6%
I
II
III
0.7% 1.4% 1.4% 1.9% 4.4% 4.0% 2.7% 7.4% 5.9% 3.2% 10.1% 7.2% 3.4% 12.4% 7.9% 3.6% 14.3% 8.3%
IV
0.1% 0.8% 1.6% 2.4% 3.1% 3.7%
9.61in x 6.69in
Notes: The initial short-term interest rate is 6%. σr = 1% is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Four tax scenarios are defined as in Table 128.8. Tax timing option is defined as )−vBH (t0 ,T ) TO(t0 , T ) = vOP (t0v,T in percentage, where vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and OP (to ,T ) optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1% and the recovery rate δ equal to 50%, 40%, 30%, and 20%, respectively.
P. H. Chen, S. Liu & C. Wu
5 10 15 20 25 30
II
Recovery = 40%
Handbook of Financial Econometrics,. . . (Vol. 4)
(Continued)
16:8
4412
Table 128.10:
b3568-v4-ch128 page 4412
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.11:
Tax-timing option value with two trading intervals per year. High variance process σr = 0.01 per year
Maturity
4413
I
II
5 10 15
0.1% 0.4% 0.7%
1.8% 4.1% 6.2%
5 10 15
0.2% 0.5% 0.8%
1.1% 3.1% 4.8%
5 10 15
0.5% 2.3% 3.8%
1.2% 5.6% 8.3%
5 10 15
1.2% 2.3% 2.9%
1.8% 5.4% 8.6%
III
IV
Coupon = 0.02 0.1% 0.3% 0.1% 0.9% 0.1% 1.7% Coupon = 0.04 0.1% 0.3% 0.8% 1.1% 1.2% 2.0% Coupon = 0.06 0.7% 0.3% 3.8% 2.2% 6.4% 3.2% Coupon = 0.08 2.4% 0.3% 5.8% 2.1% 3.0% 8.0%
Low variance process σr = 0.005 per year I
II
III
IV
0.0% 0.1% 0.3%
1.7% 3.2% 4.8%
0.0% 0.0% 0.0%
0.0% 0.3% 0.8%
0.0% 0.2% 0.5%
0.7% 1.8% 3.5%
0.0% 0.1% 0.2%
0.1% 0.4% 1.1%
0.1% 0.5% 1.1%
0.2% 3.7% 5.7%
0.1% 1.7% 2.5%
0.1% 0.6% 1.3%
0.6% 2.4% 3.1%
1.7% 4.5% 6.7%
1.2% 3.1% 4.7%
0.1% 0.5% 1.2%
Notes: We set two trading intervals each year. The initial short-term interest rates is 6%. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.40, capital gains tax rates τs , short-term, τl , long-term: I.
τs = τl = 0.2;
II. τs = 0.40, τl = 0.2; III. τs = τl = 0; IV. τs = τl = 0.4. )−vBH (t0 ,T ) in percentage, where Tax timing option is defined as TO(t0 , T ) = vOP (t0v,T OP (to ,T ) vBH (t0 , T ), vOP (t0 , T ) are bond prices under the buy-and-hold and optimal trading strategies, respectively. The default process is exogenously specified, with default probability λ equal to 1% and recovery rate δ equal to 50%.
8%. The timing option value is almost doubled for all bonds with different coupon rates. The increase in timing option values in percentage terms is even higher for shorter-term bonds. The increase in timing option value is not as high for other tax scenarios. For 15-year 6% coupon bonds, the timing option value increases from 2.3%, 3.9% and 2.0% for Scenarios I, III and IV to 3.8%, 6.4% and 3.2%, respectively. Similar patterns are found for bonds with
page 4413
July 6, 2020
16:8
4414
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
lower coupon rates. Thus, the timing option value can increase substantially when trading frequency increases. 128.5 Implications for Empirical Estimation 128.5.1 Effects of ignoring tax-timing options on estimation of default probability One of the important uses of the term structure model is to retrieve the risk-neutral default probability from observed bond yields. Most studies have assumed away tax effects in inferring the default probability from bond yields (e.g., Jarrow, Lando and Turnbull, 1997; Duffee, 1999). If the tax effect is not trivial, estimates of the default parameter would be biased if taxes are ignored by the model. Although some studies have attempted to capture the tax effect, they have typically assumed a buy-and-hold strategy (see, for example, Yawitz et al., 1985; Elton et al., 2001). This assumption ignores the tax-timing option value associated with the optimal trading strategy. To assess the effect of the tax timing option value on the estimation of default probability, we conduct simulation analysis. We first simulate the prices of defaultable bonds with known default probability, recovery rate, tax rate, and interest rate under the optimal trading strategy. Given these bond prices, we solve for default probabilities by assuming a buy-and-hold strategy. We then compare the estimated default probability with the true probability to determine the estimation bias when the tax-timing option value is ignored. Table 128.12 reports the estimates of default probabilities for bonds with different maturities and ex ante (true) default probabilities. The coupon rate and the short-term interest rate are set equal to 6%. The standard deviation of annualized interest rates is 1% and 0.5% for the high and low volatility cases. Equilibrium bond prices are calculated based on the tax regime under Scenario II and the trading process is set up as in Table 128.9. The recovery rate is fixed at 50% and the true default probabilities are set at 2%, 4% and 6%, respectively. Results show that estimation error for default probability is quite substantial when the tax-timing option value is ignored, particularly for longmaturity bonds. The tax-timing option increases the bond value and reduces yield to maturity. When this tax timing option value is ignored, its effect will be factored into the estimated default parameter. To fit the bond with higher price (or lower yield) due to the timing option, the default rate estimate is forced to be lower to bring up the price. The longer the maturity,
page 4414
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
Estimating the Tax-Timing Option Value of Corporate Bonds Table 128.12:
Estimates of default probabilities when tax-timing options are ignored. High variance process σr = 0.01 per year
Maturity 5 10 15 20 25 30
4415
Low variance process σr = 0.005 per year
λ = 0.02
λ = 0.04
λ = 0.06
λ = 0.02
λ = 0.04
0.012 0.010 0.008 0.005 0.003 0.001
0.025 0.020 0.017 0.012 0.009 0.007
0.038 0.033 0.029 0.025 0.022 0.019
0.014 0.011 0.009 0.007 0.004 0.003
0.031 0.027 0.024 0.022 0.020 0.018
λ = 0.06 0.047 0.043 0.040 0.037 0.035 0.033
Notes: The initial short-term interest rate and coupon rate are both equal to 6%. σr is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.40, capital gains tax rates τs = 0.40 short-term, τl = 0.2 long-term. The default process is exogenously specified, with default probability λ equal to 2%, 4%, and 6%, respectively and the recovery rate δ equal to 50%.
the higher the tax-timing option value and therefore, the lower the estimated default probability. In addition, the results show that the bias increases with default probability, suggesting that the estimation will be more serious for speculative-grade bonds.
128.5.2 Effects of ignoring tax-timing option on estimation of implied tax rates Ignoring the tax-timing option value can also cause underestimation of the marginal investor’s income tax rates. If bond prices reflect the investor’s optimal trading strategy, estimates of marginal tax rates based on the assumption of buy-and-hold strategy will be biased downward. The tax-timing option increases the bond price and decreases the bond yield. When the value of the timing option is ignored, this higher price (or lower yield) must be accommodated by a lower tax rate in empirical fitting. To explore the effect of timing option on the implied marginal tax rate, we conduct simulations by setting default probability to 1% (low), and 4% (high), respectively. We focus on the scenario of asymmetric taxation (II); that is, ordinary income tax rate is set equal to 40% and the long-term capital gain tax rate is 20%. Again, we first simulate bond prices under the optimal trading strategy and then use these (true) prices to estimate the implied
page 4415
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
4416 Table 128.13:
Estimates of implied tax rates when tax-timing options are ignored. High variance process σr = 0.01 per year
Low variance process σr = 0.005 per year
Maturity
λ=0
λ = 0.01
λ = 0.04
λ=0
λ = 0.01
5 10 15 20 25 30
0.362 0.347 0.315 0.271 0.254 0.233
0.360 0.316 0.283 0.252 0.227 0.215
0.328 0.305 0.278 0.247 0.224 0.207
0.388 0.366 0.346 0.326 0.304 0.285
0.375 0.355 0.334 0.313 0.293 0.276
λ = 0.04 0.346 0.322 0.304 0.288 0.272 0.261
Notes: The initial short-term interest rate and coupon rate are set equal to 6%. σr is the annual standard deviation of changes in the short-term interest rate. The difference between purchase price and par value is amortized linearly to the maturity date and the basis is increased by the amount amortized. Tax scenarios are described by their marginal income tax rate τ = 0.40, capital gains tax rates τs = 0.40 short-term, τl = 0.2 long-term. The default process is exogenously specified, with default probability λ equal to zero, 1%, and 4%, respectively, and the recovery rate δ is 50%.
marginal tax rates by ignoring the tax timing option value or assuming that investors follow a buy-and-hold strategy. Table 128.13 reports the estimates of implicit marginal income tax rates for bonds of different maturities and default probabilities.9 For comparison, we also report the results without default risk (λ = 0). As expected, ignoring the tax-timing option value results in underestimation of the true marginal income tax rate. The bias in estimation increases with maturity and default risk. For example, when default probability is close to zero, the estimated marginal tax rates are 36.2%, 34.7%, 27.1% and 23.3% for bonds with maturities equal to 5, 10, 20 and 30 years. By contrast, the estimated marginal tax rates are 32.8%, 30.5%, 24.7% and 20.7%, respectively, when default probability is increased to 4%. The true marginal income tax rate is 40% for 9
These estimates are close to previous estimates. McCulloch’s (1975) estimates of the marginal tax rate on Treasury securities range between 22% and 33%. His finding is similar to our numerical results under the high variance process with zero probability of default. As shown in the first column of the high-variance case, when default probability λ equals 0, the implied tax rates range from 23.3% for 30-year Treasury bonds to 36.2% for 5-year Treasury notes. Pye (1969) finds that the effective tax rate of the marginal bondholder ranges between 10% and 36% and Litzenberger and Rolfo’s (1984) estimates have an average of 28%. All these estimates fall into the range of our numerical estimates.
page 4416
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Estimating the Tax-Timing Option Value of Corporate Bonds
b3568-v4-ch128
4417
all maturities. The results show that the bias in marginal income tax rate can be quite substantial when the maturity is long. Green (1993) shows that the estimated implicit marginal tax rates decline with maturity.10 Although he studied the marginal tax rate for municipal bonds, it will still be interesting to compare with his results. Green’s estimates of the marginal income tax rate are 36.2% for five-year bonds, 30.8% for ten-year bonds, and 23.2% for 20-year bonds. These estimates are remarkably similar to the pattern shown in Table 128.13 for the case of λ = 1% and σ = 0.01. As indicated earlier, the tax-timing option value increases with maturity. When the value of the timing option is ignored, the higher price (or lower yield) induced by optimal trading must be accommodated by a lower tax rate in empirical fitting. Our results show that neglecting the tax-timing option value contributes leads to the estimation bias of implicit marginal tax rates which increases with maturities. Thus, the empirical puzzle of declining marginal investors’ income tax rate with bond maturity may well be due to the omission of the timing option value in the pricing model.
128.6 Conclusion Personal taxes have been shown to be an important determinant of cost of debt and capital structure (see Elton et al., 2001; Graham, 1999, 2000, 2003). However, there has been no study on the issue of tax-timing option for corporate bonds. As a consequence, how big is the tax-timing option value embedded in corporate bonds is unknown. This paper attempts to fill this gap. We find that tax timing option value accounts for a sizable portion of corporate bond price. The tax timing option value ranges from 15% to 24% for bonds with maturity longer than 20 years when the level of interest rate is high. The timing option value remains sizable, ranging from 10% to 16%, even when both the level and volatility of interest rates are low. Ignoring the tax-timing option value leads to biased estimation of default probabilities, implicit income tax rate and corporate bond spread. Thus, the tax-timing option should be considered in pricing corporate bond and estimating spreads.
10
See Table 3 of Green (1993, p. 239).
page 4417
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4418
9.61in x 6.69in
b3568-v4-ch128
P. H. Chen, S. Liu & C. Wu
Bibliography Altman, E.I. and Kishore, V.M. (1998). Defaults and Returns on High Yield Bonds: Analysis Through 1997, NYU Salomon Center Working Paper. Ball, C., Creedy, J. and Scobie, G. (2018). The Timing of Income Tax Changes in the Face of Projected Debt Increases. The Australian Economic Review 51, 191–210. Black, F. and Cox, J.C. (1976). Valuing Corporate Securities: Some Effects of Bond Indentures Provisions. Journal of Finance 31, 351–367. Collin-Dufresne, P. and Goldstein, R. (2001). Do Credit Spreads Reflect Stationary Leverage Ratios. The Journal of Finance 56, 1926–1957. Collin-Dufresne, P., Goldstein, R. and Martin, S. (2001). The Determinants of Credit Spread Changes. Journal of Finance 56, 2177–2207. Constantinides, G.M. (1984). Optimal Stock Trading with Personal Taxes: Implications for Prices and the Abnormal January Returns. Journal of Financial Economics 13, 65–89. Constantinides, G.M. and Ingersoll, J.E. Jr. (1984). Optimal Bond Trading with Personal Taxes. Journal of Financial Economics 13, 299–335. Chay, J.B., Choi, D. and Pontiff, J. (2006). Market Valuation of Tax-Timing Options: Evidence from Capital Gains Distributions. Journal of Finance 61, 837–865. Dai, M., Liu, H., Yang, C. and Zhong, Y. (2015). Optimal Tax Timing with Asymmetric Long-Term/Short-Term Capital Gains Tax. Review of Financial Studies 28, 2687–2721. Dammon, R.M., Dunn, K.B. and Spatt, C.S. (1989). A Reexamination of the Value of Tax Options. Review of Financial Studies 2, 341–372. Dammon, R.M. and Spatt, C.S. (1996). The Optimal Trading and Pricing of Securities with Asymmetric Capital Gains Taxes and Transaction Costs. Review of Financial Studies 9, 921–952. Duffee, G. (1999). Estimating the Price of Default Risk. Review of Financial Studies 12, 197–226. Duffie, D. and Singleton, K. (1997). An Econometric Model of the Term Structure of Interest Rate Swap Yields. Journal of Finance 52, 1287–1321. Duffie, D. and Singleton, K. (1999). Modeling Term Structures of Defaultable Bonds. Review of Financial Studies 12, 687–720. Elton, J.E., Gruber, M.J., Agrawal, D. and Mann, C. (2001). Explaining the Rate Spread on Corporate Bonds. Journal of Finance 56, 247–277. Edwards, A., Harris, L. and Piwowar, M. (2004). Corporate Bond Market Transparency and Transaction Costs, Working Paper, Securities and Exchange Commission. Fabozzi, F.J. and Nirenberg, D.Z. (1991). Federal Income Tax Treatment of Fixed Income Securities, in Frank J. Fabozzi, ed., Handbook of Fixed Income Securities (Dow JonesIrwin, Homewood, Illinois). Graham, J.R. (1999). Do Personal Taxes Affect Corporate Financing Decisions? Journal of Public Economics 73, 147–185. Graham, J.R. (2000). How Big are the Tax Benefits of Debt? Journal of Finance 55, 1901–1941. Graham, J.R. (2003). Taxes and Corporate Finance: A Review. Review of Financial Studies 16, 1075–1129. Green, R.C. (1993). A Simple Model of the Taxable and Tax-Exempt Yield Curves. Review of Financial Studies 6, 233–264. Green, R.C. and Odegaard, B.A. (1997). Are There Tax Effects in the Relative Pricing of U.S. Government Bonds. Journal of Finance 52, 609–633.
page 4418
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
Estimating the Tax-Timing Option Value of Corporate Bonds
b3568-v4-ch128
4419
Hong, G. and Warga, A. (1998). An Empirical Study of Bond Market Transactions, Working Paper, University of Houston. Internal Revenue Code (2002). Income, Estate, Gift, Employment and Excise Taxes §861End, CCH Editorial Staff Publication (CCH Incorporated, Chicago). Jarrow, R.A. and Turnbull, S.M. (1995). Pricing Derivatives on Financial Securities Subject to Credit Risk. Journal of Finance 50, 53–85. Jarrow, R.A., Lando, D. and Turnbull, S.M. (1997). A Markov Model for the Term Structure of Credit Risk Spreads. Review of Finance Studies 10, 481–523. Leland, H. (1994). Corporate Debt Value, Bond Covenants, and Optional Capital Structure. Journal of Finance 49, 1213–1252. Leland, H. and Toft, K. (1996). Optimal Capital Structure, Endogenous Bankruptcy, and the Term Structure of Credit Spreads etc., Journal of Finance 51, 987–1019. Lin, H., Liu, S. and Wu, C. (2011). Dissecting Corporate Bond and CDS Spreads. Journal of Fixed Income 20, 7–39. Litzenberger, R.H. and Rolfo, J. (1984). An International Study of Tax Effects on Government Bonds. The Journal of Finance 39, 1–22. Liu, S. and Wu, C. (2004). Taxes, Default Risk and Credit Spreads. Journal of Fixed Income 14, 71–85. Liu, S., Shi, J., Wang, J. and Wu, C. (2007). How Much of the Corporate Bond Spread is Due to Personal Taxes? Journal of Financial Economics 85, 599–636. Longstaff, F. and Schwartz, E. (1995). A Simple Approach to Valuing Risky Fixed and Floating Debt. Journal of Finance 50, 789–819. McCulloch, J.H. (1975). The Tax-Adjusted Yield Curve. Journal of Finance 30, 811–830. Merton, R.C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance 29, 449–470. Miller, M.H. (1977). Debt and Taxes. Journal of Finance 32, 261–275. Pye, G. (1969). On the Tax Structure of Interest Rates. The Quarterly Journal of Economics 83, 562–579. Schultz, P. (2001). Corporate Bond Trading Costs: A Peek Behind the Curtain. Journal of Finance 56, 677–698. Yawitz, J.B., Maloney, K.J. and Ederington, L.H. (1985). Default Risk and Yield Spreads. Journal of Finance 40, 1127–1140.
page 4419
This page intentionally left blank
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
Chapter 129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation in S&P 500 Peimin Chen, Chunchi Wu and Ying Zhang Contents 129.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 129.2 Model and Estimation . . . . . . . . . . . . . . . . . . . 129.3 The Test of the Difference in Different Stages for Simple Average Correlations and Variance . . . . . . . . . . . . 129.4 Data and Empirical Results . . . . . . . . . . . . . . . . 129.5 Application in β-function . . . . . . . . . . . . . . . . . 129.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
4422 4424
. . . . . .
4426 4429 4437 4439 4439 4439
. . . . . .
. . . . . .
Abstract Understanding the dynamic correlations among asset returns is essential for ascertaining the behavior of asset prices and their comovements. It also has important implications for portfolio diversification and risk management. In this chapter, we apply the DCCGARCH model pioneered by Engle (2001) and Engle and Sheppard (2002) to investigate the dynamics of correlations among S&P 500 stocks during the sub-prime crisis. Using Peimin Chen Southwestern University of Finance and Economics email: [email protected] Chunchi Wu State University of New York at Buffalo email: chunchiw@Buffalo.edu Ying Zhang Southwestern University of Finance and Economics email: [email protected] 4421
page 4421
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
4422
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
the daily data of stocks in the S&P 500 index, we document strong evidence of persistent dynamic correlations among the returns of the index component stocks. Conditional correlations between S&P 500 index and the component stocks increase substantially during the period of sub-prime crisis, showing strong evidence of contagion. In addition, stock return variance is time-varying and peaks at the crest of financial crisis. The results show that the DCC-GARCH model is a powerful tool for forecasting return correlations and performing value-at-risk portfolio analysis. Keywords Dynamic conditional correlation • Multivariate GARCH • DCC-MVGARCH • Contagion • Risk management.
129.1 Introduction In portfolio and risk management, it is a common practice to pick stocks in different categories (styles) and evaluate their correlation with a benchmark. For portfolio managers, they need to divide his/her portfolio into a number of different industries, such as banking and real-estate industries, to manage risk. This provides asset diversity within each industry and diversification across industries. In order to manage portfolio risk, managers need to know the relations of the assets in the portfolio. Thus, it is imperative for them to estimate the correlation of a selection of firms or companies within each industry and of different industry index. Although the prevailing diversification strategies typically assume that correlations are constant, it has been known that for some time that individual stocks and overall market return correlations do not remain constant (see, for example, Giorgio, 2016; Mollah, Quoreshi, and Zafirov, 2016; Horv´ath, Ly´ocsa, and Baum¨ohl, 2018). Moreover, it is widely recognized that the correlations among equity returns tend to increase during bear markets, and decrease during periods of strong stock market growth (see Santis and Gerard, 1997; Bekaert and Ang, 1999). Especially, in times of financial crisis, such as stock market crash in October 1987, Asian crisis in the late 1990’s, and the subprime crisis in 2007–2009, stock market correlations increased. In this chapter, in order to assess the contagious effect during the subprime crisis, we employ a dynamic variance–covariance model to ascertain time-varying correlations and volatility of stock returns. We first examine the statistical properties of daily stock returns. The daily stock returns during the period from January 2006 to October 2008 are random, which approaches to normal distribution by Jarque-Bera test. This finding justify the use of the DCC-GARCH model introduced by Engle
page 4422
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4423
(2001), which is based on the multivariate normal distribution, to calculate the correlations of stock returns. Univariate GARCH models have widespread applications in estimating and forecasting equity return variances. But a major challenge in estimating the multivariate GARCH models with time-varying correlations is the substantial computing requirement, which constraints researchers to estimating models with either a limited scope or considerable restrictions (see Engle and Sheppard, 2001). For example, the full unrestricted multivariate GARCH model, originally proposed by Bollerslev, Engle and Wooldridge (1998), requires o(k4 ) parameters to be estimated by maximum likelihood, where k is the number of time series being modeled. Even for a simpler model, the diagonal vech, which allows for non-zero coefficients only on own lagged effects and cross products, the number of parameters estimated is o(k2 ). To overcome these computational burden, Engle (2002), and Engle and Sheppard (2001) have proposed a more efficient model, the DCC-GARCH model, to deal with the time-series behavior of large correlation matrices by adopting a two-step procedure. In the first step, a set of univariate GARCH models is estimated for each asset return. In the second stage, they used residuals, transformed by the standard deviation estimated during the first stage, to estimate the parameters of the dynamic correlation. In the past two decades, there has been a considerable interest in the return correlation issue between different markets. For instance, Breen, Glosten and Jagannathan (1989) document a negative correlation between the short-term interest rate and the stock index future returns in the United States. As mentioned by Kearney and Poti (2003), an extensive analysis of long-term trends at the firm level and market volatility in United States stock markets from 1962 to 1997 is provided by Campbell, Lettau, Malkiel and Xu (abbreviated by CLMX) (2001). They document the evidence that a decline in overall market correlations has been accompanied by a parallel increase in average firm-level volatility by using daily data of all stocks traded over their sample period on three stock markets (AMEX, NASDAQ and NYSE). In this chapter, we apply the dynamic conditional correlation GARCH model of Engle (2002) and Engle and Sheppard (2001) to capture the behavior of three kinds of firm-level correlations with S&P 500 index and three kinds of intercorelations in the same index over the period from January 2006 to October 2008. Our major findings can be summarized as follows. First, from the simple average correlations, we find that the trends of firmlevel correlations increase from February 2007 to October 2008, which covers
page 4423
July 6, 2020
16:8
4424
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
part of the subprime crisis period. Second, the variance of each time series of returns becomes much larger starting from July 2007 than before. Moreover, we discover that the variance has three different stages for each time series of asset returns. In the first stage, before July 2007, the variance is very flat. In the second stage, from July 2007 to September 2008, it becomes much bigger. In the third stage, from September 2008 to October 2008, the variance reaches a peak value. The results show that both correlations and variance of returns increase substantially over our sample period. This chapter is structured as follows. In Section 129.2, we discuss the DCC-MVGARCH model and its implementation. In Section 129.3, we introduce a multiple structural change model proposed by Bai and Perron (1998). This model is used to test the structural change for the correlations that we estimate. In Section 129.4, we present our results of estimated coefficients and tests and present time series of correlations. These graphs are related to three simple average correlations of firms in the same industry, correlations among firms and with the S&P500 index, and variance for some selected return time series. Finally, Section 129.6 summarizes the results and concludes the paper. 129.2 Model and Estimation The dynamic conditional correlation (DCC) GARCH model proposed by Engle (2002) is used in empirical estimation. The DCC-GARCH model can be estimated in two stages. In the first stage, the mean return of each stock in the sample nested in a univariate GARCH model is estimated. we use the variance estimate to standardize the zero-mean return for each stock. In the second stage, the parameters of the dynamic correlation are estimated using the residuals from the first stage. Assume that the returns rt from k assets are either the residuals from a filtered time series or conditionally multivariate normal with zero expected value and covariance matrix Ht . Let Dt be the k × k diagonal matrix of time varying standard deviations from univariate GARCH models, and Rt be the time varying correlation matrix. Then, the generalized DCC-MVGARCH model can be written as follows: rt |Ft−1 ∼ N (0, Ht ),
(129.1)
Ht = Dt Rt Dt ,
(129.2)
Dt = diag{hit },
(129.3)
and where
page 4424
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4425
and √ Qt Q∗−1 = {qij,t / qii,t qjj,t}, Rt = Q∗−1 t t
(129.4)
√ where {ρij }t = qij,t / qii,t qjj,t is a correlation matrix containing the conditional correlation coefficients. From above, hit represents the variance of rt , which follow the univariate GARCH(p, q) process: hit = ωi +
Pi p=1
2 αip rit−p +
Qi
βiq hit−q .
(129.5)
q=1
In Engle’s (2002), DCC(M, N ) structure can be given as Qt =
1−
M
αm −
m=1
N n=1
βn
Q+
M m=1
αm (εt−m εt−m )+
√ Q∗t = diag{ qii },
N
βn Qt−n , (129.6)
n=1
(129.7)
variance–covariance matrix of standardwhere Qt = {qij,t } is the conditional √ ized errors, εt = Dt−1 rt = {εit / hit } is a vector containing the standardized errors, and Q is the unconditional covariance of the standardized residuals resulting from the first stage estimation. The DCC-GARCH model can be estimated using the maximum likelihood method, in which the log-likelihood function for the model can be written as T
L=−
1 (k log(2π) + log(|Ht |) + rt Ht−1 rt ) 2 t=1 T
1 (k log(2π) + 2 log(|Dt |) + log(|Rt |) + εt Rt−1 εt ). (129.8) =− 2 t=1
This estimation process includes two stages. In the first stage, Rt is replaced by a k × k identity matrix, which reduces the equation above to the sum of log-likelihood of the univariate GARCH equations in (129.5). Let φi = (ω, α1i , . . . , αPi i , β1i , . . . , βQi i ) be the parameters of the univariate GARCH
page 4425
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
4426
model for hit . Then, the first-stage quasi-likelihood function is T
1 (k log(2π) + log(|Ik |) + 2 log(|Dt |) + rt Dt−1 Ik Dt−1 rt ) L1 (φ|rt ) = − 2 t=1 k T 2 rit 1 log(hit ) + . T log(2π) + =− 2 hit i=1
t=1
Let φˆ = arg max L1 (φ|rt ). Then, in the second stage, the DCC parameters in equation (129.6) are estimated using the original likelihood in (129.8) conˆ And, the quasiditional on the first-stage GARCH parameter estimates φ. likelihood function is ˆ rt ) = − 1 L2 (ψ|φ, 2
T t=1
(k log(2π) + 2 log(|Dt |) + log(|Rt |) + rt Dt−1 Rt−1 Dt−1 rt )
T
=−
1 (k log(2π) + 2 log(|Dt |) + log(|Rt |) + εt Rt−1 εt ). 2 t=1
ˆ rt ) is The estimated parameter ψˆ in L2 (ψ|φ, ˆ rt ). ψˆ = arg max L2 (ψ|φ, From Engle (2002), we know that θˆT = (φˆT , ψˆT ) has asymptotic normality. Thus, the estimates of parameters can be easily tested for their significance. 129.3 The Test of the Difference in Different Stages for Simple Average Correlations and Variance From the graphs of correlations, especially of simple average correlations Figure 129.1, we notice that there are some sharp values of correlations. Moreover, before the first sharp points and between two sharps, the values change stably. This phenomenon is even more obvious for simple average variances, which enlightens us to test if the mean correlations and variances in different stages are the same. In this part we divide the data of correlations into two groups and express them as two variables X1 and Y1 with unknown distributions. Then we can test whether their means are the same or not by an efficient test proposed by Brunner and Munzel (2000). In their paper, Brunner and Munzel (2000) discuss how to test the difference between two separated random variables and how to compute the
page 4426
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4427
probability p = P (X1 < Y1 ) + 12 P (X1 = Y1 ). Similarly, for the data of variances, we expressed them as two random variables X2 and Y2 , and calculate p = P (X2 < Y2 ) + 12 P (X2 = Y2 ). We use this test method to obtain the probability of the difference between correlations and variances separately. We next discuss the Brunner–Munzel test briefly. Assume that a general non-parametric model, as in Brunner and Munzel (2000), has N = n1 + n2 independent random variables, Xk1 ∼ F1 (x), Yk2 ∼ F2 (x),
k1 = 1, . . . , n1 , k2 = 1, . . . , n2 .
(129.9)
The distribution functions F1 (x) and F2 (x) are arbitrary (with exception of the trivial case of one-point-distributions). Moreover, the hypothesis of no treatment effect is commonly formulated as H0F : F1 = F2 , which implies homoscedasticity under the hypothesis. To formulate a nonparametric hypothesis of no treatment effect, the relative treatment effect p = P (X1 < Y1 ) + 12 P (X1 = Y1 ) is considered. It is known that Fi (x) = − + − + 1 2 [Fi (x) + Fi (x)], where Fi (x) is the left-continuous version and Fi (x) is the the right-continuous version. Thus, p can be written as p = F1 dF2 and p 1 : p = F dF = since hypothesis of no treatment effect is written as H 1 2 0 2 H0F : F1 = F2 = F implies H0p : p = 12 by F dF = 12 . To estimate p, the distribution functions F1 and F2 are replaced by their empirical counterparts Fˆi (x) = 12 [Fˆi− (x) + Fˆi+ (x)]. Let ˆ H(x) =
2 ni i=1
N
Fˆi (x).
ˆ k ) and R2k = N · H(Y ˆ k ) + 1 are the rank of Xk and Then R1k = N · H(X 2 n i −1 ¯ i· = n Yk . Let R i k=1 Rik , i = 1, 2, denote the mean of the ranks Rik . It follows that pˆ =
1 Fˆ1 dFˆ2 = n1
n2 + 1 R2· − 2
is an unbiased and consistent estimator for p. For any data of correlation or variance, it is known that they are autocorrelated. But in Brunner and Munzel (2000), the random variables Xk1 and Yk2 are assumed to be N independent variables. In order to satisfy independence, the sample sizes of two separated groups need to be large enough so that the dependencies among them can be ignored. In this chapter, the
page 4427
July 6, 2020
16:8
4428
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
sample size is more than 700, which is large enough to satisfy the requirement of variable independence. In conducting the Brunner–Munzel test, since we do not know the optimal breaking points at which the samples are separated into two groups, we need to make a choice. The basic idea is that we select one starting point xs and one ending point xe as the break points of two groups. In this process, we need to keep the sample sizes of two groups large enough for independence. We first pick any point xa , between xs and xe , as breaking point, and then perform the Brunner–Munzel test to get an estimate of pˆ. For all values of pˆ, we can obtain the local maximums, which can be considered as the best breaking points. If the distance of two neighborhood breaking points is not far enough to satisfy independence for the samples between these two neighborhood breaking points, we just keep the breaking points with bigger value of pˆ, and drop the other ones. In practice, we may get several breaking points, which is consistent with the fact that there are several sharp points in the data, such as correlations and variance in this study. We further check these breaking points by performing the Brunner– Munzel test again for the samples and its neighborhood samples to see if the means of these two samples are the same. If we accept the hypothesis, then we give up this breaking point of these two samples; otherwise, this breaking point is kept. In the end, it stops after we get all possible left breaking points, between any two of which the mean of sub-samples is significantly different from that of its neighborhood sub-samples. Then, all of the breaking points are obtained. This allows us to obtain the precise dates for structural changes. For brevity, we only provide two graphs with respect to simple average correlations and variances. For other graphs, we can do the same test and get similar results. In the attached figure, we can see that the probability of difference reaches two local maximum points around February 2007 and September 2008, respectively, which are great than 90%. This result is consistent with the patterns we observed earlier. For these two points of local maximum values, they are treated as the stage breaking points. In the graphs of simple average correlation and variance, we use dash vertical lines to denote the dates of them. These two points are very important and useful in empirical implementation as they provide a method for us to divide the correlations and variances into different stages. In addition, for the probability of the differences for variance, if it is very high, greater than 90%, we need to consider the risk in different stages. This is another reason why it is important to determine the stage breaks.
page 4428
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4429
129.4 Data and Empirical Results The data consist of the daily returns of S&P 500 companies. The sample covers the period from January 3, 2006 to October 31, 2008. The stock returns are calculated from the daily closing prices on the website: http://finance.yahoo.com. In this chapter, we consider the correlations of returns on firms in the banking, insurance and trading industries to see whether the correlation of returns have been affected by the subprime crisis. For the banking industry, ten banks are selected and the ticker names are: AXP, BBT, BK, C, CIT, COF, KEY, NCC, PNC, USB. For the insurance industry, we also select ten large insurance companies and their ticker names are: ABK, AET, AIG, ALL, HIG, MET, PRU, ANAT, ASI, and HCC. Moreover, we choose eight companies in trading industry, whose ticker names are: AMTD, BPSG, CS, GS, JPM, NMR, OPY, and AMP. Tables 129.1 and 129.2 list the names of companies and their tickers. Table 129.3 reports means, standard deviations, skewness and kurtosis of Table 129.1:
The names and tickers of banks and insurance companies.
Banks
Insurance companies
Name
Ticker
American Express Co BB&T Corp Bank of NY Mellon Corp Citigroup Inc CIT Group Inc Capital One Financial Corp KEY Corp PNC Financial Svcs Group Inc U S Bancorp
AXP BBT BK C CIT COF KEY PNC USB
Table 129.2:
Name
Ticker
Ambac Finl Grp Inc Aetna Inc American International Group Allstate Corp Hartford Fin Services Group MetLife Inc Prudential Financial Inc American National Insurance Co American Safety Insur Holdings Ltd HCC Insurance Holdings Inc
ABK AET AIG ALL HIG MET PRU ANAT ASI HCC
The names and tickers of trading firms.
Name
Ticker
Td Ameritrade holding Corp Broadpoint secrities Group Inc Credit Suisse Group Goldman Sachs Group Inc JPMorgan Chase&Co Nomura Holdings Inc Oppenheimer Holdings Inc Ameriprise Financial Inc
AMTD BPSG CS GS JPM NMR OPY AMP
page 4429
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
4430
Table 129.3:
Jarque-Bera test for normality. Jarque-Bera test
Assets
Mean
Std dev
Skewness
Kurtosis
Statistic
p-value
H
S&P500 AXP BBT BK C CIT COF KEY PNC USB
−0.000429 −0.001351 −0.000560 −0.000155 −0.002621 −0.003224 −0.001320 −0.001791 −0.000308 −0.000236
0.016464 0.029311 0.031079 0.034913 0.042888 0.067472 0.035913 0.045170 0.026757 0.023641
−0.285630 −0.509194 −0.093561 −0.419872 0.675627 −0.044457 −0.240988 −0.461031 0.045883 0.347941
13.777247 11.138161 18.219038 19.864031 30.205752 16.219780 10.853615 31.891542 11.224494 11.542357
−4.853141 −2.802782 −9.652256 −11.879196 −30.915784 −7.282104 −2.579649 −34.815475 −2.818780 −3.060671
1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0
ABK AET AIG ALL HIG MET PRU ANAT ASI HCC
−0.001393 −0.000931 −0.003098 −0.000284 −0.000337 −0.000229 −0.000377 −0.000136 −0.000074 −0.000133
0.090075 0.033215 0.057157 0.027538 0.064006 0.037010 0.040257 0.030992 0.027997 0.023402
1.448587 20.966657 −5.044123 79.491825 −1.616136 36.920501 0.506195 27.042648 5.556966 106.780550 0.834155 23.290485 1.539098 27.823306 0.807506 18.694670 0.806063 13.696452 1.093076 21.169073
−13.799766 −248.032167 −48.376998 −24.128078 −453.919416 −17.270293 −26.069659 −10.372122 −4.875543 −13.953936
1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0
AMTD BPSG CS GS JPM NMR OPY AMP
−0.000089 −0.000223 −0.000243 −0.000025 −0.000229 −0.000654 −0.000033 −0.000070
0.034293 0.052961 0.032726 0.032718 0.032800 0.030091 0.034964 0.037024
−0.680521 1.615877 0.191806 1.291092 0.574288 0.468393 0.014809 0.983505
−3.687216 −10.140364 −6.471524 −9.252433 −3.570167 −2.529848 −4.060191 −7.201739
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
12.308101 18.261864 15.456702 17.676195 12.185031 10.735553 12.871358 15.998946
Note: The significant level is 0.05. H = 0 denotes that it fails to reject the null of normality.
the return data of the three industries. The Jarque-Bera test is used to check the normality of the distribution of returns by using skewness and kurtosis. We use the 5% significant level to see if the normality condition of these return data is satisfied. The results show that We fails to reject the null of the normality for all data. The normality of these data implies that we can use DCC-MVGARCH model, which is based on zero-mean normal distribution, to test their time-varying correlations. Table 129.4 reports all estimated parameters of GARCH(1, 1) and DCC(1, 1) after we calculate the correlations among S&P500 index and the
page 4430
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation Table 129.4:
4431
Parameter estimates for bank, insurance and trading firms. GARCH specification
Assets
ω
ARCH (1)
GARCH (1)
DCC (1, 1)
S&P500 AXP BBT BK C CIT COF KEY PNC USB
0.000002 0.000003 0.000002 0.000020 0.000006 0.000019 0.000015 0.000004 0.000007 0.000002
(1.52) (1.49) (1.34) (1.78) (1.79) (1.87) (1.74) (1.07) (1.75) (1.56)
0.099 0.113 0.117 0.157 0.164 0.193 0.179 0.124 0.168 0.147
(6.76) (4.20) (5.71) (4.85) (5.33) (4.91) (3.20) (3.36) (3.09) (4.25)
0.893 0.887 0.883 0.824 0.836 0.807 0.821 0.876 0.828 0.853
(55.62) (29.61) (37.97) (35.64) (24.93) (19.20) (15.23) (19.80) (16.44) (23.39)
DCC α DCC β p-value χ2 (6)
0.011 (2.09) 0.904 (12.08) 0.76 7.99
S&P500 ABK AET AIG ALL HIG MET PRU ANAT ASI HCC
0.000002 0.000003 0.000215 0.000005 0.000008 0.000006 0.000004 0.000005 0.000006 0.000013 0.000003
(1.52) (0.81) (1.18) (2.43) (1.82) (1.10) (1.54) (1.99) (1.41) (1.84) (0.99)
0.099 0.130 0.105 0.216 0.220 0.135 0.124 0.147 0.147 0.140 0.081
(6.76) (2.51) (1.47) (6.22) (3.40) (5.38) (4.45) (6.15) (3.63) (3.25) (2.88)
0.893(55.62) 0.870 (12.63) 0.727 (10.92) 0.784 (23.00) 0.773 (13.26) 0.865 (20.22) 0.876 (30.09) 0.853 (36.20) 0.853 (19.72) 0.836 (15.81) 0.917 (27.15)
DCC α DCC β p-value χ2 (6)
0.008 (3.41) 0.968 (111.71) 0.98 15.29
S&P500 AMTD BPSG CS GS JPM NMR OPY AMP
0.000002 0.000143 0.000130 0.000007 0.000018 0.000005 0.000012 0.000023 0.000006
(1.52) (2.19) (0.91) (0.98) (1.78) (1.76) (1.62) (1.57) (1.58)
0.099 0.170 0.190 0.122 0.135 0.153 0.084 0.170 0.116
(6.76) (2.28) (1.48) (2.01) (3.16) (4.59) (3.81) (3.52) (5.25)
0.893 (55.62) 0.700 (8.81) 0.779 (4.95) 0.875 (13.08) 0.851 (20.76) 0.847 (24.44) 0.901 (33.09) 0.830 (21.24) 0.884 (39.29)
DCC α DCC β p-value χ2 (6)
0.009 (2.24) 0.958 (64.12) 0.86 9.76
Note: The values in brackets are t-statistics.
firms in three industries, respectively. We test whether the DCC-MVGARCH model holds. In this table, p-value and χ2 (6) are for the null hypothesis of constant conditional correlation against an alternative of dynamic conditional correlation. The result shows that all p-values are greater than 0.5, which means that the probability value that we should use DCC-MVGARCH is more than 0.5, which is larger than that of DCC. Therefore, it is suitable for us to use the DCC-GARCH model to simulate the correlations among S&P500 index and companies.
page 4431
July 6, 2020
16:8
4432
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
(a)
(b)
Figure 129.1: (a) The simple average correlations of three industries with two structural change points. (b) The simple average variances of three industries with two structural change points. Note: In these two graphs above, the positions of the dotted lines and the dashed lines present two structural change points. Moreover, the dotted lines denote the biggest probability for structural change, and the dashed lines denote the second biggest probability for structural change.
In Figure 129.1, we present the daily simple average conditional correlations for ten stocks in the banking industry, ten stocks in the insurance industry and eight stocks in the trading industry with S&P 500 index by DCC(1, 1)-MVGARCH(1, 1) model. This simple average correlation is defined as follows: n n 2 ρij,t , ρˆt = n(n − 1) i=1 j=i+1
page 4432
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4433
(a)
(b)
(c)
(d)
Figure 129.2: (a) The correlations among three banks. (b) The variance of three banks. (c) The correlations among three insurance companies. (d) The variance of three insurance companies. (e) The correlations among three trading companies. (f) The variance of three trading companies.
page 4433
July 6, 2020
16:8
4434
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
(e)
(f)
Figure 129.2:
(continued).
where n is the number of the selected assets, ρij,t is the correlation between the selected ith asset and jth asset at the time t. The correlations range from 0.3 to 0.7 over the sample period. We also notice that after February 2007, the correlation becomes much larger comparing with that of the previous months, which is coincident with the onset of subprime crisis. In these figures, we use the model of multiple structural changes to identify the points of structural changes.
page 4434
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4435
(a)
(b)
Figure 129.3: (a) Brunner–Munzel test, Xs are the left average correlations, Y s are the right ones. (b) Brunner–Munzel test, Xs are the left average variances, Y s are the right ones. Note: In two figures above, the beginning points of the time axis are in Aug. 2006, and the ending points are in Dec. 2008. For these two figures, we kept the time-axis notations of the original graphs plotted by software, R.
In Figures 129.2 and 129.3, the correlations between three banks and the S&P 500 index are given. From these figures, we observe similar patterns to Figure 129.1. Figure 129.4 contains the variances of three banks over the sample period. The variances for each series are simply the results obtained from the univariate GARCH specifications. As shown, variances have three
page 4435
July 6, 2020
16:8
4436
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
(a)
(b)
(c)
Figure 129.4: (a) The correlations between S&P500 index and three banks. (b) The correlations between S&P500 index and three insurance companies. (c) The correlations between S&P500 index and three trading companies.
distinct stages and become much larger after July 2007. The results show that risk becomes much higher after July 2007. For the remaining figures, we first select three firms from the insurance and trading industries, respectively, and then show their correlations with S&P 500 index returns and their variances. These figures also show that
page 4436
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4437
(a)
(b)
(c)
Figure 129.5: (a) The betas of three banks. (b) The betas of three insurance companies. (c) The betas of three trading companies.
correlations for all companies become larger almost at the same time in February 2007. 129.5 Application in β-function Consider the market model Rjt = αjt + βjt (Rmt − Rf t ) + εjt ,
(129.10)
page 4437
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
4438
where Rmt is the return of the market, Rjt is the return of the asset j, εjt is the error term for the asset j, and βjt is given by βjt =
cov(Rjt , Rmt ) . var(Rmt )
(129.11)
We use the return of S&P 500 index as the market return, and Rjt is the return of firm j. Based on the results obtained from DCC-GARCH model, we can calculate βjt . Table 129.5 shows the estimates of betas for the full sample. To show the dynamics of betas, we plot their time profiles for representative firms in three groups in Figures 129.5(a)–129.5(c). Results show that betas for AXP, C, COF are clearly not constant over the sample period. Especially for COF, its beta increases substantially over the sample period. Betas are Table 129.5: Assets
Summary statistics of firm betas.
Mean
Std dev
Minimum
Median
Maximum
AXP BBT BK C CIT COF KEY PNC USB
0.774576 0.759970 0.698397 0.757977 0.615511 0.668649 0.667823 0.672023 0.748600
0.017771 0.019023 0.024750 0.018291 0.026805 0.025476 0.024017 0.019314 0.020392
0.657184 0.695198 0.624380 0.686371 0.468577 0.588691 0.579304 0.603645 0.695163
0.774713 0.759587 0.699600 0.758776 0.616009 0.668506 0.668836 0.671799 0.747979
0.831706 0.826726 0.798889 0.812801 0.756277 0.780523 0.805315 0.737400 0.838112
ABK AET AIG ALL HIG MET PRU ANAT ASI HCC
0.487303 0.329644 0.726657 0.635026 0.684690 0.727350 0.691769 0.379200 0.262032 0.554722
0.035240 0.048610 0.027027 0.037474 0.026688 0.026556 0.028836 0.075712 0.051818 0.032549
0.412432 0.239767 0.651050 0.504951 0.600814 0.636424 0.603120 0.246716 0.184171 0.494780
0.485916 0.317324 0.722799 0.636328 0.683021 0.728505 0.696111 0.364297 0.249752 0.555040
0.587695 0.506515 0.794640 0.714187 0.761323 0.789067 0.759239 0.553404 0.432307 0.644845
AMTD BPSG CS GS JPM NMR OPY AMP
0.584126 0.270771 0.732483 0.754478 0.762027 0.542124 0.347709 0.756311
0.043873 0.050237 0.022959 0.016571 0.021679 0.037146 0.040794 0.019535
0.445862 0.180629 0.663923 0.716582 0.691440 0.384066 0.236693 0.694443
0.578323 0.257859 0.732514 0.752800 0.762947 0.542137 0.338427 0.759826
0.714631 0.473930 0.815650 0.805941 0.831118 0.636012 0.457764 0.803560
page 4438
July 6, 2020
16:8
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
DCC-GARCH Model for Market and Firm-Level Dynamic Correlation
4439
used extensively in asset pricing and portfolio management. The results show that systematic risk is time varying. Thus, any asset pricing tests and risk management models must consider the time variations in systematic risk in order to have the optimal results. 129.6 Conclusions In this chapter, we employ the dynamic conditional correlation generalized autoregressive conditional heteroscedasticity (DCC-GARCH) model of Engle (2002) and Engle and Sheppard (2001) to capture the market-level correlations and firm-level correlations for firms included in the S&P 500 index. We document the evidence of significant persistence in the correlations for the firms and S&P 500 index. Both simple correlations and conditional correlations among assets increase substantially over the period from February 2007 to October 31, 2008, which overlap with the subprime crisis. In addition, the variance for each stock becomes much larger after July 2007. Conditional correlations display a clear increasing trend over the sample period. The results show that the subprime crisis causes a contagion in financial market. Our findings have important implications for public policy, portfolio management and asset allocation. The results show that the subprime crisis results in significant contagion in the market. Heighten correlation make is difficult to diversify the risk for portfolio managers during this period. Moreover, betas are time-varying and also increase during the subprime crisis. The results suggest that the DCC-GARCH model provides important information for asset allocation and risk management. Acknowledgments We thank Professor Junbo Wang for providing the data and helpful comments and suggestions. We also thank Dr. Jilin Wu for his assistance in providing references and programs related to this chapter and Dr. Hai Lin for sharing his DCC-MVGARCH programs. Bibliography Baele, L. (2005). Volatility Spillover Effects in European Equity Markets. Journal of Financial and Quantitative Analysis 40(2), 373–401. Bai, J. and Perron, P. (1998). Estimating and Testing Linear Models with Multiple Structural Changes. Econometrica 66(1), 47–78. Bai, J. and Perron, P. (2003). Computation and Analysis of Multiple Structural Change Models. Journal of Applied Econometrics 18(1), 1–22.
page 4439
July 6, 2020
16:8
4440
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch129
P. Chen, C. Wu & Y. Zhang
Bekaert, G. and Ang, A. (1999). International Asset Allocation with Time-Varying Correlations. Ssrn Electronic Journal 15(3), 243–259. Bollerslev, T., Engle, R.F. and Wooldridge, J.M. (1988). A Capital Asset Pricing Model with Time-Varying Covariances. Journal of Political Economy 96(1), 116–131. Brunner, E. and Munzel, U. (2000). The Nonparametric Behrens-Fisher Problem: Asymptotic Theory and a Small Sample Approximation. Biometrical Journal 42(1), 17–25. Brunner, E. and Munzel, U. (2015). The Nonparametric Behrens-Fisher Problem: Asymptotic Theory and a Small-Sample Approximation. Biometrical Journal 42(1), 17–25. Engle, R. (2002). Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized Autoregressive Conditional Heteroskedasticity Models. Journal of Business & Economic Statistics 20(3), 339–350. Engle, R.F. and Sheppard, K. (2001). Theoretical and Empirical Properties of Dynamic Conditional Correlation Multivariate GARCH (No. w8554). National Bureau of Economic Research. Giorgio, C. (2016). Business Cycle Synchronization of CEECs with the Euro Area: A Regime Switching Approach. Journal of Common Markets Studies 54(2), 284–300. ˇ and Baum¨ Horv´ ath, R., Ly´ ocsa, S. ohl, E. (2018). Stock Market Contagion in Central and Eastern Europe: Unexpected Volatility and Extreme Co-exceedance. European Journal of Finance 24, 391–412. Kearney, C. and Poti, V. (2003). DCC-GARCH Modelling of Market and Firm-Level Correlation Dynamics in the Dow Jones Eurostoxx50 Index. In Paper Submitted to the European Finance Association Conference, Edinburgh. Mollah, S., Quoreshi, S. and Zafirov, G. (2016). Equity Market Contagion during Global and Eurozone Crises. Journal of International Financial Markets, Institutions, and Money 41, 151–167. Reiczigel, J., Zakari´ as, I. and R´ ozsa, L. (2005). A Bootstrap Test of Stochastic Equality of Two Populations. The American Statistician 59(2), 156–161. Santis, G.D. and Gerard, B. (1997). International Asset Pricing and Portfolio Diversification with Time-Varying Risk. The Journal of Finance 52(5), 32.
page 4440
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Chapter 130
Using Path Analysis to Integrate Accounting and Non-Financial Information: The Case for Revenue Drivers of Internet Stocks∗ Anthony Kozberg Contents 130.1 Introduction . . . . . . . . . . . 130.2 Literature Review . . . . . . . 130.3 Data Collection . . . . . . . . . 130.4 Methodology . . . . . . . . . . 130.5 Results . . . . . . . . . . . . . . 130.6 Expanded Testing . . . . . . . 130.7 Conclusions and Suggestions for Bibliography . . . . . . . . . . . . . . Appendix 130A Using Path Analysis . Appendix 130B Variable Definitions .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
4442 4446 4447 4450 4455 4463 4468 4470 4470 4472
Abstract This chapter utilizes path analysis, an approach common in behavioral and natural science literatures but relatively unseen in finance and accounting, to improve inferences drawn from a combined database of financial and non-financial information. Focusing on the revenue generating activities of internet firms, this paper extends the literature on internet
Anthony Kozberg Hunter College e-mail: [email protected] ∗
This chapter is a reprint of “Using path analysis to integrate accounting and non-financial information: The case for revenue drivers of internet stocks,” which was published in 2004 in Advances in Quantitative Analysis of Finance and Accounting, pp. 33–63. 4441
page 4441
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
4442
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
valuation while addressing the potentially endogenous and multicollinear nature of the internet activity measures applied in their tests. Results suggest that both SG&A and R&D have significant explanatory power over the web activity measures, suggestive that these expenditures represent investments in product quality. Evidence from the path analysis also indicates that both accounting and non-financial measures, in particular SG&A and pageviews, are significantly associated with firm revenues. Finally, this paper suggests other areas of accounting research which could benefit from a path analysis approach. Keywords Direct effect • Indirect effect • Path analysis • Internet stock • Non-financial Information.
130.1 Introduction Prior academic literature on the relevance of accounting and non-financial statement measures for internet firms has generally focused on explaining their stock valuations. In the absence of clear relationships between earnings and these valuations, analysts, corporate insiders and researchers have concentrated their attention on other measures for explaining their valuations. These include focusing on earnings components, such as revenues and gross margin, and non-financial proxies for market share and potential future growth opportunities, such as unique audience and pageviews. With the exception of an examination of revenue forecast errors by Trueman, Wong and Zhang (2001), however, there has been little research attempting to explain how these activity measures are generated or into their effect on firm revenues, which is addressed in this paper. Kozberg (2001) discusses how a better understanding of the relationships among accounting and non-financial measures for internet firms should help improve the identification of value drivers and the means by which they are specified. Figure 130.1 (replicated herein) provides a conceptual path diagram from initial management decisions on the levels of SG&A and R&D expenditures through to revenue realization for firms which rely upon website activity. This paper refines the path diagram and uses it to test whether firm expenditures on SG&A and R&D translate into measures reflecting increased consumer activity and whether said activity results in improved revenue opportunities for the firm. In addition, Kozberg (2001) illustrates the hazards of testing a sample of heterogeneous firms involved in the internet (distinguished by their business models) as one collective sample. Heterogeneity is only one of several statistical issues that can arise regarding current methodologies for testing these or other developing firms, however. For instance, little attention has been paid by the existing literature to the likely relationships among the accounting
page 4442
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
SG&A
4443
R&D
Improved Quality Unique Audience
Time Per Person
Visits Per Person
Pageviews
Other Revenues
+
Figure 130.1:
Advertising Revenues
=
Total Revenues
Conceptual path analysis diagram.
and non-financial variables used to explain firm valuations. Finally, Kozberg (2001) shows evidence of high multicollinearity among the internet activity measures for internet firms in general and for distinct types of internet firms. One method employed in that paper and in Demers and Lev (2001) is factor analysis, which replaces raw or deflated internet usage measures with a smaller set of orthogonal factors. This approach, however, allows the data to determine the factors and is inevitably followed by a researcher’s ad hoc attempt to interpret the factors. In addition, the choice of factors is highly sensitive to the combination of variables chosen and the approach taken in calculating them.1 While high degrees of correlation and endogeneity are not the same thing, this relationship suggests that some or all of these variables could be endogenous, violating an assumption made in OLS estimation. Treating these variables as exogenous when they are in fact endogenous could result in a number of statistical problems including measurement error and bias. Ideally, these factors should be specified ex ante, while still providing the researcher with the ability to control for variable endogeneity. The methodology employed in this paper is based upon a path analysis estimation technique first used by Wright (1921). Commonly employed in the behavioral and natural sciences literatures, this approach allows a 1
For instance, Demers and Lev (2001) choose the almost perfectly correlated reach and unique audience as factor components in their model. This choice influences their first factor to load predominately on these two variables.
page 4443
July 6, 2020
16:9
4444
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
researcher to address issues of factor identification and endogeneity simultaneously. In addition, it permits separate testing of the direct and indirect (through intermediate variables) effects of the selected independent variables on the dependent(s). Path analysis is based upon a diagram of the hypothesized relationships among the independent and dependent variables. In the analysis, the variables examined are classified into two types, exogenous or endogenous, based upon whether or not they appear as dependent variables in any of the system of equations. Among the variables employed in this study, expenditures on R&D and SG&A are treated as exogenous while website activity measures and revenues are endogenous.2 The path diagram is presented in Figure 130.2, an expanded version of Figure 130.1, which specifies empirically testable relationships among the data. In Figure 130.2, single arrows indicate the predicted direction of causation from the exogenous to the endogenous variables. Empirical testing of this path diagram provides several interesting results regarding the use of non-financial data in the analyses of internet firms. Consistent with findings in Kozberg (2001), accounting data on firm expenditures in SG&A and R&D have explanatory power over both website activity measures and firm revenues. R&D, a proxy for investments made to develop website quality, reduces the amount of time an individual needs to spend visiting a firm’s website. SG&A, which should proxy for efforts to increase website activity levels, is positively and significantly related to the average time spent and number of visits per person for financial services and online retailing firms. It is also positively and significantly related to time spent per person for portal and content-community firms.3 Consistent with expectations, both SG&A and R&D are positively and significantly related to the number of unique audience members visiting the site within a month. Finally, SG&A is positively and R&D is negatively and significantly associated with firm revenues, with the latter relationship appearing to be driven by financial services and online retailing firms. These results indicate that 2
The path analysis methodology presented in this paper could be easily adapted to other areas of accounting research. In particular, it could be used to improve measurement of other variables by decomposing components or effects of accounting and non-financial data. For instance, evidence from this and other papers suggests that expenditures on SG&A and R&D might be regarded as investments and should therefore be capitalized. Path analysis could help address issues like how best to capitalize these investments. 3 Portals and content-community firms, often regarded as only one segment of the internet, are those sites which focus on providing news and other information, searching services, and/or a place to interact with others online. For a more detailed explanations of the types of firms involved in the internet, I refer the reader to Kozberg (2001).
page 4444
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
RND
SGA
SGAPP
RNDPP
Unique Audience
Time Per Person
Visits Per Person
4445
Pageviews
Ads served
Click-throughs
Sales Figure 130.2: Path analysis diagram. Notes: Solid and dashed arrows both indicate the predicted direction of causality between any two variables. Please see Appendix 130B for an explanation of these variables.
at least some portion of firm expenditures on SG&A and R&D are directed towards improving website quality and visitor activity. Internet activity measures are systematically related to firm revenues as well. As unique audience and time spent per person increase, so do pageviews. Pageviews have the direct effect of increasing firm revenues in addition to increasing the amount of advertising shown. This direct effect on revenues is most likely the result of the ability of pageviews to proxy for other, non-advertising, revenue opportunities which are associated with greater site activity (e.g., the use of mailing lists and user profiling for portal and content-community firms and increased transactions for financial services or online retailing firms). Finally, while initial results for advertising data do not show explanatory power over revenues, alternative tests provide evidence that click-through rates on advertisements shown are positively and significantly associated with firm revenues.
page 4445
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
4446
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
This paper includes seven sections. Section 130.2 provides a brief review of the relevant literature. Section 130.3 details the data collection process and provides summary statistics for the variables. Section 130.4 describes the path analysis methodology employed. Sections 130.5 and 130.6 give the initial and expanded results from empirical testing, respectively. Section 130.7 summarizes the findings and provides suggestions for future testing.
130.2 Literature Review A number of recent papers have attempted to value internet firms using a combination of accounting and non-financial measures. Hand (2001, 2003), Trueman, Wong and Zhang (TWZ, 2000), Rajgopal, Kotha and Venkatachalam (RKV, 2000) and Demers and Lev (2001) provide evidence that internet firms’ earnings are generally not priced (or in some cases negatively priced). In the absence of positive and significant results for net income, several of these earlier papers attempt to use earnings components such as revenues to explain firm valuations. The evidence from those studies is generally mixed, with revenues, marketing expenses (a component of SG&A) and R&D all showing some signs of being positively and significantly valued. Results from Kozberg (2001), which includes more recent data than prior studies, provides evidence that net income has become positively priced for internet firms in general and for most business models over time. In addition, SG&A and R&D both show stronger evidence of being positively and significantly priced for the overall sample as well as most individual business models. Finally, non-financial measures such as reach, pageviews and advertisements are shown to be priced for internet firms in general. None of these papers, however, make any attempt at directly examining the determinants of activity and the ability of firms to convert that activity into revenues. Trueman, Wong and Zhang (TWZ, 2001) utilize current financial and non-financial data in the prediction of internet firm revenues, which it suggests are a key driver in the valuation of these firms.4 It focuses on the types of firms for which one would ex ante expect web activity measures to have relevance: portal, content-community and online retailing. TWZ (2001) examines how well different accounting and internet usage variables correlate with 4
Justification for their usage of audience measurement data comes from the suppositions that: (1) higher usage reflects greater demand for products and services; (2) increased traffic leads to greater advertising revenues; and (3) higher usage brings in more advertisers and, at least indirectly, higher advertising rates.
page 4446
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
4447
analysts’ forecast errors (measured in percentages). It finds that analysts systematically underestimate revenue growth from 1999 to early 2000. Growth rates in historical revenues and internet usage seem to have power in explaining these errors for portal and content-community firms, while growth in internet usage is significant in explaining errors for online retailers. While TWZ (2001) examines the relationship between revenue estimates and their realized values, it does not examine the usefulness of accounting or nonfinancial information in explaining either analysts forecasts or realized revenues directly. If the influences of the web activity measures are already accurately impounded into the revenue estimates made by analysts, then these measures should have little or no ability to explain errors. Given the availability of internet activity data from several sources (Nielsen//NetRatings, Media Metrix and PC Data) on a monthly or even weekly basis, it is not surprising that the explanatory ability of the tests conducted in TWZ (2001) are somewhat low (R2 s of 0.15 or less). In addition, given the emphasis placed on the importance of revenue growth for internet firms, these firms may attempt to influence their reported numbers through such activities as the inclusion of “grossed-up” and/or barter revenues as discussed in Bowen, Davis and Rajgopal (2001). Over a long enough time horizon, such adjustments would naturally reverse and/or lead to a higher denominator used for the calculation of revenue growth (implying a negative correlation between past growth and the error). However, over the shorter time horizon examined in TWZ (2001), it may be possible for management to continue to manipulate revenues in this fashion. These management actions could result in the systematic underestimating of revenues that TWZ (2001) document. With the exception of TWZ (2001), no previous research has examined the ability of either financial or non-financial data to explain other fundamental economic data than internet firm valuations. This paper extends upon the previous literature by examining the financial and non-financial determinants of firm revenue, while addressing the endogenous and multicollinear nature of these measures.
130.3 Data Collection Table 130.1 provides a breakdown of the number of firms and observations in the samples studied in this paper. Unlike Kozberg (2001) but consistent with most other papers in the internet literature, this paper restricts its focus to firms with positive levels of internet activity. This is done in order to restrict
page 4447
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
4448 Table 130.1:
Sample breakdown.
Firms in initial sample: Firms (observations) with complete accounting data: Firms (observations) also with data reported in the NNR audience database: Firms (observations) with advertising data as well:
332 317 (2049) 129 (583) 86 (373)
the sample to firms that are dependent on web activity for revenues, for which the hypothesized path diagram is more likely to be a reasonable description. Accounting data for these firms comes from Compustat for quarters ending in 1999 through March 2001. The top rows of Table 130.2 provide descriptive financial statistics for these internet firms. The average (median) market value of these companies is $3.21 billion ($464 million) and average (median) revenues are about the same at $80.0 million ($17.1 million). Mean (median) net income is −$66.9 million (−$14.9 million) and the market-to-book ratio is 8.48 (2.99).5 These descriptive statistics are consistent with the larger sample examined in Kozberg (2001). The internet activity data for this study are taken form Nielsen// NetRatings “Audience Measurement” and BannertrackTM databases from February 1999 through May 2001. The data employed include6 : Unique Audience (UNQAUD) — Defined as the number of different individuals visiting a website within the month. In practice, this measure can only detect the number of unique web browsers rather than unique visitors. Reach (REACH) — This figure represents the percentage of internet users that visit a particular web property within a month.
5
Market values and net income are presented for descriptive purposes only and are not used in any tests in this paper. Similarly, book value is not used, therefore the constraint that firms have a book value over 0 is not necessary (leading the market-to-book ratio to be negative for some observations and biasing the ratio lower relative to the full sample in Kozberg, 2001a). 6 In tests conducted using advertising data, the time period examined begins in May 1999 rather than February 1999. For a more detailed explanation of the databases and a longer description of terms, I refer the reader to Kozberg (2001a).
page 4448
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information Table 130.2:
4449
Descriptive statistics.
Variable
N
Mean
Median
Std dev
Market value Market-book Net Income Sales SG&A R&D Unique audience Reach Pageviews Time spent per person Visits per person Ad impressions Click-throughs
583 582 583 583 583 583 583 583 583 583 516 377 377
3215.90 8.48 −66.86 80.01 38.06 4.86 3.03 2.36 69.89 0.19 2.04 85.71 0.15
464.38 2.99 −14.90 17.10 21.38 1.50 0.96 0.78 13.87 0.15 1.75 16.52 0.02
12651.94 0.40 41.36 −45.64 330.49 −5426.3 406.06 0.00 54.63 0.00 12.06 0.00 5.14 0.10 4.24 0.07 177.91 0.27 0.13 0.02 0.99 1.03 191.37 0.14 0.47 0.00
Min.
Max. 17140.2 900.01 1178.0 6830.0 425.00 159.72 44.56 37.38 1698.13 0.86 6.24 1821.05 7.12
Pageviews (PAGEVIEW) — In the NNR database, pageview refers to the total number of pages seen by all users in the sample, regardless of the means by which they are viewed. Visits per person (VISITSPP) — Indicates the number of different times an average audience member visits a particular property within a month. NNR does not begin reporting this statistic until August 1999. Time spent per person (TIMEPP) — Indicates the total amount of time an audience member spends at a property over the month. Advertisements served (ADSEEN) — The total number of delivered ad impressions each month across all reported domains for a given property. NNR does not begin reporting this statistic until May 1999. Click-throughs (CLICKS) — The number of advertisements shown that are clicked upon by the browser. NNR does not begin reporting this statistic until May 1999. Descriptive audience statistics for these variables are provided in the lower rows of Table 130.2.7 The average firm reaches about 2.36% of the estimated 7
The differences in the number of observations in this sample and those in the “web sample” in Kozberg (2001a) result from slight differences in the matching and truncation criterion employed in this study. Observations are matched based upon the final month of the firm
page 4449
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
4450
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
population of internet users in the US while the median firm enjoys an audience only one-third as large. These data suggest that there are a small number of firms which dominate the internet in terms of their market share of unique browsers. The average (median) user makes 2.04 (1.75) trips to a given property each month spending a total of 0.19 (0.15) hours.8 These firms show an average (median) of 69.9 (13.9) million pages carrying 85.7 (16.5) million ads but only 0.15 (0.02) million of these ads were clicked upon. As a result, firms that are able to deliver a high volume of click-throughs could command a premium in the marketplace. On the other hand, if advertising dollars on the net are more focused upon enhancing brand value (similar to more traditional media), click-throughs may have a negligible impact on firm revenues. 130.4 Methodology This section presents an alternative approach for examining the interrelated nature of the accounting and non-financial variables used in the valuation of internet firms called Path Analysis (Appendix 130A). Figure 130.1, recreated from Kozberg (2001), specifies a hypothetical path for web-activitydependent firms from start-up to revenue generation. This paper expands upon Figure 130.1 to develop a more detailed, empirically testable, path diagram. Conceptually, management initiates expenditures on R&D, intending to establish (or enhance) a website’s quality. The potential effects of this spending may offset one another, however. Increased site quality should improve a firm’s ability to retain viewers, which can be proxied for by the amount of time spent and the number of visits made per person to its websites. On the other hand, website R&D expenditures could be focused upon aspects of quality such as improved delivery times (lowering the average time spent online) rather than on adding further content (potentially increasing time online). Regardless of the means by which quality improves, however, the websites should generate larger audiences as the result of improved brand recognition and from reputation effects. quarter in question rather than the month a firm announces earnings. Observations more than 3 standard deviations from the mean are removed. 8 Kozberg (2001a) showed an almost order of magnitude difference between the means and medians for time spent online as well as considerably larger means than medians for other activity measures as well. Due to the greater need to control for outliers using a path analysis framework this relationship has been considerably mitigated.
page 4450
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
4451
In addition to spending on R&D, firms may choose to engage in major advertising campaigns and other promotions (SG&A) designed to attract new visitors to their websites. These increases in audience should improve the quantity of user generated content. It should also allow more opportunities for members to develop into communities with those possessing similar interests. As a result, increased SG&A could have the secondary effect of encouraging existing members to use their websites more frequently. Overall, expenditures on SG&A should enhance the “network effects” from having more users online with whom to interact and share information.9 As audience increases so does the total number of pages viewed, increasing advertising revenue opportunities for the firms. In addition, pageviews should increase as individual audience members visit and/or spend more time at a website. Increased pageviews translates into more opportunities for firms to deliver advertisements or other forms of sponsored content to their viewers. Naturally, increases in the number of delivered advertisements leads to additional chances for browsers to click-through to the website of an advertiser. On the other hand, as time spent per person increases, browsers are more likely to have seen the same advertisements previously or already viewed those advertised sites reducing their likelihood of clicking-through. Apart from their impact on the quantity of advertisements shown, increased audience and pageviews could also generate an improved ability to target content and promotions to their viewers which could further increase advertising revenues. Additionally, audience, pageviews, SG&A and R&D could all influence firm revenues directly, proxying for other revenue opportunities such as: (1) online or offline sales of goods and services; (2) the creation and use of mailing lists; (3) alliances; and/or (4) services rendered and content delivered for other sites. Building upon the logic contained in Figure 130.1, the methodology used for estimation in this paper focuses on path analysis, a statistical technique based upon a linear equation system that was first developed by Sewall Wright (1921). While uncommon in the financial accounting literature,10
9
Noe and Parker (2000) show analytically that two internet firms, competing in a twoperiod, winner take all model, will advertise aggressively and make large investments in site quality in order to capture market share. Under this model, any variables that are (linearly) related to pageviews should be explained, although not necessarily in a linear fashion. 10 An example of the application of path analysis in the accounting literature is Amit and Livnat (1988), which examines the direct and indirect effects of diversification, operating risk and leverage on a firm’s systematic risk.
page 4451
July 6, 2020
16:9
4452
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
A. Kozberg
it has been utilized frequently in the behavioral and natural sciences literatures. Path analysis’ popularity in those literatures results from its explicit recognition of possible causal relationships among variables. In so doing, it enables the researcher to decompose the correlations between each pair of variables into the different effects that flow from the causal variable(s) to the dependent variable. These effects may be either direct (e.g., increased audience should lead directly to more individuals seeing a site’s webpages) or channeled indirectly through other variables (increased audience directly leads to increased pageviews and indirectly causes more advertisements to be seen). Thus one may examine both the direct and various indirect effects of firm expenditures and activity generation measures and assess the impact of each. This focus on intermediate pathways along which these effects travel makes the application of this technique particularly appealing for internet firms. As discussed previously, understanding the path from firm expenditures to revenue creation provides a clearer understanding of what may be driving the value of internet firms. The analysis begins with a path model that diagrams the expected relationships among the independent and dependent variables. It should be noted, however, that the pathways in these models represent the hypotheses of researchers, and cannot be statistically tested for the direction of causality. Figure 130.2 provides a more developed version of Figure 130.1 expressed as a path diagram. In path analysis, the variables examined are broken into two types, exogenous or endogenous, based upon whether or not they appear as dependent variables in any of the system of equations. Among the variables employed in this study, expenditures on R&D and SG&A are treated as exogenous while site activity and revenues are endogenous. In the main model tested there are four exogenous variables, SG&A and R&D deflated by both total firm assets and unique audience (per-person). In any particular equation tested, however, only one of the two deflated sets of variables is used. The decision as to which set to use is based primarily, but not exclusively, upon which deflator is employed for the dependent variable. The choice of this specification is also intended to avoid unnecessary transformation of the data from its reported format, to allow easier interpretability of the results and to avoid introducing competing effects into the data. In Figure 130.2, single arrows indicate the predicted direction of causation from the exogenous to the endogenous variables that is suggested from the earlier discussion in this section. The coefficients generated in a path analysis are standardized regression coefficients (betas), showing the direct effect of an independent variable on
page 4452
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
4453
its dependent variable in the path diagram. Thus, when the model has two or more causal variables, path coefficients are partial regression coefficients that measure the extent of the effect of a causal variable and its dependent in the path model controlling for other prior variables. The path analysis typically uses standardized data or a correlation matrix as an input. In terms of its practical application, the path analysis amounts to the following system of simultaneous equations, processed iteratively.11 UNQAUD = β11 SGA + β13 RND + ε1 ,
(130.1a)
TIMEPP = β22 SGAPP + β24 RNDPP + ε2 ,
(130.1b)
VISITSPP = β32 SGAPP + β34 RNDPP + ε3 ,
(130.1c)
PAGEVIEW = β45 TIMEPP + β46 VISITSPP + β47 UNQAUD + ε4 ,
(130.1d)
ADSEEN = β58 PAGEVIEW + ε5 ,
(130.1e)
CLICKS = β65 TIMEPP + β69 ADSEEN + ε6 ,
(130.1f)
SALES = β71 SGA + β73 RND + β75 TIMEPP + β76 VISITSPP + β77 UNQAUD + β78 PAGEVIEW + β79 ADSEEN + β710 CLICKS + ε7 .
(130.1g)
Variables ending in “PP” are deflated by unique audience. All other measures are deflated by the total assets of the firm. Per-person measures are used for time spent online and visits as these are the variables reported on NNR and are more descriptive of the characteristics of a website’s audience than total hours spent or visits would be. A summary of the predictions for the signs of these coefficients is given in Table 130.3. As is the case with other statistical techniques, path analysis suffers from a number of limitations related to model specification. As mentioned previously, the most important among these is the fact that it cannot explicitly test for directionality in the relationships. The directions of the arrows in a path diagram represent the researcher’s hypotheses regarding causality; however, the actual direction could be the reverse or the correlation could be spurious. In particular, if a variable specified as prior to another given variable is really consequent to it, it should be estimated to have no path effect. However, when it is included as a prior variable in the model, it 11
The subscripts are written here in a manner consistent with other statistical tests. The standard convention for path analyses is for the first number to indicate the causal variable and the latter the dependent variable.
page 4453
July 6, 2020
RND
+ + +
RNDPP TIMEPP
VISITSPP UNQAUD PAGEVIEW
ADSEEN CLICKS
? ? + +
+
+ +
− +
0
+
+
+ +
+
9.61in x 6.69in
Notes: This table summarizes the predictions made in Section 130.4 for the direct effects of each accounting or internet-activity measure shown in Figure 130.2. Explanatory variables are given in the columns with the rows belonging to the relevant dependent variables. Variables ending in “PP” are deflated by unique audience. All other variables are deflated by total assets. See Appendix 130B for further explanations of each term. A + (−) indicates an expected positive (negative) coefficient. A “0” indicates a variable that is being tested for which no prediction was made, while a “?” indicates a variable for which multiple, conflicting predictions are made.
A. Kozberg
TIMEPP VISITSPP UNQAUD VIEWS ADSEEN CLICKS SALES
SGAPP
Handbook of Financial Econometrics,. . . (Vol. 4)
SGA
Predictions for direct effects.
16:9
4454
Table 130.3:
b3568-v4-ch130 page 4454
July 6, 2020
16:9
Handbook of Financial Econometrics,. . . (Vol. 4)
9.61in x 6.69in
b3568-v4-ch130
Using Path Analysis to Integrate Accounting and Non-Financial Information
4455
could erroneously lead to changes in the coefficients for other variables in the model. Another important limitation is that techniques such as these often require substantially more data than single equation regressions in order to assess significance. The conventional wisdom in the literature is that the total number of observations should exceed the number of parameters tested by at least 10–20 times. In addition, the coefficients in path analyses are sensitive to specification error when a significant causal variable is left out of the model. When this happens, the path coefficients will reflect their shared covariance with such unmeasured variables and will not be accurately interpretable in terms of their direct and indirect effects. Finally, the researcher’s choice of variables and pathways represented will limit the model’s ability to recreate the sample covariance and variance patterns that are observed in the data. Because of this, there may be several models that fit the data equally well. Nonetheless, the path analysis approach remains useful in structuring relational data which is a good first step in understanding the intricate nature of the data involved. 130.5 Results The description of the path analysis above focuses on the actions of webactivity-dependent firms. While the sample studied here includes a small number of observations for business models in which activity is not ex ante expected to be a substantial source of long-term revenues (Kozberg, 2001), these firms are likely to prove exceptions to the rule. If firms are attempting to maximize revenue streams from multiple sources, primary or not, then website activity should translate into increased revenues for these companies as well. Due to the use of partial regression coefficients in the path analysis, it would first be helpful to examine the overall correlations among the variables tested.12 The correlations in Table 130.4 are sorted from left-to-right (top-tobottom) based upon the particular variables’ position in Figure 130.2. From Table 130.4, it can be seen that a number of pairs of variables are highly correlated, such as pageviews and advertisements shown (0.74). This result would seem to support the need for a mechanism to control for possible endogeneity problems suggested by high multicollinearity in the data. From the organization of the data, it can be seen that these high correlations among the variables tends to fall as the number of hypothesized steps between 12 In a perfectly specified model the sum of the effects from the direct and indirect pathways between any two variables would equal the correlation for those two variables.
page 4455
July 6, 2020
1
−0.07 1
0.24 −0.03 1
−0.08 0.41 0.26 1
0.43 −0.29 0.13 −0.21 1
−0.14 −0.03 −0.13 −0.13 −0.03 1
−0.19 −0.02 −0.12 −0.15 0.05 0.63 1
0.28 −0.26 0.05 −0.18 0.76 0.28 0.19 1
0.20 −0.21 0.21 −0.13 0.65 0.25 0.31 0.74 1
0.50 −0.24