269 102 10MB
English Pages 333 [342] Year 2020
Springer Series in Operations Research and Financial Engineering
John P. Nolan
Univariate Stable Distributions Models for Heavy Tailed Data
Springer Series in Operations Research and Financial Engineering Series Editors Thomas V. Mikosch, Køebenhavns Universitet, Copenhagen, Denmark Sidney I. Resnick, Cornell University, Ithaca, USA Stephen M. Robinson, University of Wisconsin-Madison, Madison, USA Editorial Board Torben G. Andersen, Northwestern University, Evanston, USA Dmitriy Drusvyatskiy, University of Washington, Seattle, USA Avishai Mandelbaum, Technion - Israel Institute of Technology, Haifa, Israel Jack Muckstadt, Cornell University, Ithaca, USA Per Mykland, University of Chicago, Chicago, USA Philip E. Protter, Columbia University, New York, USA Claudia Sagastizabal, IMPA – Instituto Nacional de Matemáti, Rio de Janeiro, Brazil David B. Shmoys, Cornell University, Ithaca, USA David Glavind Skovmand, Køebenhavns Universitet, Copenhagen, Denmark Josef Teichmann, ETH Zürich, Zürich, Switzerland
The Springer Series in Operations Research and Financial Engineering publishes monographs and textbooks on important topics in theory and practice of Operations Research, Management Science, and Financial Engineering. The Series is distinguished by high standards in content and exposition, and special attention to timely or emerging practice in industry, business, and government. Subject areas include: Linear, integer and non-linear programming including applications; dynamic programming and stochastic control; interior point methods; multi-objective optimization; Supply chain management, including inventory control, logistics, planning and scheduling; Game theory Risk management and risk analysis, including actuarial science and insurance mathematics; Queuing models, point processes, extreme value theory, and heavy-tailed phenomena; Networked systems, including telecommunication, transportation, and many others; Quantitative finance: portfolio modeling, options, and derivative securities; Revenue management and quantitative marketing Innovative statistical applications such as detection and inference in very large and/or high dimensional data streams; Computational economics
More information about this series at http://www.springer.com/series/3182
John P. Nolan
Univariate Stable Distributions Models for Heavy Tailed Data
123
John P. Nolan Department of Mathematics and Statistics American University Washington, DC, USA
ISSN 1431-8598 ISSN 2197-1773 (electronic) Springer Series in Operations Research and Financial Engineering ISBN 978-3-030-52914-7 ISBN 978-3-030-52915-4 (eBook) https://doi.org/10.1007/978-3-030-52915-4 Mathematics Subject Classification: 60E07, 60F05, 62F12, 62F30, 62F35 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Dedicated to Martha, Julia and Erin and the memory of William J. and Anne Z. Nolan
Preface
Almost a century ago, Lévy (1925) first explored stable distributions in his study of sums of random terms. The basic mathematical facts about stable laws were described in the influential book Gnedenko and Kolmogorov (1954). On page 7 of that work, the authors state that stable laws “deserve the most serious attention. It is probable that the scope of applied problems in which they play an essential role will become in due course rather wide.” It is hoped that this book, coming many years later, convinces more people of the significance and practical relevance of stable distributions. The books of Zolotarev (1986) Samorodnitsky and Taqqu (1994), Janicki and Weron (1994), Nikias and Shao (1995), Uchaikin and Zolotarev (1999), Rachev and Mittnik (2000), and Meerschaert and Scheffler (2001) provide much of the theory on stable distributions, though little on numerics or statistical issues. There has been increasing interest in stable models in the fields of finance, economics, engineering, and physics. Their use in practical problems has been hampered by the lack of explicit formulas for stable densities and distribution functions. There are now efficient, reliable computer programs to calculate these quantities and it is now feasible to use stable models in practical problems. The program STABLE has been used throughout this text to compute stable densities, distribution functions, quantiles, and to simulate stable random variables, generate graphs, and fit data sets. A free version of this program is available at the website below. The structure of this book is nonstandard. On the one hand, I wanted to teach stable distributions to non-mathematicians who may want to use stable distributions in applications. On the other hand, I wanted to collect together facts about stable distributions with their proofs for use by researchers. My solution is to have Chapter 1 give a non-technical introduction to stable distributions in one dimension and Chapter 2 give an overview of applications using stable laws. For those interested in more theory, Chapter 3 gives a mathematical exposition of stable laws and can be skipped by those more interested in applications. Chapter 4 discusses parameter estimation. The rest of the book contains related topics: regression with stable errors, signal processing with stable noise/clutter, and facts about related
vii
viii
Preface
distributions. A second volume which focuses on multivariate stable distributions and processes is in progress. I have not tried to prove every fact used, e.g. the proof that stable densities are unimodal takes many pages and I did not feel it would add much to this book. In some cases, I have used special cases to motivate the general results, e.g. the characterization of stable domains of attraction, and given references to full proofs. I have chosen to use the symbols fi, fl, , and – for the parameters of a general stable distribution. The first two symbols are in general use for the index of stability and skewness, respectively, but most sources do not use the latter two symbols. Instead of using for the scale parameter, most references use r. This causes confusion among most non-mathematicians, who expect r to be the standard deviation, which generally does not exist for stable laws. The problem is further compounded in the Gaussian case (fi ¼ 2), when the standard deviation does exist, but does not equal the usual scale parameter. Likewise, the use of – as a location parameter, instead of l, is meant to avoid confusion as most non-mathematicians use l for the mean. In many cases, i.e. fi 1, the mean does not exist for a stable law. And when it does, the relationship between the mean and the stable location parameter depends on the parameterization used. To avoid some of this confusion, it seemed simpler to use the parameters fi, fl, , and – describe the mean and other quantities in terms of them. The multiple parameterizations for stable distributions are described. The most common parameterization is useful for theoretical reasons, but it is not ideal for practical applications. For that purpose, a variation of Zolotarev’s (M) parameterization, which we call the Sðfi; fl; ; –; 0Þ parameterization, seems most useful. The appendices contain tables of stable quantiles and other useful information in this continuous parameterization. I have tried to give some history of this subject, but my limited knowledge and energy have no doubt left out many references. I apologize to those left out—please send me corrections or additions. Novel features of the book are the careful derivation of computational formulas for simulating stable variates and accurately computing stable densities, distribution functions and qunatiles. This enabled numeric exploration of actual tail behavior of densities and distribution functions. There is also a chapter on estimation of stable parameters with examples and some diagnostics, as well as chapters on linear regression when the error terms are stable and signal processing when there is impulsive noise. Parts of this book were written while I was on sabbatical. During that period, I received financial support and time to work from the University of Lyon 1 (Anne-Laure Fougeres and Cécile Mercadier), the Technical University of Munich (Claudia Kluppelberg), Eidgenössiche Technishe Hochschule, Zurich (Paul Embrechts) and Chalmers Institute of Technology, Gothenberg (Holger Rootzén). Parts of the work was supported by an agreement with Cornell University, Operations Research & Information Engineering under contract W911NF-12-10385 from the Army Research Office.
Preface
ix
There are many people who contributed in different ways to this book. Many years ago, Sarah Jane, Guy Weber, and George Griffin made me see mathematics as an exciting and useful field. The late Bill Sacco’s enthusiasm and energy inspired me before I knew calculus, and they still inspire me. I would like to express my thanks to my dissertation advisor Loren Pitt, who started me on the study stable processes. The late Stamatis Cambanis, who enthusiastically nurtured that interest and enabled me to continue my research through summer support at the Center for Stochastic Processes at the University of North Carolina, Chapel Hill. The late Peter Hall kindly gave me encouragement at one stressful time in the academic tenure and promotion process. I would like to thank the following colleagues: Tuncay Alparslan, Gonzalo Arce, Roger J. Brown, Tomasz Byczkowski, Stephen Casey, Larry Crone, Anne-Laure Fougères, Richard Davis, Juan Gonzalez, Richard Holzsager, Robert Jernigan, Zbigniew Jurek, Dan Kalman, Tomasz Kozubowski, Annika Krutto, Roger Lee, Wenbo Li, Hugh McCulloch, Mark Meerschaert, Cécile Mercadier, Reza Modarres, Rafal Núñez, Anna Panorska, Krzysztof Podgórski, Balram Rajput, Nalini Ravishankar, Sid Resnick, Bob Rimmer, Jan Rosinski, Holger Rootzén, Gennady Samorodnistky, Murad Taqqu, Chuck Voas, Jeff Weeks, Alex White and Ryszard Zieliński for conversations, arguments, and sometimes actually finishing a paper together. I apologize to anyone I have inadvertently left out. Several American University students helped on different parts of this book. The students in a section of Stat 601 at American University suffered through an early draft of parts of this book. Students I’ve worked with while exploring stable laws include Husein Abdul-Hamid, Greg Alexander, Jen Dumiak, Bashir Dweik, Hippolyte Fofack, Hasan Hamdan, Fotios Kokkotos, Neil Kpamegan, Fairouz Hilal Makhlouf, Diana Ojeda-Revah, Shireen Rishmawi, and Xiaonan Zhang. Alyssa Cuyjet, Valbona Bejleri, Cindy Cook, Aaron Rothman, Natalie Konerth, Yanfeng Chen, and Xiaonan Zhang performed a variety of simulations and computations as student research assistants. Sid Resnick has been an exceptionally patient editor and Donna Chernyk and Christopher Tominich at Springer have also been patient and gracious. The anonymous reviewers of the manuscript provided numerous corrections and helpful suggestions. I have tried to check formulas for accuracy, but surely some mistakes remain. Any mistakes that remain are mine, please let me know those that you find. Errata will be made available online at the website below. There is also free software for working with stable laws and an extensive bibliography on stable distributions there. The availability of good systems for LATEX has greatly simplified the job of producing this book. The free MiKTeX by Christian Schenk and the inexpensive program WinEdt by Aleksander Simonic are great tools!
x
Preface
Finally, I wish to thank my family for helping along the way. My parents and siblings gave me a stable home while growing up. My wife Martha and my daughters Julia and Erin sometimes tolerated me, sometimes dragged me away from my desk, and sometimes rolled their eyes at whether this book would ever be finished. Takoma Park, Maryland February 2020
John P. Nolan https://edspace.american.edu
Contents
1
2
Properties of Univariate Stable Distributions . Definition of stable random variables . . . . . . . . Other definitions of stability . . . . . . . . . . . . . . Parameterizations of stable laws . . . . . . . . . . . . Densities and distribution functions . . . . . . . . . . Tail probabilities, moments, and quantiles . . . . . Sums of stable random variables . . . . . . . . . . . Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Central Limit Theorem and Domains of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Multivariate stable . . . . . . . . . . . . . . . . . . . . . 1.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
1 1 4 5 10 13 17 19
.......... .......... ..........
20 22 22
Modeling with Stable Distributions . . . . . . . . . . . 2.1 Lighthouse problem . . . . . . . . . . . . . . . . . . 2.2 Distribution of masses in space . . . . . . . . . . 2.3 Random walks . . . . . . . . . . . . . . . . . . . . . 2.4 Hitting time for Brownian motion . . . . . . . . . 2.5 Differential equations and fractional diffusions 2.6 Financial applications . . . . . . . . . . . . . . . . . 2.6.1 Stock returns . . . . . . . . . . . . . . . . 2.6.2 Value-at-risk and expected shortfall . . 2.6.3 Other financial applications . . . . . . . 2.6.4 Multiple assets . . . . . . . . . . . . . . . 2.7 Signal processing . . . . . . . . . . . . . . . . . . . 2.8 Miscellaneous applications . . . . . . . . . . . . . 2.8.1 Stochastic resonance . . . . . . . . . . . . 2.8.2 Network traffic and queues . . . . . . . 2.8.3 Earth Sciences . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
25 26 28 29 32 33 35 35 35 36 38 39 40 41 41 42
Basic 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
xi
xii
Contents
2.8.4 2.8.5 2.8.6 2.8.7 2.8.8 2.8.9 2.8.10
Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . Embedding of Banach spaces . . . . . . . . . . . . . . Hazard function, survival analysis, and reliability . Biology and medicine . . . . . . . . . . . . . . . . . . . Discrepancies . . . . . . . . . . . . . . . . . . . . . . . . Computer Science . . . . . . . . . . . . . . . . . . . . . Long tails in business, political science, and medicine . . . . . . . . . . . . . . . . . . . . . . . . 2.8.11 Extreme values models . . . . . . . . . . . . . . . . . . 2.9 Behavior of the sample mean and variance . . . . . . . . . . . 2.10 Appropriateness of infinite variance models . . . . . . . . . . 2.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Technical Results for Univariate Stable Distributions . 3.1 Proofs of Basic Theorems of Chapter 1 . . . . . . . . 3.1.1 Stable distributions as infinitely divisible distributions . . . . . . . . . . . . . . . . . . . . 3.2 Densities and distribution functions . . . . . . . . . . . 3.2.1 Series expansions . . . . . . . . . . . . . . . . 3.2.2 Modes . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Duality . . . . . . . . . . . . . . . . . . . . . . . 3.3 Numerical algorithms . . . . . . . . . . . . . . . . . . . . 3.3.1 Computation of distribution functions and densities . . . . . . . . . . . . . . . . . . . . 3.3.2 Spline approximation of densities . . . . . . 3.3.3 Simulation . . . . . . . . . . . . . . . . . . . . . 3.4 Functions gd , e g d , hd and e hd . . . . . . . . . . . . . . . 3.4.1 Score functions . . . . . . . . . . . . . . . . . . 3.5 More on parameterizations . . . . . . . . . . . . . . . . 3.6 Tail behavior . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Moments and other transforms . . . . . . . . . . . . . . 3.8 Convergence of stable laws in terms of (fi; fl; ; –) . 3.9 Combinations of stable random variables . . . . . . . 3.10 Distributions derived from stable distributions . . . . 3.10.1 Log-stable . . . . . . . . . . . . . . . . . . . . . 3.10.2 Exponential stable . . . . . . . . . . . . . . . . 3.10.3 Amplitude of a stable random variable . . . 3.10.4 Ratios and products of stable terms . . . . . 3.10.5 Wrapped stable distribution . . . . . . . . . . 3.10.6 Discretized stable distributions . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
42 43 43 45 45 46
. . . . .
. . . . .
. . . . .
. . . . .
46 47 47 49 51
......... .........
53 53
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
64 65 75 76 80 83
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
83 84 84 87 89 91 96 107 119 122 130 130 131 132 132 134 135
Contents
3.11 Stable distributions arising as functions of other distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11.1 Exponential power distributions . . . . . . . . . . 3.11.2 Stable mixtures of extreme value distributions . 3.12 Stochastic series representations . . . . . . . . . . . . . . . . 3.13 Generalized Central Limit Theorem and Domains of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.15 Differential equations and stable semi-groups . . . . . . . 3.16 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Univariate Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Order statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Tail-based estimation . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Hill estimator . . . . . . . . . . . . . . . . . . . . . . 4.3 Extreme value estimate of fi . . . . . . . . . . . . . . . . . . 4.4 Quantile-based estimation . . . . . . . . . . . . . . . . . . . . 4.5 Characteristic function-based estimation . . . . . . . . . . . 4.5.1 Choosing values of u . . . . . . . . . . . . . . . . . 4.6 Moment-based estimation . . . . . . . . . . . . . . . . . . . . 4.7 Maximum likelihood estimation . . . . . . . . . . . . . . . . 4.7.1 Asymptotic normality and Fisher information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.2 The score function for – . . . . . . . . . . . . . . . 4.8 Other methods of estimation . . . . . . . . . . . . . . . . . . 4.8.1 Log absolute value estimation . . . . . . . . . . . 4.8.2 U statistic-based estimation . . . . . . . . . . . . . 4.8.3 Miscellaneous methods . . . . . . . . . . . . . . . . 4.9 Comparisons of estimators . . . . . . . . . . . . . . . . . . . 4.9.1 Using x and s to estimate location and scale . . 4.9.2 Statistical efficiency and execution time . . . . . 4.10 Assessing a stable fit . . . . . . . . . . . . . . . . . . . . . . . 4.10.1 Graphical diagnostics . . . . . . . . . . . . . . . . . 4.10.2 Likelihood ratio tests and goodness-of-fit tests . 4.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.12 Estimation when in the domain of attraction . . . . . . . . 4.13 Fitting stable distributions to concentration data . . . . . 4.14 Estimation for discretized stable distributions . . . . . . . 4.15 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
135 138 139 140
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
141 148 149 152
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
159 160 162 162 165 166 170 171 174 176
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
178 182 184 184 184 185 186 186 186 190 195 202 205 213 215 218 221
xiv
Contents
Stable Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Maximum likelihood estimation of the regression coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Parameter confidence intervals: linear case . . . 5.1.2 Linear examples . . . . . . . . . . . . . . . . . . . . 5.2 Nonlinear regression . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Parameter confidence intervals: nonlinear case . 5.2.2 Nonlinear example . . . . . . . . . . . . . . . . . . . 5.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 223 . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
224 226 229 234 235 236 238
6
Signal Processing with Stable Distributions . . . . . . . . . . 6.1 Unweighted stable filters . . . . . . . . . . . . . . . . . . . . 6.2 Weighted and matched stable filters . . . . . . . . . . . . 6.3 Calibration and numerical issues . . . . . . . . . . . . . . 6.3.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Evaluating and minimizing the cost function . 6.4 Evaluation of stable filters . . . . . . . . . . . . . . . . . . . 6.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
239 239 244 247 247 249 250 252
7
Related Distributions . . . . . . . . . . . . . . . . . . . . . . . 7.1 Pareto distributions . . . . . . . . . . . . . . . . . . . . . 7.2 t distributions . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Other types of stability . . . . . . . . . . . . . . . . . . . 7.3.1 Max-stable and min-stable . . . . . . . . . . . 7.3.2 Multiplication-stable . . . . . . . . . . . . . . . 7.3.3 Geometric-stable distributions and Linnik distributions . . . . . . . . . . . . . . . . . . . . 7.3.4 Discrete stable . . . . . . . . . . . . . . . . . . . 7.3.5 Generalized convolutions and generalized stability . . . . . . . . . . . . . . . . . . . . . . . 7.4 Mixtures of stable distributions: scale, sum, and convolutions . . . . . . . . . . . . . . . . . . . . . . . 7.5 Infinitely divisible distributions . . . . . . . . . . . . . 7.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
255 255 259 259 260 262
A
Mathematical Facts . . . . . . . . . . . . . . A.1 Sums of random variables . . . . . A.2 Symmetric random variables . . . A.3 Moments . . . . . . . . . . . . . . . . . . A.4 Characteristic functions . . . . . . . A.5 Laplace transforms . . . . . . . . . . A.6 Mellin transforms . . . . . . . . . . . A.7 Gamma and related functions . .
. . . . . . . .
B
Stable Quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . .
. . . . . . . . . 262 . . . . . . . . . 263 . . . . . . . . . 263 . . . . . . . . . 263 . . . . . . . . . 265 . . . . . . . . . 266 . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
267 267 267 268 268 269 270 271
Contents
xv
C
Stable Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
D
Asymptotic standard deviations and correlation coefficients for ML estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Symbol Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Chapter 1
Basic Properties of Univariate Stable Distributions
Stable distributions are a rich class of probability distributions that allow skewness and heavy tails and have many intriguing mathematical properties. The class was characterized by Paul Lévy in his study of sums of independent identically distributed terms in the 1920s. The lack of closed formulas for densities and distribution functions for all but a few stable distributions (Gaussian, Cauchy and Lévy, see Figure 1.1) has been a major drawback to the use of stable distributions by practitioners. There are now reliable computer programs to compute stable densities, distribution functions, and quantiles. With these programs, it is possible to use stable models in a variety of practical problems. This book describes the basic facts about univariate stable distributions, with an emphasis on practical applications. This chapter describes basic properties of univariate stable distributions. Chapter 2 gives examples of stable laws arising in different problems. Chapter 3 gives proofs of the results in this chapter, as well as more technical details about stable distributions. Chapter 4 describes methods of fitting stable models to data. The remaining chapters are concerned with stable regression, some results on signal processing in the presence of heavy tailed noise, and related distributions.
1.1 Definition of stable random variables An important property of normal or Gaussian random variables is that the sum of two of them is itself a normal random variable. More precisely, if X is normal, then for X1 and X2 independent copies of X and any positive constants a and b, d
aX1 + bX2 =cX + d,
(1.1)
d
for some positive c and some d ∈ R. (The symbol = means equality in distribution, i.e. both expressions have the same probability law). In words, equation (1.1) says © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_1
1
2
1 Basic Properties of Univariate Stable Distributions
that the shape of X is preserved (up to scale and shift) under addition. This book is about the class of distributions with this property. Definition 1.1 A random variable X is stable or stable in the broad sense if for X1 and X2 independent copies of X and any positive constants a and b, (1.1) holds for some positive c and some d ∈ R. The random variable is strictly stable or stable in the narrow sense if (1.1) holds with d = 0 for all choices of a and b. A random variable is symmetric stable if it is stable and symmetrically distributed around 0, d
e.g. X = − X. The addition rule for independent normal random variables says that the mean of the sum is the sum of the means and the variance of the sum is the sum of the variances. Suppose X ∼ N(μ, σ 2 ), then the terms on the left-hand side of (1.1) are N(aμ, (aσ)2 ) and N(bμ, (bσ)2 ) respectively, while the right-hand side is N(cμ+d, (cσ)2 ). By the addition rule one must have c2 = a2 +b2 and d = (a+b−c)μ. Expressions for c and d in the general stable case are given below. The word stable is used because the shape is stable or unchanged under sums of the type (1.1). Some authors use the phrase sum stable to emphasize the fact that (1.1) is about a sum and to distinguish between these distributions and max-stable, min-stable, multiplication-stable, and geometric-stable distributions (see Chapter 7). Also, some older literature used slightly different terms: stable was originally used for what is now called strictly stable, quasi-stable was reserved for what is now called stable. Two random variables X and Y are said to be of the same type if there exist d constants A > 0 and B ∈ R with X =AY + B. The definition of stability can be restated as aX1 + bX2 has the same type as X. There are three cases where one can write down closed form expressions for the density and verify directly that they are stable—normal, Cauchy, and Lévy distributions. The parameters α and β mentioned below are defined in Section 1.3. Example 1.1 Normal or Gaussian distributions. X ∼ N(μ, σ 2 ) if it has a density 1 (x − μ)2 f (x) = √ , −∞ < x < ∞. exp − 2σ 2 2πσ The cumulative distribution function, for which there is no closed form expression, is F(x) = P(X ≤ x) = Φ((x − μ)/σ), where Φ(z) = probability that a standard normal r.v. is less than or equal z. Problem 1.1 shows a Gaussian distribution is stable with parameters α = 2, β = 0. Example 1.2 Cauchy distributions. X ∼ Cauchy(γ, δ) if it has density γ 1 − ∞ < x < ∞. f (x) = π γ 2 + (x − δ)2 These are also called Lorentz distributions in physics. Problem 1.2 shows a Cauchy distribution is stable with parameters α = 1, β = 0 and Problem 1.3 gives the d.f. of a Cauchy distribution.
3
normal Cauchy Levy
0.0
0.1
0.2
f(x)
0.3
0.4
0.5
1.1 Definition of stable random variables
−4
−2
0
2
4
x Fig. 1.1 Graphs of standardized normal N(0, 1), Cauchy(1,0), and Lévy(1,0) densities.
Example 1.3 Lévy distributions. X ∼ Lévy(γ, δ) if it has density 1 γ γ exp − f (x) = , δ < x < ∞. 2π (x − δ)3/2 2(x − δ) Note that some authors use the term Lévy distribution for all sum stable laws; here it is only used for this particular distribution. Problem 1.4 shows a Lévy distribution is stable with parameters α = 1/2, β = 1 and Problem 1.5 gives the d.f. of a Lévy distribution. Figure 1.1 shows a plot of these three densities. Both normal distributions and Cauchy distributions are symmetric, bell-shaped curves. The main qualitative distinction between them is that the Cauchy distribution has much heavier tails, see Table 1.1. In particular, there is a tiny amount of probability above 3 for the normal distribution, but a significant amount above 3 for a Cauchy. In a sample of data from these two distributions, there will be (on average) approximately 100 times more values above 3 in the Cauchy case than in the normal case. This is the reason stable distributions are called heavy tailed. In contrast to the normal and Cauchy distributions, the Lévy distribution is highly skewed, with all of the probability concentrated on x > 0, and it has even heavier tails than the Cauchy. General stable distributions allow for varying degrees of tail heaviness and varying degrees of skewness. Other than the normal distribution, the Cauchy distribution, the Lévy distribution, and the reflection of the Lévy distribution, there are no known closed form expressions for general stable densities and it is unlikely that any other stable distributions
4
1 Basic Properties of Univariate Stable Distributions c 0 1 2 3 4 5
P(X > c) Normal Cauchy 0.5000 0.5000 0.1587 0.2500 0.0228 0.1476 0.001347 0.1024 0.00003167 0.0780 0.0000002866 0.0628
Lévy 1.0000 0.6827 0.5205 0.4363 0.3829 0.3453
Table 1.1 Comparison of tail probabilities for standard normal, Cauchy and Lévy distributions.
have closed forms for their densities. Zolotarev (1986) (pg. 155–158) shows that in a few cases stable densities or distribution functions are expressible in terms of certain special functions. This may seem to doom the use of stable models in practice, but recall that there is no closed formula for the normal cumulative distribution function. There are tables and accurate computer algorithms for the standard normal distribution function, and people routinely use those values in normal models. We now have computer programs to compute quantities of interest for stable distributions, so it is possible to use them in practical problems.
1.2 Other definitions of stability There are other equivalent definitions of stable random variables. Two are stated here, the proofs of the equivalence of these definitions are given in Section 3.1. Definition 1.2 Non-degenerate X is stable if and only if for all n > 1, there exist constants cn > 0 and dn ∈ R such that d
X1 + · · · + Xn =cn X + dn, where X1, . . . , Xn are independent, identical copies of X. X is strictly stable if and only if dn = 0 for all n. Section 3.1 shows that the only possible choice for the scaling constants is cn = n1/α for some α ∈ (0, 2]. Both the original definition of stable and the one above use distributional properties of X, yet another distributional characterization is given by the Generalized Central Limit Theorem, Theorem 1.4. While useful, these conditions do not give a concrete way of parameterizing stable distributions. The most concrete way to describe all possible stable distributions is through the characteristic function or Fourier transform. (For a random variable X with distribution ∫ ∞ function F(x), the characteristic function is defined by φ(u) = E exp(iuX) = −∞ exp(iux)dF(x). The function φ(u) completely determines the distribution of X and has many useful mathematical properties, see Appendix A). The sign function is used below, it is defined as
1.3 Parameterizations of stable laws
5
⎧ ⎪ −1 u < 0 ⎪ ⎨ ⎪ sign u = 0 u=0 ⎪ ⎪ ⎪1 u > 0. ⎩ In the expression below for the α = 1 case, 0 · log 0 is always interpreted as limx↓0 x log x = 0. Other than this case, log x will be used only for positive values of x in this chapter. d
Definition 1.3 A random variable X is stable if and only if X =aZ + b, where 0 < α ≤ 2, −1 ≤ β ≤ 1, a 0, b ∈ R and Z is a random variable with characteristic function
exp −|u| α 1 − i β tan π2α (sign u) α1
E exp(iuZ) = (1.2) α = 1, exp −|u| 1 + i β π2 (sign u) log |u| √ where u ∈ R and i = −1. These distributions are symmetric around zero when β = 0 and b = 0, in which case the characteristic function of aZ has the simpler form α α φ(u) = e−a |u | . Problems √ 1.1, 1.2 and 1.4 show that a N(μ, σ 2 ) distribution is stable with (α = 2, β = 0, a = σ/ 2, b = μ), a Cauchy(γ, δ) distribution is stable with (α = 1, β = 0, a = γ, b = δ) and a Lévy(γ, δ) distribution is stable with (α = 1/2, β = 1, a = γ, b = δ). The equivalence of Definitions 1.1, 1.2 and 1.3 are proved in Section 3.1.
1.3 Parameterizations of stable laws Definition 1.3 shows that a general stable distribution requires four parameters to describe: an index of stability or characteristic exponent α ∈ (0, 2], a skewness parameter β ∈ [−1, 1], a scale parameter, and a location parameter. We will use γ for the scale parameter and δ for the location parameter to avoid confusion with the symbols σ and μ, which will be used exclusively for the standard deviation and mean. The parameters are restricted to the range α ∈ (0, 2], β ∈ [−1, 1], γ ≥ 0 and δ ∈ R. Generally γ > 0, although γ = 0 will sometimes be used to denote a degenerate distribution concentrated at δ when it simplifies the statement of a result. Since α and β determine the form of the distribution, they may be considered shape parameters. There are multiple parameterizations for stable laws and much confusion has been caused by these different parameterizations. The variety of parameterizations is caused by a combination of historical evolution, plus the numerous problems that have been analyzed using specialized forms of the stable distributions. There are good reasons to use different parameterizations in different situations. If numerical work or fitting data is required, then one parameterization is preferable. If simple algebraic properties of the distribution are desired, then another is preferred. If one
6
1 Basic Properties of Univariate Stable Distributions
wants to study the analytic properties of strictly stable laws, then yet another is useful. This section will describe three parameterizations; in Section 3.5 eight others are described. In most of the recent literature, the notation Sα (σ, β, μ) is used for the class of stable laws. We will use a modified notation of the form S (α, β, γ, δ; k) for three reasons. First, the usual notation singles out α as different and fixed. In statistical applications, all four parameters (α, β, γ, δ) are unknown and need to be estimated; the new notation emphasizes this. Second, the scale parameter is not the standard deviation (even in the Gaussian case), and the location parameter is not generally the mean. So the neutral symbols γ for the scale (not σ) and δ for the location (not μ) are used. And third, there should be a clear distinction between the different parameterizations; the integer k does that. Users of stable distributions need to state clearly what parameterization they are using, this notation makes it explicit. Definition 1.4 A random variable X is S (α, β, γ, δ; 0) if
πα d γ Z − β tan 2 + δ α 1 X= γZ + δ α = 1,
(1.3)
where Z = Z(α, β) is given by (1.2). X has characteristic function E exp(iuX) =
exp −γ α |u| α 1 + i β(tan π2α )(sign u)(|γu| 1−α − 1) + iδu
exp −γ|u| 1 + i β π2 (sign u) log(γ|u|) + iδu
(1.4) α1 α = 1.
When the distribution is standardized, i.e. scale γ = 1, and location δ = 0, the symbol S (α, β; 0) will be used as an abbreviation for S (α, β, 1, 0; 0). Definition 1.5 A random variable X is S (α, β, γ, δ; 1) if
α1 d γZ + δ X= 2 γZ + δ + β π γ log γ α = 1, where Z = Z(α, β) is given by (1.2). X has characteristic function
exp −γ α |u| α 1 − i β(tan π2α )(sign u) + iδu α 1
E exp(iuX) = α = 1. exp −γ|u| 1 + i β π2 (sign u) log |u| + iδu
(1.5)
(1.6)
When the distribution is standardized, i.e. scale γ = 1, and location δ = 0, the symbol S (α, β; 1) will be used as an abbreviation for S (α, β, 1, 0; 1). Above the general stable law in the 0-parameterization and 1-parameterization were defined in terms of a standardized Z ∼ S (α, β; 1). Alternatively, we could start with Z0 ∼ S (α, β; 0), in which case γZ0 + δ ∼ S (α, β, γ, δ; 0)
1.3 Parameterizations of stable laws
and
γZ0 + δ + βγ tan π2α γZ0 + δ + β π2 γ log γ
7
α1 ∼ S (α, β, γ, δ; 1). α=1
Since the density of Z0 is continuous with respect to x, α, and β, this makes it clear how the 1-parameterization is not continuous as α → 1 (because of the tan(πα/2) term) and not a scale-location family when α = 1 (because the γ log γ term). Note that if β = 0, then the 0- and 1-parameterizations are identical, but when β 0 the asymmetry factor (the imaginary term in the characteristic function) becomes an issue. The symbol SαS is used as an abbreviation for symmetric α-stable. When a scale parameter is used, SαS(γ) = S (α, 0, γ, 0; 0) = S (α, 0, γ, 0; 1). The different parameterizations have caused repeated misunderstandings. Hall (1981a) describes a “comedy of errors” caused by parameterization choices. The most common mistake concerns the sign of the skewness parameter when α = 1. Zolotarev (1986) briskly switches between half a dozen parameterizations. Another example is the stable random number generator of Chambers et al. (1976) which has two arguments: α and β. Most users expect to get a S (α, β; 1) result, however, the routine actually returns random variates with a S (α, β; 0) distribution. One book even excludes the cases β 0 when α = 1. In principle, any choice of scale and location is as good as any other choice. We recommend using the S (α, β, γ, δ; 0) parameterization for numerical work and statistical inference with stable distributions: it has the simplest form for the characteristic function that is continuous in all parameters. See Figure 1.2 for plots of stable densities in the 0-parameterization. It lets α and β determine the shape of the distribution, while γ and δ determine scale and location in the standard way: if X ∼ S (α, β, γ, δ; 0), then (X − δ)/γ ∼ S (α, β, 1, 0; 0). This is not true for the S (α, β, γ, δ; 1) parameterization when α = 1. On the other hand, if one is primarily interested in a simple form for the characteristic function and nice algebraic properties, the S (α, β, γ, δ; 1) parameterization is favored. Because of these properties, this is the most common parameterization in use and we will generally use it when we are proving facts about stable distributions. The main practical disadvantage of the S (α, β, γ, δ; 1) parameterization is that the location of the mode is unbounded in any neighborhood of α = 1: if X ∼ S (α, β, γ, δ; 1) and β > 0, then the mode of X tends to +∞ as α ↑ 1 and tends to −∞ as α ↓ 1. Moreover, the S (α, β, γ, δ; 1) parameterization does not have the intuitive properties desirable in applications (continuity of the distributions as the parameters vary, a location and scale family, etc.). See Figure 1.3 for densities in the 1-parameterization and Section 3.2.2 for more information on modes. When α = 2, a S (2, 0, γ, δ; 0) = S (2, 0, γ, δ; 1) distribution is normal with mean δ, but the standard deviation is not γ. Because of the way the characteristic function is defined above, S (2, 0, γ, δ; 0) =N(δ, 2γ 2 ), so the normal standard deviation is √ σ = 2γ. This fact is a frequent source of confusion when one tries to compare stable quantiles when α = 2 to normal quantiles. This complication is not inherent in the properties of stable laws; it is a consequence of the way the parameterization has been chosen. The 2-parameterization mentioned below rescales to avoid this
1 Basic Properties of Univariate Stable Distributions
0.3
α = 0.5 α = 0.75 α=1 α = 1.25 α = 1.5
0.0
0.1
0.2
f(x)
0.4
0.5
0.6
8
−4
−2
0
x
2
4
0.3
α = 0.5 α = 0.75 α=1 α = 1.25 α = 1.5
0.0
0.1
0.2
f(x)
0.4
0.5
0.6
Fig. 1.2 Stable densities in the S (α, 0.5, 1, 0; 0) parameterization, α = 0.5, 0.75, 1, 1.25, 1.5.
−4
−2
0
x
2
4
Fig. 1.3 Stable densities in the S (α, 0.5, 1, 0; 1) parameterization, α = 0.5, 0.75, 1, 1.25, 1.5.
1.3 Parameterizations of stable laws
9
0.3
α = 0.5 α = 0.75 α=1 α = 1.25 α = 1.5
0.0
0.1
0.2
f(x)
0.4
0.5
0.6
problem, but the above scaling is standard in the literature. Also, when α = 2, β is irrelevant because then the factor tan(πα/2) = 0. While you can allow any β ∈ [−1, 1], it is customary take β = 0 when α = 2; this emphasizes that the normal distribution is always symmetric. Since multiple parameterizations are used for stable distributions, it is perhaps worthwhile to ask if there is another parameterization where the scale and location parameter have a more intuitive meaning. Section 3.5 defines the S (α, β, γ, δ; 2) parameterization so that the location parameter is the mode and the scale parameter agrees with the standard scale parameters in the Gaussian and Cauchy cases. While technically more cumbersome, this parameterization may be the most intuitive for applications. In particular, it is useful in signal processing and in linear regression problems when there is skewness. Figure 1.4 shows plots of the densities in this parameterization.
−4
−2
0
x
2
4
Fig. 1.4 Stable densities in the S (α, 0.5; 2) parameterization, α = 0.5, 0.75, 1, 1.25, 1.5.
A stable distribution can be represented in any one of these or other parameterizations. For completeness, Section 3.5 lists eleven different parameterizations that can be used, and the relationships of these to each other. We will generally use the S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1) parameterizations in what follows to avoid (or at least limit) confusion. In these two parameterizations, α, β, and the scale γ are always the same, but the location parameters will have different values. The notation X ∼ S (α, β, γ, δk ; k) for k = 0, 1 will be shorthand for X ∼ S (α, β, γ, δ0 ; 0) and X ∼ S (α, β, γ, δ1 ; 1) simultaneously. In this case, the parameters are related by (see Problem 1.9)
10
1 Basic Properties of Univariate Stable Distributions
δ1 + βγ tan π2α δ0 = δ1 + β π2 γ log γ
α1 α=1
δ1 =
δ0 − βγ tan π2α δ0 − β π2 γ log γ
α1 α = 1.
In particular, note that in (1.2), Z(α, β) ∼ S α, β, 1, β tan π2α ; 0 = S (α, β, 1, 0; 1) when α 1 and Z(1, β) ∼ S (1, β, 1, 0; 0) = S (1, β, 1, 0; 1) when α = 1.
1.4 Densities and distribution functions While there are no explicit formulas for general stable densities, a lot is known about their theoretical properties. The most basic fact is the following. Theorem 1.1 All (non-degenerate) stable distributions are continuous distributions with an infinitely differentiable density. To distinguish between the densities and cumulative distribution functions in different parameterizations, f (x|α, β, γ, δ; k) will denote the density and F(x|α, β, γ, δ; k) will denote the d.f. of a S (α, β, γ, δ; k) distribution. When the distribution is standardized, i.e. scale γ = 1, and location δ = 0, f (x|α, β; k) will be used for the density and F(x|α, β; k) will be used for the d.f. Since all stable distributions are shifts and scales of some Z ∼ S (α, β; 0), we will focus on those distributions here. The computer program STABLE, using algorithms described in Section 3.3, was used to compute the probability density functions (pdf) and (cumulative) distribution functions (d.f.) below to illustrate the range of shapes of these distributions. Stable densities are supported on either the whole real line or a half line. The latter situation can only occur when α < 1 and (β = +1 or β = −1). Precise limits are given by the following lemma. Lemma 1.1 The support of a stable distribution in the different parameterizations is ⎧ ⎪ [δ − γ tan π2α , ∞) α < 1 and β = 1 ⎪ ⎨ ⎪ support f (x|α, β, γ, δ; 0) = (−∞, δ + γ tan π2α ] α < 1 and β = −1 ⎪ ⎪ ⎪ (−∞, +∞) otherwise ⎩ ⎧ ⎪ [δ, ∞) α < 1 and β = 1 ⎪ ⎨ ⎪ support f (x|α, β, γ, δ; 1) = (−∞, δ] α < 1 and β = −1 ⎪ ⎪ ⎪ (−∞, +∞) otherwise. ⎩ The constant tan π2α appears frequently when working with stable distributions, so it is worth recording its behavior. As α ↑ 1, tan π2α ↑ +∞, the expression is undefined at α = 1, and when α ↓ 1, tan π2α ↓ −∞. This essential discontinuity at α = 1 is sometimes a nuisance when working with stable distributions, but here it is natural: if | β| = 1 then as α ↑ 1, the support in Lemma 1.1 grows to R in a natural way. Another basic fact about stable distributions is the reflection property.
11 1.0
0.4
1.4 Densities and distribution functions
0.0
0.0
0.2
0.1
0.4
0.2
0.6
0.3
0.8
α = 0.7 α = 1.3 α = 1.9
−4
−2
0
2
4
−4
−2
0
2
4
Fig. 1.5 Symmetric stable densities and cumulative distribution functions for Z ∼ S (α, 0; 0), α = 0.7, 1.3, 1.9.
Proposition 1.1 Reflection Property. For any α and β, Z ∼ S (α, β; k), k = 0, 1, 2 d
Z(α, −β)= − Z(α, β). Thus the density and distribution function of a Z(α, β) random variable satisfy f (x|α, β; k) = f (−x|α, −β; k) and F(x|α, β; k) = 1 − F(−x|α, −β; k). More generally, if X ∼ S (α, β, γ, δ; k), then −X ∼ S (α, −β, γ, −δ; k), so f (x|α, β, γ, δ; k) = f (−x|α, −β, γ, −δ; k) and F(x|α, β, γ, δ; k) = 1 − F(−x|α, −β, γ, −δ; k). First consider the case when β = 0. In this case, the reflection property says f (x|α, 0; k) = f (−x|α, 0; k), so the density and d.f. are symmetric around 0. Figure 1.5 shows the bell-shaped density of symmetric stable distributions. As α decreases, three things occur to the density: the peak gets higher, the region flanking the peak gets lower, and the tails get heavier. The d.f. plot shows how as α decreases, the tail probabilities increase. If β > 0, then the distribution is skewed with the right tail of the distribution heavier than the left tail: P(X > x) > P(X < −x) for large x > 0. (Here and later, statements about the tail of a distribution will always refer to large |x|, nothing is implied about |x| small). When β = 1, we say the stable distribution is totally skewed to the right. By the reflection property, the behavior of the β < 0 cases are reflections of the β > 0 ones, with left tail being heavier. When β = −1, the distribution is totally skewed to the left.
1 Basic Properties of Univariate Stable Distributions 1.0
0.4
12
0.0
0.0
0.2
0.1
0.4
0.2
0.6
0.3
0.8
β=0 β = 0.5 β=1
−4
−2
0
2
4
−4
−2
0
2
4
Fig. 1.6 Stable densities and cumulative distribution functions for Z ∼ S (1.9, β; 0), β = 0, 0.5, 1.
When α = 2, the distribution is a (non-standardized) normal distribution. Note that tan π2α = 0 in (1.2) so the characteristic function is real and hence the distribution d
is always symmetric, no matter what the value of β. In symbols, Z(2, β)=Z(2, 0). In general, as α ↑ 2, all stable distributions get closer and closer to being symmetric and β becomes less meaningful in applications (and harder to estimate accurately). Figure 1.6 shows the density and d.f. when α = 1.9, with varying β, and there is little visible difference as β varies. As α decreases, the effect of β becomes more pronounced: the left tail gets lighter and lighter for β > 0, see Figure 1.7 (α = 1.3), Figure 1.8 (α = 0.7), and Figure 1.9 (α = 0.2). The last figure shows that when α approaches 0, the density gets extremely high at the peak, and the d.f. gets closer and closer to an improper distribution (see Section 3.2 for more information on this topic). As Lemma 1.1 shows, the light tail is 0 after some point when α < 1 and | β| = 1. Finally, all stable densities are unimodal, but there is no known formula for the location of the mode. However, the mode of a Z ∼ S (α, β; 0) distribution, denoted by m(α, β), has been numerically computed. The values of m(α, β) are shown for β ≥ 0 in Figure 1.10 and a table of modes is given in Appendix C. By the reflection property, m(α, −β) = −m(α, β). Numerically, it is also observed that P(Z > m(α, β)) > P(Z < m(α, β)) (more mass to the right of the mode) when β > 0, P(Z > m(α, β)) = P(Z < m(α, β)) = 1/2 when β = 0, and by reflection P(Z > m(α, β)) < P(Z < m(α, β)) when β < 0 (more mass to the left of the mode). Note that these statements are all in the 0-parameterization, not the 1-parameterization. See Section 3.2.2 for more information about modes.
13 1.0
0.4
1.5 Tail probabilities, moments, and quantiles
0.0
0.0
0.2
0.1
0.4
0.2
0.6
0.3
0.8
β=0 β = 0.5 β=1
−4
−2
0
2
4
−4
−2
0
2
4
Fig. 1.7 Stable densities and cumulative distribution functions for Z ∼ S (1.3, β; 0), β = 0, 0.5, 1.
1.5 Tail probabilities, moments, and quantiles When α = 2, the normal distribution has well understood asymptotic tail properties. Here we give a brief discussion of the tails of non-Gaussian (α < 2) stable laws, see Section 3.6 for more information. For α < 2, stable distributions have one tail (when α < 1 and β = ±1) or both tails (all other cases) that are asymptotically power laws with heavy tails. The statement h(x) ∼ g(x) as x → a means limx→a h(x)/g(x) = 1. Theorem 1.2 Tail approximation. Let X ∼ S (α, β, γ, δ; 0) with 0 < α < 2, −1 < β ≤ 1. Then as x → ∞, P(X > x) ∼ γ α cα (1 + β)x −α f (x|α, β, γ, δ; 0) ∼ αγ α cα (1 + β)x −(α+1), where cα = sin( π2α )Γ(α)/π. Using the reflection property, the lower tail properties are similar: for −1 ≤ β < 1, as x → ∞ P(X < −x) ∼ γ α cα (1 − β)x −α f (−x|α, β, γ, δ; 0) ∼ αγ α cα (1 − β)x −(α+1) . For all α < 2 and −1 < β < 1, both tail probabilities and densities are asymptotically power laws. When β = −1, the right tail of the distribution is not asymptotically a power law; likewise when β = 1, the left tail of the distribution is not asymptotically a power law. The point at which the tail approximation becomes useful is
1 Basic Properties of Univariate Stable Distributions 1.0
0.4
14
0.0
0.0
0.2
0.1
0.4
0.2
0.6
0.3
0.8
β=0 β = 0.5 β=1
−4
−2
0
2
4
−4
−2
0
2
4
Fig. 1.8 Stable densities and cumulative distribution functions for Z ∼ S (0.7, β; 0), β = 0, 0.5, 1.
a complicated issue, it depends on both the parameterization and the parameters (α, β, γ, δ). See Section 3.6 for more information on both of these issues. Pareto distributions, see Section 7.1, are a class of probability laws with upper tail probabilities given exactly by the right-hand side of Theorem 1.2. The term stable Paretian laws is used to distinguish between the fast decay of the Gaussian law and the Pareto like tail behavior in the α < 2 case. One consequence of heavy tails is that not all moments exist. In most statistical problems, the first moment E X and variance Var(X) = E(X 2 ) − (E X)2 are routinely used to describe a distribution. However, these are not generally useful for heavy tailed distributions, because the integral expressions for these expectations may diverge. In ∫ ∞their place, it is sometimes useful to use fractional absolute moments: E |X | p = −∞ |x| p f (x)dx, where p is any real number. Some review on moments and fractional moments is given in Appendix A. Problem 1.11 shows that for 0 < α < 2, E |X | p is finite for 0 < p < α, and that E |X | p = +∞ for p ≥ α. Formulas for moments of arbitrary stable laws are given in Section 3.7. Thus, when 0 < α < 2, E |X | 2 = E X 2 = +∞ and stable distributions do not have finite second moments or variances. This fact causes some to immediately dismiss stable distributions as being irrelevant to any practical problem. Section 2.10 discusses this in more detail. When 1 < α ≤ 2, E |X | < ∞ and the mean of X is given below. On the other hand, when α ≤ 1, E |X | = +∞, so means are undefined. Proposition 1.2 When 1 < α ≤ 2, the mean of X ∼ S (α, β, γ, δk ; k) for k = 0, 1 is μ = E X = δ1 = δ0 − βγ tan
πα 2
.
15 β=0 β = 0.5 β=1
0
0.0
0.2
10
0.4
20
0.6
30
0.8
40
1.0
1.5 Tail probabilities, moments, and quantiles
−0.4
−0.2
0.0
0.2
0.4
−0.4
−0.2
0.0
0.2
0.4
Fig. 1.9 Stable densities and cumulative distribution functions for Z ∼ S (0.2, β; 0), β = 0, 0.5, 1. Note that both the horizontal and vertical scales are very different from Figures 1.6–1.8.
Consider what happens to the mean of X ∼ S (α, β; 0) as α ↓ 1. Even though the mode of the distribution stays close to 0, it has a mean of μ = −β tan π2α . When β = 0, the distribution is symmetric and the mean is always 0. When β > 0, the mean tends to +∞ because while both tails are getting heavier, the right tail is heavier than the left. By reflection, the β < 0 case has ∫μ ↓ −∞. Finally, when α reaches 1, the ∞ tails are too heavy for the integral E X = −∞ x f (x)dx to converge. In contrast, a S (α, β; 1) distribution keeps the mean at 0 by shifting the whole distribution by an increasing amount as α ↓ 1. For example, when 1 < α < 2, Theorem 3.4 shows that F(0|α, 1; 1) = 1/α, which converges up to 1 as α ↓ 1. In these cases, most of the probability is to the left of zero, and only a tiny amount is to the right of zero, yet the mean is still zero because of the very slow decay of the right tail. The behavior is essentially the same for any β > 0. A S (α, β; 2) distribution keeps the mode exactly at 0, and the mean as a function of (α, β) is continuous, like the mean of a S (α, β; 0) distribution. Note that the stable skewness parameter β is not the same thing as the classical skewness parameter. The latter is undefined for every non-Gaussian stable distribution because neither the third moment nor the variance exists. Likewise, the kurtosis is undefined, because the fourth moment is undefined for every non-Gaussian stable distribution. It is sometimes useful to consider non-integer moments of stable distributions. Such moments are sometimes called fractional lower order moments (FLOM). When X is strictly stable there is an explicit form for such moments. Such moments can
1 Basic Properties of Univariate Stable Distributions
−0.3 −0.4 −0.5
β =0 β = 0.25 β = 0.5 β = 0.75 β =1
−0.7
−0.6
m(alpha,beta)
−0.2
−0.1
0.0
16
0.0
0.5
1.0
1.5
2.0
alpha Fig. 1.10 The location of the mode of a S (α, β; 0) density.
be used as a measure of dispersion of a stable distribution and are used in some estimation schemes. Tables of standard normal quantiles or percentiles are given in most basic probability and statistic books. Let zλ be the λth quantile, i.e. the z value for which the standard normal distribution has lower tail probability λ, i.e. P(Z < zλ ) = λ. The value z0.975 ≈ 1.96 is commonly used: for X ∼ N(μ, σ 2 ), the 0.025th quantile is μ − 1.96σ and the 0.975th quantile is μ + 1.96σ. Quantiles are used to quantify risk. For example, in a Gaussian/normal model for the price of an asset, the interval from μ − 1.96σ to μ + 1.96σ contains 95% of the distribution of the asset price. Quantiles of the standard stable distributions are used in the same way. The difficulty is that there are different quantiles for every value of α and β. The symbol zλ (α, β) will be used for the λth quantile of a S (α, β; 0) distribution: P(Z < zλ (α, β)) = λ. The most accurate way to find these values is to use the program STABLE. If one only needs a few digits of accuracy, Appendix B tabulates values and one can interpolate on the α and β values. Those tables shows selected quantiles for α = 0.1, 0.2,. . ., 1.9, 1.95, 1.99, 2.0 and β = 0, 0.1, 0.2,. . ., 0.9, 1. (Reflection can be used for negative beta: by Proposition 1.1, zλ (α, β) = z1−λ (α, −β)). We caution the reader about two ways that stable quantiles are different from normal quantiles. First, if the distribution is not symmetric, i.e. β 0, then the quantiles are not symmetric. Second, the way the quantiles scale depend on what parameterization is being used. In the S (α, β, γ, δ; 0) parameterization, it is straightforward; in other parameterizations one has to either convert to the S (α, β, γ, δ; 0)
1.6 Sums of stable random variables
17
parameterization using (1.7), or scale and shift according to the definition of each parameterization. These issues are illustrated in the following examples. Example 1.4 Find the 5th and 95th quantiles for X ∼ S (1.3, 0.5, 2, 7; 0). From Appendix B, the 5th quantile is z0.05 (1.3, 0.5) = −2.355 and the 95th quantile is z0.95 (1.3, 0.5) = +5.333 for a standardized S (1.3, 0.5, 1, 0; 0) distribution. So the corresponding quantiles for X are δ − 2.355γ = 2.289 and δ + 5.333γ = 17.666. Example 1.5 If X ∼ S (1.3, 0.5, 2, 7; 1), then using the previous example, the 5th and 95th quantiles are γ(−2.355)+(δ+βγ tan π2α ) = 0.327 and γ(5.333)+(δ+βγ tan π2α ) = 15.704. Alternatively, S (1.3, 0.5, 2, 7; 1) = S (1.3, 0.5, 2, 5.037; 0) by (1.7), so the 5th and 95th quantiles are 2(−2.355) + 5.037 = 0.327 and 2(5.333) + 5.037 = 15.704.
1.6 Sums of stable random variables A basic property of stable laws, both univariate and multivariate, is that sums of α-stable random variables are α-stable. In the independent case, the exact parameters of the sums are given below. As always, the results depend on the parameterization used. In these results it is essential that the summands all have the same α, as Problem 1.12 shows that otherwise the sum will not be stable. Section 7.4 discusses this issue briefly. When the summands are dependent, the sum is stable but the precise statement is more difficult and depends on the exact dependence structure, which is beyond the scope of this volume. Proposition 1.3 The S (α, β, γ, δ; 0) parameterization has the following properties. (a) If X ∼ S (α, β, γ, δ; 0), then for any a 0, b ∈ R, aX + b ∼ S (α, (sign a)β, |a|γ, aδ + b; 0). (b) The characteristic functions, densities, and distribution functions are jointly continuous in all four parameters (α, γ, β, δ) and in x. (c) If X1 ∼ S (α, β1, γ1, δ1 ; 0) and X2 ∼ S (α, β2, γ2, δ2 ; 0) are independent, then X1 + X2 ∼ S (α, β, γ, δ; 0), where β=
β1 γ1α + β2 γ2α , γ1α + γ2α
γ α = γ1α + γ2α,
δ1 + δ2 + tan π2α [βγ − β1 γ1 − β2 γ2 ] δ= δ1 + δ2 + π2 [βγ log γ − β1 γ1 log γ1 − β2 γ2 log γ2 ]
α1 α = 1.
The formula γ α = γ1α + γ2α in (c) is the generalization of the rule for adding variances of independent random variables: σ 2 = σ12 + σ22 . It holds for both parameterizations. Note that one adds the αth power of the scale parameters, not the scale parameters themselves.
18
1 Basic Properties of Univariate Stable Distributions
Proposition 1.4 The S (α, β, γ, δ; 1) parameterization has the following properties. (a) If X ∼ S (α, β, γ, δ; 1), then for any a 0, b ∈ R,
S (α, (sign a)β, |a|γ, aδ + b; 1) α1 aX + b ∼ 2 S 1, (sign a)β, |a|γ, aδ + b − π βγa log |a|; 1 α = 1. (b) The characteristic functions, densities, and distribution functions are continuous away from α = 1, but discontinuous in any neighborhood of α = 1. (c) If X1 ∼ S (α, β1, γ1, δ1 ; 1) and X2 ∼ S (α, β2, γ2, δ2 ; 1) are independent, then X1 + X2 ∼ S (α, β, γ, δ; 1), where β=
β1 γ1α + β2 γ2α , γ1α + γ2α
γ α = γ1α + γ2α,
δ = δ1 + δ2 .
The corresponding results for the S (α, β, γ, δ; 2) parameterization are given in Proposition 3.3. Part (a) of the above results shows that γ and δ are standard scale and location parameters in the S (α, β, γ, δ; 0) parameterization, but not in the S (α, β, γ, δ; 1) parameterization when α = 1. In contrast, part (b) shows that the location parameter δ of a sum is the sum of the location parameters δ1 + δ2 only in the S (α, β, γ, δ; 1) parameterization. Unfortunately there is no parameterization that has both properties. In the symmetric case, i.e. β1 = β2 = 0, both the previous propositions are simpler to state: if independent X1 ∼ S (α, 0, γ1, δ1 ; k) and X2 ∼ S (α, 0, γ2, δ2 ; k) (with k = 0 or k = 1), then X1 + X2 ∼ S (α, 0, γ, δ; k) with γ α = γ1α + γ2α and δ = δ1 + δ2 . This is exactly like the normal case: if X1 ∼N(μ1, σ12 ) and X2 ∼N(μ2, σ22 ) and independent, then X1 + X2 ∼N(μ, σ 2 ), where σ 2 = σ12 + σ22 and μ = μ1 + μ2 . By induction (see Problem 1.13), one gets formulas for sums of n stable random variables: for X j ∼ S α, β j , γ j , δ j ; k , j = 1, 2, . . . , n independent and arbitrary w1, . . . , wn , the sum w1 X1 + w2 X2 + · · · + wn Xn ∼ S (α, β, γ, δ; k), where γα =
n
|w j γ j | α
j=1
n β=
j=1
β j (sign w j )|w j γ j | α γα
⎧ ⎪ w j δ j + tan π2α βγ − j β j w j γ j j ⎪ ⎪ ⎪ ⎪ ⎨ w δ + 2 βγ log γ − β w γ log |w γ | ⎪ j j j j j j j j π δ = j ⎪ ⎪ w δ ⎪ j j j ⎪ ⎪ ⎪ w δ − 2 β w γ log |w | j ⎩ j j j π j j j j
k = 0, α 1 k = 0, α = 1 k = 1, α 1 k = 1, α = 1.
(1.7)
1.7 Simulation
19
Note that if β j = 0 for all j, then β = 0 and δ = j w j δ j . An important case is the scaling property for stable random variables: when the terms are independent and identically distributed, say X j ∼ S (α, β, γ, δ; k), then X1 + · · · + Xn ∼ S α, β, n1/α γ, δn ; k , (1.8) where
⎧ ⎪ nδ + γ β tan π2α (n1/α − n) ⎪ ⎨ ⎪ δn = nδ + γ β π2 n log n ⎪ ⎪ ⎪ nδ ⎩
k = 0, α 1 k = 0, α = 1 k = 1.
This is a restatement of Definition 1.2: the shape of the sum of n terms is the same as the original shape. We stress that no other distribution has this property. With the above properties of linear combinations of stable random variables, we can characterize strict stability. Proposition 1.5 Let X ∼ S (α, β, γ, δk ; k) for k = 0, 1. (a) If α 1, then X is strictly stable if and only if δ1 = δ0 − βγ tan (b) If α = 1, then X is strictly stable if and only if β = 0.
πα 2
= 0.
Here there is an essential difference between the α = 1 case and all other cases. When α = 1, only the symmetric case is strictly stable and in that case the location parameter δ can be anything. In contrast, when α 1, any β can be strictly stable, as long as the location parameter is chosen correctly. This can be rephrased as follows: any stable distribution with α 1 can be made strictly stable by shifting; when α = 1, a symmetric stable distribution with any shift is strictly stable and no shift can make a nonsymmetric 1-stable distribution strictly stable. In addition to the basic properties described above, there are other linear and nonlinear properties of stable random variables given in Section 3.9.
1.7 Simulation In this section U, U1, U2 will be used to denote independent Uniform(0,1) random variables. For a few special cases, there are simple ways to generate stable random variables. For the normal case, Problem 1.15 shows X1 = μ + σ −2 log U1 cos 2πU2 (1.9) X2 = μ + σ −2 log U1 sin 2πU2 give two independent N(μ, σ 2 ) random variables. This is known as the Box-Muller algorithm. For the Cauchy case, Problem 1.16 shows
20
1 Basic Properties of Univariate Stable Distributions
X = γ tan(π(U − 1/2)) + δ
(1.10)
is Cauchy(γ, δ). For the Lévy case, Problem 1.17 shows X=γ
1 +δ Z2
(1.11)
is Lévy(γ, δ) if Z ∼ N(0, 1). In the general case, the following result of Chambers et al. (1976) gives a method for simulating any stable random variate. Theorem 1.3 Simulating stable random variables Let Θ and W be independent with Θ uniformly distributed on (− π2 , π2 ), W exponentially distributed with mean 1, 0 < α ≤ 2. (a) The symmetric random variable (1−α)/α ⎧ ⎪ cos((α − 1)Θ) ⎪ ⎨ sin αΘ ⎪ Z = (cos Θ)1/α W ⎪ ⎪ ⎪ tan Θ ⎩
α1 α=1
has a S (α, 0; 0) = S (α, 0; 1) distribution. (b) In the nonsymmetric case, for any −1 ≤ β ≤ 1, define θ 0 = arctan(β tan(πα/2))/α when α 1. Then (1−α)/α ⎧ ⎪ sin α(θ 0 + Θ) cos(αθ 0 + (α − 1)Θ) ⎪ ⎪ α1 ⎨ ⎪ 1/α πW αθ 0 cos Θ) Z = (cos ⎪ 2 W cos Θ ⎪ ⎪ α=1 ⎪ π2 π2 + βΘ tan Θ − β log π + βΘ ⎩ 2 has a S (α, β; 1) distribution. It is easy to get Θ and W from independent Uniform(0,1) random variables U1 and U2 : set Θ = π(U1 − 12 ) and W = − log U2 . To simulate stable random variables with arbitrary shift and scale, (1.3) is used for the 0-parameterization and (1.5) is used for the 1-parameterization. Since there are numerical problems evaluating the expressions involved when α is near 1, the STABLE program uses an algebraic rearrangement of the formula. Section 3.3.3 gives a proof of this formula and a discussion of the numerical implementation of Chambers et al. (1976).
1.8 Generalized Central Limit Theorem and Domains of Attraction The classical Central Limit Theorem says that the normalized sum of independent, identical terms with a finite variance converges to a normal distribution. To be more
1.8
Generalized Central Limit Theorem
21
precise, let X1, X2, X3, . . . be independent identically distributed random variables with mean μ and variance σ 2 . The classical Central Limit Theorem states that the sample mean X n = (X1 + · · · + Xn )/n will have Xn − μ d √ −→Z ∼ N(0, 1) as n → ∞. σ/ n d
(The notation Yn −→Y means the sequence of r.v.s Yn converges in distribution, i.e. the corresponding distributions satisfy Fn (y) → F(y) at all continuity points of F.) To match the notation in what follows, this can be rewritten as d
(1.12) an (X1 + · · · + Xn ) − bn −→Z ∼ N(0, 1) as n → ∞, √ √ where an = 1/(σ n) and bn = nμ/σ. The Generalized Central Limit Theorem states that if the finite variance assumption is dropped, the only possible resulting limits are stable. Theorem 1.4 Generalized Central Limit Theorem A non-degenerate random variable Z is α-stable for some 0 < α ≤ 2 if and only if there is an independent, identically distributed sequence of random variables X1, X2, X3, . . . and constants an > 0, bn ∈ R with d
an (X1 + · · · + Xn ) − bn −→Z. The following definition is useful in discussing convergence of normalized sums. Definition 1.6 A random variable X is in the domain of attraction of Z if there exists constants an > 0, bn ∈ R with d
an (X1 + · · · + Xn ) − bn −→Z, where X1, X2, X3, . . . are independent identically distributed copies of X. DA(Z) is the set of all random variables that are in the domain of attraction of Z. Theorem 1.4 says that the only possible non-degenerate distributions with a domain of attraction are stable. Section 3.13 proves the Generalized Central Limit Theorem, characterizes the distributions in DA(Z) in terms of their tail probabilities, and gives information about the norming constants an and bn . For example, suppose X is a random variable with tail probabilities that satisfy x α P(X > x) → c+ and x α P(X < −x) → c− as x → ∞, with c+ + c− > 0 and 1 < α < 2. Then μ = E X must be finite and Theorem 3.12 shows that the analog of (1.12) is d
an (X1 + · · · + Xn ) − bn −→Z ∼ S (α, β, 1, 0; 1) as n → ∞, 1/α −1/α when an = (2Γ(α) sin( π2α ))/(π(c+ + c− )) n , bn = nan μ and β = (c+ − − + − c )/(c + c ). In this case, the rate at which of the tail probabilities of X decay determines the index α and the relative weights of the right and left tail determine the skewness β.
22
1 Basic Properties of Univariate Stable Distributions
1.9 Multivariate stable This volume focuses on univariate stable laws; a second volume is in progress that focuses on the multivariate stable laws. For the curious, we state the multivariate equivalent of Definition 1.1 for stable distributions. Definition 1.7 A non-degenerate d-dimensional random vector X = (X1, . . . , Xd )T is stable or stable in the broad sense if for X(1) and X(2) independent copies of X and any positive constants a and b, aX(1) + bX(2) =cX + d, d
(1.13)
for some positive c = c(a, b) and some d = d(a, b) ∈ Rd . The random vector is strictly stable or stable in the narrow sense if (1.13) holds with d = 0 for all choices of a and b. A random vector is symmetric stable if it is stable and symmetrically d
distributed around 0, e.g. X= − X. The main complication of multivariate stable laws is that a very wide range of dependence structures are possible. The support of the distribution can be the whole space Rd or a subspace or a cone (when α < 1). The cases where the components are independent result in level sets of the density which are cross shaped. But there are also circularly and elliptically contoured stable laws as well as many unexpected cases, e.g. triangular- and star-shaped level sets. The full description of the possibilities requires a spectral measure, a finite Borel measure that is supported on the unit sphere in Rd . See Chapter 2 of Samorodnitsky and Taqqu (1994) for definitions and properties.
1.10 Problems Problem 1.1 Show directly using the convolution formula (Appendix A.1) that the normal distributions are stable. Show that a2 + b2 = c2 in (1.1), so α = 2. Conclude √ √ 2 that N(μ, σ ) = S 2, 0, σ/ 2, μ; 0 = S 2, 0, σ/ 2, μ; 1 and S (2, 0; γ, δ; 0) = S (2, 0; γ, δ; 1) = N(δ, 2γ 2 ). Problem 1.2 Show directly using the convolution formula that the Cauchy distributions are stable. Show that a + b = c in (1.1), so α = 1 and conclude that Cauchy(γ, δ) = S (1, 0, γ, δ; 0) = S (1, 0, γ, δ; 1). Problem 1.3 Show that the cumulative distribution function of a Cauchy distribution is F(x|1, 0, γ, δ; 0) = F(x|1, 0, γ, δ; 1) = (1/2) + arctan((x − δ)/γ)/π. Problem 1.4 Show directly using the convolution formula that Lévy distributions are stable. Show that a1/2 + b1/2 = c1/2 in (1.1), so α = 1/2 and conclude that Lévy(γ, δ) = S (1/2, 1, γ, δ; 1) = S (1/2, 1, γ, δ + γ; 0).
1.10
Problems
23
Problem 1.5 Show that the cumulative distribution function of a Lévy distribution X ∼ S (1/2, 1, γ, δ; 1) is, for x > δ F(x|1/2, 1, γ, δ; 1) = 2 1 − Φ γ/(x − δ) , where Φ(x) is the d.f. of a standard normal distribution. Problem 1.6 What is wrong with the following argument? If X1, . . . , Xn ∼Gamma(α, β) are independent, then X1 + · · · + Xn ∼Gamma(nα, β), so gamma distributions must be stable distributions. d
Problem 1.7 Use the characteristic function (1.2) to show that Z(α, −β)= − Z(α, β). This proves Proposition 1.1. Problem 1.8 Use the definitions of the different parameterizations and the characteristic function (1.2) to show that the characteristic functions in (1.4) and (1.6) are correct. Problem 1.9 Show that the conversions between the parameterizations in (1.7) are correct. (Use either the characteristic functions in (1.4) and (1.6) or the definitions of the parameterizations in terms of Z(α, β)). Problem 1.10 A Pareto(α, c) distribution is defined in Section 7.1. Show that if p < α, then E X p exists and find its value, but if p ≥ α, then E X p = ∞. Problem 1.11 Extend the previous problem to show that if X is any random variable with a bounded density for which both left and right tail densities are asymptotically equivalent to Pareto(α, c), then E |X | p is finite if p < α and infinite if p ≥ α. Problem 1.12 Show that the sum of two independent stable random variables with different αs is not stable. Section 7.4 gives a brief discussion of what happens when you combine different indices of stability. Problem 1.13 Derive (1.7) and (1.8) for the sums of independent α-stable r.v. Problem 1.14 Simulate n = 1, 000 uniform random variables and let sk2 be the sample variance of the first k values. A “running sample variance” plot is a graph of (k, sk2 ), k = 2, 3, 4, . . . , n. Repeat the process with normal random variables, Cauchy random variables and Pareto random variables (see Section 7.1 for a method of simulating Pareto distributions) with α = 0.5, 1, 1.5. Contrast the behavior of sk2 for the different distributions. Problem 1.15 Show directly that (1.9) gives independent N(μ, σ 2 ) terms. Theorem 1.3 also works when α = 2 to generate normal random variates, but it requires two uniforms to generate one normal, whereas (1.9) generates two normals from two uniforms. Problem 1.16 Use the cumulative distribution function for a Cauchy(γ, δ) distribution from Problem 1.3 to prove (1.10). Problem 1.17 Use Problem 1.5 to prove (1.11).
Chapter 2
Modeling with Stable Distributions
Stable distributions have been proposed as a model for many types of physical and economic systems. There are several reasons for using a stable distribution to describe a system. The first is where there are solid theoretical reasons for expecting a non-Gaussian stable model, e.g. reflection off a rotating mirror yielding a Cauchy distribution, hitting times for a Brownian motion yielding a Lévy distribution, the gravitational field of stars yielding the Holtsmark distribution; see below for these and other examples. The second reason is the Generalized Central Limit Theorem which states that the only possible nontrivial limit of normalized sums of independent identically distributed terms is stable. It is argued that some observed quantities are the sum of many small terms—the price of a stock, the noise in a communication system, etc., and hence a stable model should be used to describe such systems. The third argument for modeling with stable distributions is empirical: many large data sets exhibit heavy tails and skewness. The strong empirical evidence for these features combined with the Generalized Central Limit Theorem is used to justify the use of stable models. Several monographs focus on stable distributions: Zolotarev (1986), Samorodnitsky and Taqqu (1994), Janicki and Weron (1994), Nikias and Shao (1995), Uchaikin and Zolotarev (1999). The last book contains over 200 pages of applications of stable distributions in probabilistic models, correlated systems and fractals, anomalous diffusion and chaos, physics, radiophysics, astrophysics, stochastic algorithms, financial applications, biology, and geology. In this chapter, we give an overview of some applications of stable laws. Initially, we will discuss applications where there is an underlying physical model that leads to a stable distribution. Then examples where heavy tails and skewness appear empirically will be discussed. Table 2.1 shows a simple summary of different possible tail behavior and models that can be used for such problems. This table is not exhaustive; it is meant solely to highlight when a stable model is appropriate. Since the focus of this book is on stable models, we will focus on the last row of the table. While infinite variance may be difficult to accept in practical problems, the goal in © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_2
25
26
2 Modeling with Stable Distributions
modeling is to develop a good model for the problem, not a perfect one. Section 2.10 discusses the appropriateness of using infinite variance models. symmetric non-symmetric Uniform(−1,1) Uniform(0,1) All normal exponential, skewed normal moments light finite mixtures of normals exist tails symmetric Laplace skewed Laplace Limited finite variance t (d.f. > 2) Pareto, α > 2 moments infinite sym. stable (α < 2) non-sym. stable (α < 2) exist variance t (d.f. < 2) Pareto, α < 2 no tails
Table 2.1 Various types of tail behavior and possible distributions used to model them.
Fig. 2.1 Aerial view of a lighthouse at coordinates (a, b). It is rotating at a uniform rate and flashes randomly at angle θ.
2.1 Lighthouse problem Consider a lighthouse located off-shore from a straight coastline. We take the coastline as the x-axis and assume the lighthouse is located at the point (a, b), see Figure 2.1. The light is always on and rotating at a uniform rate. We will only consider the light that hits the coastline, ignoring the light headed away from the coast, so the angle θ is uniform on (−π/2, π/2). What is the distribution of the intensity X along the coastline? From the figure, the light shines to the left of x precisely when the tangent of the angle θ is less than (x − a)/b. Hence the probability that the light hits the shore to the left of x is F(x) = P(X ≤ x) = P(tan θ ≤ (x − a)/b) = P(θ ≤ arctan((x − a)/b)) = (π/2 + arctan((x − a)/b)) /π. Differentiation shows that X has density f (x) = b/(π b2 + (x − a)2 ), i.e. the intensity distribution is Cauchy(b, a). A variation of this problem illustrates how sample means can be misleading. This example is adapted from Sivia (1996). Assume that the top of the lighthouse is still
2.1 Lighthouse problem
27
-500 0
x
500 1000
spinning at a fixed rate, but the circuit controlling the lamp is broken, with the result that the light flashes momentarily at random times (uniformly spread over all possible angles) as the light rotates. The same argument as above shows that the distribution of the location of the place where the light hits the coastline is Cauchy(b, a). Now, suppose we record the locations X1, X2, . . . , Xn where a number of flashes hit the shore. An intuitive estimate of the horizontal location a is the sample mean X = (X1 +· · ·+ Xn )/n. But by (1.8), the sum Sn := X1 +· · ·+ Xn is S (1, 0, nγ, a; 0), and then by Proposition 1.3, X = Sn /n ∼ S (1, 0, γ, a; 0). Thus the mean X has a Cauchy distribution with fixed scale, i.e. the same distribution as an individual observation X j . A simulation of the sample behavior of X as a function of n is illustrated in Figure 2.2. The point is that the sample mean does not help us estimate the position of the lighthouse, even with a large sample, because the sample mean does not converge to the location as n → ∞. (One can estimate the location parameter of a stable distribution using one of the techniques of Chapter 4.) This problem gets worse when the summands have even heavier tails—see Section 2.9, Problems 2.1 and 2.2.
0
200
400
600
800
1000
600
800
1000
3 2 1 -2 -1 0
running mean
n
0
200
400
n
Fig. 2.2 Simulated sample of size n = 1000 from a standard Cauchy distribution centered at 0 (top) and sample behavior of X n (bottom).
The argument above applies in more general situations. Anytime a uniformly rotating object is radiating or reflecting something (photons, radioactive particles, etc.) that strikes a flat surface, the observed intensity on the surface will be Cauchy. More generally, for any bivariate random vector (X1, X2 ) with a radially symmetric distribution, the ratio Z = X1 /X2 will have a Cauchy distribution. This result does
28
2 Modeling with Stable Distributions
not depend on what the marginal distributions of X1 and X2 are, only the radially symmetry is used. See Lemma 3.20 for the formal statement of this property.
2.2 Distribution of masses in space If particles of fixed mass are randomly spread around in space, what is the distribution of the net gravitational field at a fixed point? In 1920, the physicist Holtzmark showed that the answer is a stable distribution. Start with the one-dimensional case: masses of unit weight are randomly distributed along a line. Without loss of generality, we may take the point to be the origin. Approximate the distribution of particles by using a uniform distribution X on [−a, a] and then taking a → ∞. The inverse square law shows that if Xi is the location of the i th particle, then the net gravitational field at the origin is Y=
n sign Xi i=1
Xi2
,
(2.1)
where n = n(a) is the number of particles in the interval [−a, a]. If λ is the average density of particles, then n ≈ 2aλ. By Problem 2.4, the characteristic function of Y is n E exp(iuY ) = E exp(iu(sign X1 )/X12 ) 2aλ 1 c|u| 1/2 +o ≈ 1− → exp −2λc|u| 1/2 , a → ∞. a a Thus, the gravitational field at a fixed point is a symmetric stable distribution with α = 1/2. Problem 2.5 shows that if the inverse square law in (2.1) is replaced by the inverse pth power with any p > 1/2, the result is a symmetric stable law with α = 1/p. The argument generalizes to multiple dimensions: the masses can be spread around in dimension d = 2 or 3, in which case X is the vector location of the object i. Applying the above argument to any component of the field shows that each component is stable. The radial symmetry in three-dimensional space can be used to show that the resulting Holtzmark distribution is a three-dimensional isotropic α = 3/2 stable law, see Feller (1971), page 173. It also applies to other situations, e.g. the net electrical field at a point from randomly scattered charges in space.
2.3 Random walks
29
2.3 Random walks The standard random walk Xn : n = 0, 1, 2, . . . starts at X0 = 0 and randomly moves up 1 step or down 1 step with equal probability, Figure 2.3(a) shows a realization of a path for a standard walk. Such random walks are used as a model for diffusion, where a particle moving through a fluid is bumped up or down. They are also used to model the fluctuation in the price of a stock, where the price at each time period can move up or down. To describe the model, let Z j , j = 1, 2, 3, . . . be independent random variables taking values +1 or −1, each with probability 1/2. Then the standard walk can be defined by X0 = 0 and for n ≥ 1, Xn = Z1 + Z2 + · · · + Zn .
(2.2)
From this point of view, one can think of Xn as the net movement up or down from the sum of the Z j s. At any step n > 0, Xn has a (shifted and scaled) binomial distribution. In particular, for n large, Xn has approximately a normal distribution (this is the deMoivre-Laplace Theorem, a special case of the Central Limit Theorem). One limitation of the standard random walk is the assumption that the step is always up or down exactly one unit. When an arbitrary distribution is allowed for the (independent) summands Z j in (2.2), a general random walk is described. For example, letting Z j take on values −1, 0, and +1 will allow a random walk which can either move down, up, or stay put at each step—see Figure 2.3(b). If the Z j s are N(0,1), then the sum Xn is the sum of n independent Gaussian terms and thus has an N(0,n) distribution, see Figure 2.3(c). Taking the Z j s to be Uniform(−1,1) gives a different random walk, with behavior illustrated by Figure 2.3(d). In all four of these cases, indeed for any distribution Z with mean 0 and finite variance, Xn is a sum of n independent, identical terms and, so for large n, the distribution of Xn is approximately normal. Now suppose the summands Z j s are exactly stable, say Z j ∼ S (α, 0, γ, 0; 0), then the sum Xn in (2.2) is stable: Xn ∼ S α, 0, n1/α γ, 0; 0 by (1.8). Figures 2.4(a) and (b) show random walks with step sizes standardized symmetric stable distribution with α = 1.5 and α = 0.7 respectively. Note the qualitative difference in these graphs and in Figure 2.3. The vertical scale in these heavy tailed random walks is much larger and the paths are visually dominated by a few large steps, especially when α is small. More generally, if the summands have any heavy tailed distribution, the paths of the random walk will be dominated by large jumps. Figures 2.4(c) and (d) show examples with step sizes a Pareto distribution with tail index α = 0.7: P(Z j > z) = 0.7/z1.7 , z > 1 and symmetrized Pareto with the same α. The one-sided step distributions result in an increasing random walk. The Generalized Central Limit Theorem shows that for large n, Xn = Z1 + Z2 + · · · + Zn is approximately α-stable in these two cases. Moreover, no other limiting distribution is possible. In dimension greater than 1, a random walk is defined in the same way as in (2.2), with the terms all being vectors. Figure 2.5 shows some simulated stable paths with n = 500 isotropic steps in dimension two. (Note: in the Gaussian case, isotropic
30
2 Modeling with Stable Distributions (a)
−10
−10
−5
−5
0
0
5
5
10
10
(b)
0
10
20
30
40
0
10
(c)
20
30
40
30
40
10 5 0 −5 −10
−10
−5
0
5
10
(d)
0
10
20
30
40
0
10
20
Fig. 2.3 Light tail random walks with 40 steps. In the upper left, (a) shows a standard random walk with ±1 steps. In the upper right, (b) shows a random walk taking values −1, 0, +1, each with probability 1/3. In the lower left, (c) shows N(0,1) steps. In the lower right, (d) shows Uniform(−1,1) steps.
step is the same as independent components, but when α < 2, isotropic steps have dependent components.) The heavier tails of the steps will imply a larger range. An important qualitative difference between the Gaussian case and the stable case is in the geometry of the resulting shapes. When α = 2, the paths tend to be tangled and tightly compacted; as α decreases, there are large jumps, and islands appear. Such paths are called Lévy flights, a term made popular in Mandelbrot (1982). The book by Schlesinger et al. (1995) gives physical examples of Lévy flights. Radio waves traveling through interstellar space take paths that are random. Boldyrev and Gwinn (2003) have used stable distributions to explain observations of interstellar scintillations. The classical Gaussian theory fails to produce the anomalous scaling of signal widths with pulsar distances. Stable distributions have been
31
0
x(n)
-100 -50
0 -40
-20
x(n)
20
50
40
100
2.3 Random walks
0
10
20
30
40
0
10
n
30
40
30
40
x(n)
-300
-100 0 100
-200 0 200
600
300
n
-600
x(n)
20
0
10
20
n
30
40
0
10
20
n
Fig. 2.4 Heavy tailed random walks with 40 steps. The upper left plot (a) shows S (1.5, 0; 0) steps, upper right (b) shows S (0.7, 0; 0) steps. In the lower left plot (c), the increasing random walk is generated by Pareto(0.7,1) summands, while the lower right (d) uses symmetrized Pareto(0.7,1) summands. Note the varying vertical scales and contrast the range with the previous figure.
used by Freeman and Chisham (2005) to model the Doppler spectral width of backscatter from radar measurements of the magnetosphere. Stable random walks have been proposed to model movements of organisms. Viswanathan et al. (1999) look at movements of foraging animals, where there may be many small movements around an area and then jumps to a distant area. Reynolds and Frye (2007) describe an experiment where the paths of Drosophila meglanogaster mosquitos are recorded and fit with a stable law. Brockmann et al. (2006) examine human travel. And these Lévy flights have been proposed to model the spread of an epidemic, where infection initially spreads locally, and then a large jump is made, e.g. an infected person takes an international trip. See Brockmann and Hufnagel (2007), Linder et al. (2008), Boto and Stollenwerk (2009), Machado and Lopes (2020). Stable laws also occur in limiting laws for random walks in random environments, see Kesten et al. (1975), Hughes (1995), and Mayer-Wolf et al. (2004). Finally, stable walks can be used in metrology to classify shapes of complicated objects, see Verdi (2014) and Audus et al. (2020).
32
2 Modeling with Stable Distributions (b)
−40
−150 −100
−20
−50
0
0
50
20
100
40
150
(a)
−40
−20
0
20
40
−150
−100
(c)
−50
0
50
100
150
−500
−5000
0
0
500
5000
(d)
−500
0
500
−5000
0
5000
Fig. 2.5 Bivariate stable random walks with 500 isotropic steps and varying α. In (a) α = 2, in (b) α = 1.6, in (c) α = 1.2, and in (d) α = 0.8. Note the increasing scales as α decreases.
2.4 Hitting time for Brownian motion Let X(t), t ≥ 0, be a standard one-dimensional Brownian motion, i.e. a random process with X(0) = 0, stationary independent increments that are Gaussian with mean zero and variance Var(X(t) − X(s)) = |t − s|. Pick a level c > 0 and define the hitting time T = inf {t > 0 : X(t) ≥ c}. We will show that the random time T has a Lévy stable distribution. Conditioning on whether X(t) > c or X(t) < c, P(T < t) = P(T < t, X(t) > c) + P(T < t, X(t) < c).
2.5 Differential equations and fractional diffusions
33
We claim that these two terms are equal: if T < t, then the Brownian motion hit the level c at time before t, so by symmetry, it is equally likely to be below c as above c at time t. (We are using the strong Markov property of Brownian motion here.) So P(T < t) = 2P(T < t, X(t) > c). But the event {X(t) > c} is a subset of the event {T < t}, so P(T < t, X(t) > c) = P(X(t) > c). Now X(t) ∼ N(0, t), so letting Φ be the standard normal d.f., √ P(T < t) = 2P(X(t) > c) = 2(1 − Φ(c/ t)) t > 0. The density of T is √ d c2 exp(−c2 /(2t)) f (t) = 2(1 − Φ(c/ t)) = , dt 2π t 3/2 the Lévy(c2, 0) = S 1/2, 1, c2, 0; 1 density. This result generalizes. Let 1 < α < 2 and let X(t) be a completely skewed to the right α-stable process. Lévy processes are stochastic processes that are continuous in probability with stationary, independent increments. The later condition means that for any h > 0, {X(t + h) − X(h), t ≥ 0}, and {X(t), t ≥ 0} have the same finitedimensional distributions. The hitting time for (−∞, −c] is T = T(α, c) = inf {t > 0 : X(t) ≤ −c}. Sato (1999) shows that T is a positive (1/α)-stable r.v.
2.5 Differential equations and fractional diffusions In addition to their uses in probability and statistics, stable densities solve certain integro-differential equations. One example is the fractional heat equation: ∂u = −(−Δ)α/2 u. ∂t When α = 2, the Green’s functions of the standard Laplacian Δ are the normal densities f (·|2, 0, ct 1/2, 0; 0). When α ∈ (0, 2), the Green’s functions for the fractional differential operator Δα/2 are the symmetric stable densities f (·|α, 0, ct 1/α, 0; 0), see Nolan (2019c). More information on this topic can be found in the book by Meerschaert and Sikorskii (2012). We illustrate the difference between a normal diffusion and a fractional diffusion with an example of fractional diffusion, with initial condition u(x, 0) = 1 if 0.5 < |x| < 1.5 and u(x, 0) = 0 otherwise. Figure 2.6 shows numerically derived solutions for α = 1.2 and α = 2 with c = 1. There are two qualitative differences between the solutions. The first is the fast diffusion exhibited when α < 2: for large |x|, the solutions u(x, t) are higher for the fractional diffusion than the normal one when
34
2 Modeling with Stable Distributions
t > 0. This observed behavior in applications is one of the reasons these anomalous diffusions are of interest. The second fact is that the higher values of the stable densities near x = 0 leads to a slow diffusion around the peaks, i.e. the solution u(x, t) remembers the peaks of u0 (x) longer than the standard heat equation.
0.4 0.0
u(x,t)
0.8
t=0
−6
−4
−2
0
2
4
6
0.2
0.4
α = 1.2 α=2
0.0
u(x,t)
0.6
t = 1/2
−6
−4
−2
0
2
4
6
2
4
6
0.4 0.2 0.0
u(x,t)
0.6
t=1
−6
−4
−2
0
Fig. 2.6 Solution u(x, t) for the fractional heat equation. The top plot shows u(x, 0), the middle plot shows u(x, 1/2), the bottom shows u(x, 1). The solid curve corresponds to a fractional diffusion with α = 1.2, the dashed line corresponds to the standard heat equation with α = 2.
These kinds of models are used to describe fluid movement in porous media, e.g. Cushman and Moroni (2001); geological processes, e.g. Cushman et al. (2005); and turbulence, e.g. Ditlevsen (2004). West (2016) argues that fractional calculus is a necessary tool in modern science. Section 3.15, summarizes how stable distributions give solutions to certain ordinary and fractional differential equations, and give a brief introduction to stable semi-groups. Experiments of anomalous diffusion in geology lead to measurements of concentration values of a tracer at different locations. Section 4.13 discusses fitting stable distributions to such concentration data.
2.6 Financial applications
35
2.6 Financial applications Examples of stable law, in finance and economics are given in Mandelbrot (1963), Fama (1965), Samuelson (1967), Roll (1970), McCulloch (1996), Rachev and Mittnik (2000), Rachev (2003), Peters (1994), McCulloch (1996), and Nolan (2014).
2.6.1 Stock returns Rachev (2003), McCulloch (2003), and Robinson (2003b) discuss using log-stable models for pricing options and evaluating portfolio risk. Chapter 18 of Kaplan (2012) discusses the use of stable models for returns. The scaling property (1.8) has an important implication for diversifying risk. In the Gaussian case, the risk is measured by the standard deviation, which is a multiple of the stable scale parameter case γ, and for n assets with independent, identical risk, the diversified risk scales as n−1/2 γ. However, in the case where the returns are modeled by an α-stable distribution with α < 2, the diversified risk scales as n(1/α)−1 γ. Thus to reduce risk by a given amount, a larger number of assets are needed, see Figure 2.7. For example, to reduce the risk to 25% of the original risk with a Gaussian model requires 16 assets, whereas a stable model with α = 1.6 requires 40 assets. Note that when α = 1, there is no reduction in risk, no matter how many assets are used. When α < 1, the risk increases as the number of assets increases! These issues are discussed in Ibragimov (2005), where their consequences for economic models are examined.
2.6.2 Value-at-risk and expected shortfall Value-at-risk (VaR) is a quantile of a return distribution F; here, we use VaR() = F −1 (). So VaR(0.05) is the value with 5% of the returns below this level and 95% above. Typically, one uses = 0.05 or 0.01, i.e. focus on large losses. It is a measure of risk, giving a sense of how large losses can be. When the returns are modeled by a stable distribution, VaR values can be obtained from the STABLE program or shifting and scaling the values in the tables in Appendix B. VaR is defined for all α ∈ (0, 2] and all β ∈ [−1, 1]. Stable VaR behaves differently than the traditional Gaussian model. First, the VaR quantities are generally larger because the stable model has a higher tail probability. Second, it scales differently over time. In particular, the n day VaR scales like n1/α , not n1/2 . Figure 2.8 shows plots of common VaR values for a range of α and β. Some references are Lamantia et al. (2006), Khindanova and Atakhanova (2002), Khindanova et al. (2001), and Martin et al. (2006). Expected shortfall is the negative of the conditional mean E(X |X < a), where a is a lower bound of interest. It will be defined if α > 1, and undefined when α ≤ 1.
2 Modeling with Stable Distributions
100
36
60 40 0
20
% of initial risk
80
α=2 α = 1.8 α = 1.6 α = 1.4 α = 1.2 α=1
5
10
15
20
25
number of assets Fig. 2.7 The reduction in risk by diversification for different values of α.
It is common to use a =VaR(), in which case the expected shortfall is the expected loss given that the return is below the Var() threshold. It is another measure of financial risk, in some ways better than VaR because it takes into account the spread below the VaR level. Expected shortfall is a coherent risk measure, whereas VaR is not. Figure 2.9 shows some sample values for standardized stable r.v. in the 0-parameterization. As expected, the expected shortfall is bigger when α decreases.
2.6.3 Other financial applications Foreign exchange rates can be very volatile and stable models have been used by Basterfield et al. (2003), Basterfield et al. (2005a), and Basterfield et al. (2005b). An early look at parallel market exchange rates is Fofack and Nolan (2001). Real estate prices can also be very volatile. Several authors use stable models to describe prices: Young and Graff (1995), Graff et al. (1997), Brown (2000), Brown
2.6 Financial applications
37
−100
−80
−60
−40
−20
0
VaR(0.05)
0.5
1.0
1.5
2.0
1.5
2.0
−100
−80
−60
−40
−20
0
VaR(0.01)
0.5
1.0
Fig. 2.8 Numerically computed VaR for a S (α, β; 0) distribution as a function of 1 < α ≤ 2. The top plot shows VaR level 0.05, the bottom shows VaR level 0.01. The different curves correspond to different β: β = −1, −0.5, 0, 0.5, 1 (lowest to highest).
(2004), Brown (2005), Young et al. (2006), Young (2008), and King and Young (1994). De Vany (2003) and De Vany and Walls (1999) argue that revenue from commercial movies is heavy tailed and skewed and use stable laws. Stable distributions have been used to model commodity prices. Electricity prices can spike on the spot market, e.g. Weron (2005). Jin (2005) examine prices of agricultural commodities. Some insurance claim data sets exhibit heavy tails. The increasing number of large forest fires and devastating floods show extreme values. A standard reference for such applications is Embrechts et al. (1997).
38
2 Modeling with Stable Distributions
0
20
40
60
80
100
expected shortfall for VaR (0.05)
1.0
1.2
1.4
1.6
1.8
2.0
1.8
2.0
0
20
40
60
80
100
expected shortfall for VaR (0.01)
1.0
1.2
1.4
1.6
Fig. 2.9 Numerically computed expected shortfall −E(X |X 1, unlike in Figure 2.8.
2.6.4 Multiple assets So far, we have discussed modeling univariate financial data. Now suppose X = (X1, X2, . . . , Xd ) is a list of d components. For example, they may be the returns from d individual stocks in a portfolio. If they are independent, then any model can be chosen for the marginals and the joint distribution is given by the products of the marginals. When there is dependency among the assets, the problems are more complicated. We briefly mention three possible approaches: multivariate stable models, operator stable models, and “coupling” univariate marginals to model the joint dependence.
2.7 Signal processing
39
Multivariate stable models are defined in Chapter 2 of Samorodnitsky and Taqqu (1994). The technicalities of working with general multivariate stable distributions are significant. It is possible to have a very wide range of dependence models for the joint distribution, with all marginals (and all one-dimensional projections) univariate stable with the same index α ∈ (0, 2]. This requires that the tail behavior of all the components have the same asymptotic Pareto tail behavior. The general dependence structure is probably much more than is needed in practical problems, and simpler subclasses are useful. The elliptically contoured stable distributions are probably the most accessible subclass. They are described by a shape matrix of coefficients that describes the joint dependence structure. Many economic data sets are approximately elliptically contoured giving empirical support to this subclass of stable laws. This approach is used in Nolan (2013). Marginal stable or operator stable distributions are a broader and more complicated class that allows components to have different indices of stability, and in particular, different tail behavior. This theory is described in Meerschaert and Scheffler (2001). The last approach uses copulas. In the simplest form, a copula is a multivariate d.f. C(u1, . . . , ud ) on [0, 1]d . Let X1, . . . , Xd be any distribution (not necessarily stable) with strictly increasing distribution function Fj (x). Then F(x1, . . . , xd ) = C(F1−1 (x1 ), F2−1 (x2 ), . . . , Fd−1 (xd )) is a multivariate distribution on Rd , with the marginal distribution of the j th component exactly Fj (x). The components will be independent when F(x1, . . . , xd ) factors, but they are generally dependent, e.g. see Nelsen (1999).
2.7 Signal processing In certain engineering problems, impulsive noise can corrupt a signal. Standard signal processing algorithms, notably linear filters, perform poorly in such situations. Impulsive noise or heavy tailed clutter can appear in telephone wires Stuck and Kleiner (1974), in radar systems Nikias and Shao (1995) and Kapoor et al. (1999), in image processing Carassso (2002) and Arce (2005), in underwater acoustics Chitre et al. (2006), in the analysis of bearing noise Xu and Liu (2019), and in blind source separation Kidmose (2001) and Zha and Qiu (2006). The approach below is based on Nolan et al. (2010). In the standard additive noise model, the observed signal xi is the sum of the signal si and a noise term i : xi = si + i . When i is heavy tailed, the noise is impulsive and conventional moving average filters do a poor job of detecting the underlying signal. Chapter 6 describes nonlinear methods in that are better at filtering out the impulsive noise. Figure 2.10 (left plot) shows a simulation of a signal with additive stable noise. Comparing the output of the linear (top right plot) and stable (bottom right plot)
40
2 Modeling with Stable Distributions
filter shows the amount of improvement that can be achieved with a stable filter. More information is contained in Chapter 6, including weighted and matched filters using stable laws.
linear filter
50
−20 −10 0
10 20
100
x(t)
4000
8000
0
0
−100
−20 −10 0
−50
10 20
stable filter
0
2000
4000
6000
8000
10000
0
4000
8000
Fig. 2.10 Left figure shows a sine wave with additive symmetric stable noise. The top right figure shows the output of a linear filter, the bottom right shows the output of a stable filter. The parameters used in this simulation were α = 1.3, γ = 2, n = 10000, and window width is m = 50. Note the difference in vertical scales for the input and output signals.
Carassso (2002) uses stable distributions to deblur images. Figure 2.11 shows an example of a scanning electron microscope image1 of a mosquito’s head and a sharpened image using the Apex deconvolution method. Other examples in that paper show significant sharpening of a wide variety of image types—faces, towns, medical images, satellite images of the earth and galaxies.
2.8 Miscellaneous applications This section contains a partial list of applications of stable laws, emphasizing newer areas.
1 This image was provided by A. Carasso.
2.8 Miscellaneous applications
41
Fig. 2.11 Left figure shows scanning electron microscope image of a mosquito head. The right figure shows a sharpened image using an α = 0.3138 stable law.
2.8.1 Stochastic resonance Stochastic resonance occurs in sensory systems of some organisms. A signal too weak to be detected directly can be observable when noise is added to the signal. It appears such biological systems take advantage of the noise in the environment to boost sensitivity. Kosko and Mitaim (2001) show that robust stochastic resonance can improve system performance when the noise is heavy tailed.
2.8.2 Network traffic and queues Heavy tailed distributions have been observed in network traffic. For example, when files are being transferred over a network, the presence of a large file, say a video, can have a significant impact on the performance of the network. If the size of individual files is given by X1, X2, . . . that have tail index α, then the cumulative size of n files X1 + · · · + Xn is approximately α-stable. The large flies can clog the system, resulting in long-range dependence in traffic flow. Some references on this topic are Crovella et al. (1998), Willinger et al. (1998), and Roughan and Kalmanek (2003). Sigman (1999) gives a primer on heavy tailed distributions in queues.
42
2 Modeling with Stable Distributions
2.8.3 Earth Sciences An early use of stable laws in geology was by Marcus (1970). A theoretical analysis of crater heights on a planetary surface involved a stable law. Fractal methods have been used to model reservoir mechanisms, rock structure, and seismic reflectivity: Gaynor et al. (2000). Gunning (2002), Meerschaert et al. (2004), Molz et al. (2004), Painter (2001), Painter et al. (1995), Sahimi and Tajer (2005), Velis (2003), Li and Mustard (2000), Li and Mustard (2005), and Zaliapin et al. (2005). Lavallée and Archuleta (2003) modeled earthquake strength and Wolpert et al. (2016) described lava flows using heavy tailed laws. There is a long history of considering stable models for rainfall amounts, where the data is skewed right and can have extremes: Lovejoy (1982), Lovejoy and Mandelbrot (1985), Lovejoy and Schertzer (1986), Gupta and Waymire (1990), Menabde and Sivapalan (2000), Millán et al. (2011), and Gomi and Kuzuha (2013). Water flow in porous soil can lead to fractional diffusion, see Guadagnini et al. (2013), Guadagnini et al. (2014), Guadagnini et al. (2015), Nan (2014), Nan et al. (2016), Zhang et al. (2018). Benson et al. (2001) and Rishmawi (2005) fit concentration data to analyze underground water flow, see Section 4.13. Climate variability models with heavy tails are considered in Lavallée and Beltrami (2004).
2.8.4 Physics The Kohlrausch-William-Watts function or stretched exponential function is K(t) = Kτ,α (t) = exp (−(t/τ)α ) ,
t ≥ 0,
(2.3)
0 < α < 1, τ > 0. This expression provides a better match to observed data in relaxation phenomenon. It is shown in Proposition 3.2 that this is the Laplace transform of a S(α, 1, (cos πα/2)1/α /τ, 0; 1) stable density. For a linear viscoelastic material, the shear storage and shear loss moduli are ∫ ∞ ∫ ∞ Gstorage (ω) = ω K(t) sin(ωt)dt and Gloss (ω) = ω K(t) cos(ωt)dt. 0
0
When K(t) is given by (2.3), then these expression can be written as Gstorage (ω) = ω g1 (ω),
Gloss (ω) = πω f (ω),
where f (ω) is the density of a symmetric α-stable distribution and g1 (ω) is defined in Section 3.11.1. More information on this can be found in Kohlrausch (1847), Williams and Watts (1970), Anderssen et al. (2004), and Elton (2018). Physical examples of Lévy flights are given in the book by Schlesinger et al. (1995). And West (1999) gives a large number of physical examples where heavy
2.8 Miscellaneous applications
43
tails occur. Barthelemy et al. (2008) have manufactured a special glass that diffuses light according to a Lévy distribution. They have named this material Lévy glass. The Landau distribution is used in physics to describe the fluctuations in the energy loss of a charged particle passing through a thin layer of matter. This distribution is a special case of the stable distribution with parameters α = 1 and β = 1. It was originally discussed in Landau (1944), more information is in Leo (1994). Csörgő et al. (2004) model the source distribution for Bose-Einstein correlations with stable distributions. In a theoretical paper on broadening and shift of spectral lines in plasma, Peach (1981) uses a stable distribution and obtains information about the characteristics of the plasma from this.
2.8.5 Embedding of Banach spaces Stable laws are used in embedding one Banach space in another, see Ledoux and Talagrand (1991) and Friedland and Guédon (2011).
2.8.6 Hazard function, survival analysis, and reliability The hazard function of a nonnegative random variable T with a density f (t) and d.f. F(t) is h(t) = f (t)/(1 − F(t)). This quantity measures the instantaneous hazard or risk at time t for a component (in a reliability setting) or a patient (in a biomedical setting) that has lifetime T. Figure 2.12 shows a plot of numerically derived hazard functions for some positive stable random variables, where necessarily 0 < α < 1 and β = 1. For simplicity of certain formulas, a different scaling is often used when working with positive stable terms, i.e. T ∼ S α, 1, (cos πα/2)1/α, 0; 1 . (See (3.51) for a reason why this scaling is used.) If an observed hazard function has a shape like one of these plots, a positive stable term may provide a model for the lifetime. Restricted maximum likelihood can be used to estimate the parameters. Stable models are also used in survival analysis and reliability to model population heterogeneity. The starting point is Cox’s proportional hazard model: λ(t) = h(t) exp(βT θ). Here, h(t) is the baseline hazard function and the exponential term models the effect of covariates θ = (θ 1, . . . , θ m ) on the hazard functions. For example, in a medical study, the covariates may be age, sex, weight, etc., that affect the risk of a patient to a particular disease and the coefficients β = (β1, . . . , βm ) determine the way in which each covariate affects the risk. Then the survival function gives the probability of surviving longer than time t: ∫ t
∫ t T λ(s)ds = exp − h(s)ds exp(β θ) . S(t) = exp − 0
0
44
2 Modeling with Stable Distributions
0.0
0.5
1.0
1.5
2.0
2.5
3.0
α = 0.5 α = 0.6 α = 0.7 α = 0.8 α = 0.9
0
1
2
3
Fig. 2.12 The hazard function for positive S α, 1,
4
(cos πα/2)1/α,
5
0; 1 distributions.
This well-known model assumes that the population is homogeneous: any individual with the same covariate values has the same hazard function. In many cases, there may be population heterogeneity: individuals have other factors that influence the risk that may not be observed or measurable, e.g. unknown genetic predisposition, unknown environmental exposure, etc. To model this, Vaupel et al. (1979) introduced a multiplicative nonnegative random term X that models the frailty of an individual: λ(t, X) = X h(t) exp(βT θ). If X < 1, the risk is lowered and if X > 1, then there is higher risk. Now the hazard function is random, including the individual frailty, the baseline hazard, and the individual covariate effects. The population survival function is the expected value over the frailty: ∫ t
∫ t λ(s, X)ds = E exp −X h(s)ds exp(βT θ) . S(t) = E exp − 0
0
The original choice of X was a gamma r.v., but Hougaard (1986) proposed using a positive stable r.v. The advantage is that when X ∼ S α, 1, (cos πα/2)1/α, 0; 1 , with α < 1, the survival function can be evaluated explicitly: using Proposition 3.2 ∫ t
α T S(t) = exp − h(s)ds exp(β θ) . 0
2.8 Miscellaneous applications
45
Generalizations using group frailty, multivariate frailty using Levy processes, and other possibilities can be found in Hougaard (2000) and Hanagal (2011). Applications of this concept can be found in Wassell et al. (1999), Qiou et al. (1999), Ravishanker and Dey (2000), Mallick and Ravishankaer (2004), and Gaver et al. (2004).
2.8.7 Biology and medicine Chapter 18 of Uchaikin and Zolotarev (1999) describes a variety of applications of stable laws in biology, genetics, physiology, and ecology and West (1999) contains multiple examples where stable laws occur in biological systems. Kotulska (2007) models the opening and closing of nanopores and observes open times that appear to be stable. Farsad et al. (2015) use stable distributions to model the noise in molecular communication. Alpheidae is a shrimp that lives in the waters of the Indian Ocean. It has an oversized claw that it snaps shut to stun prey. When there are many of these snapping shrimp in an area, they create a noisy acoustic environment underwater that is highly impulsive and interferes with sonar systems. In Section 4.11, a stable distribution is shown to accurately model this noise. Van den Heuvel et al. (2015) and Van den Heuvel et al. (2018) use stable distributions to characterize proton pencil beams, narrowly focused beams used in cancer treatment. They estimated α in the range 1.84 to 1.93 from data.
2.8.8 Discrepancies When printing tables of probabilities, one frequently has to round some quantity to a specified accuracy. In general, the results will not be exact, and the difference between the exact value and the exact value is called the discrepancy. This issue comes up when apportioning votes based on population, or legislative seats based on proportional representation. In some cases, it is required by law that the discrepancy is zero, and that one does this in such a way as to minimize an overall measure for the difference between the exact and rounded values. One such measure is the SanteLaguë’s chi-square divergence. Heinrich et al. (2004) show that the Sante-Laguë’s chi-square divergence is a sum of heavy tailed terms and is approximately stable with α = 1 and β = 1 when the number of categories is large.
46
2 Modeling with Stable Distributions
2.8.9 Computer Science Indyk (2000) construct sketches using stable laws to reduce massive data streams. Related work is by Cormode et al. (2002), Cormode and Muthukrishnan (2003), Cormode (2003), and Cormode and Indyk (2006). When execution times are heavy tailed, novel load balancing/task assignment methods are described in Harchol-Balter (2013). It discusses how high job size variability and heavy tailed workloads affect the choice of scheduling policy. Solving hard computational problems is explored in Gomes and Selman (1999). They study the probability distributions of the run time of such computational processes and show that these distributions often exhibit heavy tailed behavior. They discuss a general strategy based on random restarts to improve the performance of such algorithmic methods. Crovella and Lipsky (1997) look for methods to ensure that simulations involving heavy tailed waiting times reach equilibrium. Caron and Fox (2017) use stable random variables to model large networks with a range of sparsity.
2.8.10 Long tails in business, political science, and medicine These references are about extreme events, generally not situations where there is a numeric value that is being measured. Observations may be ranked, e.g. from most common sales item to least common, or just classified as extreme in some sense. There is not a direct probability distribution involved, but the idea of unusual/atypical occurrences can be important. Anderson (2006) discusses the “Long Tail” occurring in sales, where many low volume items can account for significant revenue. The best-known example of this is Amazon.com, where the lack of brick-and-mortar stores make it feasible to sell low volume goods on a large scale. Brynjolfsson et al. (2006) also discusses this. King and Zeng (2001) discuss measuring rare events in international relations. The first author has a webpage2 on rare events in this field. Bremmer and Keat (2009) also write about the fat tail in political and economic events. Neither of these groups measures a quantitative variable, rather they write about typical events that cluster around some center and occasionally there is an extreme event, like a financial or the September 11, 2001, terrorist attacks on the U.S. These are bulges/bumps far from the normal events that happen. They argue that one should be thinking about these possible risky events. Sorace (2012) describes a long-tailed distribution of disease combinations in the U.S. Medicare system. In this setting, the idea is more about sparsity than about extremes: rather than prominent clusters of combinations of diseases, the paper argues that there are many, many combinations of different illnesses that occur and
2 http://gking.harvard.edu/category/research-interests/methods/rare-events
2.9 Behavior of the sample mean and variance
47
argues that the costs of medical care cannot be significantly lessened by focusing on a few common clusters of diseases.
2.8.11 Extreme values models Stable distributions are useful for building models with classes of extreme value closure properties. For example, if S ∼ S α, 1, cos1/α (πα/2), 0; 1 with 0 < α < 1 is positive stable and G is a Gumbel distribution, then Section 3.11.2 shows that G + log S is also Gumbel. This allows one to build a one-way random effects model where extremes are being observed for m groups with each group having a random shift. Let μ ∈ R, σ > 0, τi = σ log Si , where Si are positive stable as above, Gi j Gumbel with scale σ, and all variables independent, then Xi j = μ + τi + Gi j ,
1 ≤ i ≤ m, 1 ≤ j ≤ ni
gives a family of r.v. that have Gumbel marginal distributions and have a joint multivariate Gumbel distribution. All these distributions are explicitly computable and the model can be fit numerically. For other applications to MA and AR time series models for extremes, spatial models for extremes, and extensions to the general extreme value distributions, see Fougéres et al. (2009) and Fougères et al. (2013).
2.9 Behavior of the sample mean and variance When a population has heavy tails, standard statistical techniques can fail. The simplest examples of this are when computing the sample mean and sample variance based on n observations: n n Xi (Xi − X)2 and sn2 = i=1 . X n = i=1 n (n − 1) These are finite sums and always exist. On the other hand, the population mean and variance for a continuous r.v. X with density f (x) are ∫ ∞ ∫ ∞ μ = EX = x f (x)dx and σ 2 = E(X − μ)2 = (x − μ)2 f (x)dx. −∞
−∞
Both are integrals over an infinite range which are finite only when the tails of the distribution are not too heavy. We will first discuss the sample variance, and then the sample mean. When the tails of the underlying distribution are not too heavy, then the population variance σ 2 exists and sn2 converges in probability to σ 2 . However, when the population variance is not finite, sn2 does not converge to anything. Figure 2.13 shows
48
2 Modeling with Stable Distributions
simulations that demonstrate what happens when the sample is S (α, 0), standardized stable. When √ α = 2, the population variance is σ 2 = 2, and as n increases sn converges to σ = 2 as n → ∞. In contrast, when α < 2, the population variance is not finite, and the sample standard deviation diverges as n → ∞. Having a large sample does not give you a better estimate of the scale. In fact, as n increases, the sample variance will generally get worse. As α decreases, the situation deteriorates. α = 1.75
1.0
0 10
1.4
30
50
1.8
α=2
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
α = 1.25
0
0
50
1000
150
2500
α = 1.5
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
n = 100
n = 500
n = 1000 n = 10000
0
0
10^7
20000
n = 250
α = 0.75
2*10^7
α=1
n = 25
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
Fig. 2.13 Boxplots showing the behavior of the sample standard deviation s when sampling from a S (α, 0; 0) distribution for different values of α and sample sizes n = 25, 100, 250, 500, 1000, 10000. For each sample size n, 100 samples were simulated and the resulting sample standard deviations were plotted. Note the different vertical scales as α decreases.
Infinite variance is not something restricted to stable laws, see Problem 1.11. If a distribution has asymptotic power decay on the tails, then the number of moments it has is limited. If the exponent of the power decay is less than 3, then the distribution will have infinite variance. The same comments hold for the mean. Figure 2.14 generalizes Figure 2.2 to α 1. In all cases, the distributions are symmetric and centered on the origin. When α ≤ 1, the population mean does not exist and the sample mean is heavily influenced by extreme values. Note that when 1 < α < 2, the population mean exists, but the sample mean is very variable and it may be very slow to converge. The closer α is to 1, the slower the convergence will be. Similar things happen when covariance of two stable random variables is computed: the sample covariance exists, but it does not generally converge to anything. This can lead to spurious correlation when there is independence, or claimed inde-
2.10 Appropriateness of infinite variance models
49 α = 1.75
-2
-0.5
0
2
0.0
4
0.5
α=2
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
n = 25
n = 100
α = 1.5
n = 500
n = 1000 n = 10000
-20
0
-10
20
0 5
40
α = 1.25
-20
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
α = 0.75
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
-6000
0
0
50
4000
α=1
-100
n = 250
n = 25
n = 100
n = 250
n = 500
n = 1000 n = 10000
Fig. 2.14 Behavior of the sample mean X¯ when sampling from a S (α, 0; 0) distribution. For each sample size n, 100 samples were simulated and the resulting sample means were plotted. Note the different vertical scales as α decreases.
pendence when the terms are dependent. The work of Davis and Resnick (1985) and Meerschaert and Scheffler (1999) give more information on this. A brief mention of general moments seems appropriate here. For the normal distribution, the first and second moments completely specify the distribution; for most distributions, they do not. Fractional moments, e.g. E |X | p for non-integer p < α can be used to specify stable distributions, see Section 3.7, but these methods are not very familiar. Section 4.6 shows that estimation based on sample fractional moments is possible, at least in the strictly stable case, but is not as efficient as other methods.
2.10 Appropriateness of infinite variance models Distributions with heavy tails are regularly seen in applications in economics, finance, insurance, telecommunication, and physics. Even if we accept that large data sets have heavy tails, is it ever reasonable to use a stable model? One of the arguments against using stable models is that they have infinite variance, which is inappropriate for real data that have a bounded range. While some data may have bounded range or light tails, it is possible for ratios or other simple transformations of such quantities to have infinite variance, e.g. Section 2.1 and
50
2 Modeling with Stable Distributions
Section 3.11. Moreover, bounded data are routinely modeled by normal distributions which have infinite support. The only justification for this is that the normal distribution gives a useful description of the shape of the distribution, even though it is clearly wrong on the tails for any problem with naturally bounded data. The same justification can be used for stable models: does a stable fit gives an accurate description of the shape of the distribution? The variance is one measure of spread; the scale γ in a stable model is another. Perhaps practitioners are so used to using the variance as the measure of spread, that they automatically retreat from models without a variance. The parameters δ and γ generalize the notion of location and scale usually played by the mean and standard deviation. We propose that the practitioner approach this dispute as an agnostic. The fact is that until recently we have not really been able to compare data sets to a proposed stable model. Chapter 4 shows that estimation of all four stable parameters is feasible, even for large data sets. And it is now feasible to use diagnostics to assess whether a stable model accurately describes the data. In some cases, there are solid theoretical reasons for believing that a stable model is appropriate; in other cases, we will be pragmatic: if a stable distribution describes the data accurately and parsimoniously with four parameters, then we will use it as a model for the observed data. What we expect from a fit to a data set may depend on the particular application and on the size of the data set. For discussion sake, suppose we are interested in the middle 98% of the distribution and are willing to tolerate an inappropriate model on the upper and lower 1%. We will call the interval (x0.01, x0.99 ) the “approximate range” of the distribution in what follows. Table 2.2 shows the approximate range for a standardized stable distribution S (α, β; 0) for β = 0 and β = 1. When α < 2, the distribution is naturally more spread out than the normal (α = 2) case, but the range is not outrageous, even for α as small as 1. With a large data set, one should demand a close match between data and mode for a larger percentage of the data, but there will always be a point at which the bounded and discrete nature of the data will make it difficult to justify any model on the tails. In Chapter 4, diagnostics are suggested for assessing the stability of a data set. Finally, there is a range of problems where quantities are heavy tailed, but not well described by a stable model. One example is in insurance claims, where natural disasters may cause huge claims. While the individual claims Xi may not be well described by a stable distribution, what matters for the solvency of an insurance company is the cumulative claims X1 + · · · + Xn . The Generalized Central Limit Theorem says that this sum is approximately stable. Similar arguments are appropriate in other fields, say geology, where earthquake strength or lava flows from individual events are heavy tailed and not stable, but the cumulative shock or flow may be well described by a stable model.
2.11 Problems
51 α β=0 β=1 0.1 (−5.1424 × 1016, 5.1424 × 1016 ) (−0.16, 5.548 × 1019 ) 0.2 (−1.7837 × 108, 1.7837 × 108 ) (−0.32, 5.8728 × 109 ) 0.3 (−273949.61,273949.61) (−0.49,2822032.3) 0.4 (−10812.94,10812.94) (−0.67,62385.20) 0.5 (−1559.73,1559.73) (−0.84,6364.87) 0.6 (−429.22,429.22) (−1.01,1392.84) 0.7 (−170.56,170.56) (−1.17,470.72) 0.8 (−85.14,85.14) (−1.33,208.42) 0.9 (−49.41,49.41) (−1.48,110.30) 1.0 (−31.82,31.82) (−1.62,66.02) 1.1 (−22.07,22.07) (−1.77,43.12) 1.2 (−16.16,16.16) (−1.91,30.01) 1.3 (−12.31,12.31) (−2.06,21.81) 1.4 (−9.66,9.66) (−2.21,16.45) 1.5 (−7.74,7.74) (−2.37,12.65) 1.6 (−6.28,6.28) (−2.53,9.84) 1.7 (−5.15,5.15) (−2.70,7.66) 1.8 (−4.28,4.28) (−2.88,5.86) 1.9 (−3.67,3.67) (−3.07,4.36) 2.0 (−3.28,3.28) (−3.28,3.28)
Table 2.2 The interval between the 1st and 99th quantiles for S (α, 0; 0) and S (α, 1; 0) distributions.
2.11 Problems Problem 2.1 Let X1, X2, . . . be i.i.d. S (α, β, γ, δ; 1) with α < 1. Show that the sample mean of n of these terms is S α, β, n(1/α)−1 γ, δ; 1 . Since n(1/α)−1 γ > γ, there is more information about the location δ in one term X1 than in the sample mean, and as more terms are taken, the sample mean gives less and less information about the location parameter. Problem 2.2 The result in the preceding problem is not restricted to stable laws. If X is any random variable with P(|X | > x) ∼ cx −α with α < 1, then the sample mean gives less information about the location of the distribution then a single term. (Brown and Tukey (1946).) Problem 2.3 A distribution with no fractional moments. Show that the r.v. X with density 1 . f (x) = 2 (e + |x|)(log(e2 + |x|))2 is a symmetric r.v. with no moments: E |X | p = +∞ for all p > 0.
52
2 Modeling with Stable Distributions
Problem 2.4 Show that if X is Uniform(−a, a), and p ≥ 1/2, then Y = (sign X)/|X | p has characteristic function ∫ 1 a φ(u) = E exp(iuY ) = cos(u/x p )dx. a 0 1/p And for large a, φ(u) = 1 − cua + o a1 , with c = c(p) > 0. Problem 2.5 Show that if the inverse square law in (2.1) is replaced by Y = n −p with p > 1/2, then the net field at the origin is symmetric i=1 (sign Xi )|Xi | stable with α = 1/p.
Chapter 3
Technical Results for Univariate Stable Distributions
This chapter contains proofs of the results stated in Chapter 1, as well as mathematical results about stable distributions. Both theoretical and computational expressions are derived for stable densities and distribution functions. Readers who are primarily interested in using stable distributions in applications may want to skip this chapter and return to it later for specific facts they need.
3.1 Proofs of Basic Theorems of Chapter 1 In this section we prove the basic results of Chapter 1, and some technical results needed for those proofs and later. Throughout these proofs, X1, X2, . . . will always denote i.i.d. copies of X. Recall our first definition of stable says that for any a, b > 0 d
aX1 + bX2 =cX + d,
(3.1)
where c > 0 and d ∈ R. Our first goal is to relate this to the second definition: d
X1 + · · · + Xn =cn X + dn for all n > 1.
(3.2)
The standard way to prove the basic results about stable laws is to first show that X is infinitely divisible (see Section 3.1.1) and derive stable distributions as a special class of infinitely divisible distributions. Below we provide a direct proof of these properties, without invoking the theory of infinitely divisible laws. We do this to show how the algebraic nature of equations (3.1) and (3.2) lead to the form of the stable characteristic functions. We start with four lemmas which provide most of the work for the basic results. For algebraic simplicity, it is easier to work with the S (α, β, γ, δ; 1) parameterization. After that, the next lemmas show that there actually are nondegenerate random variables that are stable. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_3
53
54
3 Technical Results for Univariate Stable Distributions
Lemma 3.1 If (3.1) holds, then (3.2) holds for some cn > 0 and dn ∈ R. If X is symmetric or strictly stable, then dn = 0 for all n > 1. Proof We will write c = c(a, b) and d = d(a, b) to show the explicit dependence of c, d on the right side of (3.1) with a, b on the left side. If X is stable, then (3.1) shows d
X1 + X2 =c(1, 1)X + d(1, 1), so the result is true with c2 = c(1, 1) and d2 = d(1, 1). Use d induction: assume the result is true for n − 1, then X1 + X2 + · · · +Xn−1 + Xn =(cn1 X1 + dn−1 ) + Xn = (cn−1 X1 + Xn ) + dn−1 = c(cn−1, 1)X + (d(cn−1, 1) + dn−1 ), so the result is true with cn = c(cn−1, 1) and dn = d(cn−1, 1) + dn−1 . Since c(a, b) > 0, cn > 0 for all n, proving (3.2). If X is symmetric stable, then it is necessarily strictly stable: if d(a, b) 0 for some a, b, then (3.1) has the right-hand side symmetric but the left-hand side nonsymmetric. If X is strictly stable, then d(a, b) = 0 for all a, b by definition, so dn = 0 for all n. Properties of the characteristic function φ(u) = E exp (iuX) are used in the proofs below, see Appendix A for a summary of these properties. Lemma 3.2 If X is a non-degenerate symmetric r.v. that satisfies d
X1 + · · · + Xn =cn X,
(3.3)
then there is some 0 < α ≤ 2, k > 0 with cn = n1/α for all n > 1 and the characteristic function of X is φ(u) = exp (−k |u| α ). Proof Equation (3.3) is equivalent to the characteristic function satisfying φ(u)n = φ(cn u).
(3.4)
Clearly φ(u) = exp (−k |u| α ) satisfies this with cn = n1/α ; it will be shown that this is the only solution that is a characteristic function. Since X is symmetric, φ(u) is real with φ(u) ≤ φ(0) = 1. Replacing u by u/cn in (3.4) shows φ(u/cn )n = φ(u).
(3.5)
Together (3.4) and (3.5) imply that φ(u) is positive for all u ∈ R: suppose φ(u0 ) = 0 for some u0 . Then (3.4) and (3.5) show that both c2 u0 and u0 /c2 are also zeros of φ(u). If c2 > 1, set u1 = u0 /c2 , if c2 < 1 set u1 = c2 u0 (c2 = 1 is impossible: it would imply φ(u)2 = φ(u) for all u by (3.4). Since φ(u) is continuous and φ(0) = 1, this would imply φ(u) = 1 identically, which means X is degenerate). Continuing this way gives a sequence un with un → 0 and φ(un ) = 0. This contradicts the fact that any characteristic function φ(u) is continuous with φ(0) = 1. Hence φ(u) > 0 for all u. φ(u) positive implies that ω(u) := − log φ(u) is a well defined, continuous function. Taking logarithms of (3.4) and (3.5) shows for all n > 1, nω(u) = ω(cn u), and
3.1 Proofs of Basic Theorems of Chapter 1
55
ω(u)/n = ω(u/cn ). Thus for any integers m, j > 1 and any u ∈ R, cj j ω a u = jω(u/cm ) = ω(u). cm m
(3.6)
Applying (3.3) to a sum with nm terms in blocks of size n shows d
cnm X =X1 + X2 + · · · + Xnm = (X1 + X2 + · · · + Xn ) + · · · + (X(m−1)n+1 + · · · + Xnm ) d
d
d
=cn X1 + cn X2 + · · · cn Xm =cn (X1 + · · · + Xm )=cn cm X, hence cnm = cn cm . Applying this repeatedly to mk = m · · · m, shows cmk = (cm )k .
(3.7)
Since cn > 0 for all n, we can define rn by cn = nrn . We will show that all the rn s are the same. Suppose by way of contradiction that there exists some p, q > 1 with r p < rq . Then there exist positive integers a, b with rp a < log p q < 1 ⇒ pr p /rq < q a/b < p ⇒ pbr p < q arq < pbrq rq b ⇒ q a < pb and cq a = (cq )a = q arq > pbr p = (cp )b = cp b , a n b n where (3.7) is used in the last part. Set mn = (q n ) , jn = (p ) to get integer sequences with mn < jn and cmn /c jn = cq a /cp b → ∞. Since every symmetric characteristic function satisfies φ(u) ≤ 1 and X is not degenerate, there must be some place u∗ > 0 where 0 < φ(u∗ ) < 1, and thus k ∗ := ω(u∗ ) < 0. Substituting u∗ into (3.6) shows c jn ∗ jn ∗ ω u = k . cmn mn
This is impossible for the sequences constructed: ω(u) is continuous and as n → ∞, the left-hand side is tending toward ω(0) = log φ(0) = 0, whereas the right-hand side is tending toward −∞. Hence all the rn s are the same, say rn = r, and cn = nr for all n. Equation (3.6) now implies ω(( j/m)r u∗ ) = ( j/m)k ∗ for all m, j > 1. Define ∗ ω (u) = k ∗ (u/u∗ )1/r on u ≥ 0. By definition, ω∗ (u) is continuous and ω∗ (( j/m)r u∗ ) = k ∗ ( j/m) for all m, j > 1. Thus ω(u) and ω∗ (u) are continuous functions that agree on a dense set, and therefore they must be equal. Hence ω(u) = kuα , where α = 1/r and k = −k ∗ /(u∗ )α > 0. By symmetry, ω(−u) = ω(|u|) = k |u| α , showing the form of the ch. f. To finish the proof, we need to show 0 < α ≤ 2. If α ≤ 0, then φ(u) = exp (−k |u| α ) does not tend to 1 as u → 0, so φ is not a ch. f. If α > 2, then φ(u) is twice differentiable at 0, with φ (0) = 0. This implies Var (X) = 0, contradicting the nondegeneracy of X.
56
3 Technical Results for Univariate Stable Distributions
Most of the proof above was algebraic and the probability component of the argument is minor: the fact that φ is a characteristic function of a non-degenerate distribution is used to show r p = rq and the fact that 0 < α ≤ 2. Problem 3.1 gives an alternate proof that α ≤ 2 by showing directly that exp (−|u| α ) is not positive definite when α > 2. Lemma 3.2 is not true if (3.2) only holds for n = 2, see Problem 3.2. However, it is true if (3.2) holds for n = 2 and n = 3, see Problem 3.3. Lemma 3.3 If non-degenerate X satisfies (3.2), then there exists 0 < α ≤ 2 such that cn = n1/α for all n > 1. If α 1, then there exists some k1 > 0 and k2, k3 ∈ R such that dn = (n − n1/α )k3 and X has characteristic function φ(u) = exp (− [k1 + i(sign u)k2 ] |u| α + ik3 u)
(3.8)
Proof To find the expression for cn , reduce to X symmetric: if X is not symmetric, then Y = X1 − X2 is symmetric and for i.i.d. copies Y1, Y2, . . . , Yn of Y , (3.2) shows d
Y1 + · · · + Yn =(X1 + · · · + Xn ) − (Xn+1 + · · · + X2n ) d
d
=(cn X1 + dn ) − (cn X2 + dn ) = cn (X1 − X2 )=cnY . That is, the symmetrization of X satisfies (3.2) with the same cn . The previous lemma now shows cn = n1/α for some 0 < α ≤ 2. d Next the expression for dn is derived. Condition (3.2) is now X1 +· · ·+Xn =n1/α X + dn . Splitting a sum with nm terms into blocks of size n shows d
(nm)1/α X + dnm = (X1 + · · · + Xn ) + · · · + (X(m−1)n+1 + · · · + Xnm ) d
= (n1/α X1 + dn ) + · · · + (n1/α X1 + dn ) d
= n1/α (X1 + · · · + Xm ) + mdn =n1/α (m1/α X + dm ) + mdn = (nm)1/α X + (n1/α dm + mdn ). So dnm = (n1/α dm + mdn ).
(3.9)
Since dnm = dmn , m + mdn = n + ndm , which implies that when α 1, dn /(n − n1/α ) = dm /(m − m1/α ), i.e. dn = (n − n1/α )k3 for some k3 ∈ R. Given cn = n1/α , condition (3.2) is equivalent to n1/α d
m1/α d
(φ(u))n = φ(n1/α u) exp (iudn ).
(3.10)
We will show that for u ≥ 0, φ(u) must be of the form log φ(u) = −(k1 + ik2 )uα + ik3 u
(3.11)
where k1 > 0 and k2, k3 ∈ R. First, as in the symmetric case, φ(u) is never zero: if there existed a u0 ∈ R with φ(u0 ) = 0, then the symmetrization Y := X1 − X2 has ch.f. φY (u) = φ(u)φ(u). But
3.1 Proofs of Basic Theorems of Chapter 1
57
Y is non-degenerate, symmetric, and satisfies (3.3), with φY (u0 ) = φ(u0 )φ(u0 ) = 0, contradicting the proof of Lemma 3.2. In what follows ω(u) := − log φ(u) will denote the principal value. Since φ(u) 0 and the real part of the exponent is positive, this is a well-defined, continuous function. In general, it is complex, say ω(u) = ω1 (u) + iω2 (u), where ω1 (u) and ω2 (u) are real. Equating real parts of the logarithm of both sides of (3.10) shows nω1 (u) = ω1 (n1/α u) for all n. The proof of Lemma 3.2 shows ω1 (u) = −k1 |u| α for some 0 < α ≤ 2 and k1 > 0. If ω2 (u) = 0 for all u, then (3.11) holds with k2 = 0, k3 = 0. For the next step, assume there exists some u∗ > 0 with ω2 (u∗ ) 0. Equating the imaginary parts of the logarithm of both sides of (3.10) shows nω2 (u) = ω2 (n1/α u) + dn u for all n. Replacing u by u/n1/α and dividing by n shows ω2 (u/n1/α ) = (1/n)ω2 (u) + (dn /n1+1/α )u. Hence 1/α dj u dj u ω2 (u) u j dm u ω2 + 1+1/α − 1/α u = jω2 − 1/α = j 1/α m m m m m m dj jdm j = ω2 (u) + − 1/α u. (3.12) 1+1/α m m m Substitute dn = (n − n1/α )k3 to get 1/α 1/α j j j ω2 u = (ω2 (u) − k3 u) + k3 u. m m m Define constant k2 = −(ω2 (u∗ ) − k3 u∗ )/(u∗ )α and function ω2∗ (u) = −k2 uα + k3 u, u ≥ 0. Straightforward computations show ω2 (( j/m)1/α u∗ ) = ω2∗ (( j/m)1/α u∗ ), i.e. the continuous functions ω2 and ω2∗ agree on a dense set. Thus ω2 (u) = −k2 uα + k3 u and (3.11) is established. For any ch. f., φ(−u) = φ(u), which implies ω1 (−u) = ω1 (u), but ω2 (−u) = −ω2 (u), so for all u ∈ R, (3.8) is the form of the characteristic function. The α = 1 case requires a different argument, because the solutions to (3.10) are of a different form. We will see that this case frequently has to be treated separately. Lemma 3.4 If non-degenerate X satisfies (3.2) with cn = n, then there exists k1 > 0 and k2, k3 ∈ R such that dn = k2 n log n and the characteristic function of X is φ(u) = exp (−k1 |u| − ik2 u log |u| + ik3 u)
(3.13)
Proof When cn = n, the previous lemma shows α = 1 and (3.2) is equivalent to (φ(u))n = φ(nu) exp (iudn ).
(3.14)
Also, as in the previous lemma, φ(u) is never zero, so ω(u) := − log φ(u) = ω1 (u) + iω2 (u) is a well-defined, continuous function. Equating real parts of the
58
3 Technical Results for Univariate Stable Distributions
logarithm of both sides of (3.14) shows nω1 (u) = ω1 (nu) for all n. The proof of Lemma 3.2 shows that ω1 (u) = k1 |u| for some k1 > 0. When α = 1, equation (3.9) is dnm = ndm + mdn , or dividing by mn, dmn /(mn) = dm /m + dn /n. Defining en := dn /n, yields emn = em + en . Also, when α = 1, (3.12) reduces to j j j
ω2 u = ω2 (u) + em − e j u m m m
(3.15)
(3.16)
Define the function g(u) = ω2 (u)/u − ω2 (1) on u > 0. Using (3.16) shows g( j/m) = ω2 (( j/m) · 1)/( j/m) − ω2 (1) = em − e j . Using this and (3.15), g(( j/m)(n/p)) = g(( jn)/(mp)) = emp − e jn = (em + e p ) − (e j + en ) = (em − e j ) + (e p − en ) = g( j/m) + g(n/p). By definition, g is a continuous function and it satisfies g(r s) = g(r)+g(s) on a dense set of rationals. The only continuous function with this property is g(u) = k2 log u. The definition of g(u) now shows that on u > 0, ω2 (u) = ug(u)+ω2 (1)u = k2 u log u+ k3 u, where k3 = ω2 (1). As in the previous lemma, ω2 (−u) = −ω2 (u), implies (3.13) holds for all u ∈ R. To finish the proof, we need to show dn = k2 n log n. Equating the imaginary parts of the negative log of (3.14) shows nω2 (u) = ω2 (nu) − udn . Using the expression for φ, and solving for dn yields the result. Proof of equivalence of (3.1) and (3.2) Lemma 3.1 shows that (3.1) implies (3.2) with dn = 0 if X is strictly stable. Lemma 3.3 shows that cn = n1/α for some 0 < α ≤ 2. If we assume (3.2), then the form of the ch. f. derived in Lemma 3.3 and Lemma 3.4 make it routine to show that X is stable: for any a, b > 0, aX1 + bX2 has ch. f. φ(au)φ(bu) = exp (ω(au) + ω(bu)). We will show this as φ(cu) exp (iud) for some c > 0, d ∈ R. When α 1, ω(au) + ω(bu) = (k1 + i(sign au)k2 )|au| α + ik3 (au) + (k1 + i(sign bu)k2 )|bu| α + ik3 (bu) = (aα + bα )(k1 + i(sign u)k2 )|u| α + i((a + b)k3 )u On the other hand, ω(cu) + idu = cα (k1 + i(sign u)k2 )|u| α + i(ck3 + d)u
Taking cα = aα + bα and d = ((a + b) − c)k3 = (a + b) − (aα + bα )1/α k3 show the result. When α = 1,
3.1 Proofs of Basic Theorems of Chapter 1
59
ω(au) + ω(bu) = k1 |au| − ik2 (au) log |au| + ik3 (au) + k1 |bu| − ik2 (bu) log |bu| + ik3 (bu) = (a + b)(k1 − ik2 u log |u|) + i(k3 (a + b) − k2 (a log a + b log b))u On the other hand, ω(cu) + idu = k1 |cu| − ik2 (cu) log |cu| + ik3 (cu) + idu = c(k1 |u| − ik2 u log |u|) + i(k3 c + d − k2 c log c)u Taking c = a + b and d = ((a + b) log(a + b) − (a log a + b log b))k2 shows the result. To finish the proof, we must show dn = 0 implies strict stability. If α 1 and dn = 0 for all n, then Lemma 3.3 shows k3 = 0 and the above expression for d shows d(a, b) = 0 for all a, b > 0, i.e. X is strictly stable. If α = 1, then Lemma 3.4 implies k2 = 0 and the above expression for d shows d(a, b) = 0. Next it is shown that there are nontrivial random variables that satisfy (3.1) and (3.2). The next result uses Pareto r.v., see Problem 1.10, to establish the β = 1 cases; the following result shows the general −1 ≤ β ≤ 1 cases. Lemma 3.5 Let X1, X2, . . . be i.i.d. Pareto(α, 1). For α ∈ (0, 1) ∪ (1, 2), set Yj = X j − α/(α − 1), then (Y1 + · · · + Yn )/n1/α converges in distribution to a random variable Z with characteristic function
E exp (iuZ) = exp (−Γ(1 − α) cos π2α − i(sign u) sin π2α |u| α ). When α = 1, set Yj = π2 [X j +log π2 −(1− γEuler )], where γEuler = −Γ (1) ≈ 0.57721 is Euler’s constant. Then (Y1 + · · · + Yn − π2 n log n)/n converges in distribution to a random variable Z with characteristic function
E exp (iuZ) = exp − |u| + i π2 u log |u| . Proof By Lemma 7.1, the ch. f. of Yj near the origin is φYj (u) = φ X j (u) exp (−iuα/(α − 1))
= 1 − Γ(1 − α) cos π2α − i(sign u) sin π2α |u| α + iuα/(α − 1) + O(u2 ) × 1 − iuα/(α − 1) + O(u2 )
= 1 − Γ(1 − α) cos π2α − i(sign u) sin π2α |u| α + O(|u| min(α+1,2) ). So Sn = (Y1 + · · · Yn )/n1/α has characteristic function
60
3 Technical Results for Univariate Stable Distributions
φSn (u) = φY (n−1/α u)n Γ(1 − α) cos = 1− → exp (−Γ(1 − α) cos
πα 2
πα 2
− i(sign u) sin n − i(sign u) sin
πα 2
πα 2
|u| α
n 1 +o n
|u| α )
as
n → ∞.
Since this is continuous at the origin, it corresponds to the ch. f. of some r.v. Z. When α = 1, the details are slightly different. Near the origin, Lemma 7.1 shows
φYj (u) = φ X j π2 u exp iu π2 log π2 − (1 − γEuler )
= 1 − |u| + i π2 u log |u| − iu π2 log π2 − (1 − γEuler ) + O(u2 )
× 1 + iu π2 log π2 − (1 − γEuler ) + O(u2 )
= 1 − |u| + i π2 u log |u| + O(u2 log |u|) Hence the characteristic function of Sn = (Y1 + · · · + Yn − (2/π)n log n)/n is
n φSn (u) = φY1 (u/n) exp (−iu π2 log n)
n |u| + i π2 u log |u| log n 2 2 2 1 − iu π log n + O(u ) + i π u log n + O = 1− n n2
n |u| + i π2 u log |u| 1 +o = 1− n n
→ exp − |u| + i π2 u log |u| as n → ∞. Note that the shift by α/(α − 1) is not strictly necessary when α < 1: in this case, n1/α will dominate the shift. However shifting by an appropriate amount will hasten the convergence. Proof of Definition 1.3 (characteristic functions of stable laws) We must show that, subject to conditions on the constant k2 , the functions φ(u) in (3.8) and (3.13) are actually characteristic functions. The α = 2 case corresponds to the well-known Gaussian characteristic function. Note that in this case tan(π2)/2 = 0, so the value of beta is irrelevant; all Gaussian laws are symmetric around the mean. For α ∈ (0, 1) ∪ (1, 2), let c = (Γ(1 − α) cos( π2α ))−1/α > 0 and Z be the limit in the previous lemma. Then X = cZ has characteristic function
φ X (u) = φ Z (cu) = exp (−Γ(1 − α) cos π2α − i(sign cu) sin π2α |cu| α )
Γ(1 − α) cos π2α − i(sign u) sin π2α |u| α = exp − Γ(1 − α) cos( π2α )
= exp (− 1 − i(sign u) tan π2α |u| α ).
3.1 Proofs of Basic Theorems of Chapter 1
61
This is the standardized α-stable r.v. with β = 1 in the 1-parameterization. For any β ∈ [−1, 1], let X1, X2 be i.i.d. copies of X, then W := −((1 − β)/2)1/α X1 + ((1 + β)/2)1/α X2 has ch. f.
exp (− 1 − i(sign (−u)) tan π2α ((1 − β)/2)|u| α )
× exp (− 1 − i(sign u) tan π2α ((1 + β)/2)|u| α )
= exp (− 1 − i(sign u)β tan π2α |u| α ). This is a S (α, β; 1) distribution. Finally, for any k1 > 0, k3 ∈ R, k11/α W + k3 is a r.v. with characteristic function (3.8). When α = 1, the details are slightly different. In this case, the Z of the previous lemma is exactly a S (α, 1; 1) r.v. For any β ∈ [−1, 1], let Z1 and Z2 be i.i.d. copies of Z and define W := −((1 − β)/2)Z1 + ((1 + β)/2)Z2 + (2/π)[((1 − β)/2) log((1 − β)/2) − ((1 + β)/2) log((1 + β)/2)]. Similar calculations as above show that
E exp (iuW) = exp − |u| + i β π2 u log |u| . Finally, for any k1 > 0, k3 ∈ R, k1W + k3 −(2/π)βk1 log k1 is a r.v. with characteristic function (3.13). We have shown that as long as β = k2 /k1 ∈ [−1, 1], there is a r.v. having the form claimed in the theorem. We will show that | β| > 1 does not give a valid probability distribution using the Lévy Khintchine representation in Section 3.1.1 below. Other approaches include Dharmadhikari and Sreehari (1976) and Pitman and Pitman (2016). See the discussion after Theorem 3.2 for a short discussion about trans-stable functions (not probability densities) when α > 2 or | β| > 1. Many of the results about stable distributions are derived from algebraic facts about the characteristic function of Z(α, β; k) ∼ S (α, β; k). Since we will use these facts repeatedly, we record them here for later use. E exp (iuZ(α, β; k)) = exp (−ω(u|α, β; k)), where
|u| α 1 + i β(tan π2α )(sign u)(|u| 1−α − 1) α 1
(3.17) ω(u|α, β; 0) = α = 1, |u| 1 + i β π2 (sign u) log |u|
|u| α 1 − i β(tan π2α )(sign u) α 1
ω(u|α, β; 1) = (3.18) α = 1. |u| 1 + i β π2 (sign u) log |u| The real part of this expression is simple: Re ω(u|α, β; k) = |u| α , so much of the technical difficulties of working with stable distributions come from the imaginary part. To isolate this part, define for u ∈ R, 0 < α ≤ 2, the function
62
3 Technical Results for Univariate Stable Distributions
⎧ ⎪ tan π2α (sign u) (|u| − |u| α ) ⎪ ⎨ ⎪ η(u|α; k) = Im ω(u|α, 1; k) = − tan π2α (sign u)|u| α ⎪ ⎪ ⎪ π2 u log |u| ⎩
α 1, k = 0 α 1, k = 1 α = 1,
(3.19)
so that ω(u|α, β; k) = |u| α + i βη(u|α; k). When α = 1 and u = 0, 0 · log 0 will be interpreted as 0, so ω(0|α, β; k) = 0 and η(0|α; k) = 0 for all α and β. Here are basic algebraic properties of η(u|α; k) and ω(u|α, β; k). Lemma 3.6 (a) η(−u|α; k) = −η(u|α; k). (b) For any r ≥ 0, r α η(u|α; 0) + tan π2α [(r − r α )] u
η(ru|α; 0) = rη(u|1; 0) + π2 r log r u η(ru|α; 1) =
r α η(u|α; 1)
rη(u|1; 1) + π2 r log r u
α1 α = 1,
α1 α = 1,
(c) limα→1 η(u|α; 0) = η(u|1; 0) for all u ∈ R. Proof (a) is straightforward. For the rest of this proof it suffices to assume u > 0. (b) For α 1, r ≥ 0, u > 0
η(ru|α; 0) = tan π2α (ru)(1 − (ru)α−1 ) = tan π2α u(r α − r α uα−1 + r − r α )
= r α η(u|α; 0) + tan π2α (r − r α )u. η(ru|α; 1) = −|ru| α tan
πα 2
sign (ru) = r α η(u|α; 1).
For α = 1, u > 0, η(ru|1; k) = π2 (ru) log(ru) = r π2 u(log u + log r) = rη(u|1; k) + ( π2 r log r)u.
(c) The proof depends on two calculus facts: ( π2 − x) tan x = sin x ( π2 − x)/cos x → 1 as x → π2 and for fixed u > 0, (d/dα)uα−1 |α=1 = log u. Then for α 1, u > 0,
π − πα
η(u|α; 0) = tan π2α u(1 − uα−1 ) = tan π2α π2 π2α u(1 − uα−1 ) 2 − 2 α−1
−1 u −→ 1 · π2 u log u as α → 1. = tan π2α π2 − π2α π2 u α−1 When α 1, the algebraic properties of η(·|α; 1) are simpler that those of η(·|α; 0) and this makes some proofs simpler in the 1-parameterization. However, there is no result like (c) above when k = 1 because of the essential discontinuity of η(u|α; 1) near α = 1. The behavior of η(u|α; 0) and η(u|α; 1) are shown in Figure 3.1, note the discontinuity in α on the right-hand figure. For some purposes, it is worth the price of using the algebraically more complicated k = 0 parameterization. The S (α, β; 0) parameterization has the side effect of making the α 1 case and the
3.1 Proofs of Basic Theorems of Chapter 1
63
1.0
1.0
α = 1 case more similar: in both cases property (b) has a shift on the right-hand side and furthermore, (c) shows that shift is continuous as α → 1.
0.5
-1.0
-1.0
-0.5
-0.5
0.0
0.0
0.5
alpha = 0.50 alpha = 0.75 alpha = 1.00 alpha = 1.25 alpha = 1.50
-2
-1
0
u
1
2
-2
-1
0
1
2
u
Fig. 3.1 η(u |α; 0) = Im ω(u |α, 1; 0) on the left and η(u |α; 1) = Im ω(u |α, 1; 1) on the right.
Problem 3.4 establishes the following properties of ω(u|α, β; k). Lemma 3.7 (a) ω(−u|α, β; k) = ω(u|α, β; k) = ω(u|α, −β; k). (b) For any r ∈ R,
|r | α ω(u|α, (sign r)β; 0) + i β tan π2α (r − r ) u
ω(ru|α, β; 0) = |r |ω(u|1, (sign r)β; 0) + i β π2 r log |r | u |r | α ω(u|α, (sign r)β; 1) α1
ω(ru|α, β; 1) = 2 |r |ω(u|1, (sign r)β; 1) + i β π r log |r | u α = 1.
α1 α = 1,
(c) ω(u|α, β; 0) is jointly continuous in u ∈ R, 0 < α ≤ 2, and −1 ≤ β ≤ 1. The reflection property is now straightforward using the preceding lemma. Proof of Proposition 1.1 (Reflection property) Using the ch. f. of Z(α, β) ∼ S (α, β; k), E exp (iu(−Z(α, β))) = exp (−ω(−u|α, β; k)) = exp (−ω(u|α, −β; k)) = E exp (iuZ(α, −β)), so −Z(α, β) and Z(α, −β) have the same distribution.
64
3 Technical Results for Univariate Stable Distributions
3.1.1 Stable distributions as infinitely divisible distributions Definition 3.1 A random variable X is infinitely divisble if for every n > 1, there are independent and identically distributed random variables X1(n), X2(n), . . . , Xn(n) such that d X =X1(n) + X2(n) + · · · + Xn(n) . Note that this definition does not require that the summands be of the same type as X, whereas Definition 1.2 for stability does. It is immediate from Definition 1.2 that stable laws are infinitely divisible. More information on infinitely divisible laws is given in Section 7.5. The LévyKhintchne representation, Theorem 7.1, shows that the characteristic function has a certain form given by a shift parameter and a measure. When X is stable, the scaling property shows that the measure must be of the following form. Theorem 3.1 Lévy-Khintchine Representation for Stable Laws When X is α-stable, 0 < α < 2, the log characteristic function is ∫ ∞ dx iux ∗ iux e −1− log φ(u) = iδ u + γ+ 2 xα 1 + x 0 ∫ 0 dx iux iux +γ− , e −1− 2 1 + x |x| α −∞ where γ+ ≥ 0, γ− ≥ 0, γ+ + γ− > 0, and δ∗ ∈ R. A proof of this theorem may be found in Breiman (1968), Chapter 9, where it is shown that the above integrals can be evaluated explicitly to get the log characteristic function (in the 1-parameterization)
−γ α |u| α 1 − i β tan π2α sign u + iδu α 1
log φ(u|α, β; 1) = −γ|u| 1 − i β π2 log |u|sign u + iδu α = 1, where γ+ − γ− γ+ + γ− (γ+ + γ− ) cos(πα/2)Γ(1 − α) γ= (γ+ + γ− )(π/2) β=
α1 α = 1.
See Table 3.2 for an expression for δ as a function of δ∗ , α, γ+ , and γ− . It is immediate that β ∈ [−1, 1] from the expression for β above; this shows that | β| ≤ 1 is necessary for the function φ(u) above to be a characteristic function. Recall that the proof of Definition 1.3 on page 60 showed | β| ≤ 1 was sufficient.
3.2 Densities and distribution functions
65
3.2 Densities and distribution functions All non-degenerate stable distributions have smooth densities. Since we have the characteristic function already, the quickest way to prove this fact is to use it. Feller (1971), VI.13.2 outlines a direct proof from condition (3.2) that stable distributions are continuous. Proof of Theorem∫1.1 (Densities ∫ exist and differentiable) The characteristic function is integrable: |φ(u)|du = exp (−γ|u| α ) < ∞, so a density exists. Likewise, ∫ for all n = 1, 2, 3, . . ., |un φ(u)|du < ∞, so the density is n times differentiable. As stated in Chapter 1, except for the normal (α = 2), Cauchy (α = 1, β = 0) and Lévy distributions (α = 1/2, β = ±1), there are no known closed-form expressions for stable densities. When α and β are particular rational numbers, the corresponding stable densities can be expressed in terms of certain special functions (Fresnel integrals when α = 1/2 and β = 0, MacDonald functions when α = 1/3 and β = 1, Whittaker functions when α = 2/3, 3/2 and certain β) see Section 2.8 of Zolotarev (1986) and Sections 6.5-6.8 and Appendix A.7 of Uchaikin and Zolotarev (1999). When α and β are rational numbers, Hoffmann-Jørgensen (1994) expressed the density in terms of “incomplete hypergeometric” functions and Zolotarev (1995) expressed the densities in terms of Meijer G-functions. Finally, it is possible to express stable densities in terms of a very general family of special functions, the Fox H functions, see Fox (1961), Schneider (1986), Penson and Górska (2010), and Górska and Penson (2011). Unfortunately, these representations all involve evaluating some special function and are not practical for evaluating stable densities at general values of α, β, and x. In this section we develop expressions for stable densities, distribution functions, and related functions that are used for multivariate stable laws. We start with expressions for stable densities that follow directly from inverting the characteristic function. Because these formulas involve oscillating integrals over infinite regions, they are not practical to evaluate numerically. Next, more involved, but better computational formulas are given for stable densities. They turn out to have a computable antiderivative, so computational formulas for the distribution functions are obtained. These latter formulas are Zolotarev’s integral formulas, see §2.2 of Zolotarev (1986). To simplify formulas, define −β tan π2α α 1 ζ = ζ(α, β) = 0 α = 1. Zolotarev gives these results in a different parameterization, where there are slight technical advantages. To minimize confusion over parameterizations, we will develop formulas for f (x|α, β; 1) and F(x|α, β; 1). Recall that f (x|α, β; 0) = f (x − ζ |α, β; 1)
and
F(x|α, β; 0) = F(x − ζ |α, β; 1),
66
3 Technical Results for Univariate Stable Distributions
so that S (α, β; 0) formulas are a simple shift of the S (α, β; 1) ones. Recall that η(t|α; 1) is given in (3.19). Theorem 3.2 For any 0 < α < 2 and any −1 ≤ β ≤ 1, standardized stable densities in the 1-parameterizations are given by ∫ α 1 ∞ cos(xt + βη(t|α; 1))e−t dt. f (x|α, β; 1) = π 0 Proof When α 1, the inversion formula for characteristic functions shows that ∫ ∞ πα α e−ixt e− |t | [1−iβ(sign t)(tan 2 )] dt f (x|α, β; 1) = 2π1 ∫−∞ ∞ α α e− |t | −i[xt+ζ (sign t) |t | ] dt = 2π1 ∫−∞ ∞ α e− |t | cos [xt + ζ(sign t)|t| α ] dt = 2π1 ∫ −∞ ∞ α 1 e−t cos [xt + ζt α ] dt. = π 0
When α = 1, we likewise have
∫
f (x|1, β; 1) =
1 2π
=
1 2π
=
1 2π
=
1 π
∞
∫−∞ ∞ ∫−∞ ∞ ∫ −∞ ∞ 0
e−ixt e− |t |[1+iβ(sign t) π log |t |] dt 2
e− |t |−i[xt+β π t log |t |] dt 2
e− |t | cos xt + β π2 t log |t| dt
e−t cos xt + β π2 t log |t| dt.
One can allow α > 2 or | β| > 1 in the definition of f (x|α, β; 1) above and still get a continuous function that satisfies a convolution equation, although in these cases, f (x|α, β; 1) < 0 for some x, so it is no longer a probability density. These are the “trans-stable” functions of Section 2.11 of Zolotarev (1986). Several times integrals of the following form are needed. They are evaluated by substituting u = t α and using a table of integrals, e.g. pg. 492 of Gradshteyn and Ryzhik (2000): for α > 0,
3.2 Densities and distribution functions
∫
∞
67
∫
1 ∞ p/α−1 u cos(cu)e−u du α 0 0 cos((p/α) arctan c)Γ(p/α) = α(1 + c2 ) p/(2α) ∫ ∞ ∫ ∞ α 1 t p−1 sin(ct α )e−t dt = u p/α−1 sin(cu)e−u du α 0 0 sin((p/α) arctan c)Γ(p/α) = α(1 + c2 ) p/(2α) α
t p−1 cos(ct α )e−t dt =
(3.20) p>0 (3.21) p > −α.
When α 1, explicit values are known for the densities at the origin. To state this, define
α1 α−1 arctan β tan π2α (3.22) θ 0 = θ 0 (α, β) = π/2 α = 1. Note that when α 1, αθ 0 = − arctan ζ, and for the allowable values of α and θ 0 , cos αθ 0 = | cos αθ 0 | = (1 + tan2 αθ 0 )−1/2 = (1 + ζ 2 )−1/2 .
(3.23)
Corollary 3.1 When α 1, f (0|α, β; 1) = f (−ζ |α, β; 0) = Proof f (0|α, β; 1) =
1 π
∫∞ 0
1 Γ(1/α) cos(θ 0 )(cos αθ 0 )1/α . πα
α
e−t cos(ζt α )dt, and use (3.20) with p = 1 and (3.23).
When α = 1, f (0|1, 0; 1) = f (0|1, 0; 0) = 1/π for the Cauchy case, but no formulas are known for f (0|1, β; 1) when β 0. The next result gives expressions for derivatives of stable densities. Corollary 3.2 The derivatives (with respect to x) of standardized stable densities are given by ∫∞ α ((−1)n/2 /π) 0 cos(xt + βη(t|α; 1))t n e−t dt n even (n) ∫∞ f (x|α, β; 1) = α (n+1)/2 n −t /π) 0 sin(xt + βη(t|α; 1))t e dt n odd ((−1) f (n) (x|α, β; 0) = f (n) (x − ζ |α, β; 1). In particular, for k = 0 or k = 1, | f (n) (x|α, β; k)| ≤
n+1 1 Γ . απ α
Proof The formulas follow by differentiating the formulas in Theorem 3.2. The bounds on the derivatives come from the straightforward estimate ∫ n+1 1 ∞ n −t α 1 (n) Γ t e dt = | f (x|α, β; k)| ≤ . π 0 απ α
68
3 Technical Results for Univariate Stable Distributions
The numerical evaluation of stable densities is difficult using the above formulas because the interval of integration is infinite and the integrands oscillate an infinite number of times. When |x| is large the number of oscillations is large and when α is small the region of integration required for accurate results grows—both cases make the integral hard to evaluate accurately. Furthermore, it is difficult to get a small relative error in such calculations. The results below give integral expressions without these problems. For presentation purposes, the derivation of formulas for stable densities and distribution functions is split up into lemmas. The upcoming results use the following functions, recall θ 0 from (3.22): for θ ∈ (−θ 0, π/2) define V(θ|α, β) = α α−1 ⎧ ⎪ 1 cos θ cos(αθ 0 + (α − 1)θ) ⎪ α−1 ⎪ (cos αθ ) 0 ⎨ ⎪ sin α(θ + θ) cos θ 0 π
+ βθ ⎪ 1 2 ⎪ ⎪ ⎪ π2 cos θ exp β π2 + βθ tan θ ⎩
(3.24) α1 α = 1, β 0.
Note that when α = 1 and β = 0, V(θ|1, 0) is undefined; this is the Cauchy case, where there is a closed form expression for the pdf and distribution function. When α < 1 and β = −1, θ 0 = − π2 and the interval (−θ 0, π2 ) is empty; V(θ|α, β) is undefined in these cases. For the other values of α and β, the quantities θ 0 and V(θ|α, β) are used frequently, so it is worth understanding their behavior. Figure 3.2, shows how the values of θ 0 (α, β) vary for different α and β. Next are some useful properties of the V(θ|α, β) functions.
-2
-1
0
1
2
b=1 b = 2/3 b = 1/3 b=0 b = -1/3 b = -2/3 b = -1
0.0
0.5
1.0
Fig. 3.2 θ0 (α, β) as a function of α for varying β.
1.5
2.0
3.2 Densities and distribution functions
69
Lemma 3.8 V(·|α, β) is continuous and positive on (−θ 0, π2 ) with the following properties. (a) If α < 1 or (α = 1 and β > 0), then V(·|α, β) is strictly increasing. (b) If α > 1 or (α = 1 and β < 0), then V(·|α, β) is strictly decreasing. (c) The range of V(θ|α, β) is (c(α, β), +∞) where c(α, β) ≥ 0 is given by
c(α, β) =
⎧ |1 − α| (α−α | cos(πα/2)|)1/(α−1) ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ 2/(πe) ⎪ ⎪ ⎪ ⎪0 ⎩
(α > 1, β = −1) or (α < 1, β = 1) α = 1, | β| = 1 otherwise.
Proof First we show that for α ∈ (0, 1) ∪ (1, 2) and β ∈ [−1, 1], the function h(θ) := V(θ|α, β)(1−α)/α is strictly increasing on the interval (− π2 , π2 ). Fix α and β and let k = (cos αθ 0 )−1/α > 0. Then h(θ) = k sin α(θ + θ 0 )(cos(αθ 0 + (α − 1)θ))(1−α)/α (cos θ)−1/α =: k AB(1−α)/α C −1/α . Routine calculations show B > 0 and C > 0 for θ ∈ (− π2 , π2 ) and 1 − α (1−2α)/α −1/α B BC h (θ) = k A B(1−α)/α C −1/α + A α 1 +AB(1−α)/α − C −(1+α)/α C α k (1−2α)/α −(1+α)/α [αA BC + (1 − α)AB C − ABC ] . = B C α The terms in front of the brackets are positive since B and C are positive. We will show that the term inside the brackets is positive. Using a trig identity, rewrite B = cos α(θ + θ 0 ) cos θ + sin α(θ + θ 0 ) sin θ, compute A and B and substitute to show that the term in brackets is equal to {α cos θ cos α(θ + θ 0 ) + sin θ sin α(θ + θ 0 )}2 + (α − 1)2 cos2 θ sin2 θ > 0. Next we show that when α = 1 and β > 0, the function V(θ|1, β) is strictly increasing on the interval (− π2 , π2 ). Set D = (π/(2β) + θ) tan θ, then D = tan θ + (π/(2β) + θ) sec2 θ = tan θ + (π/(2β) + θ)(1 + tan2 θ), so 2β (π/(2β) + θ) sec θ exp (D), π 2β sec θ exp (D) [1 + (π/(2β) + θ) tan θ + (π/(2β) + θ)D ] V (θ|1, β) = π
2β sec θ exp (D) (1 + (π/(2β) + θ) tan θ)2 + (π/(2β) + θ)2 > 0. = π V(θ|1, β) =
The remainder of the proof is similar, see Problem 3.5.
Next we define a related function V(θ|α, β); it has similar properties to V(θ|α, β) which we record here for later use.
70
3 Technical Results for Univariate Stable Distributions
1−α sin α(θ 0 + θ) ⎧ ⎪ ⎪ ⎨ tan θ + α cos θ cos(αθ + (α − 1)θ) ⎪ 0 V(θ|α, β) = β ⎪ ⎪ tan θ + ⎪ π/2 + βθ ⎩
α1 (3.25)
α = 1.
In both cases, any value of β ∈ [−1, 1] is admissible. Lemma 3.9 (a) V(·|α, β) is continuous and strictly increasing for all α and β on the interval (− π2 , π2 ). (b) The range of V(θ|α, β) is (−∞, +∞) when α = 1 or (α 1 and −1 < β < 1). When α 1 and β = −1, the range is (−∞, 0); when α 1 and β = 1, the range is (0, ∞) Proof When α = 1, (θ|1, β) = sec2 θ − V
1 (π/(2β) + θ)2 − cos2 θ . = (π/(2β) + θ)2 (π/(2β) + θ)2 cos2 θ
(θ|1, β) > 0, so V(·|1, Since |π/(2β)+θ| > | cos θ| on (−π/2, π/2), V β) is increasing. When α 1, using the notations for A, B, and C from the previous proof, 1−α A V(θ|α, β) = tan θ + α BC 1 2 (θ|α, β) = sec θ + − α A BC − AB C − ABC V 2 α (BC) 2 αB + (1 − α)(A BC − AB C − ABC ) = . α(BC)2 Since the denominator is positive, it remains to be shown that the numerator is positive. Taking the derivatives and using several trig identities, it can be shown that the numerator can be simplified to B2 − (1 − α)2 C 2 . Problem 3.5 is to show that this is positive and to finish the remainder of the proof. We now state the computational formulas for stable densities. While the functions are complicated, the preceding Lemma shows that the integrands below are continuous and non-oscillatory over the finite region (−θ 0, π2 ). Theorem 3.3 When α ∈ (0, 1) ∪ (1, 2), the density of X ∼ S (α, β; 1) is given by
f (x|α, β; 1) =
When α = 1,
∫ π 1 ⎧ 2 ⎪ α αx α−1 ⎪ α−1 V(θ|α, β) dθ ⎪ V(θ|α, β) exp −x ⎪ ⎨ π|α − 1| −θ ⎪ 0
⎪ Γ(1 + (1/α)) cos θ 0 (cos αθ 0 )1/α /π ⎪ ⎪ ⎪ ⎪ f (−x|α, −β; 1) ⎩
x>0 x=0 x < 0.
3.2 Densities and distribution functions
f (x|1, β; 1) =
∫ ⎧ 1 − π2βx ⎪ ⎪ ⎪ e ⎨ 2| β| ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎩ π(1 + x 2 )
π 2
− π2
71
πx V(θ|1, β) exp −e− 2β V(θ|1, β) dθ
β0 β = 0.
Proof First assume α 1. The x = 0 case is Corollary 3.1. When x < 0, this is the reflection property Proposition 1.1. The work is in establishing the x > 0 case, which we do in a sequence of steps. So assume x > 0 and define h on the complex plane α α (cut along, say, the negative ∫ ∞ real axis) by h(z) = h(z; x, α, ζ) = z +i(xz + ζ z ). Note 1 that f (x|α, β; 1) = π Re 0 exp (−h(z))dz as in Theorem 3.2. We will find a path C in the complex plane that avoids the cut and connects 0 to ∞ with Im exp (−h(z)) = 0 along C. To find such a path, set z = reiθ , then h(z) = h(reiθ ) = r α eiαθ + i(xreiθ + ζr α eiαθ ) = r α (1 + iζ)eiαθ + ixreiθ = r α (1 + iζ)(cos αθ + isin αθ) + ixr(cos θ + isin θ) = [r α (cos αθ − ζsin αθ) − xrsin θ] + i [r α (ζcos αθ + sin αθ) + xrcos θ] . Substituting ζ = − tan αθ 0 = −sin αθ 0 /cos αθ 0 yields cos αθ − ζsin αθ = cos α(θ − θ 0 )/k and ζcos αθ + sin αθ = sin α(θ − θ 0 )/k, where k = cos αθ 0 . Hence cos α(θ − θ 0 ) sin α(θ − θ 0 ) h(reiθ ) = r α − xrsin θ + i r α + xrcos θ . k k (3.26) To have Im h(reiθ ) = 0 requires that r α sin α(θ − θ 0 )/k = −xrcos θ, or r = 1 [x ρ(θ)] α−1 , where ρ(θ) = ρ(θ|α, β) =
kcos θ −kcos θ = . sin α(θ − θ 0 ) sin α(θ 0 − θ)
The desired contour is (see Figure 3.3) 1
C = {z(θ) = [x ρ(θ)] α−1 eiθ : − π2 < θ < θ 0 }. When α < 1 and β < 1, this path starts at z(−π/2) = 0 and ends at z(θ 0 ) = ∞, and when α > 1 and β > −1, it starts at z(−π/2) = ∞ and ends at z(θ 0 ) = 0. In the remaining cases, this path has one endpoint on the negative imaginary axis and the other at ∞: when (α < 1, β = 1), z(−π/2) is on the imaginary axis and when (α > 1, β = −1), z(θ 0 ) is. In these cases, extend C by adjoining the line segment along the imaginary axis from the origin to this point. This extra piece does not affect the argument below. Along the path C, (3.26) shows
72
3 Technical Results for Univariate Stable Distributions
1 cos α(θ − θ 0 ) h(z(θ)) = Re h(z(θ)) = [x ρ(θ)] − x[x ρ(θ)] α−1 sin θ k α 1 cos α(θ − θ 0 ) = x α−1 ρ(θ) α−1 ρ(θ) − sin θ k α 1 cos θ cos α(θ 0 − θ) − sin θ sin α(θ 0 − θ) = x α−1 ρ(θ) α−1 sin α(θ 0 − θ) α 1 cos[α(θ 0 − θ) + θ] = x α−1 ρ(θ) α−1 sin α(θ 0 − θ) 1 α−1 α kcos θ cos[α(θ 0 − θ) + θ] α−1 =x sin α(θ 0 − θ) sin α(θ 0 − θ) α α−1 α 1 α cos θ cos[α(θ 0 − θ) + θ] = x α−1 V(−θ|α, β). = x α−1 k α−1 sin α(θ 0 − θ) cos θ α α−1
Also, 1
1
1
Re dz(θ) = d(x α−1 ρ(θ) α−1 cos θ) = [x ρ(θ)] α−1
cos θ ρ (θ) − sin θ dθ. (α − 1)ρ(θ)
(3.27)
To evaluate this, calculate −sin θ sin α(θ 0 − θ) − cos θ(−α cos α(θ 0 − θ)) sin2 α(θ 0 − θ) α (cos θ cos α(θ 0 − θ) − sin θ sin α(θ 0 − θ)) + (α − 1)sin θ sin α(θ 0 − θ) =k sin2 α(θ 0 − θ) α cos[θ + α(θ 0 − θ)] + (α − 1)sin θ sin α(θ 0 − θ) . =k sin2 α(θ 0 − θ)
ρ (θ) = k
So ρ (θ) α cos[θ + α(θ 0 − θ)] + (α − 1)sin θ sin α(θ 0 − θ) sin α(θ 0 − θ) × =k ρ(θ) kcos θ sin2 α(θ 0 − θ) sin θ α cos[θ + α(θ 0 − θ)] + (α − 1) . = cos θ sin α(θ 0 − θ) cos θ And thus continuing from (3.27), 1 cos θ α cos[θ + α(θ 0 − θ)] sin θ Re dz(θ) = [x ρ(θ)] α−1 + (α − 1) − sin θ dθ (α − 1) cos θ sin α(θ 0 − θ) cos θ 1 α cos[θ + α(θ 0 − θ)] dθ = [x ρ(θ)] α−1 (α − 1) sin α(θ 0 − θ) 1 α V(−θ|α, β)dθ. = x α−1 (α − 1)
3.2 Densities and distribution functions
73
A standard monodromy theorem, e.g. Ahlfors (1979), shows that integrating over the contour C gives f : ∫ ∫ f (x|α, β; 1) = Re π1 e−h(z) dz = π1 e−h(z) Re (dz) C C ∫ θ0 ⎧ 1 α α ⎪ α−1 ⎪ x V(−θ|α, β) exp (−x α−1 V(−θ|α, β))dθ α < 1 ⎪ ⎨ π(α − 1) ⎪ − π2 = ∫ −π 2 ⎪ 1 α α ⎪ ⎪ x α−1 V(−θ|α, β) exp (−x α−1 V(−θ|α, β))dθ α > 1 ⎪ ⎩ π(α − 1) θ0 ∫ θ0 1 α −α x α−1 V(−θ|α, β) exp (−x α−1 V(−θ|α, β))dθ. = π π|α − 1| −2 Substituting φ = −θ finishes this part of the proof. Now consider the α = 1 and β > 0 case. Here h(z) = z + i(xz + β π2 z log z), and h(reiθ ) = reiθ + ireiθ [x + β π2 log(reiθ )] = r[cos θ − xsin θ − β π2 (θcos θ + log rsin θ)] +ir[sin θ + xcos θ + β π2 (log rcos θ − θsin θ)].
(3.28)
To have Re h(reiθ ) = 0 requires that β π2 (log rcos θ − θsin θ) = −[sin θ + xcos θ], or that r = e−πx/(2β) ρ(θ), where ρ(θ) = ρ(θ|1, β) = exp ([θ − π/(2β)] tan θ) . The contour of integration in this case is C = {z(θ) = e−πx/(2β) ρ(θ)eiθ : − π2 < θ < π2 }. For 0 < β < 1, this path starts at ∞ and ends at 0. When β = 1, this path does not reach the origin, but as above, it can be connected to the origin by a line segment along the imaginary axis, which does not contribute to the integral. Along this path, h(z(θ)) = Re h(z(θ)) = e−πx/(2β) ρ(θ)[cos θ(1 − β π2 θ) − sin θ(x + β π2 log(e−πx/(2β) ρ(θ)))] = e−πx/(2β) ρ(θ)[cos θ + sin θ tan θ](1 − β π2 θ) πx
= e−πx/(2β) ρ(θ)(1 − β π2 θ)/cos θ = e− 2β V(−θ|1, β). Using ρ (θ) = ρ(θ)[tan θ + (θ − π/(2β)) sec2 θ], Re dz(θ) = d(e−πx/(2β) ρ(θ)cos θ) = e−πx/(2β) (ρ (θ)cos θ − ρ(θ)sin θ)dθ
= e−πx/(2β) ρ(θ) (θ − π/(2β)) sec2 θ + tan θ + θ sec2 θ dθ πx π θ − π/(2β) dθ = − e− 2β V(−θ|1, β)dθ. = e−πx/(2β) ρ(θ) cos θ 2β
74
3 Technical Results for Univariate Stable Distributions
Hence, taking into account the −θ in the above formulas and the fact that C1 starts at ∞ and ends at the origin (or on the imaginary axis when β = 1), ∫ ∫ π 2 πx πx 1 −h(z) 1 f (x|1, β; 1) = Re π e dz = e− 2β V(θ|1, β) exp (−e− 2β V(θ|1, β))dθ. π 2β C −2 When α = 1 and β = 0, the density is the well-known Cauchy density. When β < 0, the argument is similar to the one above; now the path C starts at 0 and ends at ∞.
α = 1.5, x = 3
α = 1, x = 0.1
α = 0.5, x = 1
β=1 β = 0.5
β = −1
β=0 β = − 0.5
β = 0.5 β=0 β = −1
β=1
β = − 0.5 β=0
β = −1
β = − 0.5
β = 0.5 β=1
Fig. 3.3 Contours of integration.
Theorem 3.4 When α ∈ (0, 1) ∪ (1, 2), the distribution function of X ∼ S (α, β; 1) is given by ∫ π ⎧ 2 α sign (1 − α) ⎪ ⎪ α−1 V(θ|α, β) dθ c(α, β) + exp −x x>0 ⎪ ⎪ ⎪ π −θ0 ⎪ ⎨ π ⎪ x=0 F(x|α, β; 1) = 2 − θ 0 /π ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 − F(−x|α, −β; 1) x < 0, ⎩
where c(α, β) =
1
π 2
− θ 0 /π
0 0. The distribution function is the integral of the density formula with respect to x in Theorem 3.3. The sign (1 − α) occurs because V(·|α, β) is increasing when α < 1 and decreasing when α > 1. The constant of integration c(α, β) is determined by the requirement that F(x|α, β; 1) → 1 as x → ∞: when α < 1, the integrand in the expression for F(x|α, β; 1) converges to 1, but when α > 1, the integrand converges to 0. For x = 0, the result follows by taking
x ↓ 0: when α < 1, the integrand converges to 0 pointwise, so F(0|α, β; 1) = π1 π2 − θ 0 ;
when α > 1, the integrand converges to 1 pointwise and F(0|α, β; 1) = π1 π2 − θ 0 − ( π2 −(−θ 0 ))/π = ( π2 −θ 0 )/π. The case x < 0 is the reflection property Proposition 1.1. The α = 1 case is similar. It is now straightforward to prove what the support of a stable distribution is. Proof of Lemma 1.1 (Support of stable distributions) If α < 1, β = 1, then θ 0 = π2 and by Theorem 3.4, P(X ≤ 0) = F(0|α, 1) = 0. For x > 0, the integrals for the density in Theorem 3.3 have a nonempty interval and a strictly positive integrand, and hence f (x|α, β) > 0. Thus the support is exactly [0, ∞) in these cases. In all other cases, the integrals for x > 0 in Theorem 3.3 have a nonempty interval of a strictly positive function, and hence f (x|α, β) > 0. Using reflection for x < 0 shows that the density is strictly positive for all x.
3.2.1 Series expansions There are series expansions for stable densities and distribution functions due to Feller (1952) and Bergström (1952) when α 1. The results are stated below, see Chapter 2 of Zolotarev (1986) or Chapter 4 of Uchaikin and Zolotarev (1999) for proofs. When α = 1, there are formal series expansions, but the coefficients do not have an explicit form. Theorem 3.5 (a) For 0 < α < 1, −1 ≤ β ≤ 1, set ζ = −β tan(πα/2), ρ = arctan(ζ) − ( π2α ) = −(αθ 0 + ( π2α )), c = (cos αθ 0 )1/α . Then for x > 0, ∞ 1 Γ(kα + 1) sin k ρ f (x|α, β; 1) = (−(cx)−α )k πx k=1 k!
F(x|α, β; 1) = 1 −
∞ 1 Γ(kα + 1) sin k ρ (−(cx)−α )k . πα k=1 k!k
(b) For α = 1, β > 0, any x f (x|1, β; 1) =
∞
n−1 1 nbn − π2 x + log π2 2 k=1
F(x|1, β; 1) = 1 −
∞ n 1 π bn − 2 x + log π2 , π k=0
76
where
3 Technical Results for Univariate Stable Distributions
1 bn = bn (β) = n!
∫
∞
exp (−βt log t)t n−1 sin((1 + β)tπ/2)dt.
0
For β < 0, use the reflection property. (c) For 1 < α < 2, −1 ≤ β ≤ 1, let ζ, ρ and c be as in (a). Then for x > 0, f (x|α, β; 1) =
∞ k k 1 Γ( α + 1) sin α ρ (−cx)k πx k=1 k!
∞ k k ρ 1 Γ( α + 1) sin α ρ + (−cx)k . F(x|α, β; 1) = 1 + πα π k=1 k!k
While theoretically convergent for all x > 0, numerically these series are only well behaved in certain regions. The truncated series based on (a) gives a good approximation on the tail for all α, even though the infinite series diverges when α > 1. Likewise, truncated series based on (c) gives good results for small x for all α, even though the infinite series diverges when α < 1. Both are numerically delicate as α → 1. With careful evaluation of the integrals, the expressions for f and F given in the previous section are more useful computationally. We stated the result in the S (α, β; 1) parametrization, but note that these results are easier to state and prove for standardized distribution in the 7-parameterization when α 1 and in the 6-parameterization when α = 1. (See Section 3.5 for definitions of these parameterizations). The translation to the S (α, β; 1) parameterization is by scaling x by c and by restating the skewness parameter in terms of ρ when α 1, and by noting that if X ∼ S (1, β; 1), then (π/2)X + log(π/2) ∼ S (1, β; 6). Also, the value of the d.f. at the origin from Theorem 3.4 is used when α 1. The duality between α-stable and (1/α)-stable laws, see Section 3.2.3, can be established using these series, see Problem 3.26.
3.2.2 Modes In spite of a considerable amount of effort, it was not proved until relatively recently that all stable densities are unimodal. Unimodality in the symmetric case was proven by Wintner (1936). A proof in the general case was claimed by Ibragimov and Chernin (1959), but Kanter (1976) found an error in their argument. Finally Yamazato (1978) proved the general result, indeed he showed that infinitely divisible distributions of class L, which include stable laws, are unimodal. Further facts about stable unimodality and modes can be found in Sato and Yamazato (1978), Hall (1984), Gawronski (1984), Gawronski and Wiessner (1992), and Simon (2011a). While these papers give some bounds, there were no known closed formulas or accurate numerical results for the location of the mode. Below we give numerical results for the location of the mode of a general stable density.
3.2 Densities and distribution functions
77
Gawronski (1984) claimed to show that all stable densities are bell-shaped. This means that the first derivative of the densities have exactly one zero, with a positive derivative to the left of the mode and a negative derivative to the right of the mode; the second derivative has exactly two zeroes and alternates in sign, and for all n ≥ 1, the n-th derivative has exactly n zeros with alternating signs. However, Simon (2011b) found an error in that proof, leaving the problem unresolved. A few years later, Simon (2015) showed that stable densities with 0 < α < 1 and | β| = 1 are bell-shaped. Very recently, Kwaśnicki (2020) proved that all stable densities are indeed bell-shaped. Further results on bell-shaped densities are given in Kwaśnicki and Simon (2019). Previous work on locating the mode has used parameterization (C) of Zolotarev (see Section 3.5), because the characteristic function is simpler. However, there are drawbacks to such an approach: in the (A) or (C) parameterization with β 0, the mode of a stable density tends to (sign β)∞ as α ↑ 1 and to −(sign β)∞ as α ↓ 1. The S (α, β; 0) parameterization does not have this problem because densities are jointly continuous in α and β. In the S (α, β, γ, δ; 0) parameterization, modes are well behaved because of the joint continuity of the characteristic function. It suffices to calculate the mode of the standardized density, which we denote by m(α, β). The mode of a general distribution can be obtained from this by scaling and shifting: if X ∼ S (α, β, γ, δ; 0), then its mode is located at γm(α, β) + δ. In the S (α, β, γ, δ; 1) parameterization, the mode is given by γm(α, β) + δ + γ β tan π2α when α 1, and by γm(1, β) + δ + β π2 γ log γ when α = 1. For a symmetric stable distribution, it is known that m(α, 0) = 0, see Wintner (1936). There is also an exact value for the Lévy distributions. √ The standardized Lévy density is f (x|1/2, 1; 0) = (x + 1)−3/2 exp (−1/(2(x + 1)))/ 2π, for x > −1, and f (x|1/2, 1; 0) = 0 precisely when x = −2/3. Hence, m(1/2, 1) = −2/3 and m(1/2, −1) = 2/3. Modal values are not known exactly for other stable distribution. By reflection, m(α, −β) = −m(α, β), so it suffices to consider 0 ≤ β ≤ 1. Appendix C provides accurate calculations of the mode for general α and β, obtained by numerically maximizing f (x|α, β; 0) as computed by the program STABLE, see Figure 3.4. Estimated accuracy in the values of m(α, β) is ±0.0001. Next we give an analytic result on modes that is a modification of Theorem (2.1) of Hall (1984) and Equation (87) of Gawronski and Wiessner (1992). Since m(α, 0) = 0 for all α, the rate at which m(α, β) goes to zero as β → 0 is given by κ(α) = limβ→0 m(α, β)/β = ∂m(α, 0)/∂ β. Lemma 3.10 If κ(α) is the rate at which the mode m(α, β) goes to zero as β → 0, then tan π2α Γ(1+(2/α)) − 1 α1 Γ(3/α) κ(α) = α = 1. (2γEuler − 3)/π Proof Let α 1 and c = c(α) = tan π2α . Using Theorem 3.2, the density of a standardized S (α, β; 0) random variable is ∫ α 1 ∞ cos[xt + βc(t − t α )]e−t dt f (x|α, β; 0) = π 0
3 Technical Results for Univariate Stable Distributions
0.0
78
β=0
−0.2
β = 0.2
−0.4
β = 0.4 β = 0.6 −0.6
β = 0.8 β=1 0.5
1.0
1.5
α
2.0
Fig. 3.4 Plot of stable modes m(α, β) in the 0-parameterization.
so
∂ f (x|α, β; 0) 1 =− ∂x π
∫
∞
α
sin[xt + βc(t − t α )]te−t dt.
0
We want m(α, β) such that ∂ f /∂ x(m(α, β)|α, β; 0) = 0. Fix α and define the following series for μ(·) = m(α, ·): μ(β) =
∞
μ j (α)β j =
j=0
∞ j=0
μ1 cβ + μ2j+1 β2j+1, c j=1 ∞
μ2j+1 (α)β2j+1 =
where the even terms drop out because m(α, ·) is odd. At the mode, ∫ ∞ α sin [μ(β)t + βc(t − t α )] te−t dt 0= 0
∫ =
0
∞
⎤ ⎡ ∞ ⎥ ⎢ μ1 α 2j+1 ⎥ −t α ⎢ +1 t−t +t sin ⎢ βc μ2j+1 β ⎥ te dt. c ⎥ ⎢ j=1 ⎦ ⎣
As β ↓ 0, the above integral is
3.2 Densities and distribution functions
79
⎡ ⎤ ∞ ⎢ ⎥ α μ1 + 1 t − tα + t sin ⎢⎢ βc μ2j+1 β2j+1 ⎥⎥ te−t dt c 0 ⎢ ⎥ j=1 ⎦ ∫ ∞ ⎣ α μ1 + 1 t − t α + tO(β3 ) te−t dt βc c 0 ∫ ∞ α μ1 + 1 t 2 − t α+1 e−t dt = 0 βc c 0 ∫ ∞ μ ∫ ∞ α 1 2 −t α ⇒ +1 t e dt = t α+1 e−t dt c 0 0
∫
∞
so
μ
Γ(3/α) Γ(1 + 2/α) = . c α α Solving for μ1 = κ(α) completes the α 1 case. The α = 1 case is given in Gawronski and Wiessner (1992). 1
+1
-0.8
-0.6
-0.4
-0.2
0.0
In contrast to other parameterizations κ(α) is well behaved around α = 1: limα→1 κ(α) = κ(1) = (2γEuler − 3)/π. This is shown by using properties of the gamma function: Γ(1 + p) = pΓ(p) and Γ(1 + x) ∼ 1 − γEuler x for x small. Two numerical observations about stable modes are: (i) m(α, β) is approximately linear in β, specifically |m(α, β) − βm(α, 1)| ≤ 0.06, and (ii) m(α, 1) and κ(α) have approximately the same shape, see Figure 3.5. This gives an approximation to the mode: m(α, β) ≈ βκ(α).
0.0
0.5
1.0
alpha Fig. 3.5 Plot of κ(α) and m(α, 1).
1.5
2.0
80
3 Technical Results for Univariate Stable Distributions
CONJECTURE: for α fixed, m(α, β) is decreasing and convex in β ∈ [0, 1]. If true, this conjecture implies that for β ≥ 0, βκ(α) ≤ m(α, β) ≤ βm(α, 1) ≤ 0. Since the outer terms above are known exactly, a proof of this conjecture would yield uniform bounds on the mode. It is also interesting to know f (m(α, β)|α, β), the height of the density at the mode. For a standardized stable random variable, values of the density and the d. f. at the mode are known only for symmetric stable and Lévy cases. Otherwise, it is not known how the density and d.f. at m(α, β) vary with α and β. In the symmetric case, m(α, 0) = 0 and Corollary 3.1 shows that f (0|α, 0; 0) = Γ(1 + (1/α))/π, which tends to infinity as α ↓ 0. For a general β, f (m(α, β)|α, β; 0) also tends to infinity as α ↓ 0, so it is difficult to display the height of the mode for small α. A simple argument shows that f (x|α, β; 0) ≤ f (0|α, 0; 0), so the ratio f (m(α, β)|α, β; 0)/ f (0|α, 0; 0) is bounded. This ratio is plotted in Figure 3.6 for varying α and β. Lemma 3.11 For all x ∈ R and all β, f (x|α, β; 0) ≤ f (0|α, 0; 0) = Γ(1 + 1/α)/π, where f (0|α, 0; 0) is the height of a symmetric α-stable density at its mode. Proof From Theorem 1.1 ∫∞ α 1 e−t cos[(x − ζ)t + ζt α ]dt f (x|α, β; 0) = π1 ∫0∞ −t 2 π 0 e cos[xt + β π t log |t|]dt
α1 . α=1
∫∞ α When α 1, for all x and β, f (x|α, β; 0) ≤ π1 0 e−t dt = f (0|α, 0; 0), since |cos[(x − ζ)t + ζt∫α ]| ≤ cos(0) = 1. Similarly, when ∫ ∞ α = 1, for all x and β, ∞ f (x|α, β; 0) = π1 0 e−t cos[xt + β π2 t log |t|]dt ≤ π1 0 e−t dt = f (0|10; 0). A measure of asymmetry in the distribution is to evaluate the d.f. at the mode. Figure 3.7 shows F(m(α, β)|α, β; 0) = percentile at the mode.
3.2.3 Duality It is possible to express every stable distribution with α ∈ (1, 2] in terms of a stable distribution with index α1 = 1/α ∈ [1/2, 1). The precise statement of this “duality” property of stable distributions is given next. It is simpler to state and prove duality in the S (α, β; 7) parameterization (see Section 3.5), e.g. Zolotarev (1986), but for consistency we’ll use the 1-parameterization. Recall that f (x|α, β; 1) and F(x|α, β; 1)
81
0.6
0.8
1.0
3.2 Densities and distribution functions
0.0
0.2
0.4
beta = 0 beta = 0.25 beta = 0.5 beta = 0.75 beta = 1
0.0
0.5
1.0
1.5
2.0
alpha
0.2
0.3
0.4
0.5
Fig. 3.6 Scaled value of density at the mode of standardized stable random variables f (m(α, β)|α, β; 0)/ f (0|α, 0; 0) as a function of α and β.
0.0
0.1
beta = 0 beta = 0.25 beta = 0.5 beta = 0.75 beta = 1
0.0
0.5
1.0
1.5
2.0
alpha Fig. 3.7 Distribution function of standardized stable random variables at mode F(m(α, β)|α, β; 0) as a function of α and β.
82
3 Technical Results for Univariate Stable Distributions
are known at x = 0 by Corollary 3.1 and Theorem 3.4, so only the cases x 0 are of interest. Theorem 3.6 Let X ∼ S (α, β; 1) where α > 1, −1 ≤ β ≤ 1. Set θ 1 = (π/2) − arctan(β tan(πα/2)) θ 2 = (π/2) + arctan(β tan(πα/2)) α1 = 1/α β1 = cot(πα1 /2) cot(α1 θ 1 ) β2 = cot(πα1 /2) cot(α1 θ 2 ) A1 = (1 + (β tan(πα/2))2 )1/2 /sinα α1 θ 1 A2 = (1 + (β tan(πα/2))2 )1/2 /sinα α1 θ 2 . Then
f (x|α, β; 1) = F(x|α, β; 1) =
A1 x −(α+1) f (A1 x −α |α1, β1 ; 1) A2 |x| −(α+1) f (A2 |x| −α |α1, β2 ; 1)
x>0 x0 . x 0 can be rewritten as α(1 − F(x|α, β; 1)) = F(A1 x −α |α1, β1 ; 1) − F(0|α1, β1 ; 1). Apply Theorem 3.4 to both sides of this equation: the left-hand side is ∫ α π/2 exp −x α/(α−1)V(θ|α, β) dθ π −θ0 and the right-hand side is ∫ 1 π/2 exp −(A1 x −α )α1 /(α1 −1)V(φ|α1, β1 ) dφ π −φ0 ∫ 1 π/2 = exp −x α/(α−1) A1/(1−α) V(φ|α , β ) dφ, 1 1 1 π −φ0 using φ0 = θ 0 (α1, β1 ) and α1 /(α1 − 1) = −1/(α − 1). Substitute φ = π2 − α(θ 0 + θ). Problem 3.20 shows that V(θ|α, β) = A1/(1−α) V(φ|α1, β1 ), φ = π2 when θ = −θ 0 , and 1 π φ = −φ0 when θ = 2 . Since dφ = −αdθ, this φ substitution shows the equality of the two integrals above. The formula for f (x|α, β; 1) follows by differentiating F(x|α, β; 1). The x < 0 cases are proved using the reflection formulas: f (−x|α, β; 1) = f (x|α, −β; 1) and F(−x|α, β; 1) = 1 − F(x|α, −β; 1) and then applying the previous case. Problem 3.26 gives a different proof of this result.
3.3 Numerical algorithms
83
Note that α1 is the same whether x > 0 or x < 0, but β2 is generally not equal β1 or −β1 . A curious thing about this is that it relates the right and left halves of strictly stable density and distribution functions to different dual functions. From a statistical point of view, there is nothing special about the strictly stable distributions. Indeed, they limit the possibilities for fitting data: they require the location parameter to take a specific value when α 1, and prevent the occurrence of asymmetry when α = 1. However, from a mathematical point of view, the strictly stable laws are intrinsically interesting. They have special structure and Zolotarev shows that the halves of the strictly stable laws are rich in mathematical structure. The theorem says that it suffices to calculate stable densities and distribution functions for all α ≤ 1, because those with α > 1 can be quickly computed from the result above. The result is not particularly useful for computational formulas because the integrals for the density and d.f. are equally difficult when α < 1 and when α > 1. In some cases it is possible to reverse these equations and express a stable density or distribution function with 1/2 ≤ α1 < 1 in terms of those for a stable law with α > 1. This is not possible in all cases because not every β1 ∈ [−1, 1] can be achieved in the expressions above, see Problem 3.25. The duality formulas for the density and d.f. in S (α, β; 0) form are shifts of the S (α, β; 1) formulas: for x > ζ = −β tan(πα/2), f (x|α, β; 0) = A1 (x − ζ)−(α+1) f (A1 (x − ζ)−α + ζ1 |α1, β1 ; 0), F(x|α, β; 0) = 1 − α1 [F(A1 (x − ζ)−α + ζ1 |α1, β1 ; 0) − F(ζ1 |α1, β1 ; 0)] , where α1 , β1 and A1 are as in Theorem 3.6 and ζ1 = −β1 tan(πα1 /2).
3.3 Numerical algorithms 3.3.1 Computation of distribution functions and densities There have been several efforts to compute stable densities and quantiles: Holt and Crow (1973) tabulate densities for selected values of α and β; Worsdale (1975) and Panton (1992) tabulate symmetric stable distribution functions; Brothers et al. (1983) and Paulson and Delehanty (1993) tabulate fractiles of general stable distributions; and McCulloch and Panton (1998) tabulate densities and fractiles for the maximally skewed distributions. The program STABLE computes stable densities and distributions involve numerically evaluating the integrals in Theorems 3.3 and 3.4. This approach was described in Nolan (1997). Without loss of generality, we assume x > 0 when α 1 and β > 0 when α = 1. Lemma 3.8 shows that V(·|α, β) is continuous, positive, and strictly monotonic. Thus the integral expression for the density in Theorems 3.3 has an integrand that is unimodal, possibly with one end truncated. Thus the integrand is a continuous,
84
3 Technical Results for Univariate Stable Distributions
bounded, non-oscillating function, and the region of integration is a bounded interval. Numerical integration techniques are well developed for such functions. The program STABLE uses the adaptive quadrature routine DQAG from QUADPACK, Piessens et al. (1983), to evaluate these integrals. The distribution function is also easy to approximate because the integrands in Theorem 3.4 are well behaved. When α < 1, the integrand in the formula for F(x|α, β; 1) starts at a positive value ≤ 1 and decreases monotonically to 0. When α > 1, the integrand increases monotonically from 0 to a number ≤ 1 over the region of integration. Stable quantiles are computed by numerically solving for x in the equation F(x|α, β; 0) = q for q ∈ (0, 1). The method of Brent (1971) is used in the STABLE program to find this root. There have been some other recent numerical work on these evaluations. Zieliński (2001) describes an algorithm for evaluating symmetric distribution functions. Robinson (2003a) describes algorithms using high precision (16 byte reals) to calculate stable densities and distribution functions. And Matsui and Takemura (2006) and Matsui (2005) have accurately calculated stable densities and their partial derivatives.
3.3.2 Spline approximation of densities For maximum likelihood estimation, it is desirable to have a quick way to calculate stable densities. The formulas given above require a numerical evaluation of an integral for every value of x and this can be slow when there are many x values and multiple values of the parameters. A spline approximation to stable densities has been developed for this purpose. Values of yi jk = f (xi |α j , βk ; 0) have been precomputed on a grid of x, α, β values. (Here it is desirable to use the S (α, β, γ, δ; 0) parameterization or some variation that is jointly continuous.) These tabulated values were then used to compute a threedimensional B-spline approximation to stable densities. The spline coefficients have been stored in arrays inside the STABLE program and used to compute density values. While not as accurate as the quadrature routines at non-grid points, the interpolation routines require no integration, are over 300 times as fast as the integral formulas, and yield reliable values of the density. A different approach for quickly calculating the density and distribution function in the symmetric case when α > 0.85 have been developed by McCulloch (1998c).
3.3.3 Simulation In this section the simulation formulas given in Chapter 1 are proved. The simulation formulas and computational algorithms are due to Chambers et al. (1976), with a minor correction noted in the bibliography. Unfortunately, that paper is difficult to
3.3 Numerical algorithms
85
read: they focused on computational formulas, used three different parameterizations, and did not actually give a proof of the result. Weron (1996) gave a proof, but used two parameterizations, had some misprints, and did not relate them to the Chambers, Mallows and Stuck computational formulas. Here we give a proof using the S (α, β; 1) parameterization directly, and then relate these formula to the computational ones. There are other, less efficient ways to simulate stable random variables that are discussed in Problems 3.42, 3.43, and 3.44. Proof of Theorem 1.3 (Simulating stable random variables) Substituting β = 0, θ 0 = 0 in the general formula reduces to the claimed one for the symmetric case. The rest of this proof focuses on the general β case, where the theorem states that Z = Z(α, β) :=
(1−α)/α cos(αθ 0 + (α − 1)Θ) sin α(θ 0 + Θ) W (cos αθ 0 cos Θ)1/α
(3.29)
has a S (α, β; 1) law when α 1. First consider the case when 0 < α < 1. The sign of Z is determined by the term sin α(θ 0 + Θ): Z > 0 if and only if Θ > −θ 0 . Hence P(Z ≤ 0) = P(Θ ≤ −θ 0 ) = 1 π π ( 2 − θ 0 ). Now if Θ > −θ 0 , Z = [V(Θ)/W](1−α)/α, where V(θ) = V(θ|α, β) is given by (3.24). Since (1 − α)/α > 0, for z > 0, P(0 < Z ≤ z) = P([V(Θ)/W](1−α)/α ≤ z, Θ > −θ 0 ) = P(W ≥ z α/(α−1)V(Θ), Θ > −θ 0 ) = ∫ =
1 π
π/2
−θ0
∫
π/2
−θ0
P(W ≥ z α/(α−1)V(θ))
dθ π
exp (−z α/(α−1)V(θ))dθ.
Hence, by Theorem 3.4, P(Z ≤ z) = P(Z ≤ 0) + P(0 < Z ≤ z) = F(z|α, β; 1) for d
z > 0. The case z < 0 is handled by reflection: since θ 0 (α, −β) = −θ 0 (α, β), Θ= − Θ, d
sin x is odd, and cos x is even, Z(α, −β)= − Z(α, β). Hence the d.f. of Z must agree with F(z|α, β; 1) everywhere. Next consider (3.29) when 1 < α < 2. In this case, we again have Z > 0 if and only if Θ > −θ 0 . However, (1 − α)/α < 0, so a modification of the above argument is used. For z > 0, P(Z > z) = P([V(Θ)/W](1−α)/α > z, Θ > −θ 0 ) = P(W > z α/(1−α)V(Θ), Θ > −θ 0 ) ∫ π/2 exp (−z α/(1−α)V(θ))dθ. = π1 −θ0
86
3 Technical Results for Univariate Stable Distributions
By Theorem 3.4, this is 1 − F(z|α, β; 1). Hence the upper tail probability of Z is the same as the upper tail probability of a S (α, β; 1) law. Using reflection as above, the lower tails are the same and Z ∼ S (α, β; 1). Finally, consider the case α = 1. Here the theorem states that π
W cos Θ Z = Z(1, β) = π2 π2 + βΘ tan Θ − β log 2 π . (3.30) 2 + βΘ When β = 0, this expression reduces to Z = tan Θ, the same as (1.10). When β > 0, the expression can be rearranged as Z = π2 β log(V(Θ)/W), where V(θ) = V(θ|1, β) is given by (3.24). Thus P(Z ≤ z) = P((2/π)β log(V(Θ)/W) ≤ z) πz = P(W ≥ exp (− )V(Θ)) 2β ∫ π 2 πz = exp (exp (− )V(θ))dθ. π 2β −2 By Theorem 3.4, this is F(z|α, β; 1). The β < 0 case is handled by reflection: d
d
Z(1, −β)= − Z(1, β) because Θ= − Θ, tan(−x) = − tan x, and cos(−x) = cos x.
We note that (3.29) and (3.30) are respectively (2.3) and (2.4) in Chambers et al. (1976). The above expressions are numerically unstable near α = 1. To deal with this, they give computational forms of the simulation formulas are derived in the 0-parameterization as follows. For α 1, the addition formulas for sin(A + B) and cos(A + B) show sin α(θ + θ 0 ) = sin αθ cos αθ 0 + cos αθ sin αθ 0 = cos αθ 0 [sin αθ + cos αθ tan αθ 0 ] cos[αθ 0 + (α − 1)θ] = cos αθ 0 cos(α − 1)θ − sin αθ 0 sin(α − 1)θ = cos αθ 0 [cos(α − 1)θ − tan αθ 0 sin(α − 1)θ]. Therefore (3.29) can be rewritten as (1−α)/α sin αΘ + tan αθ 0 cos αΘ cos(α − 1)Θ − tan αθ 0 sin(α − 1)Θ . cos Θ W cos Θ To get a S (α, β; 0) r.v., shift by ζ = − tan αθ 0 , (1−α)/α sin αΘ + tan αθ 0 cos αΘ cos(α − 1)Θ − tan αθ 0 sin(α − 1)Θ − tan αθ 0 + . cos Θ W cos Θ This is equation (4.1) in Chambers et al. (1976), with our θ 0 equal their −φ0 . To deal with discontinuities as α → 1, they let Y denote the term in brackets above and rewrite as
d 3.4 Functions g d , g d , h d and h
87
sin αΘ cos αΘ + tan αθ 0 −1 cos Θ cos Θ
Y (1−α)/α − tan αθ 0 (1 − Y (1−α)/α ).
The Chambers, Mallows and Stuck algorithm evaluate this expression, which is well behaved near α = 1, using additional trigonometric identities. See problem 3.45 for simulating when α is near 0.
d g d , h d and h 3.4 Functions g d , Many quantities of interest for univariate and multivariate stable distributions can be expressed in terms of the following special functions: for x ∈ R and real d in the ranges below define the functions ∫ ∞ α ⎧ ⎪ ⎪ cos(xr + βη(r, α; 1)) r d−1 e−r dr 00 ⎪ ⎨ ⎪ gd (0|α, β) = (log(cos αθ 0 ))/α d=0 ⎪
⎪ ⎪ (cos αθ 0 )d/α cos(d θ 0 ) − 1 Γ(1 + d/α)/d −α < d < 0 ⎩ −(cos αθ 0 )d/α sin(d θ 0 )Γ(1 + d/α)/d d ∈ (−α, 0) ∪ (0, ∞) gd (0|α, β) = d = 0. −θ 0 Proof Substitute u = r α in the expressions for gd (0|α, β) and gd (0|α, β). Then use respectively the integrals 3.944.6, 3.948.2, 3.945.1, 3.944.5, and 3.948.1 on pages 492–493 of Gradshteyn and Ryzhik (2000). Note that some of those printed formulas have mistyped exponents. Finally, when α 1, αθ 0 = − arctan ζ, and for the allowable values of α and θ 0 , use (3.23). The following results show a connection between fractional derivatives and gd (·) and gd (·) for non-integer values of d. The Weyl fractional derivative of order λ with respect to x is denoted by Wxλ . Lemma 3.13 Let 0 < α < 2, −1 ≤ β ≤ 1, d > 0, −d < λ < ∞. Wxλ gd (x|α, β) = cos
πλ 2
gd+λ (x|α, β) − sin
πλ 2
gd+λ (x|α, β)
β) = cos
πλ 2
gd+λ (x|α, β) + sin
πλ 2
gd+λ (x|α, β)
Wxλ gd (x|α,
d 3.4 Functions g d , g d , h d and h
89
Proof Section VII of Miller and Ross (1995) defines the Weyl fractional derivative and shows that Wxλ cos(ux) = uλ cos(ux + π2λ ) and Wxλ sin(ux) = uλ sin(ux + π2λ ). For notational convenience, set c1 = cos(xt), c2 = cos π2λ , c3 = cos(βη(r, α) and s1 = cos(xt), s2 = cos π2λ , s3 = cos(βη(r, α). Then cos(A + B) = cos A cos B − sin A sin B, and sin(A + B) = sin A cos B + cos A sin B show ∫ ∞ Wxλ gd (x) = Wxλ cos(xr + βη(r, α))r d−1 exp (−r α )dr 0 ∫ ∞ = Wxλ [cos(xr) cos(βη(r, α)) − sin(xr) sin(βη(r, α))] r d−1 exp (−r α )dr 0 ∫ ∞
cos(xr + π2λ ) cos(βη(r, α)) − sin(xr + π2λ ) sin(βη(r, α)) r d−1 exp (−r α )dr = ∫0 ∞ [(c1 c2 − s1 s2 )c3 − (s1 c2 + c1 s2 )s3 ] r d−1 exp (−r α )dr = ∫0 ∞ [c2 (c1 c3 − s1 s3 ) − s2 (s1 c3 + c1 s3 )] r d−1 exp (−r α )dr = 0 ∫ ∞ [c2 cos(xr + βη(r, α)) − s2 sin(xr + βη(r, α))] r d−1 exp (−r α )dr = 0
gd+λ (x). = c2 gd+λ (x) − s2 The derivation for Wxλ gd is similar.
3.4.1 Score functions The above functions can be used to find expressions for the score functions for general univariate stable densities. Theorem 3.7 Stable score functions Let α 1. Consider the univariate stable density in the 1-parameterization: x − δ 1 α, β . g1 f (x|α, β, γ, δ; 1) = πγ γ The score functions are given by
90
3 Technical Results for Univariate Stable Distributions
x − δ 1 π β sec2 π2α ∂f α, β (x|α, β, γ, δ; 1) = g1+α ∂α πγ 2 γ x − δ x − δ −ζ h1+α α, β − h α, β 1+α γ γ tan π2α ∂f x − δ α, β (x|α, β, γ, δ; 1) = g1+α ∂β πγ γ ∂f x−δ x − δ x − δ 1 α, β + α, β (x|α, β, γ, δ; 1) = − 2 g1 g 1 ∂γ γ γ πγ πγ 3 ∂f x − δ 1 α, β . (x|α, β, γ, δ; 1) = − 2 g2 ∂δ γ πγ Proof For α 1 and r > 0, ∂η(r, α; 1) ∂(− tan π2α r α ) = = (− sec2 ∂α ∂α = − π2 (1 + tan2 π2α )r α − tan
πα 2
) π2 r α − tan
πα 2
(ln r)r α .
πα 2
(ln r)r α
Let z = (x − δ)/γ; since z does not depend on α, straightforward calculations show ∫ ∞ α ∂g1 (z|α, β) ∂ = cos(zr + βη(r, α; 1)r α )e−r dr ∂α ∂α 0 ∫ ∞ α − sin(zr+βη(r, α; 1) − π2 (1 + tan2 π2α )r α − tan π2α (ln r)r α e−r = 0
α + cos(zr + βη(r, α; 1)r α )e−r (−r α ln r) dr πβ (1 + tan2 π2α ) g1+α (z|α, β) + β tan π2α = h1+α (z|α, β) − h1+α (z|α, β). 2 Dividing by πγ yields the first score function. In a similar way, z does not depend on β so ∫ ∞ α ∂ ∂g1 (z|α, β) = cos(zr + βη(r, α; 1)r α )e−r dr ∂β ∂β ∫ ∞ 0 α = − sin(x + βη(r, α; 1))(− tan π2α )r α )e−r dr = tan π2α g1+α (z|α, β) 0
and dividing by πγ yields the second score function. For the next step, ∂z/∂γ = −z/γ so ∫ α 1 ∂γ −1 g1 (z|α, β) 1 ∞ = − 2 g1 (z|α, β) + − sin(z + βη(r, α))(−z/γ)e−r dr ∂γ γ γ 0 1 z = − 2 g1 (z|α, β) + 2 g1 (z|α, β) γ γ and dividing by π gives the third result. For the last part, ∂z/∂δ = −1/γ, so
3.5 More on parameterizations
∂g1 (z|α, β) = ∂δ
∫
∞
91 α
− sin(z + βη(r, α))(−1/γ)e−r dr =
0
1 g1 (z|α, β). γ
Dividing by π completes the proof.
Note that when β = 0, the score function with respect to α simplifies ∂f x − δ 1 α, 0 . (x|α, 0, γ, δ; 1) = − h1+α ∂α πγ γ Because of the discontinuity of the 1-parameterization for stable distributions at α = 1, the score functions do not exist at α = 1. The 0-parameterization can be used to find expressions that are continuous in α: using (1.3) f (x|α, β, γ, δ; 0) = ⎧ ⎪ ⎪ ⎨ f (x|α, β, γ, δ + ζ γ; 1) = ⎪
1 πγ g1
2 ⎪ ⎪ ⎪ f (x|1, β, γ, δ − π βγ log γ; 1) = ⎩
x−(δ+ζγ) γ α, 1 πγ g1
β
βγ log γ) 1, β γ
x−(δ− π2
α1 α = 1.
The score functions in the 0-parameterization can be computed from this as in the 1-parameterization above.
3.5 More on parameterizations This section collects the multiple parameterizations of stable laws that are in use, and adds another two. The unfortunate fact is that there is no ideal way of parameterizing these distributions, and various authors at different times have used ways that were most convenient for their immediate purposes. Hopefully this compilation will not confuse further; it is meant to alert the reader that different parameterizations that have been used and clarify what they mean. One small variation concerns the index of stability α. Some authors use α = α/2 ∈ (0, 1] as the index of stability. So α = 1 corresponds to a Gaussian law and (α = 1/2, β = 0) corresponds to a Cauchy law. In this book we will always use α ∈ (0, 2]. The S (α, β, γ, δ; 2) parameterization was mentioned briefly in Chapter 1 and is motivated by a desire to give the location and scale parameters intuitive meanings that are absent in the 0 and 1-parameterizations. For the location parameter, the shift of β tan π2α built into the S (α, β, γ, δ; 0) parameterization makes characteristic functions and densities continuous in all parameters, but so does any shift β tan π2α + (any continuous function of α and β). Thus the location parameter is somewhat arbitrary in the S (α, β, γ, δ; 0) parameterization. Modes are easily understood, every stable distribution has a mode, and every user of the normal distribution is used to thinking of the location parameter as the mode. The 2-parameterization defines the mode to be the location parameter for stable distributions.
92
3 Technical Results for Univariate Stable Distributions
A confusing issue with the standard scale is that as α ↑ 2, both S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1) distributions converge in distribution to a normal distribution with √ standard deviation γ/ 2, not standard deviation γ. This is not an inherent property of stable distributions, simply an artifact of the way the characteristic function is generally specified. The definition below is one way to make the scale agree with the standard deviation in the normal. Definition 3.2 A random variable X is S (α, β, γ, δ; 2) if X =α−1/α γ(Z0 − m(α, β)) + δ, d
where Z0 ∼ S (α, β; 0) and m(α, β) is the mode of Z0 . X has characteristic function E exp (iuX) =
⎧ ⎪ exp − α1 γ α |u| α 1 + i β(tan π2α )(sign u)(|α−1/α γu| 1−α − 1) ⎪ ⎨ ⎪
+i δ − α−1/α γm(α, β) u ⎪ ⎪ ⎪ exp −γ|u| 1 + i β 2 (sign u) log(γ|u|) + i [δ − γm(1, β)] u ⎩ π
(3.36) α1 α = 1.
Algebraic properties of S (α, β, γ, δ; 2) random variables are given in Proposition 3.3 and Figure 1.4 shows stable densities in this parameterization. While the characteristic function is cumbersome, the 2-parameterization may be the most intuitive parameterization for users in applied fields. By definition, the location parameter δ is always the mode of a S (α, β, γ, δ; 2) density. While there is no known closed form expression for the mode, in Section 3.2.2 m(α, β) has been numerically determined to high accuracy. In addition to this clear meaning of the location parameter, this centering makes sense in various applications. In a likelihood-based stable filter for signal processing with skewed stable laws, one wants the filtered signal to be the value that maximizes the likelihood. This is natural with the 2-parameterization, it is not what happens with other parameterizations when β 0, see Chapter 6. Likewise, in regression when the residuals are skewed, it makes sense to use the 2-parameterization which will center the regression line on the most likely line, not a line shifted by a choice of the parameterization. This is discussed in Chapter 5. While there are other choices for the scaling, this one has several advantages. When β = 0, m(α, 0) = 0 and the characteristic function is simply exp (−γ α |u| α /α), a direct generalization of the symmetric normal characteristic function exp(−σ 2 t 2 /2). In the Gaussian case (α = 2), γ is the standard deviation and in the Cauchy case (α = 1, β = 0), γ is the standard scale parameter. The scaling by α−1/α makes the normal distribution have the highest mode with the mode height decreasing with α—this emphasizes the heavier tails as α decreases. In contrast, the S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1) parameterizations have the mode height converging to +∞ as α ↓ 0, see Section 3.2.2. Finally, Proposition 3.7 suggests that the scaling chosen for the 2-parameterization is in some sense natural. The rest of this section is devoted to further parameterizations that are occasionally useful. Most readers can skip this material; just recall that there are other parameterizations and return to this section if those parameterizations are needed.
3.5 More on parameterizations
93
One drawback of the S (α, β, γ, δ; 2) parameterization is that there are no explicit expressions for m(α, β), so numerically derived values of m(α, β) must be used. This is particularly difficult for α near 0, where it is difficult to compute stable densities and hard to locate the mode accurately. Furthermore, the rules for combining two S (α, β, γ, δ; 2) random variables generally have to use values of the mode at three different βs. The fact that the shift is not linear in β makes this impractical to use for multivariate stable laws. To deal these issues, and to better understand the behavior of stable densities for α small, we have found it useful to define a variation of the S (α, β, γ, δ; 2) parameterization that replaces the unknown value of the mode with the linear approximation m(α, β) ≈ βκ(α) discussed in Section 3.2.2. Both the location of the mode and the height of the density at the mode are bounded as α ↓ 0 in the S (α, β, γ, δ; 3) parameterization. Unlike the S (α, β, γ, δ; 2) parameterization, all quantities in this parameterization are known exactly. The program STABLE has transformations among parameterizations S (α, β, γ, δ; k) for k = 0, 1, 2, 3 built in. Definition 3.3 A random variable X is S (α, β, γ, δ; 3) if X =α−1/α γ(Z0 − βκ(α)) + δ, d
where Z0 ∼ S (α, β; 0) and κ(α) is given in Lemma 3.10. The next six parameterizations are defined by Zolotarev (1986). The motivation for these parameterizations was to make certain formulas and proofs easier. The scale in the 4, 5, 6, 7, and 8 parameterizations below are defined to be like the variance, not the standard deviation. With such a definition, expressions like γ α = γ1α + γ2α for the scale of the sum of independent stable random variables are replaced by γ = γ1 + γ2 . And rather than deal with complicated constants that come up in proofs using the standard parameterizations, Zolotarev found it easier to switch to a parameterization that simplifies the expression of interest. We generally use the S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1) parameterizations throughout this book, however we record the following definitions for completeness. Definition 3.4 Zolotarev A parameterization X ∼ S (α, β, γ, δ; 4) if
α1 exp −γ |u| α [1 − i(β tan π2α )(sign u)] + iδu
E exp (iuX) = α = 1. exp −γ |u|[1 + i β π2 (sign u) log |u|] + iδu Definition 3.5 Zolotarev M parameterization X ∼ S (α, β, γ, δ; 5) if E exp (iuX) =
exp −γ |u| α [1 + i β tan π2α (sign u)(|u| 1−α − 1)] + iδu
exp −γ |u|[1 + i β π2 (sign u) log |u|] + iδu
α1 α = 1.
Definition 3.6 Zolotarev B parameterization X ∼ S (α, β, γ, δ; 6) if
94
3 Technical Results for Univariate Stable Distributions
E exp (iuX) =
exp −γ |u| α exp [−i β π2 K(α)(sign u)] + iδu
π exp −γ |u| 2 + i β(sign u) log |u| + iδu
where K(α) = 1 − |1 − α| =
α α−2
α1 α = 1,
α 1.
When X is strictly stable, the location parameter is uniquely determined when α 1 and the skewness parameter is zero when α = 1, so only three parameters are needed. For technical reasons, it is sometimes convenient to scale things differently. The next parameterizations are restricted to the strictly stable case. Definition 3.7 Zolotarev C parameterization Strictly stable X ∼ S (α, θ, γ; 7) if παθ α E exp (iuX) = exp −γ|u| exp −i (sign u) , 2 where |θ| ≤ min(1, (2/α) − 1). Feller (1971) uses a variation of this, with characteristic function πθ F eller α (sign u) , E exp (iuX) = exp −γ|u| exp i 2 where θ F eller = −αθ. Note that the range of the skewness parameter θ in the 7-parameteization varies as a function of α: when α ≤ 1, −1 ≤ θ ≤ 1, but when α > 1, −(2/α) + 1 ≤ θ ≤ (2/α) − 1. This “pinches” the range as α → 2, but this is natural because the skewness naturally means less and less as α approaches 2. Since many of the theoretical results on stable distributions in the older literature are stated in terms of this parametrization, it is worth some discussion. First, it is simple to see that such a characteristic function satisfies the definition of strict stability: if X, X1, X2 ∼ S (α, θ, γ; 7) are independent, then for any a, b > 0, aX1 + bX2 has characteristic function exp (γ|au| α e−i(παθ/2)(sign (au)) ) exp (γ|bu| α e−i(παθ/2)(sign (bu)) ) = exp (γ(aα + bα )|u| α e−i(παθ/2)(sign (u)) ), d
so aX1 + bX2 =(aα + bα )1/α X. Next, the relationship with the 1-parameterization is given. When α 1, the characteristic function of X ∼ S (α, θ, γ; 7) can be rewritten as
3.5 More on parameterizations
95
exp −γ|u| α e−i(παθ/2)sign u
= exp (−γ|u| α [cos(−(παθ/2)sign u) + i sin(−(παθ/2)sign u)]) = exp (−γ|u| α cos(παθ/2)[1 − i tan(παθ/2)(sign u)]) tan(παθ/2) tan(πα/2)(sign u)] = exp −(γ cos(παθ/2))|u| α [1 − i tan(πα/2) tan(παθ/2) S (α, θ, γ; 7) = S α, , (γ cos(παθ/2))1/α, 0; 1 . tan(πα/2)
so
(3.37)
When α = 1, |u|sign u = u, so the characteristic function is exp −γ|u|e(−πθ/2)sign u = exp (−γ|u|[cos((−πθ/2)sign u) + i sin((−πθ/2)sign u)]) = exp (−γ|u|[cos(πθ/2) + iγ sin(πθ/2)u]) , and thus S (1, θ, γ; 7) = S (1, 0, γ cos(πθ/2), γ sin (πθ/2); 1).
(3.38)
For all α ∈ (0, 2], we get a strictly stable law. When α 1, any skewness is allowed, but when α = 1, no skewness is allowed, but any shift is allowed. Note that when α = 1 and θ = 1, the result is a degenerate law—a point mass centered at γ. See (3.55) for an interpretation of this. The above reasoning can be reversed, showing that when α 1, α
2 2 2 πα πα arctan β tan 2 , γ 1 + β arctan ( 2 ); 7 , (3.39) S (α, β, γ, 0; 1) = S α, πα and when α = 1,
S (1, 0, γ, δ; 1) = S 1, (2/π) arctan(δ/γ), γ 1 +
(δ/γ)2 ; 7
.
(3.40)
A small variation of the C parameterization is the following.
Definition 3.8 Zolotarev C parameterization Strictly stable X is S (α, ρ, γ; 8) if E exp (iuX) = exp (−γ(iu)α exp [−iπαρ(sign u)]) . Note that this is similar to the previous parameterization with ρ = (1 + θ)/2. Definition 3.9 Zolotarev E parameterization Strictly stable X ∼ S (ν, θ, τ; 9) if E exp (iuX) = exp − exp ν −1/2 [log |u| + τ − i π2 θ(sign u)] + γEuler (ν −1/2 − 1) , √ where ν = α−2 ≥ 1/4, |θ| ≤ min(1, 2 ν − 1), τ ∈ R, γEuler is Euler’s constant.
96
3 Technical Results for Univariate Stable Distributions
Chronologically, the A, B, and M parameterizations came before the ones we are using. The S (α, β, γ, δ; 1) parameterization is a variation of the A parameterization, while the S (α, β, γ, δ; 0) parameterization is a variation of the M parameterization. The B parameterization has a more concise characteristic function that absorbs the tan π2α term into the exponential. The C parameterization is the most compact, and probably the easiest form to use for theoretical purposes, e.g. Proposition 3.5. Tables 3.1 and 3.2 show how to convert from the any parameterization to the 0or 1-parameterization. The 0, 2, 3, and 5 parameterizations are continuous in all 4 parameters, while the others are not. The only parameterizations that are scale and shift families are the 0-, 2-, and 3-parameterizations. The only parameterizations that have the scale agree with the standard deviation in the α = 2 case are the 2and 3-parameterizations. Only the 2 and 3 parameterizations also have the scale and shift parameters agree with the standard Cauchy scale and shift. Finally, we state a last parameterization based on the Lévy-Khintchine representation, which expresses all α-stable distributions in terms of totally skewed α-stable distributions. See Lemma 3.18 and Section 7.5 for more information. Definition 3.10 Lévy-Khintchine representation X ∼ S (α, γ+, γ−, δ; 10) if X = γ+ Z1 − γ− Z1 + δ, where Z1, Z1 ∼ S (α, 1; 1) are independent. This corresponds to the Lévy-Khintchine representation with mass γ+ on the positive half-line, mass γ− on the negative halfline, and shift δ, see Theorem 3.1. Tables 3.1 and 3.2 give conversions from all the different parameterizations to the 0 and 1-parameterization.
3.6 Tail behavior The Paretian tail behavior of stable distributions is the basis for their use as models when heavy tails are observed. In the Gaussian case, X ∼ N(0, 1), there is a wellknown formula to approximate the tail probability: as x → ∞, P(X > x) ∼
exp (−x 2 /2) , √ x 2π
(3.41)
see Feller (1968), pg. 175. (Here and below, h(x) ∼ g(x) as x → ∞ will mean limx→∞ h(x)/g(x) = 1.) When α < 2, Lévy (1925) has shown that the tails of non-Gaussian stable distributions follow an asymptotic form of the law of Pareto. Theorem 1.2 states that if X ∼ S (α, β; 0) where β > −1, then as x → ∞, F(x|α, β; 0) := P(X > x) ∼ (1 + β)cα x −α,
(3.42)
tan(π β6 K (α)/2) tan(π α/2)
[γ6 cos π 2 γ6
π β6 K (α) 1/α ] 2
γ51/α
γ41/α
α−1/α γ3
α−1/α γ2
γ1
γ0 = γ1 γ0
−γ5 δ5 + β5γ5 tan(πα/2) −γ5 δ5 π β6 K (α) 1/α ] 2
−γ5 δ5 + β5 tan(πα/2)(γ51/α + γ5 ) −γ5 δ5 + β5 π2 γ5 log γ5
−γ6 δ6 + tan π β6 2K (α) [γ6 cos −γ6 δ6 + β6γ6 log( π2 γ6 ) δ1 = −γ6 δ6 δ0 =
δ1 =
δ0 =
δ0 and δ1 δ0 = δ 0 δ0 − β0γ0 tan(πα/2) δ1 = δ0 − β0 π2 γ0 log γ0 δ1 + β1γ1 tan(πα/2) δ0 = δ1 + β1 π2 γ1 log γ1 δ1 = δ1 −1/α γ m(α, β ) δ0 = δ 2 2 2 − α δ2 − α−1/α γ2 (β2 tan(πα/2) + m(α, β2 )) δ1 = δ2 − γ2 (β2 π2 log γ2 + m(1, β2 )) −1/α γ β κ(α) δ0 = δ 3 3 3 − α δ3 − α−1/α γ3 β3 (tan(πα/2) + κ(α)) δ1 = δ − γ3 β3 ( π2 log γ3 + κ(1)) 3 −γ4 δ4 + β4γ41/α tan(πα/2) δ0 = −γ4 δ4 + β4 π2 γ4 log γ4 δ1 = −γ 4 δ4
Table 3.1 Converting stable parameterization to parameterizations S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1). Whenever there is a brace to show different cases, the top choice corresponds to α 1, the bottom to α = 1.
(Zolotarev B)
S (α, β6, γ6, δ6 ; 6)
(Zolotarev M)
S (α, β5, γ5, δ5 ; 5) β5
β4
S (α, β4, γ4, δ4 ; 4)
(Zolotarev A)
β3
S (α, β3, γ3, δ3 ; 3)
β6
β2
S (α, β2, γ2, δ2 ; 2)
β1
S (α, β1, γ1, δ1 ; 1)
(mode centered)
β0 = β1 β0
S (α, β0, γ0, δ0 ; 0)
3.6 Tail behavior 97
0
0
γ+α −γ−α γ+α +γ−α
tan(π αθ /2) tan(π α/2)
tan(π α(2ρ−1)/2) tan(π α/2)
0
(γ+α + γ−α )1/α
×(cos(παθ/2))1/α
0 e τ sin(πθ/2)
⎧ α α α α (1/α)−1 tan(πα/2) ⎪ ⎪ ⎨ δ10 + (γ+ − γ− )(γ+ + γ− ) ⎪ δ0 = δ10 − π2 [γ+ log γ+ − γ− log γ− ⎪ ⎪ ⎪ −(γ+ − γ− ) × log(γ+ + γ− )] ⎩ δ10 δ1 = δ10 − π2 (γ+ log γ+ − γ− log γ− )
δ1 =
δ0 and δ1 (γ7 cos(παθ/2))1/α tan(παθ/2) (γ7 cos(παθ/2))1/α δ0 = γ sin(πθ/2) 7 0 δ1 = γ7 sin(πθ/2) (γ8 cos(πα(2ρ − 1)/2))1/α tan(πα(2ρ − 1)/2) (γ8 cos(πα(2ρ − 1)/2))1/α δ0 = γ sin(πα(2ρ − 1)/2) 8 0 δ1 = γ8 sin(πα(2ρ − 1)/2)
√ ⎧ 1/α ⎪ ⎪ ⎨ exp τ − γE ul er ( ν − 1) (cos(παθ/2)) ⎪
√ exp τ − γE ul er ( ν − 1) δ0 = × tan(παθ/2) ⎪ ⎪ ⎪ e τ sin(πθ/2) ⎩
γ0 = γ1
Table 3.2 Converting stable parameterization to parameterizations S (α, β, γ, δ; 0) and S (α, β, γ, δ; 1). Whenever there is a brace to show different cases, the top choice corresponds to α 1, the bottom to α = 1.
(Lévy-Khintchine)
S (α, γ+, γ−, δ10 ; 10)
(Zolotarev E)
α = ν −1/2
S (ν, θ, τ; 9)
(Zolotarev C )
S (α, ρ, γ8 ; 8)
(Zolotarev C)
S (α, θ, γ7 ; 7)
tan(π αθ /2) tan(π α/2)
β0 = β1
98 3 Technical Results for Univariate Stable Distributions
3.6 Tail behavior
99
where cα = Γ(α) sin απ 2 /π. It also states that the tail behavior of the density is the formal deriviative of F(x|α, β; 0) = 1 − P(X > x): as x → ∞, f (x|α, β; 0) ∼ αcα (1 + β)x −(1+α) .
(3.43)
0.0
0.1
0.2
0.3
0.4
0.5
A plot of cα is shown in Figure 3.8.
0.0
0.5
1.0
1.5
2.0
alpha Fig. 3.8 Plot of the tail constant cα .
Equation (3.42) and the reflection property show that as x → ∞ P(|X | > x) = P(X > x) + P(X < −x) ∼ cα (1 + β)x −α + cα (1 − β)x −α = 2cα x −α . This statement holds for all β, including β = ±1. Letting p := (1 + β)/2 ∈ [0, 1], we see that as x → ∞, P(X > x) → p and P(|X | > x)
P(X < −x) → 1 − p, P(|X | > x)
i.e. p = (1 + β)/2 is the proportion of the extreme tail probability on the right and 1 − p = (1 − β)/2 is the proportion of the extreme tail probability on the left. Proof of Theorem 1.2 (Tail approximation) By scaling, we may assume γ = 1. The tail probability result (3.42) follows from Theorem XVII.5.1 in Feller (1971). To establish the tail density result (3.43), we need some regularity result on the density. In the stable case, we know that the stable densities are unimodal from Section 3.2.2,
100
3 Technical Results for Univariate Stable Distributions
so to the right of the mode the density f (x|α, β; 0) is monotonically decreasing. Therefore the Lemma after Theorem XII.5.4 of in Feller (1971) shows (3.43). Theorem 1.2 results says that the tail of the empirical distribution function on a log-log scale should approach a straight line with slope −α if the data is stable. While simple and direct in principle, this method is not a very reliable way to estimate α from data—see Chapter 4. When X is totally skewed, the light tail drops off faster than a power law. We state the β = 1 case, the β = −1 case is just a reflection. The exact result is given below for an S (α, 1; 1) distribution, where it is easiest to state. Proposition 3.1 Let X ∼ S (α, 1; 1), 0 < α < 2. When α 1, define 1/(2−2α) α 1 c1 = ! πα 2π|1 − α| | cos( 2 )| 1/(1−α) αα c2 = |1 − α| | cos( π2α )| c1 |1 − α| . c3 = c2 α (a) When 0 < α < 1, the support of f (·|α, 1; 1) is [0, ∞) and as x ↓ 0 f (x|α, 1; 1) ∼ c1 x (2−α)/(2α−2) exp (−c2 x α/(α−1) ) F(x|α, 1; 1) ∼ c3 x −α/(2α−2) exp (−c2 x α/(α−1) ). (b) When α = 1, the support of f (·|α, 1; 1) is R and as x ↓ −∞ " π f (x|1, 1; 1) ∼ exp (z/2 − ez ) 8 " 1 exp (−z/2 − ez ), F(x|1, 1; 1) ∼ 2π where z = log(2/π) − 1 − (π/2)x. (c) When 1 < α < 2, the support of f (·|α, 1; 1) is R and as x ↓ −∞ f (x|α, 1; 1) ∼ c1 (−x)(2−α)/(2α−2) exp (−c2 (−x)α/(α−1) ) F(x|α, 1; 1) ∼ c3 (−x)−α/(2α−2) exp (−c2 (−x)α/(α−1) ). These results are proved in Section 2.5 of Zolotarev (1986), where they are stated in the (B) parameterization. In that book, see equations (2.5.18) and (2.5.19) for the density and equations (2.5.21) and (2.5.22) for the distribution function. To restate in the 1-parameterization, use the transformations from the (B) to the 1-parameterization in Table 3.1. Note that a scale change results in a shift when α = 1. For any α and any β, either both tails have an asymptotic power law decay, or if | β| = 1 one tail of the distribution has an asymptotic power law decay and the other
3.6 Tail behavior
101
tail is lighter. Hence supx>0 x p P(|X | > x) is finite for all 0 < p < α. Problem 3.41 gives the following upper bound for this when X is strictly stable. Lemma 3.14 For X ∼ S (α, β, γ, 0; 1) with α 1 or (α = 1 and β = 0) sup x p P(|X | > x) ≤ x>0
Γ(1 − p/α) cos pθ 0 γp . p/α Γ(1 − p) cos(pπ/2) (cos αθ 0 )
(3.44)
The rest of this section is concerned with determining when the Paretian tail approximation is accurate for a stable distribution. These results are from Fofack and Nolan (1999). For a symmetric stable distribution (β = 0), the mode is at x = 0 in either parameterization, and it makes sense to directly compare the quantities in (3.42). However, when β 0, stable distributions have shifted modes. Earlier it was shown that m(α, β) = mode of a S (α, β; 0) distribution is uniformly bounded, in fact |m(α, β)| ≤ 1. But the mode of a S (α, β; 1) distribution is at m(α, β) + β tan π2α , and this latter term is very large when α is near 1. While (3.42) will eventually hold for any fixed α, what we generally care about is the shape of the tail and not the shift. Hence, we will determine when the Paretian tail occurs in the S (α, β; 0) parameterization, not the S (α, β; 1)-parameterization. In what follows, we will restrict ourselves to normalized stable distributions in the S (α, β; 0) parameterization with density f (x|α, β; 0) and upper tail probability F(x|α, β; 0) = P(X > x) respectively. We note here that (3.42) tells us about lower tails: P(X < −x) = P(−X > x) ∼ (1 − β)cα x −α as x → ∞, because of the reflection property Proposition 1.1. Also, here we restrict ourselves to cases where the upper tail is Paretian, i.e. β > −1; information about the β = −1 case was given above. We note that equation (3.41) is in the standard normal parameterization, i.e. √ N(0, 1) = S 2, 0, 2, 0; 0 . Restating (3.41) for X ∼ S (2, 0, 1, 0; 0) gives F(x|2, 0; 0) ∼ (2/x) f (x|2, 0; 0) as x → ∞, whereas for α < 2 and β > −1, (3.42) gives F(x|α, β; 0) ∼ (x/α) f (x|α, β; 0)
as
x → ∞.
For notational convenience, we describe the Pareto family by, fPar eto (x|α, β) = α(1 + β)cα x −(1+α)
and
F Par eto (x|α, β) = (1 + β)cα x −α .
The scale above has been chosen so that f (x|α, β; 0) ∼ fPar eto (x|α, β) and F(x|α, β; 0) ∼ F Par eto (x|α, β) as x → +∞. As a density and tail probability, the above Pareto formulas hold for x ≥ ((1 + β)Cα )1/α ; for comparison purposes, we will sometimes use these formulas for all x > 0.
102
3 Technical Results for Univariate Stable Distributions
Because there are no explicit formulas for f (x|α, β; 0) or F(x|α, β; 0), we will numerically determine when the approximation is accurate; the next chapter will discuss implications for tail estimation. We start by focusing on the density. The top row of Figure 3.9 shows graphs of symmetric stable densities with the corresponding Pareto density for α = 0.7, 1.2, 1.9. For small α, the Pareto density is always above the symmetric stable one, while for α large, the Pareto crosses the stable one. A convention is needed on when to say f (x|α, β; 0) and fPar eto (x) are close. Requiring | f (x|α, β; 0) − fPar eto | to be small is not a good measure, because this holds as soon as both are small, which occurs quickly, especially when α is above 1.5. Two measures of closeness are used below. The first measure is useful for understanding when the tail behavior gives a good estimate of the tail index α. It is simply the ratio L pd f (x) = L pd f (x; α, β) = f (x|α, β; 0)/ fPar eto (x|α, β). Equation (3.43) implies that L pd f (x) → 1 as x → ∞ and the goal is to understand when L pd f (x) gets close to 1. The middle row of Figure 3.9 shows selected plots of L pd f (x). The second measure is useful for numerical approximation purposes: compute the relative error Relerr pd f (x) =
f (x|α, β; 0) − fPar eto (x|α, β) . f (x|α, β; 0)
The bottom row of Figure 3.9 shows plots of Relerr pd f (x). The relative error tends to 0 as x → ∞ and the goal is understand when it is close to 0. We have not been able to find a concise way of answering the question of how close the Pareto approximation is to a stable distribution, even for the symmetric case. For β 0, the behavior is more complex—graphs like Figure 3.9 vary considerably as α and β vary. In an attempt to capture some part of this behavior, both quantities L pd f (x) and Relerr pd f (x) were computed and a search was done to determine when the functions got close to their limit and that they remain close for all larger x. For a given tolerance , define the points x L p d f , = inf{x > 0 : |L pd f (y) − 1| ≤ for all y ≥ x}
and
xRelerr p d f , = inf{x > 0 : |Relerr pd f (y)| ≤ for all y ≥ x}. The above points were numerically located for = 0.1 and are plotted in Figure 3.10 and Figure 3.11 respectively. It is useful to reformulate these raw values of x in terms of the tail probability, so the right plot in these figures show F(x|α, β; 0). We were frankly skeptical of the unusual shape of these graphs, but repeated calculations have verified these results. In particular, the β = 0 curve in Figures 3.10 and 3.11 have abrupt changes at around α = 1.2 because the stable and Pareto densities cross and depart by more than = 0.1, forcing x L p d f ,0.1 and xRelerr p d f ,0.1 abruptly to the right. (For example, when α = 1.2 and β = 0, x pd f ,0.1 = 1.691 and F(1.691|1.2, 0; 0) = 0.154, but when α = 1.25 and β = 0, x pd f ,0.1 = 5.479 and
3.6 Tail behavior
103
2
4
0
2
4
6
8
10
6
8
10
6
8
10
0.1 0.0
0
2
4
0
2
4
0
2
4
x
6
8
10
6
8
10
6
8
10
2
4
0
2
4
0
2
4
x
6
8
10
6
8
10
6
8
10
8
1.5
2.0
4 0
0.0
x
1.0
1.0
x
x
-1.0 -0.5 0.0
0.5
0.5
x
-1.0 -0.5 0.0
0.0 1.0
x
2
0.5
1.0
1.0
6
1.5
0
10
0
x
0.5
L(x)
0.2
0.3 0.2 0.1 0.0
4
2.0
2
0.5 -1.0 -0.5 0.0
rel. error
0.3
0.4
0.4 0.3 0.2 0.1
f(x)
0.0
0
alpha = 1.9
0.4
alpha = 1.2
alpha = 0.7
x
Fig. 3.9 Top row shows symmetric stable densities (solid) versus Pareto density (dotted) for α = 0.7, 1.2, 1.9. The middle row shows the ratio L p d f (x) and the bottom row shows the relative error Relerr p d f (x). The middle row plots include a band to indicate 0.9 ≤ L p d f (x) ≤ 1.1; the bottom row includes a band to indicate −0.1 ≤ Relerr p d f (x) ≤ +0.1. Note the different vertical scale on the rightmost graph of the middle row.
F(5.479|1.25, 0; 0) = 0.0337.) For skewed stable distributions, the location of this abrupt change shifts and its magnitude can be more pronounced. Using a different measure of closeness instead of relative error would change the exact shape of these curves, but a similar sort of behavior would still result because the Pareto density crosses the stable density and can give a poor approximation for certain values of α and β. We have repeated this procedure for tail probabilities to determine when F(x|α, β; 0) and F Par eto are close. Figure 3.12 is similar to Figure 3.9, compar-
0.4
3 Technical Results for Univariate Stable Distributions 50
104
0
0.2 0.0
10
0.1
20
x
1-F(x)
30
0.3
40
beta = -0.9 beta = -0.5 beta = 0 beta = 0.5 beta = 1
0.6
1.0
1.4
1.8
0.6
alpha
1.0
1.4
1.8
alpha
0.4
50
Fig. 3.10 x L p d f ,0.1 (left plot) as a function of α for β = 1, 0.5, 0, −0.5, −0.9 and F(x L p d f |α, β; 0) (right plot).
0
0.2 0.0
10
0.1
20
x
1-F(x)
30
0.3
40
beta = -0.9 beta = -0.5 beta = 0 beta = 0.5 beta = 1
0.6
1.0
1.4
alpha
1.8
0.6
1.0
1.4
1.8
alpha
Fig. 3.11 x R el er r p d f ,0.1 (left plot) as a function of α for β = 1, 0.5, 0, −0.5, −0.9 and F(x R el er r p d f |α, β; 0) (right plot).
3.6 Tail behavior
105
alpha = 1.2
4
0
2
4
8
10
6
8
10
6
8
10
0.4 0.3
0
2
4
0
2
4
0
2
4
x
6
8
10
6
8
10
6
8
10
0
2
4
0
2
4
x
6
8
10
6
8
10
6
8
10
8
10
4
4 2 1.0
x
0.5
1.0 -1.0 -0.5 0.0
2
0
0.0
x
0.5
x
0
6
1.0 1.5 0.5
1.0 1.5 0.0 1.0
x
x
-1.0 -0.5 0.0
2
6
0.0 0.1
0.2 0.0 0.1
0
x
2.0
4
0.5 -1.0 -0.5 0.0
0.2
0.3
0.3 0.2
2
2.0
1-F(x)
0.0 0.1
0
0.5
L(x) rel. error
alpha = 1.9
0.4
0.4
alpha = 0.7
x
Fig. 3.12 Top row shows symmetric stable d.f. (solid) versus Pareto d.f. (dotted) for α = 0.7, 1.2, 1.9. The middle row shows the ratio L d . f . (x) and the bottom row shows the relative error Relerrd . f . (x). The middle row plots include a band to indicate 0.9 ≤ L d . f . (x) ≤ 1.1; the bottom row includes a band to indicate −0.1 ≤ Relerrd . f . (x) ≤ +0.1. Note the different vertical scale on the rightmost graph of the middle row.
ing tail probabilities F(x|α, β; 0) and F Par eto , their ratio Ld. f . (x) and the relative error Relerrd. f . (x). As above, define x L d . f ., = inf{x > 0 : |(F(y|α)β; 0/F Par eto (y|α, β))−1| ≤ for all y ≥ x}
and
xRelerrd . f ., = inf{x > 0 : |F(y|α)β; 0 − F Par eto (y|α, β)|/F(y|α)β; 0 ≤ for all y ≥ x} Figures 3.13 and 3.14 show these point as a function of (α, β), for = 0.1.
0.4
3 Technical Results for Univariate Stable Distributions 50
106
0
0.2 0.0
10
0.1
20
x
1-F(x)
30
0.3
40
beta = -0.9 beta = -0.5 beta = 0 beta = 0.5 beta = 1
0.6
1.0
1.4
0.6
1.8
1.0
1.4
1.8
alpha
alpha
0.4
50
Fig. 3.13 x L d . f . ,0.1 (left plot) as a function of α for β = 1, 0.5, 0, −0.5, −0.9, and F(x L d . f . |α, β; 0) (right plot).
0
0.2 0.0
10
0.1
20
x
1-F(x)
30
0.3
40
beta = -0.9 beta = -0.5 beta = 0 beta = 0.5 beta = 1
0.6
1.0
1.4
alpha
1.8
0.6
1.0
1.4
1.8
alpha
Fig. 3.14 x R el er r d . f . ,0.1 (left plot) as a function of α for β = 1, 0.5, 0, −0.5, −0.9, and F(x R el er r d . f . |α, β; 0) (right plot).
3.7 Moments and other transforms
107
We close this section with some remarks. First, since the plots for x L and xRelerr are qualitatively similar, it appears that L(x) and Relerr(x) are measuring roughly the same thing. Second, the distribution functions approach the Pareto limit faster than the densities, so any inference about tail behavior is likely to more accurate if the empirical d.f. is used than if the empirical density is used. Third, when a stable distribution is highly skewed, i.e. 1 − | β| is small, the light tail takes a very long time before the Paretian behavior appears. And finally, when α ↑ 2, the Paretian tail behavior does not occur until the tail probability F(xd. f . |α, β; 0) is very small, making any tail estimator unreliable unless a massive data set is available. Intuitively, when α is near 2, stable distributions approach the normal distribution and the Paretian tail is only evident on the extreme tails. This will be illustrated with the Hill estimator in Chapter 4. We note that as α → 2, there is a clash between the distribution approaching a Gaussian distribution, with the heavy Pareto tail. There is more detailed information on this in Uchaikin and Zolotarev (1999), Houdré and Marchal (2004) and Marchal (2005) (the last two extend results to a multivariate setting.)
3.7 Moments and other transforms Non-Gaussian stable laws always have infinite variance, so no integer moments of order 2 or above exist. However, we can consider fractional moments for powers p < α. We start with a general discussion of when fractional moments exist, then derive expressions for these moments when X is a stable r.v. For a general random variable X with density f (x) and p > 0, the fractional absolute moment E |X | p = ∫∞ p f (x)dx exists (is finite) when the tails decay quickly enough. The tails of a |x| −∞ Gaussian denstiy decay exponentially fast, so all positive moments exist. Since at least one side of a non-Gaussian stable law has a tail like f (x) ∼ c|x| −(1+α) as |x| → ∞, they have E |X | p < ∞ only when p < α. (See problem 1.11.) For a general random variable X with density f (x) and p < 0, the existence of the fractional absolute moment depends on both the tail behavior and the behavior near the ∫ origin. If the density is positive in a neighborhood of the origin, then E |X | p ≥ c − x p dx, which will only converge for p > −1. In particular, stable densities are positive in a neighborhood of the origin except for the one-sided laws: α < 1 and | β| = 1, so moments p ≤ −1 do no exist except in the case of one-sided laws. For the one sided strictly stable laws, the density decays faster than any power near the origin by Proposition 3.1, so all negative moments exist in these cases. Also, if a one-sided stable law is shifted so that the support is bounded away from 0, then all negative moments will exist. In summary, for a stable law
108
3 Technical Results for Univariate Stable Distributions
⎧ p ∈ (−1, ∞) ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ p ∈ (−∞, α) E |X | p < ∞ if and only if ⎪ ⎪ ⎪ ⎪ ⎪ p ∈ (−1, α) ⎩
α=2 α ∈ (0, 1), | β| = 1 and 0 is not an interior point of the support of X otherwise.
Another way to determine if moments exist is in terms of characteristic functions: the behavior of the characteristic function near the origin determines the tail behavior of the distribution. The next result gives a necessary and sufficient condition for the existence of fractional absolute moments, and a formula for that moment for an arbitrary r.v. Lemma 3.15 Let X be an arbitrary r.v. with characteristic function φ(u) and let 0 < p < 2. (a) E |X | p < ∞ if and only if for some c > 0, ∫ c 1 − Re φ(r) dr < ∞. (3.45) r 1+p 0 (b) If (3.45) holds, then ∫ ∞ 1 − Re φ(r) 2 E |X | = sin(πp/2) Γ(p + 1) dr. π r 1+p 0 ∫∞ Proof Note that K := 0 (1 − cos t)/t 1+p dt = π/(2 sin(πp/2)Γ(p + 1)) when ∫a 0 < p < 2, and since the integrand is nonnegative, K(a) := 0 (1 − cos t)/t 1+p dt is positive and increasing in∫a > 0 with lima→∞ K(a) = K < ∞. For part (a), let c > 0 and suppose E |X | p = |x| p F(dx) < ∞. Then ∫ c ∫ c ∫ ∞ 1 − Re φ(r) 1 dr = (1 − cos r x)F(dx) dr 1+p r 1+p 0 0 r −∞ ∫ ∞∫ c 1 − cos r x = dr F(dx) r 1+p −∞ 0 ∫ c |x | ∫ ∞ ∫ ∞ 1 − cos t = |x| p dt F(dx) ≤ K |x| p F(dx) < ∞. 1+p t −∞ 0 −∞ ∫∞ ∫1 To show the other direction, note that E |X | p = −∞ |x| p F(dx) = −1 |x| p F(dx) + ∫ |x| p F(dx), and since the first integral is always finite, the p-th absolute moment |x |>1 is finite if and only if the last integral is finite. To show the finiteness of that integral, pick a c where (3.45) holds and compute p
3.7 Moments and other transforms
∫
∫ |x |>1
109
∫
1 − cos t dt F(dx) t 1+p |x |>1 0 ∫ c ∫ 1 − cos t = K(c)−1 |x| p dt F(dx) t 1+p 0 |x |>1 ∫ c |x | ∫ ∞ 1 − cos t −1 p |x| dt F(dx) ≤ K(c) t 1+p −∞ 0 ∫ c 1 − Re φ(r) = K(c)−1 dr < ∞, r 1+p 0
|x| p F(dx) =
|x| p K(c)−1
c
where the last line is obtained by the argument above. ∫ c For part (b), let c > 0 be∫ a∞ value where (3.45) holds. The argument above shows (1−Re φ(r))/r 1+p dr = −∞ |x| p K(c|x|)F(dx). The left-hand side of this equation 0 is ∫ ∞nondecreasing in c. Letting c → ∞ on both sides of this relation, we conclude (1 − Re φ(r))/r 1+p dr = K E |X | p . 0 Problem 3.28 uses the preceding result to show E |X | p < ∞ and to get an expression for E |X | p for a strictly stable X. We give a different derivation of this below. We now give the proof of Proposition 1.2, which gives the mean of a stable r.v. when α > 1. Proof of Proposition 1.2 (Mean of a stable r.v. with α > 1) Suppose X ∼ S (α, β, γ, δ1 ; 1) with α > 1. The previous discussion shows E |X | is finite, and hence EX
exists. Let X1, X2, . . . be i.i.d. copies of X. Then (1.8) shows X1 + · · · + Xn ∼ S α, β, n1/α γ, nδ1 ; 1 , and Proposition 1.4(a) shows X n := (X1 + · · · + Xn )/n ∼
S α, β, n−1+1/α γ, δ1 ; 1 . Since α > 1, n−1+1/α → 0 as n → ∞, so X n converges in distribution to a degenerate distribution centered at δ1 . Now the law of large numd
bers says X n −→E X. We conclude that E X = δ1 . Using the relationship between the different parameterizations in (1.7) gives the other forms for E X. p
Next expressions are derived for truncated fractional moments E(X − a)+ = p 0 E((X ∫ ∞− a)1 {X>a } ) for general X ∼ S (α, β, γ, δ; 1). When p = 0, E X+ is interpreted p p as 0 f (x)dx = P(X > 0). We first note the simple result that E(X − a)+ = EY+ where Y = X − a ∼ S (α, β, γ, δ − a; 1), so an expression for fractional moments truncated at zero for general stable Y gives an expression for an arbitrary truncation point. Hence, in what follows we will focus on truncation at zero, but since we allow p an arbitrary shift, the formulas derived give E(X − a)+ for any a. Truncated moments are of interest in finance and insurance: when X models the loss of an asset, E(X − a)+ is the expected shortfall given X > a. They also arise in the stable GARCH model of Broda et al. (2013). Fractional moments can be used to estimate parameters, see Section 4.6. Most of the known results are for X strictly stable and the truncation is at zero. Hardin (1984) gave an expression for E |X | p in the strictly stable case; see Corolp lary 3.4 below for the statement. Zolotarev (1986) gives an expression for E X+ when X is strictly stable, and Section 8.3 of Paolella (2007) gives two derivations in the
110
3 Technical Results for Univariate Stable Distributions
strictly stable case. When 1 < p < α, Matsui and Pawlas (2016) give expressions for E |X − a| p when α > 1, β = 0 and 1 < p < α. The next result gives expressions for truncated moments of stable laws for any α ∈ (0, 2), any β ∈ [−1, 1], any scale γ > 0, any shift δ, and any valid p. That easily leads to expressions for E |X | p and related gd (·|α, β) are defined in signed moments. Recall that the functions gd (·|α, β) and Section 3.4. Theorem 3.8 Let X ∼ S (α, β, γ, δ; 1) with any 0 < α < 2 and any −1 ≤ β ≤ 1 and set δ/γ α1 ∗ δ = 2 δ/γ + π β log γ α = 1. p
For −1 < p < α, define m p (α, β, γ, δ) = E X+ . (a) When −1 < p < 0,
Γ(p + 1) sin ( π2p )g−p (−δ∗ |α, β) + cos ( π2p ) m p (α, β, γ, δ) = −γ p g−p (−δ∗ |α, β) . π When p = 0, m0 (α, β, γ, δ) = P(X > 0) =
1 1 − g0 (−δ∗ |α, β). 2 π
When 0 < p < min(1, α),
Γ(p + 1) sin ( π2p ) Γ(1 − p/α)/p − g−p (−δ∗ |α, β) π
m p (α, β, γ, δ) = γ p
− cos ( π2p ) g−p (−δ∗ |α, β) .
When p = 1 < α < 2,
δ∗ 1 ∗ + (Γ(1 − 1/α) − g−1 (−δ |α, β)) . m (α, β, γ, δ) = γ 2 π p
When 1 < p < α < 2, m (α, β, γ, δ) = γ p
Γ(1 − p/α) + 1) ∗ πp − g−p (−δ |α, β) sin ( 2 ) π p ∗ δ ∗ πp + cos ( 2 ) Γ((1 − p)/α) − g−p (−δ |α, β) . α
p Γ(p
p
(b) E X−p = E(−X)+ = m p (α, −β, γ, −δ). Proof (a) To simplify calculations, first assume γ = 1; the adjustment for γ 1 is discussed below. First consider −1 < p < 0; we extend the method of Paolella (2007) to the non-strictly stable case. When p is in this range and u > 0, a trigonometric identity shows
3.7 Moments and other transforms
∫
∞
111
∫
∞
x p (cos(ux) cos C − sin(ux) sin C) dx ∫ ∞ ∫ ∞ p = cos C x cos(ux)dx − sin C x p sin(ux)dx
x p cos(ux + C)dx =
0
0
0
0
= cos C[−u−p−1 Γ(1 + p) sin(πp/2)]
− sin C u−p−1 Γ(1 + p) cos(πp/2) = −u−p−1 Γ(1 + p) sin(C + πp/2),
(3.46)
where formulas 3.761.4 and 3.761.8 (with a sign correction) from Gradshteyn and Ryzhik (2000) are used in passing from the second to third line. Therefore ∫ ∞ ∫ ∞ 1 p p E X+ = x f (x)dx = x p g1 (x − δ|α, β)dx π 0 0 ∫ ∞ ∫ ∞ α 1 p cos(u(x − δ) + βη(u, α))e−u du dx = x π 0 0 ∫ ∞ ∫ ∞ α 1 p = x cos(ux − uδ + βη(u, α))dx e−u du. π 0 0 Continuing the above using (3.46) with the inner integral ∫ α 1 ∞ p E X+ = −Γ(1 + p) sin(−uδ + βη(u, α) + πp/2)u−p−1 e−u du π 0 ∫ α Γ(1 + p) ∞ =− sin(−uδ + βη(u, α) + πp/2)u−p−1 e−u du π ∫0 Γ(1 + p) ∞ sin(−uδ + βη(u, α)) cos(πp/2) =− π 0 α = + cos(−uδ + βη(u, α)) sin(πp/2) u−p−1 e−u du Γ(1 + p)
g−p (−δ|α, β) cos(πp/2) + g−p (−δ|α, β) sin(πp/2) . π ∫∞ g0 (x|α, β) → π/2 When p = 0, E X+0 = 0 1 f (x)dx = P(X > 0), and (3.35) and as x → ∞ gives the value in terms of g0 (·|α, β). When 0 < p < min(1, α), Corollary 2 of Pinelis (2011) with k = = 0 shows ∫ Γ(p + 1) ∞ φ(u) − 1 p E X+ = Re du. (3.47) π (iu) p+1 0 =−
First assume α 1. Let ζ = ζ(α, β) = −β tan
πα 2
and then for u > 0,
112
3 Technical Results for Univariate Stable Distributions
φ(u) − 1 −u α (1+iζ )+iδu = e − 1 (−i)e−i(π/2)p u−p−1 p+1 (iu)
α = −i e−u cos(δu − ζuα ) + i sin(δu − ζuα ) − 1 e−i(π p/2) u−p−1
α α = e−u sin(δu − ζuα ) − i e−u cos(δu − ζuα ) − 1 cos ( π2p ) − i sin ( π2p ) u−p−1,
and therefore
Re
α α φ(u) − 1 = cos ( π2p )e−u sin(δu − ζuα ) −sin ( π2p ) e−u cos(δu − ζuα ) − 1 u−p−1 p+1 (iu) α
= cos ( π2p ) sin(δu − ζuα )u−p−1 e−u
α α − sin ( π2p ) cos(δu − ζuα ) − 1 u−p−1 e−u + (e−u − 1)u−p−1 .
Integrating this from 0 to ∞, substituting t = uα in the last term to get p
E X+ =
$ # Γ(p + 1) − cos ( π2p ) g−p (−δ|α, β) − sin ( π2p ) g−p (−δ|α, β) − Γ(1 − p/α)/p . π
Next consider 0 < p < α = 1. Use (3.47) again, so we need to simplify φ(u) − 1 −u−iβuη(u,1)+iδu = e − 1 (−i)e−i(π/2)p u−p−1 (iu) p+1 = −i (e−u [cos(δu − βη(u, 1)) + i sin(δu − βη(u, 1))] − 1) e−i(π/2)p u−p−1 = [e−u sin(δu − βη(u, 1)) − i (e−u cos(δu − βη(u, 1)) − 1)]
× cos ( π2p ) − i sin ( π2p ) u−p−1 φ(u) − 1 Re = cos ( π2p )e−u sin(δu − βη(u, 1)) (iu) p+1 − sin ( π2p ) (e−u (cos(δu − βη(u, 1)) − 1) + (e−u − 1)) u−p−1 . Integrating from 0 to ∞ yields p
E X+ =
# $ Γ(p + 1) − cos ( π2p ) g−p (−δ|1, β) − sin ( π2p ) g−p (−δ|1, β) − Γ(1 − p)/p . π
When p = 1 < α < 2, E X exists and is equal to δ. Using Corollary 2 of Pinelis (2011) with k = 1, = 0 shows ∫ ∫ Γ(2) ∞ 1 φ(u) − 1 δ 1 ∞ φ(u) − 1 Re du = + Re du. E X+ = E X + 2 π 0 2 π 0 (iu) p+1 (iu) p+1 The integrand is the same as above, with cos ( π2p ) = 0 and sin ( π2p ) = 1, so E X+ =
δ 1 − [g−1 (−δ|α, β) − Γ(1 − 1/α)] . 2 π
When 1 < p < α < 2, Corollary 2 of Pinelis (2011) with k = = 1 shows
3.7 Moments and other transforms p
E X+ =
Γ(p + 1) π
113
∫
∞
Re 0
φ(u) − 1 − iuE X du. (iu) p+1
(3.48)
Since α > 1, E X exists and is equal to δ. As above, for u > 0, φ(u) − 1 − iuδ −u α (1+iζ )+iδu = e − 1 − iδu (−i)e−i(π/2)p u−p−1 (iu) p+1 α = −i e−u [cos(δu − ζuα ) + i sin(δu − ζuα )] − 1 − iδu e−i(π/2)p u−p−1 α α = e−u sin(δu − ζuα ) − δu − i e−u cos(δu − ζuα ) − 1
× cos ( π2p ) − i sin ( π2p ) u−p−1 . And therefore Re
α φ(u) − 1 − iuδ −u α = cos ( π2p ) e sin(δu − ζu ) − δu (iu) p+1 α − sin ( π2p ) e−u cos(δu − ζuα ) − 1 u−p−1 α α = cos ( π2p ) [sin(δu − ζuα ) − δu]u−p−1 e−u + δ(e−u − 1)u−p α α − sin ( π2p ) [cos(δu − ζuα ) − 1] u−p−1 e−u + (e−u − 1)u−p−1 .
Substituting this into (3.48) and integrating yields
Γ(p + 1) % p E X+ = cos ( π2p ) − g−p (−δ|α, β) + (δ/α)Γ((1 − p)/α) π
& + sin ( π2p ) −g−p (−δ|α, β) + Γ(1 − p/α)/p . d
Now consider γ 1. If X ∼ S (α, β, γ, δ; 1), then X =γY , where Y ∼ p p S (α, β, 1, δ∗ ; 1), so E X+ = γ p EY+ . In symbols, m p (α, β, γ, δ) = γ p m p (α, β, 1, δ∗ ). (b) This follows from the reflection property: −X ∼ S (α, −β, γ, −δ; 1).
There are several corollaries to the preceding result. First, taking p = 1 in the previous result shows the following. Corollary 3.3 If X ∼ S (α, β, γ, δ; 1) with 1 < α < 2, −1 ≤ β ≤ 1, a ∈ R a − δ δ−a γ 1 α, β . E(X − a)+ = + Γ 1− − g−1 2 π α γ Definition 3.11 For any complex number p, the signed power of a real variable x is
114
3 Technical Results for Univariate Stable Distributions
x
⎧ ⎪ xp ⎪ ⎨ ⎪ = (sign x)|x| = 0 ⎪ ⎪ ⎪ −|x| p ⎩ p
x>0 x=0 x < 0.
Problem 3.27 shows the following properties. Lemma 3.16 (a) (x
) = x . (b) x
y
= (xy)
. (c) x = sign x. (d) x
x = (sign x)x = |x| p+q . (e) (−x)
= −x
. (f) (d/dx)x
= p|x| p−1 for x 0. (g) (d/dx)|x| p = px for x 0. ∫ (h)∫ |x| p dx = x /(p + 1), p −1.
dx = |x| p+1 /(p + 1), p −1. (i) x The signed power of random variable X is written X
. The fractional ∫∞ ∫0 signed moment is defined by E X
= −∞ x
f (x)dx = − −∞ |x| p f (x)dx + ∫∞ p x p f (x)dx = E X+ − E X−p . If the fractional absolute moment exists, then the 0 fractional signed moment also exists. Combining parts (a) and (b) of Theorem 3.8 yields the following. Corollary 3.4 If X ∼ S (α, β, γ, δ; 1) with 0 < α < 2, −1 ≤ β ≤ 1, −1 < p < α. ∗ δ Γ(1 − p/α) 2Γ(p + 1) sin ( π2p ) 1 {p>1} − g−p (−δ∗ |α, β) E |X | p = γ p π p ∗ δ Γ((1 − p)/α) 2Γ(p + 1) E X
= γ p cos ( π2p ) 1 {p>1} − g−p (−δ∗ |α, β) . π α Proof E |X | p = E X−p + E X+ = m p (α, −β, γ, −δ) + m p (α, β, γ, δ) and E X
= m p (α, β, γ, δ) − m p (α, −β, γ, −δ). Use Theorem 3.8 and the reflection property: gd (x|α, −β) = − gd (−x|α, β). Note that as p → 0, gd (−x|α, β) = gd (x|α, −β) and g0 (δ∗ |α, β) = P(X > 0) − P(X < 0) = E |X | p → E 1 = 1 and E X
→ −(2/π) 1 − 2F(0). Also as p → 1, E X
→ δ. p
p
In the strictly stable case, the expressions for E X+ can be simplified using closed gd (0|α, β) when α 1. form expressions for gd (0|α, β) and Corollary 3.5 Let X be strictly stable, e.g. X ∼ S (α, β, γ, 0; 1) with α ∈ (0, 1)∪(1, 2) or (α = 1 and β = 0), and let −1 < p < α. (a) The fractional moment of the positive part of X is p γ Γ(1 − p/α) sin p(π/2 + θ 0 ) p . E X+ = 1/α Γ(1 − p) sin(pπ) (cos αθ 0 )
3.7 Moments and other transforms
115 p
(b) The fractional moment of the negative part of X is E X−p = E(−X)+ , which can be obtained from the right-hand side above by replacing θ 0 with −θ 0 . (c) The fractional absolute and signed moments are p γ Γ(1 − p/α) cos pθ 0 E |X | p = Γ(1 − p) cos(pπ/2) (cos αθ 0 )1/α p γ Γ(1 − p/α) sin pθ 0 . E X
= Γ(1 − p) sin(pπ/2) (cos αθ 0 )1/α When α < 1 and | β| = 1, the results hold true for all p < α. (d) For admissible values of p, tan αp arctan(β tan E X
tan[pθ 0 ] = = E |X | p tan[pπ/2] tan[pπ/2]
πα 2
)
.
The right-hand side of Corollary 3.5(a), and similar expressions, should be interpreted specially in certain cases: using the identity sin pπ = 2 sin(pπ/2) · cos(pπ/2) ⎧ 1 ⎪ ⎪ 2 sec pπ/2 ⎪ ⎪ ⎪
sin p(π/2 + θ 0 ) ⎪ ⎪ = π1 π2 + θ 0 = 1 − F(0|α, β; 1) lim ⎪ ⎪ p→0 2 sin pπ/2 ⎪ ⎪ ⎪ Γ(1 − 1/α) sin(π/2 + θ 0 ) ⎪ ⎨ ⎪ 2(cos αθ 0 )1/α lim p→1 Γ(1 − p) cos pπ/2 ⎪ ⎪ ⎪ Γ(1 − 1/α) sin(π/2 + θ 0 ) ⎪ ⎪ = ⎪ ⎪ π(cos αθ 0 )1/α ⎪ ⎪ ⎪ Γ(1 − p/α) ⎪ ⎪ ⎪ ⎪ Γ(1 − p)(cos απ/2) p ⎩
α = 1, β = 0 p=0
α > 1, p = 1 α < 1, β = 1, p = −n.
There are two cases of special interest. In the symmetric case, for any 0 < α < 2, X ∼ S (α, 0, γ, 0; 1), and −1 < p < α. E |X | p =
⎧ ⎪ ⎨γ p ⎪
Γ(1 − p/α) Γ(1 − p) cos(pπ/2) ⎪ ⎪ π2 γΓ(1 − 1/α) ⎩
p1 p = 1.
(3.49)
Since lim p→1 Γ(1 − p) cos(pπ/2) = π/2, the top
case tends to the p = 1 case. The second special case is 0 < α < 1 and X ∼ S α, 1, (cos π2α )1/α, 0; 1 = S (α, 1, 1; 7) where X is positive and θ 0 = π/2, so E X p = E |X | p = E X
=
Γ(1 − p/α) Γ(1 − p)
− ∞ < p < α.
(3.50)
If α > 1, p = 1 is possible above. In these cases the product Γ(1 − p) sin(πp) in the denominator of (a) is interpreted as the limiting value π. In (c), Γ(1 − p) cos(πp) → π/2 as p → 1 and the |Γ(1 − p) sin(πp)| → ∞, and the right-hand side of E X
is interpreted as 0, which is E X because X is strictly stable. When p → 0,
116
3 Technical Results for Univariate Stable Distributions
sin(pθ 0 )/sin(πp/2) is interpreted as the limiting value 2θ 0 /π. Theorem 2.6.3 of Zolotarev (1986) shows that the above result holds for complex numbers p: the first part holds for −1 < Re p < α, the α < 1, β = 1 case holds for all Re p < α, and the α < 1, β = −1 case holds for all p. Proof (a) Note that when X is strictly stable, δ∗ = 0. First assume 0 < p < min(1, α) and substitute Lemma 3.12 into this case of Theorem 3.8 p
E X+ Γ(1−p/α) Γ(1 − p/α) γ p Γ(p + 1) −p/α − (cos αθ 0 ) = cos(−pθ 0 ) − 1 sin(πp/2) π p −p Γ(1 − p/α) − cos(πp/2)(−(cos αθ 0 )−p/α sin(−pθ 0 )) −p γ p Γ(p + 1)Γ(1 − p/α) [sin(πp/2) cos(pθ 0 ) + cos(πp/2) sin(pθ 0 )] = πp(cos αθ 0 ) p/α γ p Γ(p + 1)Γ(1 − p/α) = sin(πp/2 + pθ 0 ). πp(cos αθ 0 ) p/α Using the identity Γ(p + 1) = πp/(Γ(1 − p) sin pπ) gives the result. When p = 1 < α, again using the appropriate part of Theorem 3.8 shows Γ(1 − 1/α) 1 E X+ = γ 0 + Γ(1 − 1/α) − (cos αθ 0 )−1/α cos(−θ 0 ) − 1 π −1 γΓ(1 − 1/α) (cos αθ 0 )−1/α cos(θ 0 ) . = π When 1 < p < α, using Theorem 3.8 and δ∗ = 0, p
E X+ Γ(1 − p/α) Γ(1 − p/α) γ p Γ(p + 1) −p/α − (cos αθ 0 ) = cos(−pθ 0 ) − 1 sin(πp/2) π p −p Γ(1 − p/α) + cos(πp/2) 0 + cos(αθ 0 )−p/α sin(−pθ 0 ) , −p and the rest is like the first case. The case where −1 < p < 0 is similar. (b) Replace β with −β in (a). (c) Use the method in the proof of Corollary 3.4. The identity sin(A+B)+sin(A−B) = 2 sin A cos B shows sin(πp/2 + pθ 0 ) + sin(πp/2 + pθ 0 ) = 2 sin(pπ/2) cos(pθ 0 ) sin(πp/2 + pθ 0 ) − sin(πp/2 + pθ 0 ) = 2 cos(pπ/2) sin(pθ 0 ) sin pπ = 2 sin(pπ/2) cos(pπ/2). Using these and some simplifications gives (c). Another consequence of Theorem 3.8 is the following.
3.7 Moments and other transforms
117
Corollary 3.6 If X ∼ S (α, β, γ, δ; 1) with 0 < α < 2, −1 ≤ β ≤ 1. Then the moment generating functions of log |X |, log X+ and log X− are M(t) = E exp (t log |X |) = mt (α, β, γ, δ) + mt (α, −β, γ, −δ) M+ (t) = E exp (t log X+ ) = mt (α, β, γ, δ) M− (t) = E exp (t log X− ) = mt (α, −β, γ, −δ), for −1 < t < α. The 1-parameterization used above is discontinuous in the parameters near α = 1, and it is not a location-scale family when α = 1. To avoid this, the 0-parameterization can be used. Using (1.3) shows that if X ∼ S (α, β, γ, δ0 ; 0), m p (α, β, γ, δ0 − βγ tan π2α ) α1 p E X+ = p m (α, β, γ, δ0 − (2/π)βγ log γ) α = 1. This quantity is continuous in all parameters. p For the above expressions for E X+ to be of practical use, one must evaluate gd (·|α, β). When d is a nonnegative integer, Nolan (2019c) gives gd (·|α, β) and Zolotarev type integral expressions for these functions. However, this is not helpful here, where negative and non-integer values of d are needed. A short R program gd (·|α, β) exists. to numerically evaluate the defining integrals for gd (·|α, β) and A single evaluation takes less than 0.2 milliseconds on a modern desktop. This is ∫∞ p faster than numerically evaluating E X+ = 0 x p f (x|α, β, γ, δ)dx, because the latter requires many numerical calculations of the density f (x|α, β, γ, δ). Next, some integral transforms of stable laws are given: Laplace transforms, Mellin transforms, and characteristic transforms. The Laplace transform of a r.v. with density f (x) is ∫ ∞ E exp (−uX) = exp (−uX) f (x)dx, −∞
for u > 0. For a stable r.v. with β = 1, the left tail is light enough that this integral converges. In every other case the left tail is a power law and the integral diverges. Proposition 3.2 Let X ∼ S (α, 1, γ, 0; 1), 0 < α ≤ 2. The Laplace transform exists and is given by exp (−γ α (sec π2α )uα ) α ∈ (0, 1) ∪ (1, 2] E exp (−uX) = α = 1, exp (−γ π2 u log u) for any u > 0. Proof Let X1, X2, . . . be i.i.d. Pareto(α, 1) r.v. First consider α 1. Set Yj = c(X j − α/(α − 1)) where (using the reflection identity for the gamma function) c = γ(cos π2α Γ(1 − α))−1/α = π/(2 sin(πα/2)Γ(α)) > 0. Lemma 3.5 shows
118
3 Technical Results for Univariate Stable Distributions d
Sn = (Y1 + · · · + Yn )/n1/α −→S (α, 1, γ, 0; 1). By Lemma 7.2, the Laplace transform of Yj near the origin is E exp (−uYj ) = E exp (−uc(X j − α/(α − 1))) = E exp (−ucX j )e−ucα/(α−1) ucα ucα + O(u2 ) 1 − + O(u2 ) = 1 − (uc)α Γ(1 − α) + (α − 1) (α − 1) = 1 + uα γ α sec
πα 2
+ o(u2 ).
Hence, the Laplace transform of Sn is n n ' * E exp (−uSn ) = E exp (−un−1/α Yj + = E exp (−un−1/αY1 ) j=1 , ) n α α πα u γ sec 2 1 +o = 1+ . n n
As n → ∞, this converges to exp (−γ α (sec πα/2)uα ). The α = 1 case is similar, see Problem 3.29. Finally, the α = 2 case is just the moment generating function of a N(0, 2γ 2 ) law. The result is also true if u is complex with Re u > 0. Note that the constant sec(πα/2) is positive if 0 < α < 1 and negative if 1 < α ≤ 2. In particular, a
S α, 1, | cos π2α | 1/α, 0; 1 law has Laplace transform exp (−uα ) 0 < α < 1 (3.51) E exp (−uX) = 1 < α ≤ 2. exp (uα ) Differentiation of the Laplace transforms shows that if f (x) is the density of a S (α, 1, γ, 0; 1) law with 0 < α ≤ 2, then for u > 0, ∫ ∞ d n E exp (−uX) x n f (x)e−ux dx = (−1)n , dun −∞ u=0 which can be computed explicitly from Proposition 3.2. For X ∼ S (α, 1, γ, 0; 1) r.v., the cumulant of X is − sec( π2α )γ α uα α 1 log E exp (−uX) = α=1 −γ π2 u log u
u ≥ 0.
We end this section with a short discussion of Mellin transforms and characteristic transforms as discussed in Section 2.6 of Zolotarev (1986) and Sections 5.6 and 5.7 of Uchaikin and Zolotarev (1999). The Mellin transform of a positive random variable is defined by M X (u) = E(X u ), see Section A.6. The Mellin transform is defined for complex u in some vertical strip or half-plane. It uniquely determines the distribution of X and has an inverse
3.8 Convergence of stable laws in terms of (α, β, γ, δ)
119
transform. Corollary 3.5 gives the Mellin transform of the absolute value of a strictly stable r.v. M |X | (u) = E |X | u =
Γ(1 − u/α) cos uθ 0 γu , u/α Γ(1 − u) cos(uπ/2) (cos αθ 0 )
−1 < Re u < α.
(3.52) Note the simplification in the symmetric case (3.49) or in the positive strictly stable case in (3.50). Corollary 3.4 gives the Mellin tranform of |X | for a general (nonstrictly) stable random variable. To deal with random variables having both positive and negative values, there are two approaches. For the first approach, Springer (1979) decomposes a general random variable X into X+ and X− ; knowing the Mellin transform of X+ and X− determine the distribution of X. Theorem 3.8 gives these two Mellin transforms for strictly stable random variables. Equivalently, since X+ and X− have disjoint support, E |X | u = E(X+ )u +E(X− )u and E X = E(X+ )u −E(X− )u , so knowing the absolute and signed moments determines the distribution of X. Note that if X is symmetric, then E |X | u = 2E(X+ )u and E X = 0. For the second approach, Zolotarev uses the following related concept. Definition 3.12 The characteristic transform of a random variable X is the 2 × 2 complex matrix function E |X | iu 0 W(u) = . 0 E X For X ∼ S (α, β, γ, 0; 1), Problem 3.7 shows that Γ(1 − u/α) γu W(u) = (cos αθ 0 )u/α Γ(1 − u)
cos uθ0 cos uπ/2
0
0 sin uθ0 sin(uπ/2)
.
(3.53)
Both of these integral transforms are used to analyze products and ratios of random variables. For example, if X and Y are independent, then M XY (u) = M X (u)MY (u), M X/Y (u) = M X (u)MY (−u), WXY (u) = WX (u)WY (u), and WX/Y (u) = WX (u)WY (−u), where the last two equations use matrix products. See Epstein (1948), Springer (1979) and Zolotarev (1957) for some general properties.
3.8 Convergence of stable laws in terms of (α, β, γ, δ) In this section we consider how stable laws vary as the parameters (α, β, γ, δ) vary. First we consider α bounded away from 0, then examine what happens as α → 0. In Section 3.13 we discuss the issue of when normalized sums of non-stable terms converge to stable. It is useful to have a way of quantifying how close two stable distributions are. Here is one definition using the S (α, β, γ, δ; 0) parameterization.
120
3 Technical Results for Univariate Stable Distributions
Definition 3.13 Let X j ∼ S α j , β j , γ j , δ j ; 0 , j = 1, 2. Define the distance between X1 and X2 by Δ(X1, X2 ) = |α1 − α2 | + | β1 − β2 | + |γ1 − γ2 | + |δ1 − δ2 |. Δ(X1, X2 ) is a metric because the right-hand side is a metric on R4 . Theorem 3.9 Let X ∼ S (α, β, γ, δ; 0), Xn ∼ S (αn, βn, γn, δn ; 0), n = 1, 2, 3, . . . d
(a) Xn −→X if and only if Δ(Xn, X) → 0. (b) For any feasible sets of parameters (α j , β j , γ j , δ j ), j = 1, 2, | f (x|α1, β1, γ1, δ1 ; 0) − f (x|α2, β2, γ2, δ2 ; 0)| < c1 Δ(X1, X2 ), where c1 = c1 (min(α1, α2 ), min(γ1, γ2 )) is a positive constant. Proof This is the one-dimensional version of Theorem 1.2 in Nolan (2010), where an explicit value of c1 (α, γ) is given, and it is uniformly continuous on compact subsets of (0, 2] × (0, ∞). The above properties of the distance do not hold uniformly in α if the S (α, β, γ, δ; 1) parameterization is used because of the discontinuity at α = 1. It is possible to have a similar definition in any parameterization based on the S (α, β, γ, δ; 0) parameterization, but not in other parameterizations. We now discuss what happens as α approaches 0. The graphs of stable d.f.s in Figure 1.4 suggest that as α ↓ 0, stable distributions converge to a degenerate distribution. This is indeed what happens, and it is stated precisely below. On the other hand, |X | α converges to a continuous limit as α ↓ 0. To preserve the sign of X, we also look at the signed power X . These latter results were first proved by Cressie (1975). d
Lemma 3.17 Fix β ∈ [−1, 1]. As α → 0, Z(α, β) ∼ S (α, β; 1)−→Z(0, β), where Z(0, β) is an (improper) discrete r.v. with ⎧ ⎪ −∞ with probability (1 − e−1 )(1 − β)/2 ⎪ ⎨ ⎪ Z(0, β) = 0 with probability e−1 ⎪ ⎪ ⎪ +∞ with probability (1 − e−1 )(1 + β)/2. ⎩ Also, as α → 0,
d
Z(α, β) −→/E, where E ∼ E xponential(1), = ±1 with respective probabilities (1 ± β)/2, E and independent. In particular as α → 0, both Z(α, 1)α and |Z(α, β)| α (for any β ∈ [−1, 1]) converges in distribution to 1/E, a Fréchet(ξ = 1, μ = 0, σ = 1) distribution (see
converges to a Laplace distribution. Section 7.3.1), and Z(α, 0)
3.8 Convergence of stable laws in terms of (α, β, γ, δ)
121
Proof Theorem 3.4 shows that for α < 1 and any x > 0, ∫ π 2 sign (1 − α) α 1 π F(x|α, β; 1) = exp −x α−1 V(θ|α, β) dθ. 2 − θ0 + π π −θ0 As α → 0, θ 0 (α, β) → β π2 and V(θ|α, β) → 1 pointwise as α → 0. Thus the above tends to ∫ π 2 1 1−β 1 π 1 + β −1 0 π + − β exp −x · 1 dθ = + e . 2 π 2 π −β π2 2 2 This is independent of 0 < x < ∞, so the limit function is constant on (0, ∞). By the reflection property, F(−x|α, β; 1) = 1 − F(x|α, −β; 1) → ((1 − β)/2)(1 − e−1 ), again the limit is constant on (−∞, 0). There is a jump of size e−1 at 0. The d.f. of /E is ⎧ ⎪ (1 − β)/2 − ((1 − β)/2)e1/x ⎪ ⎨ ⎪ G(x|0, β) = (1 − β)/2 ⎪ ⎪ ⎪ (1 − β)/2 + ((1 + β)/2)e−1/x ⎩
x 0.
For x > 0 and α → 0, P(X
1 1 π π ≤ x) = F(x |α, β; 1) → 2 − β2 + π π 1−β 1 + β −x −1 = . + e 2 2 1/α
∫
π 2
−β π2
exp −x −1 dθ
By reflection, P(X ≤ −x) = F(−x 1/α |α, β; 1) = 1 − F(x 1/α |α, −β; 1) → −1 ((1 − β)/2) (1 − e−x ). This result gives approximations for a stable d.f. and density when α is small, where they are numerically difficult to compute: for α near 0 F(x|α, β; 1) ≈ G(x |0, β) α(1 − β) exp (−1/|x| α )/(2|x| α+1 ) f (x|α, β; 1) ≈ α(1 + β) exp (−1/x α )/(2x α+1 )
(3.54) x 0.
Using the fact that the 0-parameterization is a shift of the 1-parameterization by β tan π2α and tan(πα/2) → 0 as α → 0, it is natural to define S (0, β; 0) as equal S (0, β; 1), and S (0, β, γ, δ; k), k = 1, 2 as the scale and shift γZ(0, β) + δ. With this definition, Problem 3.8 shows the following. Corollary 3.7 Suppose Xn ∼ S (αn, βn, γn, δn ; k) for n = 0, 1, 2, . . ., k = 0 or k = 1. d
If αn → 0, βn → β, γn → γ and δn → δ. Then Xn −→γZ(0, β)+ δ ∼ S (0, β, γ, δ; k).
122
3 Technical Results for Univariate Stable Distributions
There are also approximations for stable laws as α → 2 and for the difference between S (α, β; 0) laws and a S (1, β; 0) law as α → 1, see Section 2.9 in Zolotarev (1986) and Section 4.8 of Uchaikin and Zolotarev (1999) respectively.
Finally, for some applications it is convenient to use Xα,β,γ ∼ S α, β, γ(cos π2α )1/α, 0; 1 . With this choice of scale, we get convergence to a degenerate r.v. d
Xα,β,γ −→βγ
as
α → 1.
(3.55)
This is true because log E exp (iuXα,β,γ ) is
−(γ(cos π2α )1/α |u|)α 1 − i β tan π2α sign u = −(γ|u|)α cos π2α − i β sin π2α sign u → −γ|u| [0 − i β · 1 · sign u] = i βγu, as α → 1. Compare to (3.38). In particular, if α < 1, Xα,1,γ ∼ S(α, 1, γ(cos 0; 1) = S (α, 1, γ; 7) is a positive stable r.v. and Xα,1,γ → γ as α ↑ 1.
πα 2
)1/α,
3.9 Combinations of stable random variables This section describes how some combinations of stable random variables result in other stable random variables. See the next section for ways in which stable laws arise from other distributions. We start with the proofs of the properties of linear combinations of stable random variables. Proof of Proposition 1.3 (a) If X ∼ S (α, β, γ, δ; 0), α 1, then aX + b has characteristic function φ aX+b (u) = φ X (au) exp (ibu) = exp (−γ α |au| α [1 + i β tan(πα/2)(sign au)(|γau| 1−α − 1)] + iδau + ibu) = exp (−(|a|γ)α |u| α [1 + i((sign a)β) tan(πα/2)sign u(|(|a|γ)u| 1−α − 1)] + i(aδ + b)u) ∼ S (α, (sign a)β, |a|γ, aδ + b; 0). In similar manner, when α = 1, φ aX+b (u) = φ X (au) exp (ibu) = exp (−γ|au|[1 + i β(2/π)sign (au) log(γ|au|)] + iδau) exp (ibu) = exp (−(|a|γ)|u|[1 + i((sign a)β)(2/π)sign u log((|a|γ)|u|)] +i(aδ + b)u) ∼ S (1, (sign a)β, |a|γ, aδ + b; 0). (b) The joint continuity in all four parameters follows from Lemma 3.7(c).
3.9 Combinations of stable random variables
123
(c) Let X1 ∼ S (α, β1, γ1, δ1 ; 0) and X2 ∼ S (α, β2, γ2, δ2 ; 0). To find the distribution of X1 +X2 , first consider the case where α 1. Set γ α = γ1α +γ2α , β = (β1 γ1α +β2α γ2α )/γ α , and δ = δ1 + δ2 + tan(πα/2)(βγ − β1 γ1 − β2 γ2 ). Then by independence, φ X1 +X2 (u) = φ X1 (u)φ X2 (u) = exp −γ1α |u| α [1 + i β1 tan(πα/2)sign u(|γ1 u| 1−α − 1)] + iδ1 u × exp −γ2α |u| α [1 + i β2 tan(πα/2)sign u(|γ2 u| 1−α − 1)] + iδ2 u
= exp −γ1α |u| α + iu tan(πα/2)β1 γ1α + i(δ1 − β1 γ1 tan(πα/2))u
× exp −γ2α |u| α + iu tan(πα/2)β2 γ2α + i(δ2 − β2 γ2 tan(πα/2))u = exp (−(γ1α + γ2α )|u| α + iu tan(πα/2)(β1 γ1α + β1 γ2α ) + i(δ1 + δ2 − tan(πα/2)(β1 γ1 + β2 γ2 )))
= exp −γ α |u| α + iu tan(πα/2)βγ α + i(δ − tan(πα/2)βγ)u
= exp −γ α |u| α 1 + i β tan(πα/2)sign u(|γu| 1−α − 1) + iδu . This is the characteristic function of a S (α, β, γ, δ; 0) r.v. Likewise, when α = 1, φ X1 +X2 (u) = φ X1 (u)φ X2 (u) = exp (−γ1 |u|[1 + i β1 (2/π)sign u log |γ1 u|] + iδ1 u) × exp (−γ2 |u|[1 + i β2 (2/π)sign u log |γ2 u|] + iδ2 u) = exp (−γ1 |u| − i(2/π)β1 γ1 u log |γ1 u| + iδ1 u) × exp (−γ2 |u| − i(2/π)β2 γ2 u log |γ2 u| + iδ2 u) = exp (−(γ1 + γ2 )|u| − i(2/π)u[β1 γ1 log |γ1 u| + β2 γ2 log |γ2 u|] +i(δ1 + δ2 )u). Set γ = γ1 +γ2 , β = (β1 γ1 + β2 γ2 )/γ, and δ = δ1 +δ2 +(2/π)[βγ log γ − β1 γ1 log γ1 − β2 γ2 log γ2 ]. Then more algebra shows β1 γ1 log |γ1 u| + β2 γ2 log |γ2 u| = β1 γ1 (log γ1 + log |u|) + β2 γ2 (log γ2 + log |u|) = (β1 γ1 + β2 γ2 ) log |u| + β1 γ1 log γ1 + β2 γ2 log γ2 = βγ log |u| + βγ log γ − βγ log γ + β1 γ1 log γ1 + β2 γ2 log γ2 = βγ log |γu| − βγ log γ + β1 γ1 log γ1 + β2 γ2 log γ2 . Substituting this into the above yields
124
3 Technical Results for Univariate Stable Distributions
φ X1 +X2 (u) = exp (−γ|u| − i(2/π)u[βγ log |γu|−βγ log γ + β1 γ1 log γ1 + β2 γ2 log γ2] +i(δ1 + δ2 )u) = exp (−γ|u| − i(2/π)uβγ log |γu| +i(δ1 + δ2 + (2/π)(βγ log γ − β1 γ1 log γ1 − β2 γ2 log γ2 ))u) = exp (−γ|u|[1 + i(2/π)βsign u log |γu|] + iδu). The corresponding properties for the 1, 2, and 3 parameterizations, Proposition 1.4, Proposition 3.3, and Proposition 3.4 below, are proved in a similar manner, see Problem 3.9. Proposition 3.3 The S (α, β, γ, δ; 2) parameterization has the following properties. (a) If X ∼ S (α, β, γ, δ; 2), then for any a 0, b ∈ R, aX + b ∼ S (α, (sign a)β, |a|γ, aδ + b; 2). (b) The characteristic function, density, and distribution functions are jointly continuous in all four parameters (α, γ, β, δ). (c) If X1 ∼ S (α, β1, γ1, δ1 ; 2) and X2 ∼ S (α, β2, γ2, δ2 ; 2) are independent, then X1 + X2 ∼ S (α, β, γ, δ; 2) where β=
β1 γ1α + β2 γ2α , γ1α + γ2α
γ α = γ1α + γ2α
δ = δ1 + δ2 + d2 (α, β, γ) − d2 (α, β1, γ1 ) − d2 (α, β2, γ2 ),
where d2 (α, β, γ) =
α−1/α γ(m(α, β) + β tan γ(m(1, β) + β π2 log γ)
πα 2
)
α1 α = 1.
Proposition 3.4 The S (α, β, γ, δ; 3) parameterization has the following properties. (a) If X ∼ S (α, β, γ, δ; 3), then for any a 0, b ∈ R, aX + b ∼ S (α, (sign a)β, |a|γ, aδ + b; 3). (b) The characteristic function, density, and distribution functions are jointly continuous in all four parameters (α, γ, β, δ). (c) If X1 ∼ S (α, β1, γ1, δ1 ; 3) and X2 ∼ S (α, β2, γ2, δ2 ; 3) are independent, then X1 + X2 ∼ S (α, β, γ, δ; 3) where β=
β1 γ1α + β2 γ2α , γ1α + γ2α
γ α = γ1α + γ2α
δ = δ1 + δ2 + d3 (α, β, γ) − d3 (α, β1, γ1 ) − d3 (α, β2, γ2 ),
where d3 (α, β, γ) =
α−1/α βγ(κ(α) + tan βγ(κ(1) + π2 log γ)
πα 2
)
α1 α = 1.
3.9 Combinations of stable random variables
125
None of these parameterizations are ideal from the algebraic point of view. The 0-, 2-, and 3-parameterizations are scale and shift families, the 1-parameterization is not. The 0-, 2-, and 3-parameterizations are jointly continuous in all four parameters, the 1-parameterization is not. The shift parameter of a sum is the sum of the shift parameters in the 1-parameterization, but not in the others. By defining βγ tan(πα/2) α 1 d0 (α, β, γ) = βγ π2 log γ α=1 and d1 (α, β, γ) = 0, the location parameter of the sum X1 + X2 in the k-parameterization, k = 0, 1, 2, 3 can be written in the uniform notation δ1 + δ2 + dk (α, β, γ) − dk (α, β1, γ1 ) − dk (α, β2, γ2 ). Problem 3.10 shows that for each k = 0, 1, 2, 3, as α → 1 dk (α, (β1 γ1α + β2 γ2α )/(γ1α + γ2α ), (γ1α + γ2α )1/α ) − dk (α, β1, γ1 ) − dk (α, β2, γ2 ) → (3.56) dk (1, (β1 γ1 + β2 γ2 )/(γ1 + γ2 ), γ1 + γ2 ) − dk (1, β1, γ1 ) − dk (1, β2, γ2 ). We now give the proof of Proposition 1.5. Proof of Proposition 1.5 (Characterization of strict stability) Let X1, . .. , Xn be
i.i.d. S (α, β, γ, δ; 1). Then by (1.8), X1 + · · · + Xn ∼ S α, β, n1/α
γ, nδ; 1 . When α 1, Proposition 1.4(a) shows n1/α X1 + (nδ − n1/α δ) ∼ S α, β, n1/α γ, nδ; 1 also. If the terms are strictly stable, we must have nδ − n1/α δ = 0, i.e. δ = 0. When α = 1, Proposition 1.4(a) shows nX1 + π2 βn log n ∼ S (1, β, nγ, nδ; 1), so d
X1 + · · · + Xn =nX1 + π2 βn log n for all n > 1 and strict stability requires β = 0. The 0-parameterization formulation follows by (1.3). Next is another linear property of stable distributions: a stable random variable with any skewness β can be written as a linear combination of two independent stable random variables with skewness β1 < β < β2 . The particular case where β1 = −1 and β2 = +1 show that any skewness can be achieved as a sum of these extreme points. Lemma 3.18 If X1 ∼ S (α, β1, γ, 0; k) and X2 ∼ S (α, β2, γ, 0; k) are independent, and β1 < β < β2 , then 1/α 1/α β − β1 β2 − β X1 + X2 ∼ S (α, β, γ, δk ; k), β2 − β1 β2 − β1 where δk = δk (α, β1, β2, γ) is specified in the proof. In particular, taking β1 = −1 and β2 = +1,
126
3 Technical Results for Univariate Stable Distributions
1−β 2
1/α
1+β X1 + 2
1/α X2 ∼ S (α, β, γ, δk ; k).
Proof Set b1 = ((β2 − β)/(β2 − β1 ))1/α , b2 = ((β − β1 )/(β2 − β1 ))1/α , Y1 = b1 X1 and Y2 = b2 X1 . When k = 0, Proposition 1.3(a) shows Y1 ∼ S (α, β1, b1 γ, 0; 0) and Y2 ∼ S (α, β2, b2 γ, 0; 0). Then Proposition 1.3(c) shows Y1 + Y2 has skewness β2 −β β−β1 + β β α α 1 2 β2 −β1 β2 −β1 β(β2 − β1 ) β1 (b1 γ) + β2 (b2 γ) = = =β β2 −β β−β1 (b1 γ)α + (b2 γ)α β2 − β1 + β2 −β1 β2 −β1
and shift δ0 =
(β − β1 − β2 )γ tan π2α (β − β1 − β2 ) π2 γ log γ
α1 α = 1.
The same approach works for other parameterizations—the skewness of Y1 + Y2 is always β, but the shift varies. When k = 1 and α 1, Proposition 1.4 shows Y1 ∼ S (α, β1, b1 γ, 0; 1) and Y2 ∼ S (α, β2, b2 γ, 0; 1); however when α = 1, one has Y1 ∼ S (1, β1, b1 γ, −β1 γb1 log b1 ; 1) and Y2 ∼ S (1, β2, b2 γ, −β2 γb2 log b2 ; 1). Thus 0 α 1 δ1 = β2 −β β2 −β β−β1 β−β1 α = 1. −γ β1 β2 −β1 log β2 −β1 + β2 β2 −β1 log β2 −β1 When k = 2 or 3, Proposition 3.3 and Proposition 3.4 show Y1 ∼ S (α, β1, b1 γ, 0; k) and Y2 ∼ S (α, β2, b2 γ, 0; k), so δ2 = d2 (α, β, γ) − d2 (α, β1, b1 γ) − d2 (α, β2, b2 γ) δ3 = d3 (α, β, γ) − d3 (α, β1, b1 γ) − d3 (α, β2, b2 γ). There are also nonlinear relationships among stable random variables. One of them was given in (1.11), namely X = (γ/Z 2 ) + δ ∼ Lévy(γ, δ) if Z ∼ N(0, 1). Since Z 2 is chi-squared with 1 degree of freedom (= gamma with shape parameter 1/2), this relates Lévy distributions to these families. Other nonlinear properties are given below. These results require strict stability, so it is natural to use the 1-parameterization. Proposition 3.5 Suppose R is a positive strictly αR -stable r.v. and Y is strictly αY stable, R and Y independent. Then X = R1/αY Y is strictly (αR αY )-stable. The precise parameters of the result depend on the values of the parameters. Throughout, let R ∼ S (αR, 1, γR, 0; 1) be positive stable, αR < 1. There are three cases. (a) If Y ∼ S (αY , βY , γY , 0; 1), where (αY 1 and αR αY 1), then R1/αY Y is S (αR αY , β, γ, 0; 1), where
3.9 Combinations of stable random variables
127
2 arctan(βY tan(παY /2)) παY β = tan(αR αY θπ/2)/tan(αR αY π/2) 1/2 cos(πα α θ/2) 1/αR 1/αY R Y 2 2 γ = γY γR 1 + βY tan (παY /2) . cos(παR /2) θ=
(b) If αR αY = 1 and Y ∼ S (αY , βY , γY , 0; 1), then R1/αY Y ∼ S (1, 0, γ, δ; 1), where θ and γ are as in (a) and αR sin(πθ/2) . δ = γY γR (1 + βY2 arctan2 (παY /2))1/2 cos(παR /2) (c) If αY = 1 and Y ∼ S (1, 0, γY , δY ; 1), then RY ∼ S (αR, β, γ, 0; 1) where θ = (2/π) arctan(δY /γY ) β = tan(παR θ/2)/tan(παR /2) 1/αR cos(παR θ/2) γ = γR γY (1 + (δY /γY )2 )1/2 . cos(παR /2) Proof For notational convenience, assume for the moment that Y is expressed in the 7-parameterization, e.g. Y ∼ S (αY , θ, γ7 ; 7). Then conditioning on positive R and using Proposition 3.2, E exp (iuR1/αY Y ) = E(E exp (i(uR1/αY )Y )|R) = E(exp (−γ7 |uR1/αY | αY exp (−iπαY θ/2))) = E(exp (−[γ7 |u| αY exp (−iπαY θ/2)]R)) = exp (−[γR γ7 |u| αY exp (−iπαY θ/2)]αR sec(παR /2)) = exp (−[sec(παR /2)(γR γ7 )αR ]|u| αR αY exp (−iπαR αY θ/2)) ∼ S (αR αY , θ, sec(παR /2)(γR γ7 )αR ; 7). (3.57) So the product is strictly αR αY -stable. An alternate proof of this is given in Problem 3.30. To express the result in the 1-parameterization, consider each case separately. (a) By (3.39), Y ∼ S (αY , βY , γY , 0; 1) = S (αY , θ, γ7 ; 7) with parameters θ = (παY /2)−1 arctan(βY tan(παY /2)) and γ7 = γYαY (1 + βY2 arctan2 (παY /2))1/2 . Thus by (3.57) αR R1/αY Y ∼ S αR αY , θ, sec(παR /2) γR γYαY (1 + βY2 arctan2 (παY /2))1/2 ;7 . (3.58)
128
3 Technical Results for Univariate Stable Distributions
Next use (3.37) to convert back to the 1-parameterization: R1/αY Y ∼ S (αR αY , β, γ, 0; 1), where tan((παR αY /2)θ) tan(αR αY θπ/2) = tan(παR αY /2) tan(παR αY /2) αR 1/(αR αY ) γ = sec(παR /2) γR γYαY (1 + βY2 arctan2 (παY /2))1/2 cos(παR αY θ/2) 1/αR 1/αY cos(πα α θ/2) R Y = γY γR (1 + βY2 arctan2 (παY /2))1/2 . cos(παR /2) β=
(b) By assumption, αR < 1 and αR αY = 1, so we must have αY > 1. In this case, (3.58) still holds, but we must use (3.38) to convert to the 1-parameterization: R1/αY Y ∼ S (1, 0, γ, δ; 1), where αR cos(πθ/2) γ = sec(παR /2) γR γYαY (1 + βY2 arctan2 (παY /2))1/2 αR sin(πθ/2) δ = sec(παR /2) γR γYαY (1 + βY2 arctan2 (παY /2))1/2 (c) When αY = 1, strictly 1-stable Y ∼ S (1, 0, γY , δY ; 1) = S (1, θ, γ7 ; 7) with θ = (2/π) arctan(δY /γY ), γ7 = γY (1 + (δY /γY )2 )1/2 by (3.40). So by (3.57), αR RY ∼ S αR, θ, sec(παR /2) γR γY (1 + (δY /γY )2 )1/2 ;7 . Using (3.37) again, RY ∼ S (αR, β, γ, 0; 1) where β = tan(παR θ/2)/tan(παR /2) 1/αR γ = sec(παR /2)(γR γY (1 + (δY /γY )2 )1/2 )αR cos(παR θ/2) 1/αR 2 1/2 cos(παR θ/2) = γR γY (1 + (δY /γY ) ) . cos(παR /2) Problem 3.11 gives a different proof of this result. Also, Problem 3.12 shows that β in (a) has a limited range when αR αY > 1, so that not every strictly stable distribution can be realized as R1/αY Y . The above result is stated in different ways: X is substable with dominating stable Y and random scale R1/α , or X is dominated by Y with stable subordinator R, or X is conditionally (given R) αY -stable. Particular cases of Proposition 3.5 are stated below. Corollary 3.8 Let R and Y be independent. (a) Every symmetric α-stable random variable is sub-Gaussian: if R ∼ √ 2 S (α/2, 1, γR, 0; 1) and Y ∼ N(0, σ ) = S 2, 0, σ/ 2, 0; 1 , then X = R1/2Y ∼ !
S α, 0, σ γR /2(sec(πα/4))1/α, 0; 1 . Choosing γR = 2(cos(πα/4))2/α makes R1/2Y ∼ S (α, 0, σ, 0; 1). (b) Every symmetric αR -stable distribution is substable to every symmetric αY stable distribution for any αR < αY ≤ 2: X ∼ S (αR, 0, γ, 0; 1) can be expressed as
3.9 Combinations of stable random variables
129
d
X =R1/αY Y , where Y ∼ S (αY , 0, 1, 0; 1) and R ∼ S(αR /αY , 1, γ αY /(cos(παR /2))1/αR , 0; 1). (c) For α < 1, any strictly α-stable r.v. with −1 < β < 1 is substable with respect to a shifted Cauchy distribution. (d) R ∼ S (αR, 1, γR, 0; 1) and Y ∼ S (αY , 1, γY , 0; 1), where both 0 < αR < 1 and 0 < αY < 1, then R1/αY Y ∼ S (αR αY , 1, γ, 0; 1), where 1/2 cos(πα α /2) 1/αR 1/αY R Y 2 . γ = γY γR 1 + tan (παY /2) cos(παR /2) A different combination of stable laws arises in the study of fractional kinetics. Definition 3.14 Let X1 ∼ S(α1, θ 1, 1; 7) and X2 ∼ S(α2, θ 2, 1; 7) with respective densities f1 (·) and f2 (·). Define for −∞ < p < ∞ a fractionally stable distribution: Y = Y (α1, α2, θ 1, θ 2, p) = X1 /X2
. The density of Y can be expressed as ∫ ∞ fY (y) = f1 (yt
) f2 (t)dt. −∞
Propostion 3.5 shows that Y (α1, α/α1, 0, 1, −1/α1 ) = S(α, 0, 1; 7). Other properties of these laws are given in Appendix B of Uchaikin and Sibatov (2013). Next are two stochastic orderings of stable distributions. Proposition 3.6 For fixed 0 < α < 1, the family S (α, β, γ, δ; 1) is stochastically ordered in β: if Xi ∼ S (α, βi, γ, δ; 1) for i = 1, 2, then β1 < β2 implies P(X1 ≤ x) > P(X2 ≤ x) for all x.
(3.59)
Proof By shifting and scaling, we can assume γ = 1 and δ = 0. Using d
Lemma 3.18, we can write Xi =ai Zi,1 − bi Zi,2 , where Zi, j are i.i.d. S (α, 1, 1, 0; 1) with common density f (z), 0 ≤ a1 < ∫a∫2 , and 0 ≤ b2 < b1 . Since Z1 and Z2 are nonnegative, P(Xi ≤ x) = f (z1 ) f (z2 )dz1 dz2 , where Ai = Ai {(z1, z2 ) : z1 ≥ 0, z2 ≥ 0, ai z1 − bi z2 ≤ x}. A sketch shows that A2 is strictly contained in A1 for every x (consider x ≥ 0 and x < 0 separately). Since f (z) > 0 for all z > 0, this shows the result. In the 1-parameterization, Problem 3.22 shows that (3.59) fails when 1 < α < 2. Also in the 1-parameterization, when α = 1, the location is confounded with the scale, and it appears (3.59) holds for some γ and not others. On the other hand, when we switch over to the 0-parameterization, i.e. Xi ∼ S (α, βi, γ, δ; 0), numerical evidence shows that (3.59) fails when 0 < α < 1/2, but it appears to hold when 1/2 ≤ α < 2. In the 2-parameterization and 3-parameterization defined
130
3 Technical Results for Univariate Stable Distributions
in Section 3.5, numerical evidence suggests that the distributions are stochastically ordered in β for any 0 < α < 2. In the 2-parameterization, there is also a dispersive ordering with respect to α in the symmetric case. The following result is due to Zieliński (2000). Proposition 3.7 Let X1 ∼ S (α1, 0, γ, 0; 2) and X2 ∼ S (α2, 0, γ, 0; 2) with 0 < α1 < α2 ≤ 2, then for all t > 0, P(|X1 | ≤ t) ≤ P(|X2 | ≤ t). Proof By symmetry, it suffices to show F1 (t) := P(X1 ≤ t) < F2 (t) := P(X2 ≤ t) d
for all t > 0. To do this, use Proposition 3.5 to write X1 =A1/α2 X2 , where A ∼ S(α1 /α2, 1, γ0 = α1−α2 /α1 α2 (cos(π(α1 /α2 )/2))α2 /α1 , 0; 1) is positive stable. Then for t > 0, F1 (t) = P(A1/α2 X2 ≤ t) = E P(X2 ≤ t A−1/α2 | A) = E F2 (t A−1/α2 ). Since F2 (t) = f2 (t) < 0 for t > 0, F2 (·) is concave on (0, ∞). Hence F1 (t) = E F2 (t A−1/α2 ) ≤ F2 (tE A−1/α2 ). We will show E A−1/α2 < 1, and conclude that F1 (t) ≤ F2 (t). To show this bound, use Corollary 3.5: E A−1/α2 = γ0−1/α2 (cos π(α1 /α2 )/2)1/α1
1/α1 Γ(1 + 1/α1 ) α1 Γ(1 + 1/α1 ) . = 1/α Γ(1 + 1/α2 ) α 2 Γ(1 + 1/α2 ) 2
Problem 3.23 shows that the function α1/α Γ(1 + 1/α) is strictly increasing for α ∈ (0, 2), so the ratio above is strictly less than 1. Problem 3.24 shows that this ordering does not hold in the 0- or 1-parameterization. However, by scaling from the 0- or 1-parameterization, one can apply this result by adding constants: for X1 ∼ S (α1, 0, γ1, 0; j) and X2 ∼ S (α2, 0, γ2, 0; j), j = 1, 2, with 0 < α1 < α2 ≤ 2, then for all t > 0, P(|X1 | ≤ t) ≤ P(|X2 | ≤ c t) where c = (α21/α2 γ2 )/(α11/α1 γ1 ).
3.10 Distributions derived from stable distributions 3.10.1 Log-stable When X is normal, Y = e X is called log-normal. Following this terminology, a random variable Y is called log-stable if Y = e X , where X is a stable random variable. The distribution function is FY (y) = FX (log y) and the density is fy (y) = fX (log y)/y. When X is α-stable with 0 < α < 2, and β −1, Y = e X has very heavy tails, and no fractional moments exist. On the other hand, if β = −1, all positive moments of Y are finite. (See Problem 3.13 for both of these claims.) Let
3.10 Distributions derived from stable distributions
131
Z ∼ S (α, −1, γ, 0; 1) be a totally skewed negative stable r.v. Problem 3.14 shows Y = e Z has Mellin transform for Re u > 0 exp (γ α (sec π2α )uα ) α 1 (3.60) M X (u) = E(Y u ) = α = 1. exp (γ π2 u log u)
3.10.2 Exponential stable A random variable Z is called exponential-stable if Z = log X, where X is a positive stable random variable. In some problems it is convenient to introduce a scale and location family based on a particular choice of positive stable X as d
follows: we will say Z ∼ExpS(α, μ, σ) if Z =μ + σ log X and 0 < α ≤ 1, where X ∼ S α, 1, (cos(πα/2))1/α, 0; 1 = S (α, 1, 1; 7). More generally, if X is any stable random variable, then Y = log |X | can be considered. If X is α-stable, P(log |X | > y) = P(|X | > exp (y)) ∼ c exp (−αy) for large y, so the upper tail of log |X | decays exponentially. The lower tail decays at least exponentially, and thus log |X | has moments of all orders p > 0. In the strictly stable case, the moments can be derived from the following. Lemma 3.19 Let X ∼ S (α, β, γ, 0; 1) with α 1 or X ∼ S (1, 0, γ, 0; 1). Then log |X | has moment generating function u γ Γ(1 − u/α) cos(uθ 0 ) − 1 < u < α, E exp (u log |X |) = Γ(1 − u) cos(uπ/2) (cos αθ 0 )1/α and characteristic function
γ E exp (iu log |X |) = (cos αθ 0 )1/α The mean and variance are
iu
Γ(1 − iu/α) cos(iuθ 0 ) . Γ(1 − iu) cos(iuπ/2)
γ E(log |X |) = γEuler (1/α − 1) + log (cos αθ 0 )1/α 2 2 π (1 + 2/α ) − θ 02, Var (log |X |) = 12
where γEuler ≈ 0.57721 is Euler’s constant. Proof The moment generating function follows from E exp (u log |X |) = E |X | u and Corollary 3.5. Differentiating this result and evaluating at u = 0 gives the first two moments. The characteristic function follows from E exp (iu log |X |) = E |X | iu and (3.52).
132
3 Technical Results for Univariate Stable Distributions
The choice of scaling was made for exponential stable distributions to make a standardized distribution have a simple form for the Laplace transform: standardized Z ∼ExpS(α, 0, 1) has θ 0 = π/2, so the mgf is E exp (uZ) = E exp (u log X) =
Γ(1 − u/α) , Γ(1 − u)
(3.61)
with mean E Z = γEuler (1/α − 1) and Var(Z) = (π 2 /6)(1/α2 − 1/2).
3.10.3 Amplitude of a stable random variable The amplitude of any r.v. X is R = |X |. It has cdf and pdf FR (r) = P(−r ≤ X ≤ r) = FX (r) − FX (−r) fR (r) = fX (r) + fX (−r).
(3.62)
These can be numerically computed when X is stable using the basic routines. When X ∼ N(0, 1), R2 = X 2 is χ2 with 1 degree of freedom. When X is stable with 0 < α < 2, X 2 may also be of interest. This is most likely useful for the symmetric case β = δ = 0, but the following discussion holds in general. Let X ∼ S (α, β, γ, δ; 0) with cdf FX and pdf fX . Then S = X 2 has cdf and pdf √ √ √ √ FS (s) = P(X 2 ≤ s) = P(− s ≤ X ≤ s) = FX ( s) − FX (− s) (3.63) √ √ fX ( s) + fX (− s) . fs (s) = √ 2 s When X = (X1, . . . , Xd ) is N(0, I), R2 = X12 + · · · + Xd2 is χ2 with d degrees of freedom and the amplitude distribution R can be specified in terms of a χ2 distribution. There does not seem to be a simple expression for the distribution of R2 when 0 < α < 2 and the terms are independent. However, Nolan (2013) gives expressions for the distribution of R = |X| when X = (X1, . . . , Xd ) is isotropic stable. (Note that in the Gaussian case, independent components is equivalent to isotropic. In the stable case with 0 < α < 2, independent components and isotropic are distinct cases.) Since X can have any shift in the above discussion, the discussion above includes the folded distribution Y = |X − a|. Replacing FX with FX−a and fX with fX−a , (3.62) gives the cdf and pdf of Y and (3.63) gives the cdf and pdf of Y 2 .
3.10.4 Ratios and products of stable terms In some situations, the ratio Z = X/Y and product V = XY of independent stable random variables is of interest. For example, Davis and Resnick (1986) and later
0.8 0.2
0.2
0.4
d.f.
0.3
0.6
0.4
α = 0.75 α = 1.25 α =1.75
0.0
0.0
0.1
density
133 1.0
0.5
3.10 Distributions derived from stable distributions
−4
−2
0
2
4
u
−4
−2
0
2
4
u
Fig. 3.15 Density and distribution function of the ratio U = X/Y, where X ∼ S (α, 0; 1) and Y ∼ S (α/2, 1; 1).
Mikosch et al. (1995) show that the limiting distributions of parameter estimates for heavy tailed time series are given by the ratio of independent stable terms. Motivated by those applications, we will focus on the case where Y is nonnegative, so αY < 1 and βY = +1. Standard expressions for the distribution of the ratio show that U = X/Y has d.f. and density ∫ ∞ FU (u) = P(U ≤ u) = FX (uy) fY (y)dy, 0 ∫ ∞ fU (u) = y fX (uy) fY (y)dy. 0
These integrals can be evaluated numerically, e.g. Figure 3.15. Note that the density of U has a vertical asymptote at the origin, because when u = 0, the integral above for fU (0) reduces to fX (0)EY = +∞. The tails of U are determined by the tails of the numerator, so FU (u) is regularly varying of index αX . Quantiles of U can be found by numerically inverting FU (u). In a similar way, the d.f and density of the product V = XY can be evaluated by the standard formulas
1.0
3 Technical Results for Univariate Stable Distributions 0.6
134
0.8 0.6 0.4
d.f.
0.3
0.0
0.0
0.1
0.2
0.2
density
0.4
0.5
α =0.75 α = 1.25 α =1.75
−4
−2
0
2
4
−4
−2
0
2
4
v
v
Fig. 3.16 Density and distribution function of the product V = XY, where X ∼ S (α, 0; 1) and Y ∼ S (α/2, 1; 1).
∫
∞
FV (v) = P(V ≤ v) = FX (w/y) fY (y)dy, 0 ∫ ∞ fV (v) = (1/y) fX (v/y) fY (y)dy. 0
These integrals can be evaluated numerically, e.g. Figure 3.16. The tails of V are regularly varying with index min(αX , αY ).
3.10.5 Wrapped stable distribution Jammalamadaka and SenGupta (2001), Section 2.28, define a wrapped stable distribution by taking a stable random variable X and wrapping it around a circle. This can be defined as Y = X (mod 2π), where Y is interpreted as an angle on the unit circle. They focus on the symmetric case, and show that these wrapped stable distributions allow circular distributions with higher peaks, heavier “tails” in the region antipodal to the peak, but with lower “shoulders”. They report a good fit to some data from directional data of ant movements. For more information see Gatto and Jammalamadaka (2003) and Pewsey (2008).
3.11 Stable distributions arising as functions of other distributions
135
3.10.6 Discretized stable distributions In many engineering applications, observations are quantized and truncated. For any continuous random variable X and any integers a < b, ⎧ ⎪ a ⎪ ⎨ ⎪ Y = Y (a, b, X) = [X] ⎪ ⎪ ⎪b ⎩
X≤a a 0. When 0 < α < 1, this is sometimes called a stretched exponential distribution. They are related to the Kohlrausch-Williams-Watts functions, see (2.3). When 0 < α ≤ 2, the characteristic function of such a distribution is a multiple of a symmetric α-stable density: ∫ ∞ ∫ ∞ α α γ γ f (u|α, 0, γ, 0; 0). eiux h(x)dx = eiux e−γ |x | dx = 2Γ(1 + 1/α) −∞ 2Γ(1+1/α) −∞ If α > 2, then the characteristic function of the exponential power distribution is a trans-stable function, which is negative at some values. Gneiting (1997) shows that exponential power laws arise as scale mixtures of normal laws, where the mixing distribution involves a stable law. A one-sided exponential power distribution has densities of the form h+ (x) =
α α γ e−γ x , Γ(1 + 1/α)
x>0
for α > 0 and γ > 0. When 0 < α ≤ 2, the characteristic function is ∫ ∞ ∫ ∞ α α g1 (u/γ |α, 0) γ g1 (u/γ |α, 0) + i , eiux h+ (x)dx = eiux e−γ x dx = Γ(1 + 1/α) 0 Γ(1 + 1/α) 0
3.11 Stable distributions arising as functions of other distributions
139
where g1 and g1 are defined Section 3.4. It follows that symmetric stable densities g1 (u|α, 0), are themselves positive definite. and g1 (u|α, 0) + i
3.11.2 Stable mixtures of extreme value distributions There are interesting connections between extreme value distributions: appropriate location mixtures of Gumbel distributions are Gumbel, appropriate scale mixtures of Weibull are Weibull, and appropriate scale mixtures of Fréchet are Fréchet. (See Section 7.3.1 for a definition of Gumbel, Weibull, Fréchet and generalized extreme value distributions (EVD), and their connection with the concept of max-stability.) The basic connections are given next.
Proposition 3.9 Let α ∈ (0, 1), S ∼ S α, 1, (cos πα/2)1/α, 0; 1 be positive α-stable r.v. and let σ > 0, ξ > 0. (a) If X ∼ Gumbel(μ, σ) be independent of S, then σ log S + X ∼ Gumbel(μ, σ/α). (b) If X ∼ Fr´echet(μ, σ, ξ) is independent of S, then S 1/ξ (X − μ) ∼ Fr´echet(0, σ, αξ). (c) If X ∼ Weibull(μ, σ, ξ) is independent of S, then S −1/ξ (X−μ) ∼ Weibull(0, σ, αξ). Proof (a) In the notation of Section 3.10, M = σ log S ∼ExpS(α, 0, σ), with mgf Γ(1 − uσ/α)/Γ(1 − σu) given by Lemma 3.19. Gumbel X has mgf euμ Γ(1 − σu), e.g. Kotz and Nadarajah (2000). Hence the mgf of M + X is EeuM EeuX =
Γ(1 − uσ/α) uμ e Γ(1 − σu) = euμ Γ(1 − uσ/α), Γ(1 − σu)
which is Gumbel(μ, σ/α). (b) Let Y = S 1/ξ (X − μ), then log Y = (1/ξ) log S + log(X − μ). Using Lemma 7.3, log(X − μ) is Gumbel (log σ, 1/ξ), so (a) shows log Y is Gumbel(log σ, 1/(αξ)) and thus Y is Fréchet(0, σ, αξ). (c) −(X − μ)−1 is Fréchet(0, σ, ξ) by Lemma 7.3, so apply (b) to the term in brackets and invert and negate. Since 0 < α < 1, the scale increases from σ to σ/α in (a), and the tail index decreases from ξ to αξ in (b) and (c). To keep the original endpoint of the distributions in (b) and (c), use S 1/ξ (X − μ)+ μ. Problem 3.31 gives an alternate proof of this result using conditioning. A generalization to multivariate mixtures is given in Fougéres et al. (2009) and Fougères et al. (2013). The results can be restated in terms of the (generalized) extreme value distribution. Problem 3.32 shows that if X ∼GEV(γ, μ, σ) and S is positive stable as above, then Y = Sγ X + (1 − Sγ )(μ − σ/γ) is GEV(μ, σ/α, γ/α). Note that if X has a finite endpoint, then so does Y and it is at the same place. The discussion after Lemma 3.17 shows that if Z(α, 1) ∼ S (α, 1; 1), then d
Z(α, 1)α −→Y as α ↓ 0, where Y is Fréchet(ξ = 1, μ = 0, σ = 1).
140
3 Technical Results for Univariate Stable Distributions
Proposition 3.10 Let α ∈ (0, 1) and S j ∼ S α, 1, (cos πα/2)1/α, 0; 1 = S (α, 1; 7), j = 1, 2, 3, . . / . be i.i.d. positive α-stable. j (a) The sum ∞ j=1 α log S j has a Gumbel(0, 1) law. 1/ξ ∞ αj has a Fr´echet(0, 1, ξ) law. (b) The product j=1 S j 1/ξ j ∞ −α (c) The product − has a Weibull(0, 1, ξ) law. j=1 S j Proof (a) The mgf of n . j=1
Eeuα
j
log S j
=
/n j=1
α j log S j is
n . Γ(1 − uα j /α) j=1
Γ(1 −
α j u)
=
Γ(1 − u) → Γ(1 − u) Γ(1 − α n u)
as n → ∞,
which is the mgf of Gumbel(0,1) law. (b) Let X be the sum in (i), then exp (X/ξ) is Fréchet(0, 1, ξ). (c) Let X be the sum in (i), then − exp (−X/ξ) is Weibull(0, 1, ξ).
3.12 Stochastic series representations The following representation is important for theoretical purposes, especially its generalization to processes. It expresses a stable distribution as a dependent sum of ∞ is a sequence of i.i.d. simpler terms. Fix an 0 < α < 2, Γi = e1 +· · ·+ei , where {ei }i=1 Exponential(1) random variables, and Wi i.i.d. random variables with E |W1 | α < ∞ when α 1 or E |W1 log |W1 || < ∞ when α = 1. Set ⎧ 0 ⎪ ⎪ ⎨ ⎪ ∫ |W |/(i−1) ki (α) = E(W1 |W 1|/i x −2 sin x dx) 1 ⎪
⎪ ⎪ α i (α−1)/α − (i − 1)(α−1)/α EW 1 ⎩ α−1
0 0 and bn → b and aX + b=Y , e.g. X and Y are the same type. Proof of Theorem 1.4 (Generalized Central Limit Theorem) Let X1 , X2 , X3 , . . . be an i.i.d. sequence of random variables and for notational convenience, let Sn = X1 + X2 + · · · + Xn . Suppose for some an > 0 and bn , d
an (X1 + X2 + · · · + Xn ) − bn = an Sn − bn −→Z.
(3.67)
We will show Z is stable. Fix any integer n > 1. For m = 1, 2, 3, . . ., define Ym = d
anm Snm − bnm , then (3.67) shows Ym −→Z. Next split Snm into n summands: Snm = T1 (m) + T2 (m) + · · · + Tn (m), where each term includes m X j s differing by n: T1 (m) = X1 + Xn+1 + X2n+1 + · · · + X(m−1)n+1 T2 (m) = X2 + Xn+2 + X2n+2 + · · · + X(m−1)n+2 T3 (m) = X3 + Xn+3 + X2n+3 + · · · + X(m−1)n+3 .. . Tn (m) = Xn + X2n + X3n + · · · + Xmn .
142
3 Technical Results for Univariate Stable Distributions
Set Am = Am (n) = am /anm and Bm = Bm (n) = am bnm /anm − nbm . Some algebra and (3.67) show am am (anm Snm − bnm ) + bnm − nbm anm anm = am Snm − nbm = (amT1 (m) − bm ) + (amT2 (m) − bm ) + · · · + (amTn (m) − bm )
AmYm + Bm =
d
−→ Z1 + Z2 + · · · + Zn, where Z1, . . . , Zn are i.i.d. copies of Z. By the convergence of types theorem, we d
conclude Am (n) → Cn > 0 and Bm (n) → Dn , and Cn (Z1 + · · · + Zn ) + Dn =Z. Since d n was arbitrary, we have Z1 + · · · + Zn =Cn−1 Z − Cn−1 Dn , for all n > 1, i.e. Z is stable by Definition 1.2. Conversely, suppose Z ∼ S (α, β, γ, δ; 1). Then taking X j to be i.i.d. copies of Z and an = n−1/α and bn = (n1−(1/α) − 1)δ if α 1; bn = (2/π)βγ log n if α = 1, (1.8) d
shows for all n, an (X1 + · · · + Xn ) − bn =Z, which is stronger than (3.67).
The classical Central Limit Theorem gives simple expressions for the norming constants an and bn when the summands X have a finite variance. Here is a similar statement for convergence to a stable limit in the special case when the tails of X are asymptotically a power. Theorem 3.12 Explicit Form of the Generalized Central Limit Theorem Let X1, X2, . . . be i.i.d. copies of X, where X has characteristic function φ X (u) and satisfies the tail conditions x α F(−x) → c−
and
x α (1 − F(x)) → c+
as
x → ∞,
where c− ≥ 0, c+ ≥ 0 and 0 < c− + c+ < ∞. (a) If 0 < α < 2, set c+ − c− c+ + c− 2Γ(α) sin( π2α ) 1/α −1/α an = n π(c+ + c− ) ⎧ ⎪0 0 0, b ∈ R. In particular, it is not dependent on the parameterization used, so the notation DA(Z(α, β)) means the same thing whether Z ∼ S (α, β, γ, δ0 ; 0) or Z ∼ S (α, β, γ, δ1 ; 1). Theorem 3.12 shows that when the tail condition (3.68) is satisfied, X is in the domain of attraction of a stable law and the scaling constants an can be chosen to be an = an−1/α,
(3.70)
for some a > 0. If the scaling constants can be chosen as (3.70), then X is said to be in the domain of normal attraction of Z. (This can be a confusing terminology if the adjective normal gets confused with the normal distribution: there are distributions in the domain of non-normal attraction of a normal distribution and there are distributions in the domain of normal attraction of a non-normal stable law. Problem 3.34 shows a Pareto(2,1) distribution is an example of the former and a Pareto(α, 1) law with 0 < α < 2 is an example of the latter by Theorem 3.12. The phrase domain of regular attraction would avoid this confusion.) It can be shown that (3.70) can be used to scale when 0 < α < 2 if and only if (3.68) holds, e.g. Ibragimov and Linnik (1971). The domain of attraction of the Gaussian law is slightly bigger than distributions with finite variance. Section 35 of Gnedenko and Kolmogorov (1954) shows the following. Theorem 3.13 A r.v. X with distribution function F(x) is the in domain of attraction of a Gaussian law if and only if ∫ t 2 |x |>t dF(x) ∫ → 0 as t → ∞. (3.71) x 2 dF(x) |x | 0, L(ax) = 1. lim x→∞ L(x) A function G(x) is regularly varying of index ρ at infinity if G(x) = x ρ L(x), where L(x) is slowly varying at infinity. (In general ρ can be any number, positive or negative, below we will only focus on ρ = −α < 0.) We note that ∫an equivalent statement of (3.71) is that the truncated variance t function V(t) = −t x 2 dF(x) = E(X 2 1 |X | ≤t ) (the denominator in (3.71)) is slowly varying at infinity, see section XVII.5 of Feller (1971). See Problem 3.36 for examples of slowly varying functions and Problem 3.37 for a distribution that has tails that are not regularly varying. There is an extensive theory of regularly varying functions, see Bingham et al. (1987), Geluk and de Haan (1987) or Resnick (1987). For our purposes, they are important because of following result which characterizes stable domains of attraction. The essential idea is that if P(|X | > x) is regularly varying of index −α at infinity, 0 < α < 2, then X is in the domain of attraction of an α−stable random variable. To determine β, we need to look at the relative behavior of the tails P(X ≤ −x) = F(−x) and P(X > x) = 1 − F(x). Theorem 3.14 Let X be a r.v. with d.f. F(x) and let 0 < α < 2. Define for x > 0, L(−x) = x α F(−x)
and
L(x) = x α (1 − F(x)).
Then X ∈ DA(Z(α, β)) if and only if there exists constants c− ≥ 0 and c+ ≥ 0 such that c− + c+ > 0 c+ − c− β = + c + c− + c > 0 ⇒ L(x) is slowly varying at infinity c− > 0 ⇒ L(−x) is slowly varying at infinity L(x) P(X > x) c+ = → + as x → ∞. L(x) + L(−x) P(|X | > x) c + c− A proof of this theorem using the theory of infinitely divisible distributions can be found in Section 35 of Gnedenko and Kolmogorov (1954) or Chapter 9 of Breiman (1992). A direct proof using the theory of regular variation can be found in Geluk and de Haan (2000). Pareto laws, where the tail probability is exactly a power P(|X | > x) = cx −α , are in the domain of normal attraction of stable laws, as we saw above. However, in general the slowly varying component can be more complicated, see Example 3.1 for some examples.
3.13 Generalized Central Limit Theorem and Domains of Attraction
145
In non-extreme cases, both c+ > 0 and c− > 0 and L(x) and L(−x) are both slowly varying at infinity and both tails have similar behavior. When c− = 0, then only the right tail probability has to be regularly varying of index −α, while the left tail can be anything that satisfies P(X ≤ −x)/P(X > x) → 0, and the limiting distribution will be totally skewed to the right. Similar statements hold when c+ = 0, and the limiting distribution will be totally skewed to the left. There is another variation of the domain of attraction concept. If the convergence in (3.69) does not hold when n → ∞ through all n, but does hold for some subsequence ank , then X is said to be in the domain of partial attraction. This case is not considered here, we refer the reader to § 37 of Gnedenko and Kolmogorov (1954), where it is shown that the possible limits are all infinitely divisible laws. Next is a short diversion on tail behavior of convolutions. P(X X2j be > x) independent = x −α j L j (x) random x →variables ∞, with Theorem 3.15 Let X1 and where L1 (·) and L2 (·) are slowly varying at infinity. Then as x → ∞, P(X1 + X2 > x) ∼ x −α1 L1 (x) + x −α2 L2 (x). Proof For any x > 0 and δ > 0, it is straightforward that {X1 > (1 + δ)x, X2 > −δx} ∪ {X1 > −δx, X2 > (1 + δ)x} ⊂ {X1 + X2 > x}. Hence P(X1 > (1 + δ)x)P(X2 > −δx) + P(X1 > −δx)P(X2 > (1 + δ)x) ≤ P(X1 + X2 > x). For any 0 < < 1/2, there is a x1 = x1 () such that δx > x1 implies P(X j > −δx) > 1 − . Hence for x large enough P(X1 + X2 > x) ≥ [P(X1 > (1 + δ)x) + P(X2 > (1 + δ)x)](1 − ) = [((1 + δ)x)−α1 L1 ((1 + δ)x) +((1 + δ)x)−α2 L2 ((1 + δ)x)](1 − ) (1 + δ)−α1 L1 ((1 + δ)x) −α1 x L1 (x) = L1 (x) (1 + δ)−α2 L2 ((1 + δ)x) −α2 + x L2 (x) (1 − ) L2 (x) (3.72) ≥ [x −α1 L1 (x) + x −α2 L2 (x)](1 − 2), where the last step uses the fact that δ can be arbitrarily small and L1 (·) and L2 (·) are slowly varying. For an upper bound, let 0 < δ < 1/2. Then a sketch shows {X1 + X2 > x} ⊂ {X1 > (1 − δ)x} ∪ {X2 > (1 − δ)x} ∪ {X1 > δx, X2 > δx}. Hence for δx large enough,
146
3 Technical Results for Univariate Stable Distributions
P(X1 + X2 > x) ≤ P(X1 > (1 − δ)x) + P(X2 > (1 − δ)x) + P(X1 > δx)P(X2 > δx) = ((1 − δ)x)−α1 L1 ((1 − δ)x) + ((1 − δ)x)−α2 L2 ((1 − δ)x) + (δx)−α1 L1 (δx)(δx)−α2 L2 (δx) ≤ [((1 − δ)x)−α1 L1 ((1 − δ)x) + ((1 − δ)x)−α2 L2 ((1 − δ)x)](1 + ) ≤ [x −α1 L1 (x) + x −α2 L2 (x)](1 + 2), as in (3.72). Combining (3.72) and (3.73) show the result.
(3.73)
Thus the convolution of two random variables with similar power tails (α1 = α2 ), has the same tail behavior. Using this repeatedly shows for i.i.d. r.v.s with P(X1 > x) = x −α L(x), P(X1 + X2 + · · · + Xn > x) ∼ nx −α L(x). Furthermore, Problem 3.39 shows that P(max(X1, . . . , Xn ) > x) ∼ nP(X1 > x). Thus, when a distribution has regularly varying tails, the tail of the sum of n terms is comparable to the tail of the maximum of n terms. In other words, large values of the sum are likely to come from a single large value in the summands. This is very different than in the light tail case. On the other hand, the convolution of two random variables with different tails has tail behavior dominated by the heavier term. So if X1 is in the domain of attraction of an α1 -stable law and X2 is in the domain of attraction of an α2 -stable law, then X1 + X2 is in the domain of attraction of a min(α1, α2 )-stable law. In particular, if X1 has finite variance, but X2 does not, then X1 + X2 is not in the domain of attraction of a normal law. Example 3.1 In these examples, we restrict to 0 < α < 2. The verification of these examples is left to Problem 3.38. 1. If X ∼ Pareto(α, 1), then X ∈ DA(Z(α, 1)) (compare to the argument on page 59). 2. Let f (x) be the density of a Pareto(α, 1) distribution and 0 ≤ p ≤ 1. Then the mixture distribution with density p f (x) + (1 − p) f (−x) is in DA(Z(α, 2p − 1)). 3. If X j ∼Pareto(α, 1), j = 1, 2, then X = aX1 − bX2 ∈ DA(Z(α, (a − b)/(a + b))). 4. If X has density c(1 + x 2 )−(1+α)/2 , then X ∈ DA(Z(α, 0)). 5. If X has density f (x) = c(log x)/x 1+α , x > e, c = α2 e−1 /(1 + α), it’s in the non-normal domain of attraction of Z(α, 1). Then for −1 ≤ β ≤ 1, g(x) = ((1 + β)/2) f (x) + ((1 − β)/2) f (−x) is the density of a r.v. in the non-normal domain of attraction of Z(α, β). 6. Let X j , j = 1, 2 have tail behavior P(X j > x) = c+j x −α j + o(x −α j ) and P(X j < −x) = c−j x −α j + o(x −α j ) with c+j + c−j > 0 for j = 1, 2. Note that this includes cases where X j is discrete. Then X1 ∈ DA(Z(α1, β1 )), X2 ∈ DA(Z(α2, β2 )), and ⎧ ⎪ DA(Z(α1, β1 )) ⎪ ⎨ ⎪ X1 + X2 ∈ DA(Z(α1, β)) ⎪ ⎪ ⎪ DA(Z(α2, β2 )) ⎩
α1 < α2 α1 = α2 α1 > α2,
where β1 = (c1+ − c1− )/(c1+ + c1− ), β2 = (c2+ − c2− )/(c2+ + c2− ), β = ((c1+ + c2+ ) − (c1− + c2− ))/(c1+ + c1− + c2+ + c2− ),
3.13 Generalized Central Limit Theorem and Domains of Attraction
147
7. If U ∼Uniform(0, 1), then U −1/α ∈ DA(Z(α, 1)). If U ∼Uniform(−1, 1), then U ∈ DA(Z(α, 0)). 8. If X has a density with f (0) > 0, then X ∈ DA(Z(α, 0)). 9. If X has a density with both c+ = limx↓0 f (x) and c− = limx↑0 f (x) existing with c+ + c− > 0, then X ∈ DA(Z(α, (c+ − c− )/(c+ + c− ))). 10. Let X have tail behavior P(X > x) = c+ x −p + o(x −p ) and P(X < −x) = c− x −p + o(x −p ) where 0 < p < ∞. Then for any q > 0, DA(Z(p/q, (c+ − c− )/(c+ + c− ))) p/q < 2
X ∈ DA(Z(2, 0)) otherwise |X | ∈ q
DA(Z(p/q, 1)) p/q < 2 DA(Z(2, 0)) otherwise.
11. Let X ∼ S (α, β, γ, δ; j), j = 0, 1. For p > 0, DA(Z(α/p, β)) α/p < 2 DA(Z(α/p, 1))
p X ∈ |X | ∈ DA(Z(2, 0)) otherwise, DA(Z(2, 0))
α/p < 2 otherwise.
12. If X1 and X2 are independent and satisfy x α P(|X j | > x) → c j as x → ∞, then X1 X2 is also in the domain of attraction of an α-stable law. This follows from Theorem 2.1 of Rosiński and Woyczynski (1987). 13. Let T be the number of tosses of a fair coin until there are an equal number of heads and tails. Combinatoric √ arguments, e.g. Problem 19, § 3.3 of Breiman (1992) show that P(T > n) ∼ ( 2/π)n−1/2 . Hence T ∈ DA(Z(1/2, 1)). 14. A discrete Pareto distribution with P(X = n) = c n−(1+α) , c = 1/ζ(1 + α), is in DA(Z(α, 1)). An interesting thing happens with skewness in domain of attraction results. Let X ∼ Pareto(α, 1); note that for all α > 0, X is one sided and supported on [1, ∞). If α ≥ 2, X ∈ DA(Z(2, 0)) and if α < 2, X ∈ DA(Z(α, 1)). Thus if α ≥ 2, the limiting distribution is symmetric and supported on R, if 1 ≤ α < 2 the limiting distribution is supported on R, but totally skewed with a light left tail, and if 0 < α < 1, the limiting distribution is totally skewed and supported on a half line. Thus the limiting distribution can have no skewness or a lot, depending on the heaviness of the tails, even though the original X terms are all totally skewed and qualitatively similar. To explore finer tail behavior of convolutions, there is a concept of second-order regular variation, e.g. P(X > x) = c1 x −α1 + c2 x −α2 + o(x −α2 ), where α1 < α2 , see Geluk et al. (2000) and Geluk and Peng (2000). One use of second-order regular variation is to study the rate of convergence of normalized sums to a stable law. Some references to this are Cramér (1962), Cramér (1963), Hall (1981b), Christoph and Wolf (1992), de Haan and Peng (1999), and Kuske and Keller (2001). An elegant (and mathematically sophisticated) approach to rates of convergence using probability metrics is due to Zolotarev, see Rachev (1991). Here is one result from Cramér (1963).
148
3 Technical Results for Univariate Stable Distributions
Theorem 3.16 Let d.f. F(x) satisfy F(−x) = c− x −α + O(x −p )
and
1 − F(x) = c+ x −α + O(x −p )
for some 1 < α < p < 2. Set an = ([π(c+ + c− )n]/[2Γ(α) sin( π2α )])−1/α and bn = nan E X. Then for i.i.d. random variables with d.f. F, Fn (x) := P(an (X1 + · · · + Xn ) − bn ≤ x) → G(x), where G(x) is the d.f. of a S (α, β, 1, 0; 1) law and sup
−∞ 1 in the definition of s(x|α, β) and still get a continuous function that satisfies a convolution equation, although in these cases, Re S(x|α, β) < 0 for some x, so it is no longer a probability density. These are the “trans-stable” functions of Section 2.11 of Zolotarev (1986). When m/n is a rational number in the set (0, 1) ∪ (1, 2) the function s(x) = s(x|m/n, β) satisfies n−1 d (x 1+m/n s(x)) = nn x 1+m/n dx m−1 n −im(π/2−θ0 ) m d m e x s(x) + iΓ(1 + n)mn−1 x m . dx
150
3 Technical Results for Univariate Stable Distributions
More on this result can be found in Section 2.8 of Zolotarev (1986) and Chapter 6 of Uchaikin and Zolotarev (1999). In the references, special values of α = m/n yield differential equations connected to certain special functions, e.g. Fresnel integrals (α = 1/2), Macdonald, Airy, and Whittaker functions. Next, stable semi-groups and their relation to space fractional diffusions are discussed. Fix α ∈ (0, 2], β ∈ [−1, 1], γ > 0, δ ∈ R and for t ≥ 0, define k t (x) = f (x|α, β, (γt)1/α, δt; 1), i.e. the stable density in the 1-parameterization. In this setting, k0 (x) is interpreted as a Dirac delta function at x = 0. The convolution ∫ ∞ k t (x − y)k s (y)dy k t ∗ k s (x) = −∞
corresponds to the density of the sum of a S α, β, (γt)1/α, δt; 1 r.v. and indepen
dent S α, β, (γs)1/α, δs; 1 r.v., which by Lemma 1.4 has a S α, β, (γ(t + s))1/α, δ(t + s); 1 r.v. Hence, (3.76) k t ∗ k s = k t+s, and the family {k t (·) : t ≥ 0} forms a convolution semi-group. This convolution semi-group naturally defines a semi-group of operators on C ∞ = all infinitely differentiable functions from R to R. For t ≥ 0, define the operator Kt : C ∞ → C ∞ by Kt f = k t ∗ f . The semi-group property (3.76) implies that Kt Ks = Kt+s , so {Kt : t ≥ 0} is a semigroup of operators. It is well known, e.g. Feller (1971) IX.6, that this semi-group of operators has a generator A = limh→0 (Kh − I)/h given by an integral equation. Specifically, d A = c+ A + + c− A − + δ , dx where ∫ ∞ f (x−y)− f (x) ⎧ ⎪ dy 0 e, c = (4/3)e2 . Show that Var(X) = ∞, but X ∈ DA(Z(2, 0)). This X is in the non-normal domain of attraction of a normal law. Problem 3.35 Let X be a Cauchy(0,1) r.v. and define Y = X 2 . Find the d.f. and density of Y . Note that Y is not stable, but it is in the domain of attraction of an S (α = 1/2, β = 1; 1) law. Problem 3.36 Show that the following functions are slowly varying at infinity: (a) non-zero constant functions (b) L(x) = c1 +c2 x −p (c1 0, p > 0) or any function L(x) that approaches a non-zero limit (c) log x, 1/log x, log(log x), log(log(log x)), (log ax p )q (a > 0, p > 0, q ∈ R). Note that the first example has limx→∞ L(x) = ∞ so slowly varying functions may be unbounded and that the second has limx→∞ L(x) = 0, so slowly varying functions may not have a positive limit. (d) (log Γ(x))/x (e) L(x) = exp ((log x)1/3 cos((log x)1/3 )). Note that lim inf x→∞ L(x) = 0 and lim supx→∞ L(x) = ∞, showing that a function can have infinitely many oscillations of unbounded amplitude, and still be slowly varying. (This example and others can be found on pg. 16 of Bingham et al. (1987). They also give properties of slowly varying functions, e.g. the sum, product, and ratio of slowly varying functions are slowly varying.) (f) If L(x) is slowly varying, then for any p ∈ R, |L(x)| p is slowly varying. Show that the following functions are not slowly varying: (g) x p (p 0) and exp (x) (h) sin x or sin(log x) (slowly varying is not the same as slowly oscillating) Problem 3.37 Show that the discrete r.v. X that gives mass 2−k to point 2k , k = 1, 2, 3, . . ., is not regularly varying. It is said to be dominatedly varying: lim supx→∞ (1 − F(x/2))/(1 − F(x)) < ∞, see Goldie (1978). Problem 3.38 Verify the examples in the domain of attraction Example 3.1. Problem 3.39 Let X have tail behavior P(X > x) = 1 − F(x) = x −p L(x) with L(x) slowly varying at infinity. (a) Let X1 and X2 be independent with the same law as X. Then P(X1 > t|X1 + X2 > t) → 1/2 as t → ∞. In words, a large value of the sum is likely to be due to the contribution of one of the two variables. (b) Let X1, . . . , Xn be i.i.d. with the same/law as X. Show Mn = max(X1, . . . , Xn ) k has tail behavior P(Mn > x) = (1 − F(x)) n−1 j=0 F (x) ∼ n(1 − F(x)) as x → ∞.
156
3 Technical Results for Univariate Stable Distributions
Problem 3.40 This problem gives an alternate way of proving that there are characteristic functions with the form (3.8). For α ∈ (0, 1) ∪ (1, 2), fix a positive integer /n Wk−1/α . Show n and let W1, . . . , Wn be i.i.d. Uniform(0, n) r.v. and define Yn = k=1 that the ch. f. of the sum Yn is n n ∫ n iuw −1/α φYn (u) = φW −1/α (u) = e dw 1 0
→ exp −c|u| α 1 + i(sign u) tan π2α , as n → ∞. Problem 3.41 Show that any random variable X with finite pth absolute moment, p > 0, satisfies sup x p P(|X | > x) ≤ E |X | p . x>0
Use this and Corollary 3.5 to establish (3.44). Problem 3.42 Theorem 3.10 gives a way to approximately simulate a general stable law. Picking an appropriate W that can be simulated, generating i.i.d. exponentials to simulate Γi , and truncating the series given in the theorem will yield an approximation to an arbitrary stable law. To simulate symmetric stable distributions, any symmetric r.v. W with E |W | α < ∞ will do, e.g. normal, uniform on (−1,1), etc. For β 1 cases, you need to balance the probability to the right and left of 0 to achieve the asymmetry, e.g. uniform on (−a, b). Section 3.4 of Janicki and Weron (1994) finds that the convergence of this series is very slow. Problem 3.43 Another way to approximately simulate stable random variables is to use a domain of attraction argument. Theorem 3.14 shows that if X1, X2, . . . are d
Pareto(α, 1), then Sn := n−1/α (X1 + X2 + · · · + Xn )−→Z, where X ∼ S (α, 1, γ, δ; 1). By taking a large number of terms, the sum Sn will be approximately stable. Take a weighted difference pSn − qSn to get skewed stable. Problem 3.44 The rate of convergence in Problem 3.43 is faster if the distribution X is close to a stable law before one starts summing. For symmetric stable laws, and 0 < α < 2, taking two normal laws Z1 and Z2 , the ratio X = Z1 /|Z2 | 1/α is symmetric and unimodal with the appropriate tail behavior to be in the domain of attraction of a symmetric α-stable distribution. The normalized sum Sn converges more quickly to a stable limit than using Pareto terms. This method is due to Mantegna (1994), who further found that the convergence can be accelerated more by taking a nonlinear transformation of the X above: Y = X(a exp (−bX) + 1) is quite close to a symmetric α-stable law if a and b are chosen correctly. The constants a and b depend on α: Mantegna chooses a so that the densities matches at the origin and b by a numerical procedure. Problem 3.45 Simulating stable random variables, or ones in the domain of attraction of a stable distribution, for small α. First, Corollary 3.7 suggests simulating
3.16 Problems
157
stable random variables by (− log U)1/α , where = ±1 with respective probability (1 ± β)/2 and U ∼ Uniform(0,1). To simulate r.v. in the domain of attraction of Z(α, 0), use X = U −1/α ∈ DA(Z(α, 0)). The need for can be eliminated by using: U −1/α if U ≤ 1/2 and −(U − 1/2)−1/α otherwise. This last r.v. has a gap in it’s support, P(−1 < X < 1) = 0, and this gap will slow down the convergence to a stable limit. There are various ways to improve this convergence, the simplest is X = [U −1/α −1]. If α = 1/k, then U −1/α = U −k can be computed by multiplication. Problem 3.46 Compare the different methods of generating stable random variables by simulation. Compare the accuracy and the computational time of the approximate methods to the method of Section 3.3.3. Problem 3.47 Use definition of entropy (3.74) to show that for any continuous r.v. X, H(aX + b) = H(X) + log |a|. Use the explicit formula for the Gaussian, Cauchy and Lévy densities to compute the entropy expressions in Section 3.14. The following problems are open research problems. Problem 3.48 Prove that m(α, ·) is decreasing and convex up as a function of β. Problem 3.49 Derive a formula for m(α, β), the mode of a S (α, β; 0) distribution.
Chapter 4
Univariate Estimation
This chapter describes various methods of estimating stable parameters: among them tail, quantile, empirical characteristic function, and maximum likelihood methods. Unless stated otherwise, we will assume that X1, X2, . . . , Xn are a stable sample, i.e. independent and identically distributed copies with a stable law. In most cases, the continuous S (α, β, γ, δ; 0) parameterization is used to avoid problems arising from the discontinuity at α = 1. Diagnostics for assessing stability are discussed and applications to several data sets are given. The performance of the methods are compared in simulations in Section 4.9. Most traditional estimates cannot be applied to stable data. For example, the standard method of moments fails because integer moments don’t exist. And since there is no closed form expression for stable densities, one cannot express the likelihood in a closed form, so analytical maximum likelihood is not possible. These difficulties have led to several novel estimation methods. In addition to estimating stable parameters, these approaches may be of interest to researchers using other probability distributions where standard estimation methods may not apply. In some applications, there are small to moderate data sets and possible skewness, so the emphasis is on fitting all four parameters as efficiently as possible. In signal processing, there are very large data sets and strong theoretical and empirical evidence for symmetry and a requirement for real time processing. In such cases, fast estimators for α, γ, and δ may be of interest, even if these estimators are not the most efficient. In signal filtering problems, one may calibrate to set the index α and scale γ, and then require very fast estimates of only the location parameter δ. In data stream problems, one chooses α, β, and δ and may only need to get a fast estimate of the scale γ. Because of these very distinct goals, some sections below focus on these special cases.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_4
159
160
4 Univariate Estimation
4.1 Order statistics The order statistics of any random variables X1, . . . , Xn , are defined by X1:n = min(X1, . . . , Xn ) X2:n = second smallest of X1, . . . , Xn .. . Xk:n = k th smallest of X1, . . . , Xn .. . X(n−1):n = second largest of X1, . . . , Xn Xn:n = max(X1, . . . , Xn ). We will only consider the case when the Xi are continuous, in which case the probability of ties is zero. The order statistics for the case when X1, . . . , Xn are independent stable r.v.s. follows the standard theory of order statistics for continuous random variables, e.g. Section 5.5 of Casella and Berger (1990). Let F(x) and f (x) denote respectively the common distribution function and density of the Xi ’s. The j th -order statistic has distribution function FX j:n (x) = P(X j:n ≤ x) =
n k=j
n! F(x)k (1 − F(x))n−k . k!(n − k)!
The k th term in the sum corresponds to exactly k terms less than or equal to x. Differentiating and some algebra leads to the density fX j:n (x) =
n! fX (x) [FX (x)] j−1 [1 − FX (x)]n−j . ( j − 1)!(n − j)!
Figure 4.1 shows examples of such densities for different values of α and β. The joint distribution of the order statistics is also given in the reference above. Problem 4.1 shows that the tails of the order statistic distributions are asymptotically power laws. While a stable distribution will have limited moments, non-extreme order statistics from a stable sample can have finite moments of high order, see Problem 4.2. When the components are independent, the standard asymptotic theory also holds, e.g. Chapter 13 of Ferguson (1996): for large sample sizes n, the joint distribution of the order statistics are jointly normal, with known mean and covariance. In particular, for any fixed p ∈ (0, 1), as n → ∞, √ d p(1 − p) , (4.1) n X np:n − x p −→N 0, 2 f (x p )
4.1 Order statistics
161
alpha = 1.8, beta = 0
0.0
0.0
0.4
0.4
0.8
0.8
alpha = 1.8, beta = 1
-2
0
2
4
-4
-2
0
2
x
x
alpha = 1.3, beta = 0
alpha = 1.3, beta = 1
4
0.0
0.0
0.4
0.4
0.8
0.8
-4
-4
-2
0
2
4
-4
-2
x
0
2
4
x
alpha = 0.8, beta = 0
0.4 0.0
0.0
0.4
0.8
0.8
alpha = 0.8, beta = 1
-4
-2
0
x
2
4
-4
-2
0
2
4
x
Fig. 4.1 Densities for order statistics for a stable sample of size n = 5 for varying α and β. The curve with the leftmost mode is for X1:5 , the second from the left is X2:5 , etc. The densities are in the 0-parmaeterization.
i.e. for large samples, the np th -order statistic is approximately normally distributed with mean x p and variance p(1 − p)/(n f 2 (x p )). If one is dealing with a massive data set, it can be expensive, for both space and time, to save and sort the data. There are methods for approximating quantiles without sorting and without saving the entire data set. One method is described by Liechty et al. (2003) and McDermott et al. (2007).
162
4 Univariate Estimation
4.2 Tail-based estimation These methods use the asymptotic Pareto tail behavior to estimate the index α. McCulloch (1997) shows that using the generalized Pareto model suggested by DuMouchel (1983) or the Hill estimator on stable data when 1 < α < 2 leads to overestimates of α. McCulloch points out that several researchers have used such misleading tail estimators of α to conclude that various data sets were not stable. In Section 3.6, it is shown that where the Pareto behavior starts to occur depends heavily on the parameterization, and that even when we shift so that the mode is near or at zero, the place where the power decay starts to be accurate is a complicated function of α and β. In particular, when α is close to 2, one must get extremely far out on the tail before the power decay is accurate. We start with a discussion of the simplest estimate of the index of stability α. It is common to take the top p% of a sample to estimate α by plotting the upper p% on a log-log scale and estimating the slope. The accuracy of this or any other tail procedure depends on (i) shifts in the distribution and (ii) on when the Paretian tail behavior starts. For data coming from stable distribution, some guidelines can be given for the first issue. Shifting the data to center on the mode or median will generally lead to a quicker convergence to Paretian tails. In contrast, little advice can be given on point (ii): the figures in Section 3.6 imply that the value of p one should take is very dependent on α and β. If one knows what α and β are, then these figures can be used to give a conservative value of p to use. Of course, this is not helpful in a general estimation procedure where the purpose is to estimate α and β. Another estimate of α is based on the QQ-plot in Kratz and Resnick (1996). Since it depends on the asymptotic tail behavior, it has the same difficulties as the above method.
4.2.1 Hill estimator Hill (1975) proposed a method for estimating the tail behavior that does not assume that a parametric form holds for the entire distribution function, but focuses only on the tail behavior. The Hill estimator is used to estimate the Pareto index α, when the (upper) tail of the distribution has form P(X > x) ≈ C x −α . Here the goal is to study the behavior of Hill’s estimator when it is applied to a non-Gaussian stable distribution. Let X(i) = Xn−i+1:n be the reversed order statistics drawn from a population with distribution G, ordered from largest to smallest: X(1) ≥ X(2) ≥ · · · ≥ X(n) . The Hill estimate of α based on the k largest values is:
αˆ Hill ,k
k X(i) 1 = log k i=1 X(k)
−1 .
4.2 Tail-based estimation
163
In practice, the ordered pairs (k, αˆ Hill,k ) are plotted and one looks for a region where the plot levels off to identify the correct α. This procedure is widely used, see Resnick (1997), Resnick and Stărică (1995), Resnick and Stărică (1997), and Embrechts et al. (1997). In order to determine how the Hill estimate is affected by α for a stable sample, we generated 10000 symmetric standardized (β = 0, γ = 1, δ = 0) stable random variables for different values of α = 0.5, 1.2, 1.9 using the algorithm of Chambers et al. (1976), then used the positive values in the sample as basis for deriving the Hill estimate. In order to remove the random effects associated with simulations, the Hill estimator of α was derived from stable quantiles as well. Hence, convergence of the Hill estimator to the true parameter estimate is assessed from two different data sets: a sample of 10000 simulated standardized symmetric stable random variables with α = 0.5, 1.2, 1.9 and 10000 exact quantiles from the stable distribution with the same α. From Figures 3.13 and 3.14, it is seen that in the symmetric case (β = 0) convergence of a stable distribution to a Pareto is quickest when α = 1.2, where the corresponding tail probability is maximal. On the other hand, when α → 2, the tail probability F(x pd f ; α, β) converges to 0. This implies that the accuracy of Hill estimate will require a relatively large sample size when α is large because only the extreme tail has Paretian behavior. The Hill estimator for these data are shown in Figure 4.2, for both quantile and simulated data. Simulated data alpha = 1.2 4 3
0
0
1
1
2
2
3
4 3 2 1 0
Hill estimate of alpha
alpha = 1.9
4
alpha = 0.5
0 1000
3000
5000
0 1000
3000
5000
0 1000
3000
5000
0 1000
3000
5000
3
4 0
0
1
1
2
2
3
4 3 2 1 0
Hill estimate of alpha
4
Quantile data
0 1000
3000
5000
0 1000
3000
5000
Fig. 4.2 Hill plots of tail index estimate for symmetric stable distributions when α = 0.5, 1.2, 1.9 using simulated data (top row) and exact quantiles (bottom row). The dotted horizontal line is the true value of α.
164
4 Univariate Estimation
The true value of the parameter α is known in the three cases, and the issue is to estimate what percent p of the (upper) distribution is required to ensure convergence of the Hill estimate αˆ to the true parameter value. This indirectly determines the size n of a sample needed to estimate α: if p is very small, then n must be very large before np, the approximate number of points in the upper tail required to get a good estimate of α, is appreciable. In the first case considered here, α = 0.5, Figures 3.13 and 3.14 both show that it it takes a while before either Lcd f (x) or Relerrcd f (x) get close to their limiting values. However, the Hill estimator actually does quite well using a large portion of the sample. An explanation for this can be found in Figure 3.12: when α = 0.5, Lcd f (x) never gets very far from 1, in contrast to the α = 1.9 case discussed below. As expected, when α = 1.2, the Hill estimator is close to the true index α for a large proportion of the sample. However, when α = 1.9, the Hill estimator never gives a reasonable estimate of α. This can be understood by studying the plot of Lcd f (x) from the previous chapter. Formally, the stable tail is exactly F(x; α, β) = Lcd f (x)F Par eto (x; α, β). Resnick (1997), considers the Hill estimator for distributions of this type when Lcd f (x) is slowly varying. In the stable case, we have more than slowly varying because Lcd f (x) → 1 as x → ∞. Yet even with this stronger condition, the Hill estimator performs poorly because Lcd f (x) gets quite large and takes a long time before it converges to 1. For example, when α = 1.9 and β = 0, x Lc d f ,0.1 = 8.33 and more importantly F(8.33; 1.9, 0) = 0.00093864. This last value shows that a massive sample is needed to show the Paretian tail behavior when α is near 2. The situation gets more involved if we consider the nonsymmetric case, e.g. values of Lcd f (x) in the hundreds occur for highly skewed data. For a stable distribution with α > 1.5, the Hill estimator generally does a poor job of estimating α. Resnick (1997) includes one such stable distribution in his “Hill Horror Plot” category for this reason. This contrasts with the fact that the maximum likelihood estimate of α is highly efficient for α near 2, see Figure 4.10. Even with the smoothed Hill plots and Alternate Hill plots (change of horizontal scale) described by Resnick, it is not clear how useful the Hill plot is for stable data. In fact, we see little reason to use the Hill estimator for a stable distribution since other methods of estimation are now available. For an arbitrary (not necessarily stable) distribution having a Paretian tail, it is impossible to give general guidelines on what percentage of the tail is needed by the Hill estimator to get a good approximation of the tail index.
4.3 Extreme value estimate of α
165
4.3 Extreme value estimate of α Given a data set of i.i.d. values X1, . . . , Xn for an α-stable law, partition them into l blocks of size k: B1 = {X1, . . . , Xk } B2 = {Xk+1, . . . , X2k } .. . B j = {X(j−1)k+1, . . . , X jk } .. . B = {Xn−k+1, . . . , Xn }. For each block B j , we assume there are positive and negative values in the block. This will usually happen if the blocks are large and the data is symmetric or not highly skewed. Define the log of the block maximum M j = log(max B j ). It is approximately the max of exponential stable r.v. Problem 4.3 shows Var(M j ) = π 2 /(6α2 ), or α = π/ 6VarM j . Likewise, define the log of the block minimum m j = − log(− min B j ), and Var(m j ) = Var(M j ). Tsihrintzis and Nikias (1996) use the sample estimators of these quantities to estimate α: compute the sample averages M = (M1 + · · · + M l )/l and m = l 2 (m1 + · · · + ml )/l, and sample variances s2M = j=1 (M j − M) /(l − 1) and
l 2 = 2 sm j=1 (m j − m) /(l − 1). Their estimator of α is the average of the two estimators for α: π 1 1 αˆ = √ + . 2 6 sm s M The performance of this estimator is shown in Figure 4.3. Note that when α is small or moderate, the estimator works well when the block sizes are moderate. However, for 1.5 < α < 2, the estimator will overestimate α unless the block size is large. This is due to the fact that the Pareto tail behavior of the distribution isn’t very evident for large α unless the samples are large. When α = 2, there is no Pareto tail and the estimator performs terribly. Using min(α, ˆ 2) will prevent terrible performance, but still give an overestimate of α in the range 1.5 < α < 2. There are other methods of estimating α from the extremes of a data set. Some use block maxima as above, others use peaks-over-threshold as in the Hill estimator. See Chapter 3 of de Haan and Ferreira (2006) for more information on this. Unfortunately, these methods tend to do poorly when 1.5 < α < 2 because the power law approximation to the tail is not accurate until very far out on the tail for stable laws.
166
4 Univariate Estimation k = 1000 l = 100
0
0
2
2
4
6
4
6
8 10 12
k = 100 l = 100
0.5
1.0
1.5
0.5
2.0
1.0
1.5
2.0
k = 1000 l = 1000
0
0
2
2
4
6
4
8
6
10 12
k = 100 l = 1000
0.5
1.0
1.5
2.0
0.5
1.0
1.5
2.0
Fig. 4.3 Performance of block maxima estimator of α for 50 evenly spaced α with k and l as shown. The sample sizes were n = kl and β = 0.
4.4 Quantile-based estimation A second approach to estimating stable parameters is based on quantiles of stable distributions. Fama and Roll (1968) noticed certain patterns in tabulated quantiles of symmetric stable distributions and proposed estimates of α, scale γ and location for symmetric stable distributions. McCulloch (1986) extended these ideas to the general (nonsymmetric) case, eliminated the bias, and obtained consistent estimators for all four stable parameters in terms of five sample quantiles (the 5th, 25th, 50th, 75th, and 95th percentiles). If the data set is stable and if the sample set is large, this last method gives reliable estimates of stable parameters. To describe the method, let x p be the pth quantile of a S (α, β, γ, δ; 0) distribution, and define the quantities x0.95 − x0.05 x0.75 − x0.25 x0.05 + x0.95 − 2x0.50 νβ (α, β, γ, δ) = x0.95 − x0.05 νγ (α, β, γ, δ) = x0.75 − x0.25
να (α, β, γ, δ) =
νδ (α, β, γ, δ) = −x0.50 .
4.4 Quantile-based estimation
167
ν_β
0
5
10
15
β=0 β = 0.2 β = 0.4 β = 0.6 β = 0.8 β=1
1.0
1.5
0.5
2.0
1.0
α
α
ν_γ
ν_δ
1.5
2.0
1.5
2.0
0
−5
−4
5
−3
10
−2
15
−1
0
20
0.5
0.0 0.2 0.4 0.6 0.8 1.0
20
ν_α
0.5
1.0
α
1.5
2.0
0.5
1.0
α
Fig. 4.4 να (α, β, 1, 0), νβ (α, β, 1, 0), νγ (α, β, 1, 0), and νδ (α, β, 1, 0) as a function of α and β.
Figure 4.4 shows the behavior of these quantities as a function of α and β, with γ = 1 and δ = 0. It is an empirical fact that να (α, β, 1, 0) and νβ (α, β, 1, 0) are strictly monotonic in α for fixed β and conversely. Knowledge of the quantiles x0.05 , x0.25 , x0.50 , x0.75 , and x0.95 and the functions in Figure 4.4 allow the recovery of the parameters (α, β, γ, δ). For X ∼ S (α, β, γ, δ; 0) and Z ∼ S (α, β; 0), the scaling property of the 0-parameterization guarantees that the quantiles of X and Z are related by x p = γz p + δ. Straightforward algebra shows να (α, β, γ, δ) = να (α, β, 1, 0) νβ (α, β, γ, δ) = νβ (α, β, 1, 0) νγ (α, β, γ, δ) = γνγ (α, β, 1, 0) νδ (α, β, γ, δ) = γνδ (α, β, 1, 0) − δ. Thus να and νβ are independent of the scale and shift. If the quantiles are known, then by monotonicity α and β can be recovered numerically by searching through the tabulated values of να (α, β, 1, 0) and νβ (α, β, 1, 0). Once α and β are known, the last two equations above can be solved for the scale and location: first γ = νγ (α, β, γ, δ)/νγ (α, β, 1, 0) and then δ = γνδ (α, β, 1, 0) − νδ (α, β, γ, δ).
168
4 Univariate Estimation
For estimation of the parameters (α, β, γ, δ) from an i.i.d. sample from a stable distribution, the sample quantiles are used in place of the population quantiles in the procedure described above. Equation (4.1) shows that the sample quantiles converge in probability to the population quantiles as the sample size tends to infinity, so the above estimators are consistent when α ≥ 0.5. McCulloch, J. H. computed the values of να , νβ , νγ and νδ for a grid of α = 0.5(0.1)2 and β = 0(0.25)1. Ojeda-Revah (2001) improved the program by extending the range and making the mesh of the grid finer: the revised program uses a grid of α = 0.1(0.1)2 and β = 0(0.1)1. This enlargement of the grid gives a little more accuracy, but turned up an unexpected problem: when α < 0.5, the curves in Figure 4.4 cross and there are no longer unique solutions for the parameters, so the estimators are not consistent when α is small. An attempt was made to deal with this problem by replacing the tabulated values of να (α, β, 1, 0), etc., with modified values that don’t intersect, but while this does solve the non-uniqueness problem, it does not appear to do any better job of estimating the parameters when α is small. If a rough quick estimate of the location and scale are needed, say for an initial normalization for a more involved estimator, there are several empirical findings about stable quantiles that may be of use. The following graphs show some numerical results for β ≥ 0, by the reflection property the results for β ≤ 0 are similar. It is essential that the 0-parameterization be used whenever skewed distributions are considered. For the location parameter, a rough estimate is the median: δ0 = x0.5 . (Note that the lower right plot in Figure 4.4 shows −x0.50 for a S (α, β; 0) distribution.) For a symmetric distribution, the location is equal to the median for all α, and when the distribution is slightly skewed, the median is still close to the location parameter. Only when | β| > 0.5 and α is small does the difference between the location parameter and the median become large. In particular, for a S (α, β, γ, δ; 0) distribution with α > 0.7 and any β, median(X) − γ ≤ δ ≤median(X) + γ. For the scale parameter, there are two robust estimates based on quantiles. The first is interquartile range IQR = x0.75 − x0.25 , the distance between the 75th percentile and the 25th percentile. The top plot in Figure 4.5 (a close up of the lower left plot in Figure 4.4) shows that IQR/γ = νγ (α, β, 1, 0) is close to 2 for all α ≥ 1 and all −1 ≤ β ≤ 1, suggesting the estimate γ = IQR/2. For α ↓ 0, IQR appears to be unbounded. The bottom graph in Figure 4.5 shows that IQRα appears to be bounded. In the symmetric case, (3.54) and ad hoc fitting gives IQRα ≈ 1.46 − 0.0077α + 0.5501α2 , with relative error less than 0.01 for 0 < α ≤ 2. Hence γˆ I QR =
IQR 1.46 − 0.0077α + 0.5501α2
1/α
is a robust estimate of the scale if α is known and β ≈ 0. This estimator is independent of the location δ, and hence will give the same answer in the 0-parameterization or the 1-parameterization. The second quantile estimate of the scale is γ is the sample median of |X |. The top graph in Figure 4.6 shows this quantity is almost independent of β, and for 1 ≤ α ≤ 2, median(|X |) is close to 1 for a standardized S (α, β, γ, δ; 0) distribution.
169
10
4.4 Quantile-based estimation
0
2
4
6
8
beta = 0.00 beta = 0.25 beta = 0.50 beta = 0.75 beta = 1.00
0.5
1.0
1.5
2.0
1.5
2.0
0
1
2
3
4
alpha
0.5
1.0
alpha
10
Fig. 4.5 IQR (top) and IQRα (bottom) for a S (α, β; 0) distribution as a function of α, with β as shown.
0
2
4
6
8
beta = 0 beta = 0.25 beta = 0.5 beta = 0.75 beta = 1
0.0
0.5
1.0
1.5
2.0
1.5
2.0
0.0
0.5
1.0
1.5
2.0
alpha
0.0
0.5
1.0
alpha
Fig. 4.6 Median of |X | (top) and median( |X |)α (bottom) for a S (α, β; 0) distribution.
170
4 Univariate Estimation
For α ↓ 0, the quantity grows without bound. As above, raising to the power α (bottom plot) appears to eliminate the singularity. Using (3.54) and some ad hoc fitting suggests 1.44 − 0.79α + 0.35α2 0 < α < 1 α median(|X |) ≈ h(α) := 1.0918 − 0.0918α 1 ≤ α ≤ 2. Hence γˆ median =
median(|X1 |, |X2 |, . . . , |Xn |) h(α)1/α
is a robust estimate of the scale if α is known, for any β. Unlike the IQR estimator, this does depend on the shift and is only valid when δ = 0 in the 0-parameterization.
4.5 Characteristic function-based estimation Since closed forms are known for the characteristic functions of stable laws, estimates of stable parameters can be based on the empirical or sample characteristic function, the∫ sample analog of the (population) characteristic function φ(u) = E exp(iuX) = exp(iux)dF(x). The empirical distribution function based on an i.i.d. sample X1, . . . , Xn from F is Fn (x) =
n #{ j : X j ≤ x} 1 = 1(−∞,x) (X j ). n n j=1
This is a (random) discrete distribution function, having mass 1/n at each sample point X j . The empirical distribution function Fn (x) converges to population distribution function F(x) as n → ∞, this is sometimes called the Fundamental Theorem of Statistics. The study of Fn (·) is the subject of empirical processes. Using Fn in place of F in the definition of the characteristic function gives the definition of the empirical characteristic function: ∫ n 1 exp(iux)dFn (x) = exp(iuX j ). φn (u) = n j=1 The empirical characteristic function is a function of the empirical distribution function, and φn (·) converges to φ(·), see Problem 4.4. Because the characteristic function exists for every distribution, no assumption about the underlying distribution is necessary. Feuerverger and Mureika (1977) and Chapter 3 of Ushakov (1999) give general information about empirical characteristic functions. Press (1972) seems to have been the first to use the empirical characteristic function with stable laws. Modifications have been made to this approach by Paulson et al. (1975), Feuerverger and McDunnough (1981), Koutrouvelis (1980), Koutrouvelis (1981) and Kogon and Williams (1998). All but the last of these papers used the
4.5 Characteristic function-based estimation
171
1-parameterization; to avoid difficulties near α = 1, we will use the 0-parameterization throughout. The general stable characteristic function is complex; separating the real part and the imaginary part: φ(u) = φ1 (u) + iφ2 (u) = exp (−γ α [|u| α + i βη(γu|α; 0)] + iδu) , where η(·|α; 0) is from (3.19). Straightforward manipulations show |φ(u)| = φ21 (u) + φ22 (u) = exp(−γ α |u| α ), which can be rewritten as log(− log |φ(u)|) = log(γ α ) + α log |u|.
(4.2)
This is linear in log |u|, with slope α and intercept b := log(γ α ). Given two non-zero values u1 and u2 with |u1 | |u2 |, one can solve for α and b, and then γ = exp(b/α). Further, (4.3) Arg φ(u) = arctan(φ2 (u)/φ1 (u)) = −γ α βη(γu|α; 0) + δu. (The principal branch of arctan will be used throughout.) With the known values of α and γ, (4.3) can likewise be solved for β and δ. Thus, if φ(u) is known at two non-zero points with |u1 | |u2 |, one can determine all four parameters exactly. The empirical characteristic function method replaces φ1 (u) and φ2 (u) with the sample analogs: φn,1 (u) = Re φn (u) = (1/n) φn,2 (u) = Im φn (u) = (1/n)
n j=1 n
cos(uX j ) sin(uX j ).
j=1
Evaluating these at u1 and u2 give values for the left hand sides of (4.2) and (4.3), and therefore estimates of the parameters.
4.5.1 Choosing values of u First, φ(0) = φn (0) = 1 for all values of α, β, γ, and δ, there is no value in considering using u = 0. Next, as φ1 (·) is an even function and φ2 (·) is an odd function, there is no gain to considering u < 0. So we will restrict to u > 0 in what follows. The first papers on this topic used two u values as discussed above, but Koutrouvelis (1980) improved accuracy by using multiple values 0 < u1 < u2 < . . . < uk in (4.2) and (4.3) and then using regression to estimate the parameters.
172
4 Univariate Estimation
Figures 4.7 and 4.8 show plots of the exact log(− log |φ(u)|) and Arg (φ(u)) (the left side of (4.2) and (4.3) respectively) and simulated sample values for α ∈ {0.5, 1.2, 1.9}, β ∈ {0, 1}, and n ∈ {25, 50, 100}. In all these plots, the scale is γ = 1 and the location is δ = 0. If γ 1, then φ1 (·) decays very quickly, and if γ 1, it decays very slowly. Likewise, if |δ| is not near 0, then the characteristic function oscillates quickly and it will be hard to get good estimates of the parameters. So, the grid of ui values should depend on the scale and the location. Arg(φ(u)) α = 0.5 β = 0
0.0
−4
−2
0.2
0
0.4
log(− log( φ(u))) α = 0.5 β = 0
−4
−3
−2
−1
−0.4 −0.2
−8
−6
exact n = 25 n = 50 n = 100
−4
0
log(− log( φ(u))) α = 1.2 β = 0
−2
−1
0
Arg(φ(u)) α = 1.2 β = 0
0.4
0 −8
−6
−0.4 −0.2
0.0
−4
0.2
−2
−3
−4
−3
−2
−1
−4
0
−1
0
−8
−6
−0.4 −0.2
−4
0.0
−2
0.2
0
−2
Arg(φ(u)) α = 1.9 β = 0
0.4
log(− log( φ(u))) α = 1.9 β = 0
−3
−4
−3
−2
−1
0
−4
−3
−2
−1
0
Fig. 4.7 Plots of log(− log |φ(u)|) (left column) and Arg (φ(u)) (right column) for stable characteristic functions. In all cases, γ = 1 and δ = 0. The horizontal axis is v = log u. The rows are different values of α with β = 0. Each plot shows the exact value (solid curve), and the empirical characteristic function for simulated samples of size n = 25, 50, 100.
4.5 Characteristic function-based estimation Arg(φ(u)) α = 0.5 β = 1
0.0
−4
0.2
−2
0
0.4
log(− log( φ(u))) α = 0.5 β = 1
173
−4
−3
−2
−1
−0.4 −0.2
−8
−6
exact n = 25 n = 50 n = 100
−4
0
0
−1
0
−8
−6
−0.4 −0.2
0.0
−4
0.2
−2
−2
Arg(φ(u)) α = 1.2 β = 1
0.4
log(− log( φ(u))) α = 1.2 β = 1
−3
−4
−3
−2
−1
0
0
−2
−1
0
−8
−6
−0.4 −0.2
0.0
−4
0.2
−2
−3
Arg(φ(u)) α = 1.9 β = 1
0.4
log(− log( φ(u))) α = 1.9 β = 1
−4
−4
−3
−2
−1
0
−4
−3
−2
−1
0
Fig. 4.8 Plots of log(− log |φ(u)|) (left column) and Arg (φ(u)) (right column) for stable characteristic functions. In all cases, γ = 1 and δ = 0. The horizontal axis is v = log u. The rows are different values of α with β = 1. Each plot shows the exact value (solid curve), and the empirical characteristic function for simulated samples of size n = 25, 50, 100.
We describe the approach of Kogon and Williams (1998). First, as mentioned above, they use the 0-parameterization to avoid discontinuities in the characteristic function. They also require some preliminary estimate γ0 of the scale and δ0 of the location. These can be found using any of the previous methods, e.g. the quantile method. Then the data is normalized yi = (xi − δ0 )/γ0 and these values are used to compute φn,1 (u) and φn,2 (u). They found that the sample variation of φn,1 (·) and φn,2 (·) were large when u > 1, so they used the uniform grid u1 = 0.1, u2 = γ∗, δ∗ ) are estimated using regression 0.2, . . . , u10 = 1. The parameters θ = ( α, β,
174
4 Univariate Estimation
in (4.2) and (4.3). Finally, the scale and location are unnormalized: γ = γ0 γ∗ and ∗ δ = δ0 + γ0 δ . Krutto (2018) takes a different approach and improves estimation when α < 1. In that approach, only two values u1 and u2 are used, but they are chosen using the data. First, the data is shifted by an estimate of the location, the median can be used or a more involved quantile estimator. The u points are chosen so that log |φ(u1 )| = −0.1 and log |φ(u2 )| = −1. This avoids having to estimate the scale and moves u1 closer to 0 when α < 1, improving the estimation in these cases. One issue that arises with this method is how to find u1 and u2 . That author suggests using a grid search for u1 to find where log |φ(u1 )| is close to −0.1, possibly finding multiple values, especially if n is not large. If there are multiple values, take u1 to be the mid-range of those values. Likewise, u2 is found by searching for where log |φ(u2 )| is close to −1. Zhang (2018) found by simulation that choice of grid in the Kogon and Williams approach works well when α ≥ 1, but poorly when α < 1. Examining Figures 4.7 and 4.8 shows that when α ≥ 1, the estimated sample values of log |φ(u)| can be inaccurate for v = log u < −2, presumably because without a large sample, it is hard to recover the shape of |φ(u)| near the origin. On the other hand, when α < 1, there appears to useful information for v < −2. Bearing this in mind, Zhang normalizes with preliminary estimates γ0 and δ0 as in Kogon and Williams (1998) above, but then make two adjustments to improve the estimation. First, rather than using a grid uniform in the u variable, use a grid that is uniform in v = log u, which is the variable that is used in the regression for α based on (4.2). Specifically, the initial grid is v1 = −2, v2 = −1.8, . . . , v10 = 0. (This is comparable to the the recommended grid of Kogon and Williams above, where v1 = log 0.1 = −2.302, v2 = log 0.2 = −1.609, . . . , v10 = 0.) Second, they use a two-pass method: an initial estimate of all four parameters is made. If α ≥ 1 is found, just use those estimates. If α < 1 is found, then the grid is adjusted to include values of u closer to 0 and the parameters are re-estimated. Based on repeated simulations, they suggest using α − 12 and a v grid spread uniformly from this lower bound to v10 = 0. This v1 = 10 adaptive approach was found to work better when α was small. Simulations found no appreciable gain by increasing the number of grid points from k = 10 or by iterating multiple times with successive estimates of α.
4.6 Moment-based estimation It is possible to use the sample mean to estimate the location parameter when α > 1. Assume X1, . . . , Xn are i.i.d. S (α, β, γ, δ; 1). When α 1, equation (1.8) and Proposition 1.4(a) show that X1 + · · · + Xn ∼ S α, β, n1−1/α γ, δ; 1 . (4.4) Xn = n Thus the sample mean is itself stable. When α > 1 the scale parameter converges to zero at rate n1−1/α and X n is a consistent estimator of δ. However, X n is itself
4.6 Moment-based estimation
175
heavy tailed and does not have a finite variance. Also, the rate of convergence is slow, especially when α is close to 1. When α ≤ 1, the population mean does not exist and the sample mean X n should not be used to estimate the location. Consider what happens in these cases. When α = 1, following the same argument as in (4.4), X n ∼ S (1, β, γ, δ + (2/π)β log n; 1). This has the same amount of spread as the original scale, so any single data value has as much information about the location parameter as the sample mean. It also shifts the distribution by an increasing amount as n increases if β 0. When α < 1, (4.4) holds and it shows that the spread in X n is larger than the spread in the original distribution, and X n does not converge as the sample size increases. Note that if α is slightly larger than 1, then X n will be very variable and give an unreliable estimate of E X. We note that if the sample is in the domain of attraction of an α-stable distribution, then X n will have approximately a stable distribution when the sample size is large, and the scale behaves similar to the exact stable case: if 1 < α < 2, then the sample mean is a consistent estimator of the population mean; if α ≤ 1, then the population mean does not exist and X n does not converge as n → ∞. For symmetric stable distributions (β = δ = 0), Nikias and Shao (1995) estimate parameters α and γ using fractional and negative moments. Corollary 3.5 shows c(α, p)γ p −1 < p < α p E |X | = (4.5) +∞ otherwise, where c(α, p) = Γ(1 − p/α)/(Γ(1 − p) cos(πp/2)). Note that if 0 < p < min(1, α), the product E |X | p E |X | −p = c(α, p)c(α, −p), is an expression for α that does not depend on γ. Problem 4.5 shows that the product is strictly decreasing in α, so that it can be solved numerically for α. Once α is known, (4.5) can be used to solve for γ. The fractional lower order (FLOM) method computes the sample fracn n moment |xi | p and i=1 |xi | −p and uses them in place of the tional absolute moments i=1 population moments. This method is fast: the sample moment calculation is straightforward and solving for α is a one-dimensional root-finding problem for a monotonic function. Below we shall see that the FLOM method is not very efficient, so large samples may be needed to get good estimates of α and γ. A practical problem is what value of p to pick. On the one hand, if 0 < p < min(1, α)/2, the p-th and −p-th sample fractional absolute moments will have finite mean and variance, so they will be well behaved, converging in the familiar way to E |X | p and E |X | −p as n → ∞. So using a small value of p gives better estimates of the fractional absolute moments. However, Problem 4.5 shows that the product c(α, p)c(α, −p) is relatively flat for α > 1 when p is small, making the estimation of α more variable. In general, one does not know α, so deciding a p may be tricky. In many engineering applications, one has prior knowledge that α is above 1 and large data sets, so the FLOM method can be useful when a fast estimation method is needed.
176
4 Univariate Estimation
Dance and Kuruoğlu (1999) generalized this approach to the nonsymmetric case using signed moments, which allows β to be estimated also. If the data is strictly stable, this is straightforward. If not in the strictly stable case, and α 1, some method must be used to find the shift that makes the data strictly stable, e.g. the quantile estimates. Kuruoğlu takes pairwise differences x1 − x2 , x3 − x4 , etc. to get symmetric data and then use the above estimate. They also describe a method based on the mean and variance of light tailed log |X | in the strictly stable case using Lemma 3.19.
4.7 Maximum likelihood estimation To simplify notation in this section, we denote the parameter vector by θ = (α, β, γ, δ) and the density by f (x|θ; 0). The likelihood function for an i.i.d. stable sample x = (x1, . . . , xn ) is given by L(θ) = L(θ |x) =
n
f (xi |θ; 0),
i=1
and the log-likelihood by (θ) = (θ |x) = log L(θ) =
n
log f (xi |θ; 0).
i=1
The parameter space is Θ = (0, 2] × [−1, 1] × (0, ∞) × (−∞, ∞). As the notation implies, we will always use the S (α, β, γ, δ; 0) parameterization, both because it is continuous in all four parameters and because it is a location-scale family. There are several issues that make maximization complicated with stable laws. First, since there is no explicit formula for the density, the likelihood must be computed numerically and the maximization is done using a numerical optimization algorithm. Since many optimization programs minimize a function, we will restate the maximum likelihood problem as minimization problem: θ = argminθ ∈Θ − (θ). Second, when α < 2, − log f (x|θ) is non-convex, so in general −(θ) is nonconvex. Figure 4.9 shows − log f (x|θ) for varying α with β = 0 and an example of non-convexity. The problem is more noticeable when α is small, when sample sizes are small, or when the parameters are far from the true values. One place this can happen is when a poor initialization is used for the maximization routine. Another place is in signal processing problems, where a smoothing filter may be “calibrated” with fixed α, β and γ, and the location parameter is estimated on sliding windows of small size. A numerical optimization routine may get stuck in a local minimum and not find the true minimum. In such cases, which arise in signal processing problems, numerical methods that look for global min should be used as in Chapter 6.
177
8
4.7 Maximum likelihood estimation
200
gamma = 0.01 gamma = 0.05 gamma = 0.1 gamma = 0.5 gamma = 1 gamma = 2
0
100
2
4
150
6
alpha = 2 alpha = 1.75 alpha = 1.5 alpha = 1.25 alpha = 1 alpha = 0.75
-4
-2
0
2
4
-3
-2
-1
x
0
1
2
3
delta
Fig. 4.9 The left figure shows − log f (x |θ = (α, 0, 1, 0); 0). For α < 2, the curves flare out and are non-convex. The right figure shows −(θ = (0.8, 0, γ, δ)) for a sample of size 32 from a S (0.8, 0, 1, 0; 0) distribution with γ as shown in the legend and δ varied from −3 to +3. For small γ, the non-convexity of − log f (x |θ) causes −(θ) to be non-convex.
The third problem with maximum likelihood estimation of stable parameters is that the likelihood is unbounded as α → 0. To illustrate this, let x1, . . . , xn be a sample from a stably distributed population. Assume that the value of x1 occurs only once in the sample. (If that value occurs more than once, then relabel points. If every value occurs more than once, the following argument can be adjusted, and in fact, the problem is even worse there.) The likelihood for a stable model with parameters θ = (α, β = 0, γ = 1, δ = x1 ) is L(θ) =
n
f (xi |θ; 0) = f (0|α, 0, 1, 0; 0) ×
i=1
= Γ(1/α)/(πα) ×
n
log f (xi − x1 |α, 0, 1, 0; 0)
i=2 n
log f (|xi − x1 | |α, 0, 1, 0; 0).
(4.6)
i=2
Problem 4.7 shows that as α → 0 the first term tends to ∞ and it dominates the second term. This argument can be repeated at each xi , and thus the likelihood is unbounded at multiple points as α → 0. In spite of these problems, maximum likelihood estimation is possible and reliable if done carefully. While not easily accessible, DuMouchel (1971) gives a wealth of information on estimating stable parameters at a remarkably early date. In that work, an approximate maximum likelihood method was developed based on grouping the data set into bins, and using an approximation to the density at the middle of bins (using the fast Fourier transform for central values of x and series expansions for tails) to compute an approximate log-likelihood function. This function was then numerically maximized. Most of this work was published in a series of papers
178
4 Univariate Estimation
of DuMouchel (1973a,b, 1975, 1983). For the special case of ML estimation for symmetric stable distributions, see Brorsen and Yang (1990) and McCulloch (1998b). Finally, Brant (1984) proposes a method for approximating the likelihood using the characteristic function directly. The program STABLE implements full maximum likelihood estimation for all four parameters by calculating −(θ) numerically and then uses an optimization routine to minimize it, see Nolan (2001). It does not appear to get stuck in local minimums, even for small samples, because: (i) a good initial estimate of θ is used, (ii) α is bounded away from 0, and (iii) while non-convexity can occur, the multivariate optimization problem appears to avoid getting stuck in a local minimum in one variable because the likelihood surface is sloping upward in another variable. For example, in the right-hand side of Figure 4.9, ∂(θ)/∂δ may be zero at multiple points, but ∂(θ)/∂γ > 0 and when a multivariate minimization routine move toward the optimal γ, the non-convexity in δ smooths out. We note that is sometimes desirable to restrict some of the parameters when using maximum likelihood. For example, restricting β = 0 will result in a symmetric stable fit and restricting (0 < α < 1 and β = 1) will result in a one-sided stable fit. The STABLE program has a restricted maximum likelihood estimation option that allows one to specify a restricted parameter space.
4.7.1 Asymptotic normality and Fisher information matrix DuMouchel (1971) showed that when the parameter vector θ0 is in the interior of the parameter space Θ, the ML estimator follows the standard theory, so it is consistent and asymptotically normal with mean θ0 and covariance matrix given by n−1 Σ, where Σ = (σi j )i, j=1,...,4 is the inverse of the 4 × 4 Fisher information matrix I. The entries of I are given by ∫ ∞ ∂f ∂f 1 dx. Ii, j = ∂θ i ∂θ j f −∞ The STABLE program was used to numerically compute these partial derivatives, and the resulting values for the integrands were then numerically integrated. The original computations in Nolan (2001) had some inaccuracies; those values have been recomputed here and a larger set of values of α and β have been used. Further work and some values for α < 0.5 can be found in Matsui and Takemura (2006) and Barker (2015). There is also information on the Fisher information for α as α ↑ 2 in Section 9.7 of Uchaikin and Zolotarev (1999). When θ is near the boundary of the parameter space the finite sample behavior of the estimators is not precisely known. When θ approaches the boundary of the parameter space, i.e. α = 2 or β = ±1, the asymptotic normal distribution for the estimators tends to a degenerate distribution at the boundary point and the ML estimators are super-efficient. See DuMouchel (1971) for more information on these cases.
4.7 Maximum likelihood estimation
179
σβ 10
2.0
σα
6
1.5
8
β=0
β=0
β=1
0
0.0
2
0.5
4
1.0
β=1
0.5
1.0
1.5
2.0
0.5
1.0
σγ
1.5
2.0
1.5
2.0
β=1
2.0
1.5
2.5
3.0
2.0
σδ
1.0
β=1
0.0
0.0
0.5
β=0
0.5
1.0
1.5
β=0
0.5
1.0
1.5
2.0
0.5
1.0
Fig. 4.10 Standard deviations σα , σβ , σγ and σδ of the MLE estimators of stable parameters as a function of α and β (γ = 1, δ = 0, in the 0-parameterization). The different curves are for different values of β which vary from 0 to 1 in steps of 0.1 in the indicated direction.
Away from the boundary of Θ, large sample confidence intervals for each of the parameters are given by σθ θ i ± zα/2 √ i , n where σθ1 , . . . , σθ4 are the square roots of the diagonal entries of Σ. The values of σθi , i = 1, . . . , 4 have been computed and are plotted in Figure 4.10 when γ = 1 and √ δ = 0. The correlation coefficients ρθi θ j = σi j / σii σj j , have also been computed; they are plotted in Figure 4.11. These values are tabulated in Appendix D on a grid of α, β values. When β < 0, the standard deviations are the same as for | β| and the correlation coefficients are expressed in terms of the β > 0 case as (−1)i+j ρi j . For a general scale γ and location δ, σα , σβ and ρθi θ j are unchanged, but σγ and σδ are γ times the tabulated value.
180
4 Univariate Estimation ραβ 1.0
1.0
ραγ
0.5
β=0
0.0
0.0
0.5
β = 0.9
−0.5
β=1
−1.0
−1.0
−0.5
β=0
0.5
1.0
1.5
0.5
2.0
1.0
2.0
ρβγ
ραδ 1.0
1.0
1.5
β = 0.9
0.0
0.0
0.5
0.5
β=1
−0.5
β=0
−1.0
−1.0
−0.5
β=0
0.5
1.0
1.5
2.0
0.5
1.0
1.5
2.0
ργδ β=1
0.5
0.5
1.0
1.0
ρβδ
0.0
β=0
−0.5
−0.5
0.0
β = 0.9
0.5
1.0
1.5
−1.0
−1.0
β=0 2.0
0.5
1.0
1.5
2.0
Fig. 4.11 Correlations ρθ i θ j between the MLE estimators of the parameters as a function of α. The different curves are for different values of β, which vary from 0 to 1 in steps of 0.1 when neither θi or θ j are β. In those cases, β varies from 0 to 0.9 as indicated.
The grid values were chosen to give a spread over the parameter space and show behavior near the boundary of the parameter space: α = 2 and β = ±1. (Because of computational difficulties, these values have not been tabulated for small values of α). We note that when β = 0, stable densities are symmetric and all the correlation coefficients involving β are 0. When β = 1, σβ = 0 and all the correlation coefficients involving β are undefined. To estimate the information matrix for a subset of the parameters, start with the full information matrix, delete the rows and columns
4.7 Maximum likelihood estimation
181
0.3
beta = 0 beta = 1
0.2 0.1
2*SE (alpha)
n = 100
n = 1000
0.0
n = 10000
0.5
1.0
1.5
2.0
alpha Fig. 4.12 Graph of twice the standard error of αˆ as a function of α for various sample sizes.
corresponding to the known parameters, and invert to get the covariance matrix for the remaining parameters. Some of these values have been given in DuMouchel (1971), pg. 93. For α not near 1, most of the values given there agree with our results. That author uses the S (α, β, γ, δ; 1) parameterization, so near α = 1, one would expect different values. Some general observations about accuracy of parameter estimates can now be made. The parameter ˆ √ of most interest is usually α. Twice the standard error of α, 2 S.E.(α) ˆ = 2σαˆ / n, is plotted in Figure 4.12 for 0.5 ≤ α ≤ 2, n = 100, 1000, and 10000 and β = 0, β = 1. (The graphs for 0 < | β| < 1 are between the given ones.) Unless α is close to 2, it is clear that a large data set will be necessary to get small confidence intervals, e.g. when α = 1.5 and β = 0, sample sizes of 100, 1000, and 10000 yields S.E.(α)’s ˆ of 0.318, 0.100, and 0.0318 respectively. Since no other estimation method is asymptotically more efficient than ML, any other method of estimating α will likely yield larger confidence intervals. In contrast, when α ↑ 2,
182
4 Univariate Estimation
ˆ γˆ and δˆ0 also S.E.(α) ˆ approaches 0. Similar calculations of standard errors for β, show that large samples will always be necessary for small confidence intervals. As ˆ → ∞. In practice, this is of little import because β an extreme, as α ↑ 2, S.E.( β) means little as α ↑ 2. Sometimes one may wish to estimate the shift parameter δ1 in the 1-parameterization. For example, if α > 1, the population mean exists and is equal to δ1 . Estimating it by the sample mean is intuitive, but the standard confidence interval is not appropriate (the population variance is infinite). While it is possible to estimate δ1 directly by expressing the likelihood in terms of the densities f (x|α, β, γ, δ1 ; 1) and recomputing the Fisher information matrix, it is preferable to estimate all four parameters in the 0-parameterization and use δ1 = δ0 − βγ tan(πα/2) to estimate δ1 . To find confidence intervals for δ1 , express the covariance matrix Σ1 of the 1-parameterization estimators in terms of the covariance matrix Σ of the 0-parameterization estimators. To do this, use Cramer’s Theorem (e.g. Chapter 7 of Ferguson (1996)) on the transformation (α, β, γ, δ0 ) → (α, β, γ, δ0 − βγ tan(πα/2)) to show that Σ1 = GΣGT , where the matrix of partial derivatives of the transformation between paramterizations is 1 0 0 0 1 0 G= 0 0 1 2 πα −(π/2)βγ sec ( 2 ) −γ tan(πα/2) −β tan(πα/2)
0 0 . 0 1
In particular, the term used in the confidence interval for δ1 is the (4,4) entry of Σ1 . The upper left 3 × 3 sub-matrix of Σ and Σ1 are identical, but all the terms in the last row and the last column of Σ1 are different. The terms tan(πα/2) and sec2 ( π2α ) will be very large when α is near 1, so if the confidence interval for δ1 will not be reliable when α is near 1.
4.7.2 The score function for δ ∂ The score function for the location parameter is g(x|α, β, γ, δ; k) = ∂x log f (x|α, β, γ, δ; k) = − f (x|α, β, γ, δ; k)/ f (x|α, β, γ, δ; k). We will use the same notation as before for the standardized score function g(x|α, β; k) := g(x|α, β, 1, 0; k). In the 0-parameterization, the stable laws are a location and scale family, so it suffices to calculate g for the standardized case: g(x|α, β, γ, δ; 0) = γ −1 g((x − δ)/γ |α, β; 0). In the 1-parameterization with α 1 the same relation holds, when α = 1 and β 0 it does not. The score function has an explicit form for three cases.
1. Gaussian case (α = 2, β = 0): g(x|2, 0; 0) = g(x|2, 0; 1) = x/2. (Recall S (2, 0; 0) is N(0,2), not N(0,1).) 2. Cauchy case (α = 1, β = 0): g(x|1, 0; 0) = g(x|1, 0; 1) = 2x/(1 + x 2 ) 3. Lévy case (α = 1/2, β = 1): g(x|1/2, 1; 1) = (1 − 3x) exp(−1/x)/(2x 5 ).
4.7 Maximum likelihood estimation
183 β = 0.5
β=0 α=2 α = 1.75
2
2
α = 1.25
−2
−2
−1
−1
0
0
1
1
α = 0.75
−3
−2
−1
0
x
1
2
3
−3
−2
−1
0
1
2
3
x
Fig. 4.13 Graphs of the score function g(x |α, β, 1, 0; 0) for various values of α. Top graph show the symmetric case β = 0, the bottom shows a skewed case β = 0.5.
In other cases, the function can be computed numerically. Figure 4.13 shows numerically derived values of g(x|α, β, 1, 0; 0). Note that the score function is nonlinear, except when α = 2. For any 0 < α < 2, β −1, the tail approximation and regularity of the density shows that g(x|α, β, 1, 0; k) → (1 + α)/x as x → ∞. The left tail is similar by the reflection property, except in the totally skewed cases, where the light tail behavior is different. Note that in contrast to the linear Gaussian case, for non-Gaussian stable laws, the tails of g decay to 0 as |x| → ∞, indicating that large values of |x| are downplayed when maximum likelihood estimates of the location parameter are found. As in the Gaussian case, values near the origin don’t have much effect on the estimate of δ. In the non-Gaussian stable case, moderate values of |x| influence the location parameter most. In signal processing, the score function is called the nonlinear function and is used in the locally optimal detector, see Nikias and Shao (1995).
184
4 Univariate Estimation
4.8 Other methods of estimation This section mentions a few other estimation methods, focusing on the symmetric case.
4.8.1 Log absolute value estimation Let X ∼ S (α, 0, γ, 0; 1) and set Z = log |X |. Lemma 3.19 with θ 0 = 0 gives 1 E(Z) = γEuler − 1 + log γ α Var(Z) =
π 2 (1 + 2/α2 ) . 12
(4.7) (4.8)
The method of log absolute values uses the above expressions to estimate the parameters α and γ. If x1, . . . , xn is a sample from the above symmetric stable distribution, define the transformed sample zi = log |xi |, i = 1, . . . , n. Compute the sample mean z and sample variance sz2 of the transformed values; substituting the latter in (4.8) gives the estimate −1/2 2 6sz 1 , 2 . α = min 2 − 2 π (The cutoff at 2 is done to guarantee the the result is in the interval (0, 2].) Substituting α and z in (4.7) gives an estimator for γ: γ = exp(z − γEuler (1/ α − 1)).
4.8.2 U statistic-based estimation Fan (2006) used a U-statistics approach to derive an estimator for α and γ for a symmetric stable distribution. Let x1, . . . , xn be a sample from a S (α, 0, γ, 0; 0) law. Define the two functions 1 log |x1 | + log |x2 | hα (x1, x2 ) = log |x1 + x2 | − 2 2 log |x1 | + log |x2 | − γEuler (hα (x1, x2 ) − 1) hγ (x1, x2 ) = 2 γEuler log |x1 | + log |x2 | γEuler = 1+ − log |x1 + x2 | + γEuler log 2 2 log 2
4.8 Other methods of estimation
185
By Lemma 3.19, E log |Xi | = γEuler (1/α − 1) + log γ. For i j, Xi + X j ∼ S α, 0, 21/α γ, 0; 0 and therefore E log |Xi + X j | = γEuler (1/α − 1) + log(21/α γ) = γEuler (1/α − 1) + (log 2/α) + log γ = E |Xi | + (1/α) log 2. Hence for i j, E hα (Xi, X j ) = log 2/α. Define Un,α =
−1 n hα (xi, x j ), 2 1≤i< j ≤n
the sample estimate of E hα (Xi, X j ), and solve for α to get the unbiased estimator α=
(log 2) . Un,α
In a similar way, E hγ (Xi, X j ) = log γ, so we define Un,γ
−1 n = hγ (Xi, X j ). 2 1≤i< j ≤n
Solving the first equation above for γ and then substituting Un,γ as an unbiased estimator of E hγ (Xi, X j ) yields the unbiased estimator γ = exp(Un,α ). Since this method looks at all pairs of data points, the execution time is of order n2 .
4.8.3 Miscellaneous methods There are other methods of estimating stable parameters. The details of those methods will not be discussed, however some references are given for those that want to explore them. A Bayesian approach has been used by Buckle (1995), Buckle (1993), Godsill (1999), Godsill (2000), Peters et al. (2012), and Tsionas (1999). Antoniadis et al. (2006) describe a wavelet approach to estimating based on the empirical characterisitic function. Garcia et al. (2011) and Lombardi and Calzolari (2005) use indirect estimation to estimate stable parameters, using skewed t laws as auxillary models. Finally, minimum distance estimators are given in Fan (2009).
186
4 Univariate Estimation
4.9 Comparisons of estimators 4.9.1 Using x and s to estimate location and scale When X ∼ S (α, β, γ, δ; 1) and α > 1, E(X) = δ, so the sample mean x¯ can be used as an estimator of δ. Some also use the sample standard deviation s as an estimator of the scale γ, even though the population variance is undefined. To assess how this works, one thousand samples of size n = 50, 100, 250, 500 were simulated from a S (1.5, 0, 1, 0; 1) distribution. For each sample, the sample mean x¯ and sample standard deviation s were computed and simultaneously the maximum likelihood ˆ γ, estimators α, ˆ β, ˆ δˆ were computed. Figure 4.14 shows boxplots comparing x¯ vs. δˆ and s vs. γ. ˆ The figure shows that the values of δˆ are much more concentrated around the correct value of δ = 0 than x. ¯ The bottom figure shows a similar behavior: γˆ is much more concentrated around the correct value of γ = 1 than s. In fact, as the sample size increases, the moment estimators get worse, whereas the maximum likelihood estimators get better. These problems get worse as α gets smaller. It is clear that x¯ is a poor estimate of the location and s is a poor estimate of scale, even when the distribution is symmetric.
4.9.2 Statistical efficiency and execution time Simulations were performed to compare the efficiency of the common estimators. The following plots and tables show the root mean square error (RMSE) of M = 1000 estimates of all four parameters α, β, γ, and δ from samples of size n = 100. The parameters α and β varied as shown, while γ = 1 and δ = 0 were fixed. In all cases, the 0-parameterization was used. Here and below, the methods are labeled with abbreviations: MLE for maximum likelihood estimator, QUANT for quantile-based method, ECF for the empirical characteristic function method of Kogon and Williams (1998) (without any of the refinements mentioned above), FLOM for the fractional lower order moment method, LOGABS for the log absolute value method, and USTAT for the U-statistic method. We note that occasionally some of these estimation methods fail. The FLOM, LOGABS, and USTAT methods are undefined if any xi is zero. (This is because the FLOM method computes a negative sample moment and the other two methods compute log |xi |.) If any xi is close to zero, the methods can be unreliable. Furthermore, these three methods assume that the population is symmetric and centered at the origin, and always return β = 0 and δ = 0; in these cases, the RMSE is exactly 0 for both β and δ. For all values of β and δ, Problem 4.8 shows that the RMSE for β is exactly | β| and the RMSE for δ is exactly |δ| for these three methods, independent of n, α and γ. For this reason, it doesn’t make sense to discuss the RMSE of the FLOM, LOGABS, and USTAT methods for β and γ.
4.9 Comparisons of estimators
187
−1
0
1
2
location estimates
n = 100
n = 250
n = 500
−2
n = 50
^ δ
x
x
^ δ
x
^ δ
x
^ δ
0
5
10
15
20
scale estimates
n = 50
s
n = 100
^γ
s
n = 250
^γ
s
n = 500
^γ
s
^γ
Fig. 4.14 Comparison of estimators for the location (top) and scale (bottom). In the top graph, boxplots for the sample mean x¯ and the maximum likelihood estimator of the location δˆ are shown side by side for different sample sizes n. In the bottom graph, boxplots for sample standard deviation s and maximum likelihood estimator of the scale γˆ are shown side by side. In all cases M = 1000 samples were simulated from a S (1.5, 0; 1) distribution.
In the following plots, the horizontal dashed line is the true value of the parameter, which is known from the simulation design. The boxplots show the dispersion around this value for the different parameters and different methods. We consider the cases separately. As an abbreviation, we will use A ≈ B to mean methods A and B have the approximately the same RMSE and A < B to mean method A has a smaller RMSE than method B. 1. α = 0.5, β = 0 (Figure 4.15) For α and γ, MLE ≈ QUANT ≈ ECF ≈ FLOM ≈ LOGABS ≈ USTAT. For β and δ, MLE < QUANT < ECF. 2. α = 0.5, β = 0.5 (Figure 4.16) For α, MLE ≈ QUANT ≈ ECF ≈ FLOM ≈ LOGABS ≈ USTAT. For β and δ, MLE < ECF < QUANT. For γ, MLE
1.4. For an arbitrary distribution, there is no general statement that can be made about what fraction of the tail is appropriate.
4.10.1 Graphical diagnostics Graphical diagnostics are a useful tool for selecting a model and assessing how well a model fits data. If the tails of a distribution have heavy tails, then we expect large values in a data set. While these values may be a small proportion of the data, they can be a feature of the data that we wish to model. The term outlier is frequently
196
4 Univariate Estimation β
0.0
−1.0
0.5
−0.5
1.0
0.0
1.5
0.5
2.0
1.0
α
MLE
QUANT
ECF
FLOM LOGABS USTAT
MLE
QUANT
ECF
γ
FLOM LOGABS USTAT
0.0
−1.0
0.5
−0.5
1.0
0.0
1.5
0.5
2.0
1.0
δ
MLE
QUANT
ECF
MLE
FLOM LOGABS USTAT
method MLE QUANT ECF FLOM LOGABS USTAT
α 0.156 0.227 0.178 0.313 0.303 0.247
β 0.337 0.345 0.380 0.500 0.500 0.500
QUANT
ECF
FLOM LOGABS USTAT
γ δ 0.108 0.177 0.135 0.195 0.116 0.181 0.177 0.000 0.171 0.000 0.166 0.000
Fig. 4.22 The top plot shows the distribution of the different estimators when α = 1.5, β = 0.5, γ = 1, δ = 0, and n = 100. The bottom table gives the RMSE.
used for recording errors or “fluke” extreme values. In contrast, here we assume that these extreme values are not mistakes in data recording or gathering, but a regular, although perhaps infrequent, part of the problem. We also assume that understanding these extreme values is an important part of the analysis, so removing them from the sample is not appropriate. Indeed, in some cases these extreme values may be the most important part of the data; examples are large losses in a financial market, large claims against an insurance company, maximum wind speed in a hurricane, minimum temperature, etc. The commonly used graphical diagnostics—empirical density, empirical cumulative distribution function (ecdf), and Q-Q plots generally perform poorly with heavy tailed data—see Figure 4.25. An empirical density plot (or histogram) should
4.10 Assessing a stable fit
197 β
0.0
−1.0
0.5
−0.5
1.0
0.0
1.5
0.5
2.0
1.0
α
MLE
QUANT
ECF
MLE
FLOM LOGABS USTAT
QUANT
ECF
γ
FLOM LOGABS USTAT
0.0
−1.0
0.5
−0.5
1.0
0.0
1.5
0.5
2.0
1.0
δ
MLE
QUANT
ECF
MLE
FLOM LOGABS USTAT
method MLE QUANT ECF FLOM LOGABS USTAT
α 0.137 0.269 0.193 0.327 0.317 0.256
β 0.058 0.380 0.261 1.000 1.000 1.000
QUANT
ECF
FLOM LOGABS USTAT
γ δ 0.094 0.183 0.164 0.282 0.116 0.194 0.189 0.000 0.183 0.000 0.167 0.000
Fig. 4.23 The top plot shows the distribution of the different estimators when α = 1.5, β = 1, γ = 1, δ = 0, and n = 100. The bottom table gives the RMSE.
be done to examine the center of the data, e.g. more than one mode or gaps in the support that rule out stability. It may be necessary to trim the extremes of the data to get a useful value of the smoothing parameter, especially if the tails are really heavy. But empirical density plots behave erratically on the tails where there are few points that are widely scattered, resulting in isolated bumps in a kernel smoothed density; see the upper right plot in Figure 4.25. Comparing such an estimator to a model for the data is meaningless on the tails. Standard ecdf plots focus on the center of the distribution, with the tails monotonically approaching 0 on the left, and 1 on the right. For heavy tailed data, a standard ecdf plot looks like a step function, with most of the data visually compressed in a small interval with a steep rise. It is hard to assess tail behavior or compare to a model with such a plot. In the presence of heavy tails, Q-Q plots are visually dominated
198
4 Univariate Estimation β
0.0
−1.0
0.5
−0.5
0.0
1.0
0.5
1.5
1.0
2.0
α
MLE
QUANT
ECF
MLE
FLOM LOGABS USTAT
QUANT
ECF
γ
FLOM LOGABS USTAT
0.0
−1.0
0.5
−0.5
0.0
1.0
0.5
1.5
1.0
2.0
δ
MLE
QUANT
ECF
MLE
FLOM LOGABS USTAT
method MLE QUANT ECF FLOM LOGABS USTAT
α 0.054 0.186 0.034 0.294 0.294 0.219
β 0.642 0.430 0.966 0.000 0.000 0.000
QUANT
ECF
FLOM LOGABS USTAT
γ δ 0.074 0.148 0.120 0.188 0.076 0.182 0.130 0.000 0.129 0.000 0.161 0.000
Fig. 4.24 The top plot shows the distribution of the different estimators when α = 2, β = 0, γ = 1, δ = 0, and n = 100. The bottom table gives the RMSE.
by the extreme values, with most of the data concentrated in a compressed central region. Furthermore, the inherent variability of the extreme values makes for large deviations away from the diagonal line even when the data is being compared to the true model. Unlike the familiar light tailed case, in the heavy tailed case, large sample sizes exacerbate these problems, because then there are likely to be more extreme values. These issues are compounded in the multivariate case, where heavy tails and directional dependence make it hard to explore multivariate data. Finally, P-P plots (not shown) can be used, but by construction, these plots are squeezed into the lower left and upper right corners of the plot region and tail behavior is hard to distinguish.
4.10 Assessing a stable fit
199
0.30
1e−03
0.25
8e−04
0.20
6e−04
0.15 4e−04 0.10 2e−04
0.05
0e+00
0.00 −40
−20
0
20
40
50
1.0
6000
0.8
4000
60
70
80
90
100
0
5000
2000
0.6
0 0.4
−2000
0.2
−4000 −6000
0.0
−25000 −20000 −15000 −10000 −5000
0
5000
−25000
−20000
−15000
−10000
−5000
Fig. 4.25 Standard diagnostics with a data set of n = 10,000 simulated Cauchy random variates. The range for this data was from −7525.765 to 18578.677. The upper left plot shows a kernel density estimator using default Gaussian kernel and default bandwidth in the R function density, clipped to the region (−50,50). The upper right plot shows the same kernel density estimator on the interval (50,100). The lower left shows the ecdf. The lower right plot shows a qqplot of this simulated data vs. the exact equally spaced quantiles of a Cauchy distribution.
Next, a nonparametric graphical diagnostic is described, based on a nonlinear transform of the empirical distribution function, where both axes are logarithmically scaled at the extremes. In Nolan (2019a) this is called an ecdfHT plot—an acronym for empirical cumulative distribution function (ecdf) for heavy tails (HT). It is assumed that a large data set is available; if not, conclusions about heavy tails are likely to be unreliable. This proposed plot has the advantage that power law behavior on the tails will appear as a straight line, regardless of the value of the tail exponents. The scaling is determined by quantiles of the data and does not depend on any parametric model for the data. The ecdfHT package to draw these diagnostic plots is available on CRAN1, the open source R software depository. Such plots may be useful for non-stable distributions, e.g. extreme value data. Given a data set x1, . . . , xn , let x1:n, . . . , xn:n be the sorted values and let pi = (i − 1/2)/n be the empirical cdf at xi:n . (We subtract the 1/2 in this definition as a simple way to avoid p values of 0 and 1 at the extremes. If there are repeats the package handles them correctly by tallying the number of repeats at a particular x and having a jump of the appropriate size at each repeat.) The standard ecdf plot shows the pairs (xi:n, pi ), i = 1, . . . , n, as in the lower left plot in Figure 4.25. See D’Agostino and Stephens (1986) for more information. Pick three values 0 ≤ q1 ≤ q2 ≤ q3 ≤ 1 (called “scale quantiles”) and define the −1 (qi ). Use these values to define the functions corresponding data quantiles: ti = F (see Figure 4.26): 1 See https://CRAN.R-project.org/package=ecdfHT
200
4 Univariate Estimation
−2
−1
0
1
2
h(x|−1,0,1)
−4
−2
0
2
4
−3
−2
−1
0
1
2
3
g(p|1/4,1/2,3/4)
0.0
0.2
0.4
0.6
0.8
1.0
Fig. 4.26 The functions h(x |t1 = −1, t2 = 0, t3 = 1) and g(p |q1 = 1/4, q2 = 1/2, q3 = 3/4). The dashed lines mark the cut points in the function definitions.
⎧ ⎪ −1 − log(−x) x < −1 ⎪ ⎨ ⎪ h0 (x) = x −1 ≤ x ≤ 1 ⎪ ⎪ ⎪ 1 + log(x) x>1 ⎩ ⎧ 2 ⎪ x < t2 ⎨ h0 tx−t ⎪ 2 −t1 h(x) = h(x|t1, t2, t3 ) = x−t ⎪ x ≥ t2 ⎪ h0 t3 −t22 ⎩ ⎧ ⎪ (q1 − q2 ) + q1 log qp1 ⎪ ⎨ ⎪ g(p) = g(p|q1, q2, q3 ) = p − q2 ⎪ ⎪ ⎪ (q3 − q2 ) − (1 − q3 ) log 1−p 1−q3 ⎩
(4.9) p < q1 q1 ≤ p ≤ q3 p > q3 .
4.10 Assessing a stable fit
201
0.99 0.95 0.9 0.5 0.1 0.05 0.01
−32.2
−3.08
−0.00764
2.95
29.5
Fig. 4.27 Basic ecdfHT plot with simulated Cauchy data, n = 10,000.
An ecdfHT plot graphs the transformed pairs (h(xi:n ), g(pi )), i = 1, . . . , n. This plot is somewhat like a two-sided complementary cdf plot on a log-log scale, but orients things differently on the lower tail to show the ecdf and it gives a direct view of what happens in the mid-range. Figure 4.27 shows a basic ecdfHT plot with simulated Cauchy data. The linear behavior on the tails is characteristic of a heavy tailed data set. Note that the coefficients are chosen so that both functions are continuous, monotonically increasing, linear in the middle and with logarithmic scales on the outer intervals. If the data is symmetric and q1 = 1 − q3 , the two line segments in the middle of h have the same slope. The h(·) function pulls in extreme x values and the g(·) function spreads out values near the endpoints p = 0 and p = 1. This makes it possible to see both the behavior in the middle and the tail behavior on one plot. For symmetric data, it makes sense to use q2 = 1/2 and q1 = 1 − q3 , with, e.g. q1 = 1/4. For one-sided data that has a finite left endpoint, it makes sense to use 0 = q1 = q2 < q3 , say q3 = 3/4. Likewise, use 0 < q1 < q2 = q3 = 1 for data with a finite right endpoint. Note that in these one-sided cases, the functions h and g are not defined on the truncated side, but in these regions, the functions are not needed because no data values fall in these regions. In addition to just displaying a data set, it is useful to be able to add more annotations and to compare it to one or more models to select an appropriate model for the data. Since neither axis is linear, it may be useful label specific points on the axes. Figure 4.28 shows more labels on the vertical axis, manually chosen labels on the horizontal axis, and grid lines at each of the tick marks. One can also compare the data to one (or more) models by adding the exact quantiles of a model to the basic plot. (This may require estimating parameters for the model from the data.) Figure 4.28 adds comparison curves of the exact cdf for a Cauchy law and a Gaussian fit. Pointwise confidence bounds can also be drawn for a model; these are shown as dotted lines of the same color around each model in the figure. These bounds are
202
4 Univariate Estimation
0.9999 0.999 0.99 0.95 0.5 0.05 0.01 0.001 0.0001 −200
−20
−5
0
5
20
200
Fig. 4.28 Univariate plot with more annotations and comparison of two models to the simulated Cauchy data, with n = 10,000. The solid red curve corresponds to a Cauchy cdf, the green curve to a Gaussian cdf. The dashed lines show pointwise 95% confidence bounds for the two models.
computed by using the standard confidence interval for a binomial parameter and transforming by the function g(p). Note how the plot clearly shows non-Gaussian behavior over virtually the entire range: the data is leptokurtic with more values near the origin and much heavier tails than a Gaussian model, even though the estimated variance for the fitted Gaussian model is inflated by the extreme values in the data.
4.10.2 Likelihood ratio tests and goodness-of-fit tests If a data set X = (X1, . . . , Xn ) comes from the stable family, then a likelihood ratio test (LRT) can be used to examine whether the data is from a specified subset of stable laws. The most common case is to test H0 : normal distribution vs. H1 : a non-normal stable distribution using the test statistic λ(X) = where L(θ |X) =
n j=1√ f (X j |θ)
L(θ N ORM |X) , L(θ M L )
is the likelihood of θ = (α, β, γ, δ) given the data
X, θ N ORM = (2, 0, s/ 2, X) is the unrestricted maximum likelihood normal fit to the data and θ M L is the maximum likelihood estimate of the stable parameters. The standard theory for LRTs shows that −2 log λ(X) is asymptotically χ2 with 2 d.f. The power of the LRT or other goodness-of-fit method depends on α: for α very close to 2, it is difficult to distinguish between normality and α-stable. However, as α decreases, it gets easier and easier to reject a null hypothesis of normality. Figure 4.29 shows a simulation based evaluation of the power of the LRT to reject normality. That evaluation looped through a range of α values and sample sizes n;
4.10 Assessing a stable fit
203
0.6 0.4
power
0.8
1.0
stable LRT, beta = 0
0.0
0.2
n = 25 n = 50 n = 100 n = 250 n = 500 0.0
0.05 0.5
1.0
1.5
2.0
alpha
Fig. 4.29 Empirical power to detect non-normality using the likelihood ratio test when the sample is simulated from a symmetric stable law with varying α and sample size n.
for each pair (α, n), M = 1000 data sets of size n were simulated and the LRT score was evaluated. If that score exceeded a χ2 critical value with type I error 0.05, the null hypothesis of normality was rejected. The plot shows the symmetric case, but other simulations with β = 0.5 and 1 show essentially the same plot, so the LRT works just as well in the nonsymmetric case. Using the LRT for other purposes is straightforward. Two possibilities are to test for a symmetric stable law (A = {β = 0}); or one-sided stable (A = {α < 1, β = 1}). In such cases, replace replace θNORM in the ratio λ(X) above with θ A, the ML estimate of parameters when the search is restricted to the appropriate subset A of the parameter space. It is possible to use more general goodness-of-fit tests to test for non-normality. This was explored through simulation using the following methods: KolmogorovSmirnov, Shapiro-Wilks, Cramer-von Mises, Anderson-Darling, Lilliefors, and Jarque-Bera. The results showed that the Kolmogorov-Smirnov tests had noticeably lower power that the stable LRT, see Figure 4.30. However, Shapiro-Wilks and the other tests had approximately the same power as the stable LRT—compare Figure 4.29 and Figure 4.31. Other simulations with β = 0.5 and 1 showed very similar power curves as in the symmetric figures shown here. Since the Shapiro-Wilks test is easily accessible (function shapiro.test in base R) and it is fast, we suggest it as a test for non-normality in a stable data set. In general, it appears to be challenging to find an omnibus test for stability. A standard χ2 goodness-of-fit test bins the data and loses information about the tail behavior, making it hard to distinguish stable from non-stable data. Tests based on the empirical cumulative distribution function (ecdf) have more promise, as they take into account all values of the data, including tail values. Zhang (2018) did
204
4 Univariate Estimation
0.6 0.4
power
0.8
1.0
Komogorov−Smirnov, beta = 0
0.0
0.2
n = 25 n = 50 n = 100 n = 250 n = 500 0.0
0.05 0.5
1.0
1.5
2.0
alpha
Fig. 4.30 Empirical power to detect non-normality using the Kolmogorov-Smirnov test.
0.6 0.4
n = 25 n = 50 n = 100 n = 250 n = 500
0.0
0.2
power
0.8
1.0
Shapiro−Wilks, beta = 0
0.0
0.05 0.5
1.0
1.5
2.0
alpha
Fig. 4.31 Empirical power to detect non-normality using the Shapiro-Wilks test.
a small comparison test of the Kolmogorov-Smirnov, Kuiper, Cramer-Von Mises, and Anderson Darling goodness-of-fit tests. In addition, that work compared two methods based on the empirical characteristic function φ(·). The first was a modified “shape” test of Csörgő (1987) based on the empirical characteristic function. Let 0 < u1 < u2 < · · · < uk be a grid where the empirical characteristic function will be evaluated. (See Section 4.5 for discussion of choice of the grid, including u0 below.) Csörgő’s method picks a u0 > uk and defines multiple estimators of the stable index: for j = 1, . . . , k
4.11 Applications
205
αj =
log | log | φ(u j )|| − log | log | φ(u0 )|| . log(u j /u0 )
The modified shape statistic is Tn = n1/2 (max( α j ) − min( α j )), (The original shape α is maxmethod by Csörgő searched for location umax (respectively umin ) where imized (respectively minimized). The second proposed ECF method was a simple squared distance test statistic D2 =
k
− φ(u j )| 2, |φ(u j | θ)
j=1
is the exact characteristic function of a stable law with ML estimated where φ(·| θ) parameters θ = ( α, β, γ, δ). Since all these tests require the estimation of parameters, the bootstrap method was used to determine critical values. The limited simulations in the above work showed that empirical distribution function tests and the empirical characteristic function tests are able to reliably detect large deviations from stability, e.g. a uniform distribution or clear bimodality in the data. However, when applied to simulated samples from a t-distribution with 2 to 10 d.f. and n = 1000, the only test with observed power above 0.05 was the ECF shape test, where observed power was in the range 0.2 (for 2 d.f.) to 0.39 (for 10 d.f.). Since the t-distributions are unimodal and have heavy tails, it is a challenging problem to distinguish such data sets from a stable law. At the current time, we are unaware of any powerful tests for making such discrimination.
4.11 Applications Simulated stable data set A stable data set with α = 1.3, β = 0.5, γ = 5 and δ = 10 and n = 1000 values was generated using the method of Section 3.3.3. The maximum likelihood estimates with 95% confidence intervals are αˆ = 1.284±0.092, βˆ = .466±0.130, γˆ = 5.111±0.369, δˆ = 10.338 ± 0.537. Figure 4.32 shows an ecdfHT plot. Acoustic noise In tropical waters, some species of shrimp (families Alphaeus and Synalpheus) produce loud popping noises by snapping their large claw. When there are many shrimp in an area, they can create a significant amount of acoustic noise, making it difficult for underwater communication systems to operate and for sonar to detect objects underwater. The noise can be impulsive, with spiky behavior that causes
206
4 Univariate Estimation
0.999
quantiles
0.99
0.9 0.75 0.5 0.25 0.1
0.01
0.001 −100
−40
0
40
100
x Fig. 4.32 ecdfHT plot of simulated stable data with n = 1000. The circles are data values, the solid line shows the quantiles of a stable distribution with the estimated parameters, and the dotted lines are the pointwise 95% confidence intervals for the quantiles.
standard linear filters to perform poorly. This section describes the analysis of one underwater noise sample collected in Singapore waters.2 The data file contains 10,000 samples from a signal that is bandpass filtered between 45 kHz and 100 kHz and sampled at 250 kHz. The maximum likelihood estimated parameters, with 95% confidence intervals are α = 1.756±0.0284, β = 0.000±0.0998, γ = 0.001886±0.0000344, δ = 0.0000±0.0000648 (the 0-parameterization is used throughout). For all practical purposes, the skewness parameter β and the location parameter δ are zero, so the data can be modeled by a symmetric stable distribution. Figure 4.33 shows the data and diagnostic plots. The top plot shows the original data—note the spikes in the data. The bottom plot shows a comparison to a stable and normal model for the data. The comparison curves show that a Gaussian model poorly describes the data, missing about 5% of both tails. A stable distribution with α = 1.756 and β = 0 does a very good job of modeling the data, following the data all the way out to the extremes. CRSP stock prices McCulloch (1997) analyzed 40 years (January 1953 to December 1992) of monthly stock price data from the Center for Research in Security Prices (CRSP). The data set 2 This data was kindly provided by Prof. Mandar Chitre of the Acoustic Research Laboratory at the National University of Singapore. See Chitre et al. (2006) for some of their work.
4.11 Applications
207
−0.05
0.00
0.05
acoustic noise, n= 10000
0
2000
4000
6000
8000
10000
diagnostic plot
0.9999 0.95 0.05 0.0001
−0.04 −0.01
0
0.01 0.04
Fig. 4.33 Underwater acoustic data from south Asia. The top plot shows the raw data, the bottom plot shows the ecdfHT plot with scale quantiles=(0.25, 0.50, 0.75) and comparison to two models. The red curve is a stable model with α = 1.756, β = 0, γ = 0.001886, δ = 0 in the 0-parameterization; the green curve is for a N(μ = 3.3838 × 10−7, σ 2 = 1.6174 × 10−5 ) model.
208
4 Univariate Estimation
0.999
quantiles
0.99 0.9 0.75 0.5 0.25 0.1 0.01 0.001 −40
0
x
Fig. 4.34 a diagnostic plot for the CRSP stock price data, n = 480.
consists of 480 values of the CRSP value-weighted stock index, including dividends, and adjusted for inflation. The quantile estimates using McCulloch’s method were αˆ = 1.965, βˆ = −1, γˆ = 2.755 and δˆ = 0.896. McCulloch used ML with an approximation for symmetric stable distributions to fit this data and obtained αˆ = 1.845, βˆ = 0, γˆ = 2.712 and δˆ = 0.673. Our ML estimates with 95% confidence intervals are αˆ = 1.855 ± 0.110, βˆ = −0.558 ± 0.615, γˆ = 2.711 ± 0.213 and δˆ0 = 0.871 ± 0.424. The diagnostics in Figure 4.34 show a close fit to the full ML fit. We note that the confidence interval for αˆ is close to the upper bound of 2 for α and the one for βˆ is large and extends beyond the lower bound of −1, so the asymptotic normality of these parameters has not been achieved and the naive confidence intervals cannot be strictly believed. Still it is very likely α is less than 2, while the bounds on β are not very important for α this large. Radar noise This is a very large data set with n = 320,000 pairs of data points. The two values correspond to the in-phase and quadrature components of sea clutter radar noise. The parameter estimates for the in-phase component are αˆ = 1.7783 ± 0.0049, βˆ = 3.671 × 10−9 ± 0.0186, γˆ = 0.4081 ± 0.00129 and δˆ = −0.000381 ± 0.002473. The quadrature component has very similar estimates of the parameter values. With this large sample size, the confidence intervals for the ML parameter estimates are very small. Again the correct question is not how tight the parameter estimates are, but whether or not they fit accurately describes the data. The ecdfHT plot in Figure 4.35 show a close stable fit. Figure 4.36 shows a large-vertical scale.
4.11 Applications
209
0.99999 0.9999 0.999
quantiles
0.99 0.9 0.75 0.5 0.25 0.1 0.01 0.001 0.0001 1e−05
−10 −2
0
2
10
x Fig. 4.35 ecdfHT plot for the in-phase component of sea clutter radar noise, n = 320000. The green line is the stable fit, the red is a normal fit.
Bond data This application uses a data set3 of returns of an index based on AAA corporate bonds, 5–7 year term, for the period 31 December 1997 to 10 November 2009. It consists of daily returns for this period, with a total of 3101 values. Maximum likelihood estimation of the stable parameters are α = 1.69, β = −0.115, γ = 0.192, and δ = 0.0363. Figure 4.37 shows the diagnostic plot, with a comparison to the stable fit and a normal fit. Here we see that the stable model does a good job of describing the bulk of the distribution, say from the 0.01 quantile to the 0.99 quantile. The extreme tails of the data do not seem to be as extreme as this fit predicts. There may 3 The data is used with permission from ICE Data Indices, LLC.
4 Univariate Estimation
quantiles
210
0.99999 0.9999 0.999 0.99 0.9 0.5 0.1 0.01 0.001 0.0001 1e−05
−10 −2
0
2
10
x Fig. 4.36 Same plot as above, but with expanded vertical limits to see more of the Gaussian tail in red.
be multiple reasons for this. One is that there is some serial correlation in the data and burstiness—periods where the volatility is high. (It is possible to use a GARCH model to deal with the burstiness, but that will not be pursued here. Nolan (2014) shows that the residuals after a GARCH fit results in a high value of α, i.e. a model with lighter tails, though the observed tail is still lighter than this revised fit.) Figure 4.38 shows a close-up view of the same data, focusing on the lower left part of the graph, i.e. looking at losses. (A similar analysis and conclusion holds for the upper tail.) The vertical scale was extended to show more of the normal model fit to the data. Here the difference between the stable fit and the data does not look so bad. Notice that normal model, which is frequently used in practice, is 11 orders of magnitude different from the observed data! The Gaussian model grossly underestimates the risks of extreme losses which occurred multiple times during
4.11 Applications
211
0.9999 0.999
quantiles
0.99 0.9 0.75 0.5 0.25 0.1 0.01 0.001 0.0001 −3 −2
−1
0
1
2
3
x
Fig. 4.37 Graphical diagnostic for the corporate bond data. The circles show the data, the black curve shows the stable fit, and the red curve shows a normal fit.
this 12 year time period. One can use other models for this data, e.g. a Student t-distribution with 5 d.f., but then you lose the convolution property: cumulative returns are no longer of the same type as the individual returns. Using a model that overestimates the tails might be preferable to one that underestimates for riskadverse investors. In particular, financial regulators might favor such a conservative approach. Simulated non-stable data We simulated several data sets that were not stable and used our diagnostics to assess the fit with a stable model. Since the data is not stable, we shouldn’t expect a good fit to the data; rather our focus is on whether or not we can detect the non-stability in a data set. The first example is a location mixture—a 50%–50% mix of two α = 1.5 stable distributions with different location parameters δ = ±3. This data set is clearly bimodal, which shows clearly in a histogram or density plot. It also shows up in the diagnostic plot in Figure 4.39, where the data (circles) are not following the stable fit (solid curve).
quantiles
212
4 Univariate Estimation
0.5 0.1 0.01 0.001 0.0001 1e−05
1e−10
1e−15 −6
−3
−1
0
x
Fig. 4.38 Close up of the lower left portion of Figure 4.37.
0.999
quantiles
0.99 0.9 0.75 0.5 0.25 0.1 0.01 0.001 −10
−2
0
2
10
x
Fig. 4.39 Diagnostic plot for a simulated location mixture with two terms n1 = 250 observations from a S (1.5, 0, 1, −3; 0) law and n2 = 250 from a S (1.5, 0, 1, +3; 0) law. The diagnostic shows the sample quantiles differ from the maximum likelihood fit of α = 1.93, β = 0.485, γ = 2.56, and δ = 0.075.
A second simulation is an α mix of two stable laws. The data set has 900 values from a normal distribution and 100 values from an α = 1.5 distribution. Figure 4.40 shows an ecdfHT plot of one such sample. Not surprisingly, the MLE estimates of the parameters give an intermediate value of α = 1.781. This fit does a good job of describing the data over most of the range but underestimates the extreme tails, which come from the 1.5-stable term.
4.12 Estimation when in the domain of attraction
213
0.999
quantiles
0.99 0.9 0.75 0.5 0.25 0.1 0.01 0.001 −10
−2
0
2
10
x
Fig. 4.40 Diagnostic plot for a simulated α mixture with two terms n1 = 900 observations from a Gaussian (S (2, 0; 0)) law and n2 = 100 from a S (1.5, 0; 0) law. The diagnostic shows the sample quantiles differ from the maximum likelihood fit of α = 1.781, β = 0.089, γ = 1.107, and δ = 1.014.
4.12 Estimation when in the domain of attraction When X is in the domain of attraction of a stable law, it may be of interest to estimate the parameters of the limiting stable law. The methods in Sections 4.2 and 4.3 only estimate the tail index α. The methods in Sections 4.4–4.8 should not be used directly because they all use the whole distribution. Recall that the parameters of the limiting stable law depend only on the tail behavior, so anything can happen for small and moderate values of X, including gaps in the support or a discrete range or multimodality. Such behavior will make the previous estimation methods inappropriate. Because of this arbitrary behavior over most of the range, it is unlikely that there are reliable methods in such general cases. Nonetheless, some estimation method may still be of use when there is some regularity in the distribution. Assume that we have an i.i.d. sample X1, . . . , Xn , each a copy of X. We describe two methods—one uses sums of sample values and the other uses the tail behavior. The first method uses the fact that sums of copies of X get closer and closer to a stable law. So pick some integer k ≥ 2 and take sums of k terms, say Y1 = X1 + · · · + Xk Y2 = Xk+1 + · · · + X2k .. . Ym = X(m−1)k+1 + · · · + Xmk ,
214
4 Univariate Estimation
where m = n/k. We now take this i.i.d. data set Y1, . . . , Ym and use any of the γY , δY ). Then estimates for the previous methods to estimate the parameters ( αY , βY , parameters of the limiting stable law are: α= αY , ⎧ ⎪ δY ⎨ ⎪ δ= ⎪ δ ⎪ ⎩ Y
β = βY , − γY
α γ= γY k −1/ ,
α βtan(π α/2)(k 1/
− γY βπ2 k log k /k
− k) /k
(4.10) α1 α = 1,
see Problem 4.12. The practical problem is how to choose the number of terms k in each sum above. Picking a large k will make Y1, . . . , Ym closer to the stable limit. But this makes m = the size of the Yj data set small and makes the parameter estimates more variable. One suggestion is to use the graphical diagnostics described below to see if this sample is close to stable. There are variations on this approach. First, summing non-consecutive values may help if the sample has serial dependence. If it is known that the limiting law is symmetric stable, one can take alternating signs, e.g. Y1 = X2 − X2 + X3 − X4 , etc., when k = 4. These terms are symmetric around 0 so we only need to estimate the two parameters α and γ. The second method uses the tails of the data to estimate α, β and γ using the R package ecdfHT, see Section 4.10.1. Then the median is used to estimate δ. We will assume | β| < 1, so that both tails of the limiting stable law have the same asymptotic power law decay. First, choose some thresholds for the lower tail and upper tail quantiles, say p0 = 0.10 to use the leftmost 10% of the data to estimate the lower tail and rightmost 10% to estimate the upper tail. The steps are: 1. Use function ecdfHT(x, c(p0, 1/2, 1 − p0 )) to do a preliminary plot of the data. Check that the tails are approximately linear on the ecdfHT plot. 2. Use function ecdfHT.fit to estimate tail behavior for the cdf: F(x) ∼ c1 /|x| α1 on the left and 1− F(x) ∼ c2 /x α2 on the right. Examine the two estimated tail indices; if they are noticeably different, this implies that the two tails have different decay rates and the data cannot be in the domain of attraction of a stable law with | β| < 1. 3. Average the left and right tail estimates from step 2 to get an estimate of the index of stability α. 4. Using the estimated tail constants, let β = (c2 − c1 )/(c2 + c1 ) be an estimate of the skewness. where cα = sin(πα/2)∗ 5. Estimate the scale with γ = c1 /(cα (1− β))+c 2 /(cα (1+ β)), Γ(α)/π. γ 6. Finally, estimate the location parameter by δ = median(X1, . . . , Xn ) − where median(α, β) is the median of an S (α, β; 0) r.v. median( α, β), Problem 4.13 justifies the steps above. A difficulty with this approach is the familiar problem that there is no general way to say when the power law tail begins, e.g. what value of p0 to choose. If
4.13 Fitting stable distributions to concentration data
215
the sample is actually from a stable law, then we saw in Section 3.6, that unless α is small, one has to go far out on the tails to see the asymptotic tail behavior, leaving a small effective sample. While it is possible that the data may have tails that show the power law decay better than the limiting stable law, in general, we do not know where to measure the tail. Also, the estimate of the location δ is likely to be unreliable if the distribution of X is poorly behaved around median(X1, . . . , Xn ). Finally, Problem 4.14 examines the cases where β = ±1.
4.13 Fitting stable distributions to concentration data This section is motivated by a problem in geology, described in Meerschaert et al. (2002). To measure how fluid flows underground, geologists drill a series of holes in the ground, insert a pipe at each location x1, . . . , xn and put a cap on the pipe, making a sequence of wells. For simplicity here, we will assume the holes are in a straight line. At one of the wells, a tracer is injected into the ground. After a certain time t a sample of groundwater is taken from each well and the concentration of tracer yi at well location xi is measured. Figure 4.41 illustrates the experiment. The objective is to find a function f (x) that describes the data, e.g. so that yi ≈ f (xi ). When a toxic pollutant is flowing underground, using an accurate model is important to predict how long it will take to reach a point, say an aquifer. Such data arises in other situations, e.g. measuring concentration of a chemical in a chromatography experiment. Note that this is not a standard estimation setup where we have an i.i.d. sample for observations. Instead, we are measuring a concentration and fitting that with a scaled density. In the traditional model for this situation, the tracer is diffusing as a Brownian motion and the concentration values follow a scaled normal distribution. In practice, there is sometimes super-diffusion: some of the tracer is moving much faster than a normal distribution allows. This gives heavy tails for this concentration curve, motivating the use of stable curves used to fit this data. We now describe an approach to this problem given in Rishmawi (2005). It provides a plausible physical model for this experiment and a way to analyze such data. Molecules of the tracer solution diffuse through the ground on a microscopic scale. In addition, there may be a drift, where everything in the medium moves in a particular direction. We will assume a constant drift velocity. The concentration in a sample is the number of molecules of the tracer present divided by the volume of the sample. We assume all sample volumes are the same and model this system as an urn model: a given molecule of the tracer is in urn/well j at time t with probability p j . Let pn+1 be the probability of a molecule not being in any of the wells. Since the volume of each well is small, we expect to have p j t) ∼ c2 t (n−j+1)α . The densities satisfy fX j:n (−t) ∼ jαc1 t jα−1 and fX j:n (t) ∼ (n− j+1)αc2 t (n−j+1)α−1 . Also, the average of the order statistic densities is the density of the terms: (1/n) nj=1 fX j:n (x) = f (x). These facts are from Rimmer (2014). Problem 4.2 Show that if X (j) is the j th order statistic from a sample of size n from an α-stable distribution, then E |X (j) | m is finite if m/α < j < n − 1 − m/α. When | β| < 1, this condition is necessary and sufficient for the mth moment to be finite. See Mohammadi and Mohammadpour (2011). Problem 4.3 Show that in Section 4.3, Var(M j )=Var(m j )=π 2 /(6α2 ). Problem 4.4 Let F be an arbitrary cdf, φ(·) the associated characteristic function, Fn the empirical cdf for an i.i.d. sample from F, and φn (·) the empirical characteristic function. (a) Show that for any fixed u, and large n, φn (u) is approximately normal with mean φ(u) and variance (1 − |φ(u)| 2 )/n. (b) Show that for fixed u, v, Cov(φn (u), φn (v)) = (φ(u − v) − φ(u)φ(−v))/n. An interesting geometric interpretation of the empirical characteristic function is given by Epps (1993). The uniform convergence √ of φn (·) to φ(·) is more involved. In particular, under mild regularity conditions, n(φn (u) − φ(u)) converges weakly to a continuous Gaussian process on any compact set |u| ≤ K. More information about this convergence can be found in Csorgo (1981) and Marcus (1981). Problem 4.5 Show that when 0 < p < min(1, α) and c(α, p) is given by (4.5), the product c(α, p)c(α, −p) is strictly monotonic for p < α < 2. Further, show that for small p, the product is relatively flat for α > 1. Problem 4.6 (Open question) Can one use Theorem 3.8 to estimate all four parameters of a stable law in the non-strictly stable case? Problem 4.7 Show that the log likelihood in (4.6) tends to ∞ as α → 0. Problem 4.8 Consider the FLOM, LOGABS or USTAT methods of estimating parameters. When β 0 the RMSE for β is exactly | β| and when δ 0, the RMSE for δ is exactly |δ|. Problem 4.9 Simulate n = 500 data values from a S (1.5, 0; 0) distribution. Fit the parameters and use standard p-p and q-q plots, as well as stabilized p-p and q-q plots. Repeat with α = 1.0, 0.5. Discuss the results. Problem 4.10 Simulate n = 500 data values from some non-stable distribution. Fit it with a stable distribution, do the diagnostics and discuss.
222
4 Univariate Estimation
Problem 4.11 Find reasonable value(s) of the smoothing parameter for the sample pdf when the data is S (α, β, γ, δ; 0) with α < 1.5. Problem 4.12 Use (1.8) to justify the estimators in (4.10). Problem 4.13 Use Theorem 3.12 to explain the tail based algorithm on page 214 for estimating the parameters of the limiting stable law for X in the domain of attraction. Problem 4.14 Adapt the tail based algorithm on page 214 to handle the cases β = ±1. The program stable is available at the website given at the end of the preface. It will compute stable densities, d.f., quantiles, simulate stable variates, and estimate parameters. Also at that website are directions on how to use that program and data sets for the following problems. Problem 4.15 Use the stable program to estimate the parameters (α, β, γ, δ) for the data in the file Estimate1.dat Problem 4.16 Use the stable program to estimate the parameters (α, β, γ, δ) for the data in the file Estimate2.dat Problem 4.17 Use the stable program to estimate the parameters (α, β, γ, δ) for the data in the file Estimate3.dat
Chapter 5
Stable Regression
Ordinary least squares (OLS) is a well-established and important procedure for solving regression problems. In the case of regression with normally distributed errors, the OLS solution is the same as the maximum likelihood solution. Although small departures from normality do not affect the model greatly, errors from a heavy tailed distribution will generally result in extreme observations that can greatly affect the estimated OLS regression coefficients. In this chapter, we are interested in regression when the error terms are from a heavy tailed stable law. First linear regression will be developed, then different ways of giving confidence intervals for the parameters are given. A simulated and a financial data set will be analyzed. Then nonlinear regression will be developed, followed again by confidence interval construction and finally a simulated exponential growth example. The standard linear regression model is yi =
k
xi, j θ j + i
i = 1, ..., n,
(5.1)
j=1
where xi, j are independent variables, yi are the response variables, θ j are the coefficients of the regression, to be estimated, and the error terms i are i.i.d. random variables. The reasoning used to justify the normal model for the error terms in (5.1) is that the error is the sum of many unmeasured terms. If the unmeasured terms have a finite variance, then the central limit theorem says that the error terms will be approximately normal. But when the unmeasured terms are heavy tailed, the generalized central limit theorem says that the error terms will be approximately stable. For linear regression when the errors are normally distributed, the OLS solution is the same as the ML solution. When the errors are not normally distributed, these two methods are not the same. Robust estimation methods use some technique that removes or downplays outliers. Here we do not use robust estimation, but instead use robust modeling, as advocated in Lange et al. (1989). In contrast to a robust © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_5
223
224
5 Stable Regression
estimation method, robust modeling is interested in both estimating the regression coefficients, and in fitting the error distribution. Specifically, we use a heavy tailed stable model for the errors and ML to estimate all the parameters simultaneously. There are several papers in the literature on this topic. Blattberg and Sargent (1971) proposed two methods for estimating a single regression coefficient in the infinite variance stable case. First, they developed a best linear unbiased estimator (BLUE), assuming symmetric errors and that the index of stability α is known. Those authors credit Wise (1966) with suggesting this approach. Second, they applied minimum sum of absolute error (MSAE), also called least absolute deviation or L1 regression, where the regression coefficients minimize the sum of absolute deviations n i=1 |yi − (θ 1 xi,1 + · · · + θ k xi,k )|. They showed through simulation, that when 1 < α < 1.7, BLUE and OLS perform approximately the same, and that MSAE is better than either. In contrast, when 1.7 ≤ α < 2, OLS outperforms BLUE, which slightly outperforms MSAE. Heathcote (1982) discussed functional least squares, using a loss function that depends on the sample characteristic function of the data. These are all robust estimation procedures, and they do not yield a parametric fit to the residuals. El Barmi and Nelson (1997) considered a stable regression model, but they have restrictive assumptions: they assume α is known, β is known and zero, and they do not estimate the scale γ. McCulloch (1998a) used ML to estimate the linear regression coefficients when the errors are symmetric stable. More recently, Samorodnitsky et al. (2007) reconsidered BLUE regression in a more general setting, where the independent variables can also be random, with the independent variables and the dependent variables having any combination of light or heavy tailed distributions. Their analysis is informative and interesting: it shows when the BLUE estimates are consistent and gives optimal convergence rates for BLUE in terms of the tails of both the error distribution and the independent variable distribution. Autoregressive time series models for α-stable models are studied in Andrews et al. (2009). Hallin et al. (2011) and Hallin et al. (2013) use rank-based methods to fit linear models in the presence of infinite variance error terms. The results in the current paper are based on dissertation work in Ojeda-Revah (2001) and subsequent developments in Nolan and Ojeda-Revah (2013). The ML approach described here extends previous work in several ways. It allows for nonsymmetric error distributions, allows a larger range for α than McCulloch (1998a), simultaneously estimates multiple regression coefficients θ 1, . . . , θ k and the parameters of the stable error distribution, increases the accuracy and speed of the computations, gives asymptotic joint confidence regions for all the parameters with normal distributions for the estimates and n−1/2 rate of convergence, and extends the ML approach to nonlinear regression with stable error terms.
5.1 Maximum likelihood estimation of the regression coefficients The regression model (5.1) can be written in matrix form as
5.1 Maximum likelihood estimation of the regression coefficients
225
y = Xθ + , where X = (xi, j )n×k is the design matrix, θ = (θ 1, θ 2, . . . , θ k )T are the regression coefficients, and = (1, 2, . . . , n )T are the errors. We will assume that there is a constant term in the model, say xi,1 = 1 for all i. Then there are k +3 parameters to be estimated: (φ, θ) = (α, β, γ, θ 1, . . . , θ k ). There is no explicit location parameter for the stable distribution as it is replaced by the intercept term θ 1 . Let f ( |φ) = f ( |α, β, γ) be the density of a S(α, β, γ, 0; 2) random variable. The motivation for using the 2-parameterization is discussed below. The log-likelihood under model (5.1) is (φ, θ) =
n i=1
log f (i |φ) =
n
log f (yi − (xi,1 θ 1 + ... + xi,k θ k )|φ).
(5.2)
i=1
The coefficients are found by numerically maximizing (φ, θ) with respect to the k + 3 parameters (α, β, γ, θ 1, . . . , θ k ). The specific steps are • Perform an initial OLS fit to the data. • Using the residuals from the previous step, perform a trimmed OLS fit. Two quantiles are used to select a trimmed data set. The defaults are p1 = 0.1 and p2 = 0.9, i.e. the lowest and highest 10% are trimmed away. A second OLS fit is performed on the trimmed data to find an initial estimate θ0 of the regression coefficients. • The residuals from the previous step are fitted by ML to get initial estimates γ0, δ0 for the stable parameters. α0, β0, γ0, θ 01 − δ0, θ 02, . . . , θ 0k ). This • The initial estimate for all parameters is ( α0, β0, is used as the starting value for a numerical optimization routine to find the maximum of the log-likelihood (φ, θ) using all the data. This algorithm for stable ML estimation of the linear regression coefficients has been implemented in C for speed reasons. There are interfaces to both the R program and the MATLAB program to make it easier to use interactively. It uses the STABLE program to numerically evaluate the likelihood (5.2). The method is quite fast: using the approximation to stable densities gives execution times of a second on a standard desktop computer for moderately sized data sets. The program works for all 0.2 < α ≤ 2 and all −1 ≤ β ≤ 1. For α ≤ 0.2, there are numerical difficulties in computing stable densities; we are unaware of applications that require such heavy tails. After the ML fit is computed, it is useful to use diagnostics to assess the fit. One standard diagnostic is to plot the residuals and look for any patterns. The distribution of the residuals can also be compared to the stable distribution with the final parameters α, β and γ. The examples below show both of these plots. The choice of parameterization for the stable error distribution affects the meaning of the intercept θ 1 . This is related to the meaning of regression in the presence of skewed errors. Suppose the error term is stable. If α > 1 and β = δ = 0, then E() = 0 and thus E(y) = E(Σ j x j θ j + ) = Σ j x j θ j , so the estimates are unbiased as in OLS. However, this is not the case when there is skewness if we use a continuous
226
5 Stable Regression
parameterization: when α > 1 and β 0, then E() 0, and so E(y) Σ j x j θ j . When α ≤ 1, E(y) does not exist, so the concept of being unbiased is undefined. Using the S(α, β, γ, δ; 2) parameterization guarantees that we are doing “modal” regression when the errors are nonsymmetric. This means that the fitted regression line/plane goes through the center of the data, which is not the same thing as centering on the mean when there is skewness. The program uses the 2-parameterization internally, as this is numerically well behaved, and centers the regression curve on the mode of the data. However, once the optimization is complete, the program will output the stable parameters in any parameterization the user chooses. If the 2-parameterization is specified, the stable parameter estimates will be (α, β, γ, 0). If the 0- or 1- parameterizations are used, a (generally non-zero) value of δ will be given. In this case, the user can adjust the intercept from θ 1 to θ 1 + δ to make the error distribution have location δ = 0. In the 1-parameterization when α > 1, this will result in unbiased regression. Note that this will be numerically unstable for α near 1, and can result in shifting the regression line/plane so that it is not centered on the data.
5.1.1 Parameter confidence intervals: linear case The ML approach to this regression problem has many desirable properties. In particular, the regression coefficients are asymptotically normal and maximally efficient as long as the stable parameters are in the interior of the parameter space (0, 2] × [−1, 1] × (0, ∞) × (−∞, ∞). This result follows from the argument in DuMouchel (1973a).1 Let ψ = (α, β, γ, θ 1, . . . , θ k ) and J be the (k + 3) × (k + 3) Fisher information matrix of θ. The entries of the matrix J are given by ∂ ∂ Jik = E log f ( |φ) log f ( |φ) . ∂ψi ∂ψk Next we show how to express J in terms of the Fisher information matrix for the estimation of (α, β, γ, δ) in the i.i.d. stable case. Let I = (I j,l ) = (I j,l (α, β, γ, δ)) be the 4×4 Fisher information matrix of (α, β, γ, δ) from the ML estimation problem, where the data is an i.i.d. sample from a stable distribution which was computed in Section 4.7. Ojeda-Revah (2001) showed that the Fisher information matrix J for the regression problem is given by
nI1:3,1:3 I4,1:3 x ·1 I4,1:3 x ·2 · · · I4,1:3 x ·k J= (5.3) T I4,1:3 x ·1 I4,1:3 x ·2 · · · I4,1:3 x ·k I4,4 XT X, 1 DuMouchel (1973a) did not use a continuous parameterization, and hence had to limit the skewness as α → 1. Using a continuous parameterization avoids this limitation.
5.1 Maximum likelihood estimation of the regression coefficients
227
where
n I I I I 1,1 1,2 1,3 4,1 I1:3,1:3 = I2,1 I2,2 I2,3 , I4,1:3 = I4,2 , x · = x j, . j=1 I I I I 3,1 3,2 3,3 4,3
The form of the upper left part of J is straightforward; we now give a sketch of the proof for the remaining elements. Let g(x|α, β, γ, δ) be the pdf of a stable distribution with all four parameters, fix i and let i = yi − (θ 1 xi,1 + · · · + θ k xi,k ) be the error at the i-th data point. Using the fact that we have a scale-location family, f (i |φ) = g(yi − (θ 1 xi,1 + · · · + θ k xi,k )|α, β, γ, 0) = g(yi |α, β, γ, θ 1 xi,1 + · · · + θ k xi,k ). Thus for 1 ≤ m ≤ k,
xi,m (∂/∂δ)g(yi |α, β, γ, θ 1 xi,1 + · · · + θ k xi,k ) ∂ log f (i |φ) = ∂θ m g(i |α, β, γ, 0) (∂/∂δ)g(i |α, β, γ, 0) ∂ = xi,m = xi,m log g(i |α, β, γ, 0). g(i |α, β, γ, 0) ∂δ Therefore
log f (i |φ) ∂θ∂ l log f (i |φ) 2 ∂ log g(i |α, β, γ, 0) = xi,m xi,l I4,4 . = xi,m xi,l E ∂δ E
∂ ∂θm
The entry of J corresponding to θ m and θ l is the sum over i of such terms, and n xi,m xi,l I4,4 . This shows that the bottom right block of (5.3) hence J3+m,3+l = i=1 is correct. The upper right and lower left blocks are similar, involving products of a partial derivative with respect to a stable parameter and a partial derivative with respect to some θ m . The asymptotic variances and covariances of the parameters ψ are obtained from J−1 . If one of the parameters α or β is on (near) a boundary of the parameter space, then this approach will not work, as inverting J is impossible (poorly conditioned). In the examples below, the diagonal entries of J−1 are used to give confidence intervals for the individual parameters, but elliptical confidence regions for multiple parameters are straightforward using Draper and Smith (1981). An example is given in the next section. As noted in DuMouchel (1975) and Nolan (2001), the information matrix I is not block diagonal in general: every parameter is correlated with every other parameter, and J inherits this property. We note that this approach is general and works with any model f (x) for the error distribution that is a scale-location family. Suppose, that is, that the distribution of errors has m parameters φ = (φ1, . . . , φm ) and X is the n× k data matrix as above. For notational convenience, assume the parameters are ordered so that the last parameter is the location. The m × m Fisher information matrix for ML estimation of an i.i.d. sample from f (x) can be used to derive the (k +m−1)×(k +m−1) Fisher information matrix for the regression problem (5.1) with this error distribution in the same way as above.
228
5 Stable Regression
A second approach to finding confidence intervals is to use profile likelihood as in Venzon and Moolgavkar (1988). Let ψ ∗ = (φ ∗, θ ∗ ) be the ML estimator of the parameters. To obtain a confidence interval for ψ ∗j , define j (t) = ∗ ), i.e. the log-likelihood with all parameters except (ψ1∗, . . . , ψ ∗j−1, t, ψ ∗j+1, . . . , ψk+3 ψ j fixed. The profile likelihood confidence interval is found by varying t on either 2 (1), where χ 2 (1) is the critical value of a χ 2 side of ψ ∗j until j (ψ ∗j ) − j (t) = χ1−p q distribution with 1 degree of freedom and significance level q. The (1− p) confidence interval for ψ j is then (t −, t + ), where the endpoints are the points to the left and right of ψ ∗j described above. A third way to obtain confidence intervals is by simulation. Given a set of parameters ψ ∗ = (φ ∗, θ ∗ ) and a data matrix X, simulate M data sets using the model M . The (1 − p) simulation 1, . . . , ψ (5.1) and save each set of parameter estimates ψ confidence intervals are the p/2 and 1 − p/2 empirical quantiles of these values. This is computationally intensive, but may better show the variability in the parameter estimates for small sample sizes. To compare the methods for finding parameter confidence intervals, simulations were performed. We first simulated a simple linear model yi = θ 1 + θ 2 xi + i , with distributed as S(α, β, 40, 0; 0) with θ 1 = 10, θ 2 = 0.1, α = 1.5 and β = 0.5, and xi uniformly spaced on (0, 100) with n = 5, 10, 15, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500 points. For each value of n, the asymptotic ML confidence intervals were computed using J from (5.3). Each simulation was repeated M = 500 times and parameter estimates were found using ML and the profile likelihood confidence intervals were computed. At the end of the simulations, for each value of n, the empirical quantiles of the parameter estimates were used to get simulation-based confidence intervals for each parameter. Figure 5.1 shows the results, with one plot for each of the parameters α, β, γ, θ 1 , θ 2 . The solid curve shows the half-interval width √ of confidence intervals based on asymptotic ML; as expected it decays at rate 1/ n. The dashed curve shows the mean of the M different profile likelihood halfwidths, and the dotted curve shows the half-width of the simulation-based confidence √ interval. As n increases, the curves converge to common values with decay 1/ n. For the parameters α, β, θ 1 and θ 2 , there is a close agreement between the interval widths for the theoretical asymptotic ML confidence intervals and the simulation-based confidence intervals for n ≥ 30. For these same parameters, the profile likelihood method seems to underestimate the parameter variability. In contrast, for γ, the theoretical asymptotic ML confidence intervals and the profile likelihood are close, but both underestimate the variability in γ . With moderate and large sample sizes, the three methods give comparable confidence intervals, therefore we will show only the asymptotic ML confidence intervals in the examples below.
5.1 Maximum likelihood estimation of the regression coefficients
229
α
0.0
0.0
1.0
0.6
2.0
1.2
β
100
200
300
400
500
0
100
200
n
n
γ
θ1
300
400
500
300
400
500
0.0
0.0
0.4
1.0
0.8
2.0
0
0
100
200
300
400
500
n
0
100
200
n
θ2
0.000
0.020
asymptotic ML profile likelihood simulation 0
100
200
n
300
400
500
Fig. 5.1 Results of simulations assessing the confidence interval methods. The model is y = θ1 + θ2 x + = 10 + 0.1x + . The curves show the half-widths for 95% confidence intervals for each of the five parameters as a function of the sample size n.
5.1.2 Linear examples In this section, we present examples of linear regression with simulated data to compare the accuracy of the regression coefficients fitted by OLS and stable ML regression. The first example involves simulated data. Five hundred random x values were generated uniformly from the interval (0, 100), and then y values were generated by y = 10 + 0.1x + , where ∼ S(1.5, 0.5, 1, 0; 2). Figure 5.2 shows the scatter plot with the OLS and stable ML regression lines, as well as the parameter estimates and diagnostics on the residuals. In this data set there are a large number of data points, and several very large extremes. The ML method gives the best results, those values θ 2 = 0.0989 ± 0.0072. with 95% ML confidence intervals are θ 1 = 10.02 ± 0.43 and The program developed is quite fast: on a 2.4 GHz Intel processor (single threaded),
230
5 Stable Regression
a simulation like the one above with n = 500 data points takes approximately 0.2 seconds.
25 20
y
stable ML coeff.
alpha = 1.42 beta = 0.605 gamma = 1.98
10
OLS coefficients 11.17 0.1022
15
scatter plot, n= 500
OLS stable ML
10.01 0.09877
5
(2−parameterization)
0
20
40
60
80
100
x
0.00
−10 0
0.05
0.10
10 20 30 40 50 60
0.15
residuals
0
20
40
60
x
80
100
−10
−5
0
5
10
residuals
Fig. 5.2 Simulated data set 1: y = 10 + 0.1x + with stable errors having parameters α = 1.5, β = 0.5 and γ = 1. The vertical limits on the scatter plot leave out several extreme data points. The upper right corner shows the parameter estimates for OLS and stable ML regression.
A second example is also a simulation, using the same equation y = 10 + 0.1x + , but the errors are Pareto with tail index α = 1.5. These errors are heavy tailed but not stable. Figure 5.3 shows the results, with the stable regression giving the best estimates of the regression coefficients, even though the errors are not stable. The errors are in the domain of attraction of a stable law with index α = 1.5 and β = 1. The OLS fit is poor; the estimate of the slope is heavily influenced by a few large extreme values. Note that the estimated stable parameters, α = 0.723 and β = 1.00, are not meaningful in this more complicated situation, yet even though the errors are not stable the stable ML regression coefficients give much better estimates of the coefficients than OLS. In this case, a model allowing skewed heavy tails, even when the error distribution is incorrectly specified, performs well. Thus stable regression seems to be a robust method for heavy tailed errors, working reasonably with heavy tailed
5.1 Maximum likelihood estimation of the regression coefficients
231
non-stable data. However, using the confidence interval methods described above is inappropriate when the error terms are not close to stable, so we omit them here. The stable parameters (α, β, γ) may be treated as nuisance parameters in cases like this, where the residuals are not stably distributed.
y
10 15 20 25 30
scatter plot, n= 500 OLS coefficients 12.73 0.09981 stable ML coeff.
11.14 0.1
alpha = 0.723 beta = 1 gamma = 0.164
5
OLS stable ML
0
(2−parameterization)
0
20
40
60
80
100
x
0
0.0
20
0.5
40
1.0
60
1.5
80
residuals
0
20
40
60
80
100
−5
x
0
5
10
residuals
Fig. 5.3 Simulated data set 2: y = 10 + 0.1x + with Pareto α = 1.5 errors. The vertical limits on the scatter plot leave out several extreme data points.
The last example of linear regression uses the U.S. Federal Reserve Board weekly data on interest rates.2 We selected the rates for 10 year U.S. constant maturity bonds and AAA corporate bonds for the time period 2008 to 2009. Week to week differences were computed and a linear regression was performed with x being the difference in 10 year bond rates and y being the difference in AAA bond rates. This example is of interest because the slope estimate can be used to hedge a bond position. The results are shown in Figure 5.4. Here the residuals are heavy tailed and skewed toward the right, and OLS and the stable regression method give different estimates of the slope and intercept: 0.0145 and 0.693 for OLS vs. −0.00257 and 0.746 for stable ML. The five estimated parameters and their 95% confidence intervals using 2 Available online at www.federalreserve.gov/releases/h15/data.htm.
232
5 Stable Regression
stable ML regression are α = 1.34 ± 0.28, β = 0.611 ± 0.367, γ = 0.0274 ± 0.0047, θ 1 = −0.00257 ± 0.00731, and θ2 = 0.746 ± 0.060, n = 104. The asymptotic full 5-dimensional elliptical 95% confidence region is given by 2 (θ − θ ∗ )T J(θ − θ ∗ ) ≤ χ0.05 (5) = 11.0705,
where θ ∗ = ( α, β, γ, θ1, θ2 ) = (1.34, 0.611, 0.0274, −0.002567, 0.7456), and the computed Fisher information matrix is 55.47 −8.46 −403.44 −647.12 4.48 640.44 −4.43 −8.46 33.24 −374.75 J = −403.44 −374.75 198244.97 −40408.32 279.75 . −647.12 640.44 −40408.32 96386.28 −667.29 4.48 −4.43 279.75 −667.29 1066.74
y
−0.2 −0.1 0.0 0.1 0.2 0.3
scatter plot, n= 104
−0.3
OLS coefficients 0.01451 0.6934 stable ML coeff.
−0.002567 0.7456
alpha = 1.34 beta = 0.611 gamma = 0.0274 (2−parameterization)
OLS stable ML
−0.2
−0.1
0.0
x
0.1
0.2
0.3
−0.3
0
−0.1
2
0.0
4
6
0.1
8
0.2
10
residuals
−0.2
−0.1
0.0
x
0.1
0.2
0.3
−0.1
0.0
0.1
0.2
residuals
Fig. 5.4 Linear regression of change in weekly interest rates for 10 year U.S. bonds (x) vs. AAA corporate bonds (y) for 2008–2009.
5.1 Maximum likelihood estimation of the regression coefficients
233
To get a broader perspective on the performance of the algorithms when the errors are stable, large scale simulations were performed. We used a simple linear model: yi = θ 1 + θ 2 xi + i , with distributed as S(α, β, γ, 0; 2) with θ 1 = 50, θ 2 = 20, α = 1.8, 1.3, 0.8, β = 0, and γ = 40 for sample sizes of n = 20, 30, 50, 75, 100, 250, 500, 1000 and random xi drawn from a uniform distribution on (0,100). Ojeda-Revah (2001) shows the results are similar for other simulations, including the nonsymmetric case. For each combination of the parameters above, we simulated M = 200 data sets, and from each sample, we estimated the regression coefficients by OLS and stable ML. From each set of estimates θ i,1, . . . , θ i, M for parameter i, we computed the 1/2 M 2 /M root mean square error RMSE(θ i ) = ( θ − θ ) . The results of these i, j i j=1 simulations are presented in Figure 5.5 as plots of RMSE(θ i ) against sample sizes.
0.00 0.15 0.30
200
400
600
800
1000
0
200
400
600
800
n
n
RMSE(θ1), α = 1.3
RMSE(θ2), α = 1.3
1000
0
0.00
2
4
0.10
6
0
RMSE(θ2), α = 1.8
0.000 0.004 0.008
RMSE(θ1), α = 1.8
200
400
600
800
1000
0
200
400
600
800
n
n
RMSE(θ1), α = 0.8
RMSE(θ2), α = 0.8
1000
0
0
20
400
40
800
0
0
200
400
n
600
800
1000
0
200
400
n
600
800
1000
Fig. 5.5 RMSE of OLS (solid line) and stable ML (dotted line) estimates of the regression coefficients of a simple regression model y = θ1 + θ2 x = 50 + 20x, with β = 0 and γ = 40. The left column shows the RMSE for θ1 , the right column shows the RMSE for θ2 , with the rows showing different values of α. Note the varying vertical scales.
234
5 Stable Regression
The most obvious information that we get from these graphs is that in all cases the stable ML estimates have the least variability. Furthermore, the method improved the fit and the observed error converged rapidly to the asymptotic error. The amount of improvement we get from stable ML estimation depends on the values of the parameters of the error distribution. As expected, the performance of the OLS decreases as α decreases. We can see this clearly in Figure 5.5, where the same regression model is used with different values of α. As α decreases, the variability of the OLS estimates becomes much larger, but the variability of the stable ML estimates doesn’t change much. The OLS error was at most twice the stable ML error when α = 1.8, but went up to 500 times as large for α = 0.8. In all cases, the error of the constant/intercept term θ 1 was much greater than that of the other terms. Another point to observe is that in many cases, the RMSE of OLS starts decreasing as n increases, but then jumps to a very high value. This behavior is consistent with the sampling distribution of this estimator being heavy tailed, as pointed out by Ojeda-Revah (2001). What happens is that as the sample size increases, it is more likely that extreme values will occur. As α decreases, extremes get more and more likely, leading to very unreliable estimates of the parameters. Hence large samples do not guarantee a reduction of error in the OLS estimator, indeed large sample sizes in the presence of heavy tailed errors can make OLS parameter estimates deteriorate as the sample size increases. To understand the role that extreme values of the errors play in stable regression, it is useful to look at the score function for the location parameter −(d/dx) log f (x) = − f (x)/ f (x). This was computed in Section 4.7, see Figure 4.13 for selected values of α and β with scale γ = 1 and location δ = 0. When α = 2, the score function for the normal density is linear and large x values get a large score. When α < 2, the score function is S-shaped and extreme values are discounted, resulting in a procedure that is robust with respect to extreme values. As α decreases, less weight is given to extremes and more weight is given to moderate values of x. Two recent papers apply this method to real data. Walls and McKenzie (2019) examine movie revenues. Box office hits lead to very large values in the data and this paper uses multiple regressor variables: budget to make the film, how many screens the film opens on, whether or not a popular star is in the movie, and whether or not the movie is a sequel to a previous movie. The paper by Rodriguez-Aguilar et al. (2019) examines Mexican electricity prices, which vary widely due to spikes in demand.
5.2 Nonlinear regression We next consider a general nonlinear model with possibly nonhomogeneous error terms of the form yi = g(xi, θ) + h(xi, θ)i,
i = 1, . . . , n,
5.2 Nonlinear regression
235
where θ = (θ 1, . . . , θ k ) is a k-vector of parameters, i are i.i.d. with stable distribution depending on parameters φ = (α, β, γ). A particular case of this is equation (5.5) below, where g(x, θ) = h(x, θ), i.e. the error terms are proportional to the value of y. The analysis proceeds by noting that under the model, the normalized residuals ri = ri (xi, y, θ) =
yi − g(xi, θ) h(xi, θ)
are i.i.d. stable, so the log-likelihood of the model is (φ, θ) =
n
log f (ri (xi, yi, θ)|φ).
i=1
The ML estimators are
(φ ∗, θ ∗ ) = arg max (φ, θ). (φ,θ)
We implemented this in R through the following steps: • Fit an ordinary nonlinear model to the data. In our case, we used the function nls in R. This gives an initial value θ0 . • Use θ0 to compute initial residuals and use these to get an initial estimate φ0 of the stable parameters (α, β, γ). • Using (φ0, θ0 ) as an initial point, use a multivariate optimization routine to maximize the log-likelihood.
5.2.1 Parameter confidence intervals: nonlinear case Approximate confidence intervals can be obtained by three methods: linearization, profile likelihood, and simulation. To describe the linearization approach, consider the residual function y − g(x, θ) . r(x, y, θ) = h(x, θ) For θ = θ ∗ + Δθ, i.e. expanding around the ML estimator θ ∗ , r(x, y, θ) ≈ r(x, y, θ ∗ ) +
k ∂r (x, y, θ ∗ )Δθ j , ∂θ j i=1
where −(∂g/∂θ j )(x, θ)h(x, θ) − (y − g(x, θ))(∂h/∂θ j )(x, θ) ∂r . (x, y, θ) = ∂θ j h2 (x, θ) In the case g(x, θ) = h(x, θ), this simplifies to
(5.4)
236
5 Stable Regression
−
y(∂g/∂θ j )(x, θ) . g 2 (x, θ)
Defining X = (∂r/∂θ j (xi, yi, θ))1≤i ≤n,1≤ j ≤k and x · = -th column sum of X, a large sample approximation to the Fisher information matrix for the nonlinear regression coefficients is given by (5.3). Profile likelihood confidence intervals and the simulation method are similar to the linear case. If one sees error terms that are reasonably modeled by a stable law, the simulation-based method is likely to be most reliable, especially for small sample sizes. The above large sample approximation relies on both large n and on the accuracy of the linear approximation (5.4) near θ ∗ .
5.2.2 Nonlinear example We show a simulated example, the exponential growth model yi = θ 1 exp(θ 2 xi )(1 + i ).
(5.5)
Here i ∼ S(α, β, γ, 0; 2) and the error terms are proportional to the size of the quantity. Figure 5.6 shows a scatter plot of simulated data with parameters α = 1.5, β = 0, γ = 0.25, θ 1 = 30, θ 2 = 0.1. Here x is a sequence from 0 to 20 in steps of size 0.25. The plot shows the exact model, the fit with nls assuming normal errors, and the fit from the stable model. The estimated parameter values from the stable ML method with simulation derived 95% confidence intervals and M = 250 repetitions are 1.482±0.376, 0.591±0.792, 0.299±0.072, 29.33±4.615, 0.1±0.008 respectively. (For comparison, the nls estimates are 42.605 for θ 1 and 0.08 for θ 2 .) In this case, the large values in the sample heavily influenced the normal model, and pulled the curve up noticeably. In contrast, the stable model allows for extreme data values and gives a better fit to the data. To assess how well this method works in general, we simulated the above exponential growth model M = 250 times and estimated the parameters each time by both the R function nls and also with the ML estimates for stable regression described above. Figure 5.7 shows boxplots of the results. The plots show that the stable regression estimates of θ 1 and θ 2 are concentrated much more tightly around the correct values. In summary, it is now possible to quickly and accurately estimate regression coefficients by ML in the presence of stably distributed errors, making it practical to apply these models to real data. There are three methods of estimating confidence interval estimates for the parameters, all giving comparable interval widths with moderately sized samples. Simulations show that the stable ML method outperforms OLS when the errors are stable, with the performance improvement getting more pronounced as the error distribution moves away from normality. Since the stable distributions include the normal distributions as special cases, the method suffers no penalty when the errors are normally distributed. The simultaneous estimation of
237
300
5.2 Nonlinear regression
150 0
50
100
y
200
250
exact nls stable
0
5
10
15
20
x Fig. 5.6 Nonlinear regression for y = θ1 exp(θ2 x)(1 + ), with θ1 = 30, θ2 = 0.1, and ∼ S(1.5, 0, 0.25, 0; 2).
the parameters of the error distribution allows one to check for heavy tailed errors and assess the fit. In contrast, other methods of robust regression, e.g. L1 or quantile regression, focus on estimating only the regression coefficients, and ignore the error distribution. It is possible to use one of these methods to estimate the regression coefficients and then fit the residuals with a stable distribution. However, this two-step procedure loses the dependence structure between the regression coefficients, the parameters of the error distribution, and the design matrix. The stable ML approach described here captures that dependence in the Fisher information matrix J, making it possible to give confidence regions for all parameters simultaneously.
238
5 Stable Regression
10
30
50
θ1
nls
stable
0.00
0.10
0.20
θ2
nls
stable
Fig. 5.7 Boxplots of the parameter estimates from model (5.5) with M = 250 simulations of nonlinear regression. The horizontal lines in each plot show the exact values. The boxplots for the normal model are labeled nls for the R function that computes it; the stable case is based on the methods derived here.
5.3 Problems The program stablereg is available at the website given at the end of the preface. It will perform linear regression when the errors are stable. Also at that website are directions on how to use that program and data sets for the following problems. Problem 5.1 Perform a simple univariate linear regression on the data in the file Regression1.dat. Problem 5.2 Perform a simple bivariate linear regression on the data in the file Regression2.dat. Problem 5.3 Perform a multivariate linear regression on the data in the file Regression3.dat. Problem 5.4 Find a real data set where the errors appear to be heavy tailed and analyze using the stablereg program.
Chapter 6
Signal Processing with Stable Distributions
In many engineering problems, heavy tailed noise occurs in a variety of settings. In the engineering literature, this is referred to as impulsive or spiky noise. Impulsive noise can appear in telephone wires Stuck and Kleiner (1974); in communication networks Yang and Petropulu (2001), Gonzalez (1997), Georgiadis (2000), Jaoua et al. (2014) Li et al. (2019); in radar clutter Nikias and Shao (1995) and Kapoor et al. (1999); in image processing Arce (2005) and Carassso (2002); in underwater acoustics Chitre et al. (2006); in aerial acoustics Zhidkov (2018); in blind source separation Kidmose (2001) and Zha and Qiu (2006). The sonar data in Figure 4.33 is one example where this occurs; that data is well described by an α = 1.74 symmetric stable law. This chapter focuses on filtering the additive noise model (6.1) when the noise is α-stable. Most of this chapter is from Nolan (2008) and Robust Analysis Inc (2009). Classes of nonlinear filters are derived under the assumption of stable noise and it is shown how they can outperform a linear filter. The simplest filter, called an unweighted stable filter is discussed first. The following section defines weighted/matched stable filters. Section 6.3 discusses calibration and numerical issues, while Section 6.4 shows some performance comparisons. The chapter ends with a brief discussion some of the other applications of stable laws in signal processing.
6.1 Unweighted stable filters The standard additive noise model is that the observed signal xi is the sum of a signal si and a noise term i with density f (x): xi = si + i,
i = 1, . . . , n.
(6.1)
The noise comes from a variety of sources—static, nearby electrical activity, lightning, choppy seas, reflections in urban environments, snapping shrimp, bearing noise, fluctuations within the detector circuits, etc. (With apologies to electrical © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_6
239
240
6 Signal Processing with Stable Distributions
engineers, we will not make the distinction between the noise within the detector circuits and external noise or clutter.) We will assume that the noise terms i are i.i.d. Figure 6.1 compares simulated Gaussian noise (top) with stable noise (bottom). The main qualitative difference is the spikes or impulses that appear in the stable case. These spikes are produced by something in the environment and cause problems for standard linear filters. The right side of the figure shows smoothed density plots of both noises, but do not easily reveal the spikes because they occur infrequent. So a cursory look at the data to verify normality can be misleading. In Section 4.10.2 formal tests are discussed to distinguish between Gaussian and non-Gaussian stable data sets. There we observed that the Shapiro-Wilks test for normality is fast and as powerful as any of the other tests for rejecting normality. Gaussian noise
20
0.4
10
0.3
0
0.2
−10
0.1
−20
0
500
1000
Impulsive stable noise, alpha = 1.5
20
0 1500 −20
0.3
0
0.2
−10
0.1 0
500
1000
20
0
20
0.4
10
−20
0
0 1500 −20
Fig. 6.1 Simulated Gaussian white noise (top) and stable white noise (bottom).
Figure 6.2 shows a more interesting signal on the left, a sinusoidal signal with the Gaussian and two different stable noises, S (2, 0; 0), S (1.5, 0; 0), and S (0.75, 0; 0) respectively. A linear moving average filter of width m = 10 is applied to each signal and the output in both cases is plotted on the right. To get an output signal of the same length as the input signal, the noisy signal is padded with some values on the left and the right of x1, . . . , xn to get an extended signal of length n + m − 1, say z1, . . . , zn+m−1 . For simplicity here the signal is padded on the left by repeating the first value x1 (m + 1)/2 times, and padded on the right by repeating the last value xn repeated m − (m + 1)/2 times. Then the standard linear filter is 1 zi+j−1, m j=1 m
yi =
i = 1, . . . , n.
(6.2)
6.1 Unweighted stable filters
241
Figure 6.2 shows the output of a linear filter on the right plots for the three noise types. The top right plot shows that for Gaussian noise, the linear filter performs well—the orignal sinusoidal signal is clearly visible. However, the middle right plot shows that for the α = 1.5 stable noise, the linear filter performs poorly: some of the impulses pass through the filter without much attenuation and the path is more erratic than the signal. And the bottom right plot shows that for the α = 0.75 stable noise, the linear filter performs even worse. Arguably, the filtered output is worse than the input noisy signal. Increasing the window width will smooth out the signal, but if the window gets too wide, the output gets flattened out and the sinusoidal nature and strength of the signal is lost. The moving average used to filter the signal is sensitive to extreme values in the signal. To be specific, pick a single window of values x1, . . . , xm and compute the linear filter output m 1 δ0 = xi . (6.3) m i=1 δ0 is also Gaussian When the noise is Gaussian, say i ∼ S (2, 0, γ, δ; 0), then with mean δ and scale γ/m1/2 , which tends to 0 as m → ∞. But if the noise δ0 is the sum of SαS terms and so is SαS with location δ and i ∼ S(α, 0, γ, 0; 1), therefore it has infinite variance. The scale of this stable r.v. is γ/m1−1/α . When 1 < α < 2, the scale tends to zero as m inceases, but it does so slowly. When α = 1, the scale is γ for any m: taking a moving average over an arbitrary number of terms gives the same amount of information as a single value. Even worse, when 0 < α < 1, the scale tends to ∞ as m → ∞: taking a moving average gives less information about the center δ; in fact. the more values used, the worse the output is. The stable filter described next is nonlinear and built with the assumption that the noise is α-stable. For fixed α, β and γ, let f (x) = f (x|α, β, γ, 0) and ρ(x) = − log f (x). Define the cost function for a (unweighted) stable filter by C0 (δ; x1, . . . , xm ) = − log
m
f (xi − δ) =
i=1
m
ρ(xi − δ).
(6.4)
i=1
With this cost function, the maximum likelihood estimator of δ is δ0 = argminδ C0 (δ; x1, . . . , xm ). The examples below consider the symmetric case, β = 0, but the approach works when β 0. In the nonsymmetric case, it may be helpful to use the 2-parameterization, which would make δ0 estimate the mode. When α = 2, the cost function is of the form m 1 (xi − δ)2, c(γ, m) + 2 γ i=1
(6.5)
242
6 Signal Processing with Stable Distributions linear filter, Gaussian noise, m = 10
−20
−20
−10
−10
0
0
10
10
20
20
Gaussian noise
0
500
1000
0
1500
500
1000
1500
linear filter, stable noise, m = 10
−20
−20
−10
−10
0
0
10
10
20
20
stable noise, alpha = 1.5
0
500
1000
0
1500
500
1000
1500
linear filter, stable noise, m = 10
−20
−20
−10
−10
0
0
10
10
20
20
stable noise, alpha = 0.75
0
500
1000
1500
0
500
1000
1500
Fig. 6.2 Signal x(t) = 3 sin(0.03t) with additive Gaussian (α = 2, top), symmetric stable noise (α = 1.5, middle), and symmetric stable noise (α = 0.75, bottom) on the left. On the right is the output of a linear moving average filter of width 10.
and it is minimized when δ0 = x, i.e. the linear filter (6.3). When α = 1 and β = 0, the cost function is m c(γ, m) + log(1 + (xi − δ)2 ), i=1
and minimzation results in the myriad filter of Gonzalez (1997) and Arce (2005). In the case when α = 1/2 and β = 1, it is possible to express the cost function explicitly also, but in general, the cost function has to be evaluated and minimized numerically.
6.1 Unweighted stable filters
243
Practical issues with calibrating this filter, how the minimization is done, and speed are discussed below. But first the performance of this filter is demonstrated through simulations. Figure 6.3 considers the case when there is no signal, just noise. A perfect filter should output a flat horizontal line Note how the impulsive noise passes through the linear filter, but the stable filter correctly filters it out. Next consider two cases where there is a meaningful signal with additive impulsive noise. Figure 2.10 shows the behavior of this stable filter when there is a sinusoidal signal with additive stable noise. Figure 6.4 is similar, with a step function as the signal and additive stable noise. In these cases, the stable filter does a much better job of suppressing the impulsive noise. Simulation results are presented below that offers a more thorough comparison of these filters.
20 −20 −10 0
linear filter
10
100 50
x(t)
4000
8000
0
0
−100
−20 −10 0
10
−50
20
stable filter
0
2000
4000
6000
8000
10000
0
4000
8000
Fig. 6.3 No signal, simulated additive stable noise with α = 1.3 and window size m = 50. The left plot shows the noisy signal, the top right plot shows the output of a linear filter and the bottom right shows the output of an unweighted stable filter. Note the smaller vertical scale on the right side plots.
Since δ0 is a maximum likelihood estimator, it has nice properties. First, for large √ m, the estimator is approximately normal and Var( δ0 ) ≈ kγ/ m for large m. This is in stark contrast to Var( δ0 ) = ∞ for all m. The method is designed to be robust to extremes and simulations show that the method works well for other heavy-tailed noise distributions. Of course these benefits do not come for free - much more processing needs to be done; see below.
244
6 Signal Processing with Stable Distributions
x(t)
50
−20 −10 0
10
100
20
linear filter
4000
8000
0
0
−100
−20 −10 0
10
−50
20
stable filter
0
2000
4000
6000
8000
10000
0
4000
8000
Fig. 6.4 A step function signal with simulated additive stable noise. As in Figure 2.10, α = 1.3 and window size is m = 50. The left plot shows the received signal, the top right plot shows the output of a linear filter and the bottom right shows the output of an unweighted stable filter. Note the smaller vertical scale on the right side plots.
6.2 Weighted and matched stable filters There are several generalizations of the filter that are given by replacing the cost function (6.4). The simplest extension is to allow non-negative weights inside the weighted cost function: Cwt (δ; x1, . . . , xm ) =
m
ρ(wi (xi − δ))
(6.6)
i=1
The weighted filter output is δwt = argminδ Cwt (δ; x1, . . . , xm ). This weighted filter allows the possibility of giving more importance to some terms, e.g. the most recent terms. In detection problems, one looks for a known pattern s1, . . . , sn in the signal, which is contaminated by stable noise: xt = θst + t ,
t = 1, 2, 3, . . .
The goal here is to estimate θ, the magnitude of the received signal. The null case corresponds to θ = 0 (no signal), the case where a signal is present corresponds to θ 0. One way to do this is to allow signed weights as in the myriad filter in Arce (2005). This gives an approximate matched filter, using cost function
6.2 Weighted and matched stable filters
Csign (θ; x1, . . . , xm ) =
m
245
ρ(|si |((sign si )xi − θ)) =
i=1
m
ρ(si xi − |si |θ)
i=1
and estimate θ sign = argminθ Csign (θ; x1, . . . , xm ). A second way to approach the detection problem is a true stable matched filter, with cost function Cmatch (θ; x1, . . . , xm ) =
m
ρ(xi − θsi ),
(6.7)
i=1
and estimate θ match = argminθ Cmatch (θ; x1, . . . , xm ). A place where a matched filter is useful is in radar processing, where a known signal is broadcast and the radar receiver listens to see if that particular pattern is reflected from an object. For simplicity we consider a univariate signal, not the bivariate case where in-phase and quadrature components are observed. Frequently a linear frequency modulation (LFM) chirp is used, it is given by si = sin(2πki 2 ). Figure 6.5 shows an example with k = 0.0005. A chirp embedded in symmetric stable noise is shown in Figure 6.6 along with the output of a linear matched filter and the output of a stable matched filter. (See Section 6.4 below for the meaning of signal-to-noise ratio (SNR) in the infinite variance case.) Note how the stable filter suppresses most of the noise and highlights the location of the center of the chirp. In contrast more of the noise propagates through the linear filter. Radars typically set a threshold and if the processed signal exceeds that threshold, the system decides that the chirp has been detected, i.e. the radar signal has bounced off of some object. When the processed signal exceeds a threshold, it may be a true detection, or it may ba a false alarm (type I error). The threshold is set to keep the false alarm rate at some level, called the false alarm rate. If a radar system seta a threshold based on a linear filter, an impulsive noise environment will set off many false alarms. This can be annoying, indeed dangerous if it distracts a pilot. Raising the threshold will lower the false alarm rate, but will lower the power of detection. A stable matched filter can suppress the false alarms without lowering the power, see Figure 6.6. Since radars operate in the megahertz and gigahertz frequency range, there can be millions of chirps broadcast a second and improving the filtering can lead to improvement in the probability of detection. There are relationships between these three filters. First, if we set the weights to be wi = |si |, then Csign (θ; x1, . . . , xm ) = Cwt (θ; (sign s1 )x1, . . . , (sign sm )xm ),
(6.8)
so the signed filter is a nonnegative weighted filter on the signed data (sign si )xi . Likewise, if β = 0, the signed filter is a matched filter for the data |si |xi : Cmatch (θ; |s1 |x1, . . . , |sm |xm ) = Csign (θ; x1, . . . , xm ). On the other hand, if β = 0 and all si 0, then
(6.9)
246
6 Signal Processing with Stable Distributions
−1.0
−0.5
0.0
0.5
1.0
radar chirp
−100
−50
0
50
100
i
Fig. 6.5 Typical radar chirp. chirp in stable clutter, alpha = 1.7, SNR = 20
100 50 0 −50
0
500
1000
1500
1000
1500
1000
1500
Gaussian matched filter
1
0.5
0
0
500
Stable matched filter
1
0.5
0
0
500
Fig. 6.6 The top plot shows an LFM chirp (red) with additive 1.7-stable noise with SNR 20. The middle plot shows the output of a linear matched filter, while the bottom shows the output of a stable matched filter.
6.3 Calibration and numerical issues
Csign (θ; x1 /|s1 |, . . . , xm /|sm |) = Cmatch (θ; x1, . . . , xm ).
247
(6.10)
If some si = 0, then those terms in the sum for Cmatch are constant with respect to θ, so the corresponding terms can be left out when minimizing a cost function with fewer terms. Thus, in the symmetric case, the matched filter is always equivalent to a related signed filter. Finally, it is possible to use non-symmetric stable distributions for the noise terms. For example, Kuruoglu and Zerubia (2003) use skewed stable laws to model textures in images. In the skewed case, one should use the 2-parameterization that centers the distribution at the mode. Note that in the non-symmetric case, (6.8) is true, but (6.9) and (6.10) are not true.
6.3 Calibration and numerical issues 6.3.1 Calibration In the examples above, we assumed the parameters of the noise are known. In certain cases where there is a known history of the system, this may be reasonable. In many signal processing problems, the noise is symmetric, in which case taking β = 0 reduces the number of parameters to two: stable index α and scale γ. If it is possible to get a pure noise signal, e.g. si = 0 for all i in (6.1), then estimating parameters is the standard i.i.d. estimation problem and any of the methods in Chapter 4 can be used. Maximum likelihood is recommended, especially as it can be used when some of the parameters are fixed, e.g. β = 0 and δ = 0. If speed is a concern, some of the other methods may be useful, e.g. empirical characteristic function or fractional moment estimators. When there is a signal it is harder to estimate the parameters. In this case, one can pick a set of parameters, say α = 1, β = 0, γ = the interquartile range of the data, and δ = 0. Run the signal x1, . . . , xn through the filter to get an estimated signal y1, . . . , yn . Now examine the residuals ri = xi − yi and estimate new parameters α, β, γ and δ. The graphical diagnostic in Section 4.10.1 can be used to assess the if the residuals are close to stable. If not, this process can be repeated a few times to see if better estimates of the parameters result. In more complicated environments, the character of the noise may vary over time. For example, an airplane radar may operate in a variety of settings: over a flat sea or a choppy sea, or over open land, or over an urban environment. Each of these environments may have different noise characteristics. Here one can use a variation of the above methods at periodic times. Pick a time interval ΔT and estimate the parameters. Then run the filter with those parameters until time ΔT, at which point the parameters are estimated again and those values are used to calibrate the filter for the next interval. This procedure is repeated every ΔT units of time, leading to a continuously running adaptive filter. Figure 6.7 shows an example of this. The top graph shows a signal with two chirps that we would like to detect. The middle graph
248
6 Signal Processing with Stable Distributions
shows the chirps with simulated Gaussian noise in the first half of the time interval and the stable noise with α = 1.7 in the second half of the interval. The bottom plot shows the output of a matched linear filter in blue, and an adaptive filter that readjusts it’s parameters periodically in red. Note how the Gaussian filtered output has a false detection around t = 1.4 × 104 and perhaps one or two more around t = 1.475 × 104 and t = 1.75 × 104 . Clean signal
s(t)
1
0
−1 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
t
Received (noisy) signal −−Alpha = 1.70; Gamma = 0.20
r(t)
2
4
x 10
0
−2 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Matched filter result
t
Gaussian/Adaptive−stable matched filters
1
4
x 10 Gaussian Stable
0
−1 0
0.2
0.4
0.6
0.8
1 t
1.2
1.4
1.6
1.8
2 4
x 10
Fig. 6.7 An adaptive stable filter. The middle plot has the two chirps with superimposed Gaussian noise (first half) and α = 1.7 SαS noise (second half). The bottom plot shows the output of a linear matched filter in blue and an adaptive matched stable filter in red.
In practice, the performance of the filters seems somewhat insensitive to the parameter values. Figure 6.8 shows the probability of detection for a range of SNR and three different α values. It shows that changing α by a bit doesn’t have much effect on the probability of detection. Gross misspecification of α may cause problems, but precise calibration doesn’t seem to be needed.
6.3 Calibration and numerical issues
249
Probability of Detection vs. Signal to Noise Ratio; Pfa = 0.1
0.8 0.75 0.7 0.65
Pd
0.6 0.55 0.5 0.45 0.4 Alpha = 1.9 Alpha = 1.8 Alpha = 1.7
0.35 0
1
2
3
4
5
6
7
SNR (dB) Fig. 6.8 Comparison of power to detect a chirp using different α. The true value of α in this simulation was 1.7.
6.3.2 Evaluating and minimizing the cost function In general, the cost functions described above have to be evaluated numerically. This is done in the STABLE package by using the quick approximation to the log density log f (x), see Robust Analysis Inc (2009). This does not involve any numerical integration and is much faster: approximately 7,000,000 values of log f (x) can be evaluated per second on a desktop computer. The cost function has to be minimized, and it is not generally convex, so it may have to be evaluated at many points. This is adequate for doing real time processing of audio signals, but slower than real time for radar signal processing on standard hardware. Given the computation time for the stable filter, high throughput systems like radar may want to use a hybrid combination of linear and stable filters. Run a linear filter normally, but when an alarm is called, switch to a stable filter in “spotlight” mode to suppress false positives. While it is not implemented yet, some of these algorithms can be parallelized—either on multiple cores or using GPUs When α = 2, the cost function is given by (6.5), which is convex in δ. It therefore a unique minimum, which we already pointed out is δ = x. In contrast, when
250
6 Signal Processing with Stable Distributions
0 < α < 2, the cost function is generally not convex, e.g. Figure 4.9. There may be local minima, so a standard minimization routine may get stuck in a local minimum. To find a global min, the STABLE package uses a branch and bound method, see Núñez et al. (2008).
6.4 Evaluation of stable filters To evaluate how well a stable filter works in a variety of settings, a measure of signalto-noise-ratio (SNR) is needed. The standard definition for a signal of amplitude A with Gaussian noise with standard deviation σ is 2 A . SN R = σ If α < 2, a stable law has infinite variance, so the standard SNR is 0 for any non-Gaussian stable error term. To replace the standard defition, we use the idea of geometric power introduced in Gonzalez (1997). That thesis defines the geometric power of a r.v. X to be S0 = S0 (X) = exp(E log |X |). We will restrict to with E log |X | < ∞, and hence 0 ≤ S0 < ∞, which Gonzalez calls logarithmic-order laws. The following properties are from the above thesis, see the Problems at the end of the chapter. Proposition 6.1 For a logarithmic order X and Y and real c: (a) S0 ≥ 0. (b) S0 (cX) = |c|S0 (X). (c) S0 (c) = |c|. (d) 0 ≤ c1 ≤ |X | ≤ c2 implies c1 ≤ S0 (X) ≤ c2 . (e) S0 (X) = 0 if and only if P(X = 0) > 0. (f) S0 (XY ) = S0 (X)S0 (Y ). (g) S0 (X/Y ) = S0 (X)/S0 (Y ). (h) S0 (X c ) = S0 (X)c . (i) S0 (|X | + |Y |) ≥ S0 (X) + S0 (Y ). Proposition 6.2 For X ∼ SαS(γ), S0 (X) = Cg(1/α)−1 γ, where Cg = exp(γEuler ) ≈ 1.781045. Proposition 6.3 Let M |X | (u) = E |X | u be the Mellin transform of |X | and suppose (d/du)E |X | u = E(d/du)|X | u . Then S0 (X) = exp M |X | (0) .
6.4 Evaluation of stable filters
251
Table 3.1 in the thesis mentioned above gives the geometric power of Pareto, uniform, generalized Cauchy and generalized t distributions. Problem 6.4 calculates the geometric power of an arbitrary stable r.v. It also shows that the sample or empirical geometric power of x1, . . . , xn
n 1/n n 1 log |xi | = |xi | , S0 = exp n i=1 i=1 is a consistent estimator of S0 . The fact that the last term above is the geometric mean of the sample is the motivation for calling S0 the geometric power. With this preparation, we can extend the definition of SNR. let A be the amplitude of a signal with noise geometric power S0 . Then the geometric signal-to-noise ratio, also abbreviated SNR is 2 A . SN R = 2Cg S0 The constant 2Cg guarantees that for the Gaussian noise case, the definition of the SNR coincides with that of the standard SNR. Generally as the SNR decreases, the probability of detection decreases. Figure 6.9 shows the results of simulations comparing stable filters with Gaussian and clipped Gaussian filters. These simulations used symmetric stable clutter with α = 1.4. For each SNR level in the set {−14, −12, −10, . . . , 8, 10} simulations were done in the null case to determine a threshold to keep a constant false alarm rate of 0.05 for each of the three methods. With this threshold, multiple data sets were simulated with a chirp and noise at that SNR level. A count was kept of how often the chirp was detected by each method, and this was normalized to get the probability of detection. This process was repeated for all the SNR levels. This plot shows that using stable filters is much more powerful than the Gaussian linear filter. The clipped Gaussian filter is intermediate in power, though it has the additional problem of how to choose the clipping level. As α decreases, the stable filter will perform even better compared to the linear filter and the clipped linear filter. We end with mention of some other applications. With an understanding of bivariate isotropic stable noise, it is possible to develop bivariate stable filters, e.g. to process in-phase and quadrature components in a radar system. It is also possible to process synthetic aperture radar (SAR). Protypes of both of these exist. Chapter 2 gives some other engineering applications of stable laws. The webpage given in the preface has a bibliography that many other references, including extended Kalman filters, particle filters, and other engineering applications.
252
6 Signal Processing with Stable Distributions
1 0.9
Pd vs. SNR (Pfa = 0.05; α = 1.4) Gaussian Clipped Gaussian Stable
Probability of detection
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −15
−10
−5
SNR
0
5
10
Fig. 6.9 Probability of detection for a filter with false alarm rate 0.05. The clutter was SαS with α = 1.4 and window width m = 1024.
6.5 Problems Problem 6.1 Prove the properties in Proposition 6.1. Problem 6.2 Prove Proposition 6.2. Problem 6.3 Prove Proposition 6.3. Problem 6.4 Compute the geometric power of a general (non-symmetric) stable r.v. The program stablefilter is available at the website given at the end of the preface. It will process an input file with an (unmatched) stable filter. Also at that website are directions on how to use that program and data sets for the following problems. Problem 6.5 Use the stablefilter program on the data in the file Filter1.dat using α = 1.5 and γ = 0.3. Compare plots of the original data and filtered output.
6.5 Problems
253
Problem 6.6 Use the stablefilter program on the data in the file Filter2.wav with α = 2 amd with α = 1.5, both with γ = 1. Listen to the input file and both output files. Problem 6.7 Find a real data set where the noise appears to be heavy tailed and analyze using the stablefilter program.
Chapter 7
Related Distributions
This chapter contains brief discussions of distributions related to stable laws. Except when there is something related to our main interests, proofs are not given.
7.1 Pareto distributions The density and cumulative distribution function of a Pareto(α, c) distribution with α > 0, c > 0 are αcα x 1+α c α F(x) = 1 − x f (x) =
x>c x ≥ c.
Note that c is a scale parameter: if Z ∼ Pareto(α, 1), then cZ ∼ Pareto(α, c). Solving F(x p ) = p for x p shows that the quantiles are given by x p = c(1 − p)−1/α , 0 ≤ p < 1. Since there is this explicit formula for F −1 , one can simulate Pareto r.v. in terms of d
a uniform (0,1) U: X = c(1 − U)−1/α =cU −1/α ∼ Pareto(α, c). Straightforward integration gives expressions for moments: (α/(α − p))c p −∞ < p < α p p E |X | = E X = +∞ α ≥ p. In particular, the mean exists if and only if α > 1, in which case E X = αc/(α − 1), and the variance exists if and only if α > 2, in which case Var (X) = E X 2 − (E X)2 = αc2 /((α − 1)2 (α − 2)). There is no closed form for the characteristic function of a Pareto law, but there is an expression for it in terms of some special functions. Series representations for these functions give the behavior of the characteristic function near the origin, which is what is needed for convergence of normalized sums. The generalized Fresnel © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4_7
255
256
7 Related Distributions
integrals C(x, ν) and S(x, ν) are defined for x > 0 and Re ν < 1 in Prudnikov et al. (1990), pg. 764, by ∫ ∞ ∫ ∞ C(x, ν) = uν−1 cos u du S(x, ν) = uν−1 sin u du. x
x
The hypergeometric functions p Fq (a1, . . . , a p ; b1, . . . , bq ; x) are defined by the series p Fq (a1, . . . , a p ; b1, . . . , bq ;
x) =
∞ (a1 )r · · · (a p )r x r , (b1 )r · · · (bm )r r! r=0
(7.1)
where (a)r = a(a +1)(a +2) · · · (a +r −1), (a)0 = 1 for a 0. In what follows, p < q, in which case the power series (7.1) converges for all x ∈ C. More information can be found in Mathai (1993), pg. 96. Lemma 7.1 The characteristic function of a Pareto (α, 1) distribution, α > 0, has the following properties: φα (u) = α [C(|u|, −α) + i(sign u)S(|u|, −α)] |u| α = −Γ(1 − α) cos( π2α ) − i(sign u) sin( π2α ) |u| α
α u2 1−α 3 3−α u2 + iu . + 1F2 − α2 ; 12 , 2−α ; − F ; , ; − 1 2 2 4 2 2 2 4 α−1 For α > 0, φα+1 (u) = eiu +
iu φα (u). α
(7.2)
For α ∈ (0, 1) ∪ (1, 2), as u → 0, φα (u) = 1 − Γ(1 − α) cos For α = 1, as u → 0, φ1 (u) = 1 −
π 2
πα 2
− i(sign u) sin
πα 2
|u| α + iu
α + O(u2 ). α−1
|u| + i π2 u log |u| + iu(1 − γEuler ) + O(u2 ),
where γEuler ≈ 0.57721 is Euler’s constant. Proof Since cos(ux) = cos(|u|x) and sin(ux) = (sign u) sin(|u|x), substituting t = |u|x yields ∫ ∞ ∫ ∞ φ(u) = x −α−1 eiux dx = x −α−1 [cos(|u|x) + i(sign u) sin(|u|x)] dx 1 1 ∫ ∞ α −α−1 [cos t + i(sign u) sin t] dt t = α|u| α
|u |
= α|u| [C(|u|, −α) + i(sign u)S(|u|, −α)] .
7.1 Pareto distributions
257
Using the expressions for C(|u|, −α) and S(|u|, −α) on pg. 764 of Prudnikov et al. (1990), this can be written as |u| −α α 1 2−α u2 − F ; , ; − α|u| α cos π2α Γ(−α) − 1 2 2 2 2 4 −α |u| 1−α 1−α 3 3−α u2 πα +i(sign u) − sin 2 Γ(−α) − 1F2 2 ; 2, 2 ; − 4 1−α = −Γ(1 − α)|u| α cos π2α − i(sign u) sin π2α
α u2 1−α 3 3−α u2 + 1F2 − α2 ; 12 , 2−α + iu . ; − F ; , ; − 1 2 2 4 2 2 2 4 α−1 For (7.2), integration by parts shows that for x > 0 and ν < 0, C(x, ν) = −
xν cos x 1 + S(x, ν + 1), ν ν
S(x, ν) = −
xν sin x 1 − C(x, ν + 1). ν ν
Substituting these into the first form for the characteristic function and simplifying shows that for any α > 0, φα (u) = eiu + |u| α [−S(|u|, 1 − α) + i(sign u)C(|u|, 1 − α)] .
(7.3)
This and |u| α+1 = (sign u)u|u| α implies φα+1 (u) = eiu + |u| α+1 [−S(|u|, −α) + i(sign u)C(|u|, −α)] = eiu + iu|u| α [C(|u|, −α) + i(sign u)S(|u|, −α)] = eiu +
iu φα (u), α
establishing (7.2). For the behavior of the characteristic function near the origin, the power series representation (7.1) for the hypergeometric function shows 1 F2 (a; b1, b2 ; −u2 /4) = 1 − O(u2 ). When α ∈ (0, 1) ∪ (1, 2), substituting this into the second expression for φα (u) shows the behavior as u → 0: φα (u) = −Γ(1 − α)|u| α cos π2α − i(sign u) sin π2α
α + 1 + O(u2 ) + i u(1 + O(u2 )) α−1 α α πα u + O(u2 ). = 1 − Γ(1 − α)|u| [cos 2 − i(sign u) sin π2α ] + i α−1 When α is an integer, one of the terms cos π2α or sin π2α is zero (if α is odd or even respectively), so the exact asymptotics are more delicate. For α = 1, (7.3) above and (5.2.5), (5.2.26), (5.2.27), (5.2.14), and (5.2.16) of Abramowitz and Stegun (1972) show
258
7 Related Distributions
φ1 (u) = eiu + |u| [−S(|u|, 0) + i(sign u)C(|u|, 0)] = eiu + |u| Si(|u|) − π2 − i(sign u)Ci(|u|) = cos u − π2 |u| + |u|Si(|u|) + i (sin u − uCi(|u|)) = 1 − O(u2 ) − π2 |u| + |u|(|u| + O(|u| 3 )) +i u − O(u3 ) − u(γEuler + log |u| + O(|u|)) = 1 − π2 |u| + i π2 u log |u| + iu(1 − γEuler ) + O(u2 log |u|). When α ≥ 2, (7.2) can be used to get the asymptotics.
There are similar expressions for the ∫ ∞ Laplace transform of a Pareto, using the incomplete gamma function Γ(a, z) = z t a−1 e−t dt. Lemma 7.2 The Laplace transform of Xα ∼Pareto(α, 1) distribution, α > 0, is Lα (u) = E exp(−uXα ) = αuα Γ(−α, u), For α > 1,
Lα (u) = e−u −
For α ∈ (0, 2), as u ↓ 0,
u ≥ 0.
u Lα−1 (u). α−1
α u + O(u2 ) 1 − uα Γ(1 − α) + 1−α Lα (u) = 1 + (γEuler − 1 + log u)u + O(u2 )
α1 α = 1.
Proof For u > 0, the first part is just a change of variable: set t = ux and ∫ ∞ ∫ ∞ E exp(−uX) = e−ux x −α−1 dx = αuα t −α−1 e−t dt = αuα Γ(−α, u). 1
u
For α > 1, an integration by parts shows the relation between Lα (u) and Lα−1 (u). For the behavior near the origin when α ∈ (0, 1) ∪ (1, 2), use the power series for Γ(−α, u) from 8.354.2 of Gradshteyn and Ryzhik (2000): u−α+1 u−α + −··· αuα Γ(−α, u) = αuα Γ(−α) − −α (−α + 1) α = αΓ(−α)uα + 1 + u + O(u2 ) α − 1 α = 1 − Γ(1 − α)uα + u + O(u2 ). α−1 ∫∞ For α = 1, L1 (u) = 1 t −2 e−ut dt = E2 (u), the exponential integral of order 2. Equation 5.1.12 of Abramowitz and Stegun (1972) shows that for u near 0, E2 (u) = 1 + (γEuler − 1 + log u)u − u2 /2 + u3 /12 − · · · . The parameterization used above is not universal. In cases where a distribution with support [0, ∞) is needed, some authors shift X ∼ Pareto(α, c) by c, i.e. Y = X −c,
7.2
t distributions
259
then Y has density f (y) = αcα (y + c)−1−α , y > 0. More generally, an arbitrary shift parameter may be added, e.g. X − δ. The generalized Pareto distributions (GPD) are defined by the d.f. ⎧ ⎪ 1 − (1 + (x/α))−α ⎪ ⎨ ⎪ F(x) = 1 − e−x ⎪ ⎪ ⎪ 1 − (1 + (x/α))−α ⎩
x ≥ 0, α > 0 x ≥ 0, α = 0 0 ≤ x ≤ |α|, α < 0.
For α > 0, this has power decay as x → +∞, for α < 0, this has a power singularity as x → |α|. It is possible to add a general scale γ and shift δ, e.g. γX + δ, to get a three parameter family, denoted GPD(α, γ, δ), e.g. Embrechts et al. (1997) or Reiss and Thomas (2001).
7.2 t distributions The t distribution with r > 0 degrees of freedom and scale γ > 0 has density f (x|r, γ) =
γ r c(r) (r+1)/2 , x > 0, γ 2 + x 2 /r
√ where c(r) = Γ((r + 1)/2)/( πrΓ(r/2)). When γ = 1 and r is an integer, this is the standardized t distribution used in small sample inference about a population mean. When r = 1, it coincides with a Cauchy(γ, 0) = S (1, 0, γ, 0; 0) distribution. In general, r and γ can be any positive real numbers. The tail is of the form f (x|r, γ) ≈ cx −(r+1) , so the pth absolute moments exist if and only if p < r. In particular, the mean is defined if and only if r > 1 and the variance is finite if and only if r > 2. Values of r > 2 allow distributions with finite variance, but a limited number of moments. All t distributions are scale mixtures of Gaussians: if G ∼ N(0, 1) and Y is Gamma(r, 1), then X = Y −1/2 G ∼ t(r, γ),
(7.4)
see Problem 7.1.
7.3 Other types of stability Stable distributions are defined by the fact that they retain their type under addition. (Recall that two random variables X and Y are of the same type if there are constants d
a 0 and b ∈ R such that aX + b=Y ). When addition is replaced by some other operation, different types of stability are possible. The next sections give a brief
260
7 Related Distributions
summary of other forms of stability. Throughout this section let X, X1, X2, . . . be i.i.d. random variables.
7.3.1 Max-stable and min-stable Definition 7.1 X is max-stable if for every n > 0, max(X1, X2, . . . , Xn ) is of the same type as X. X is min-stable if for every n > 0, min(X1, X2, . . . , Xn ) is of the same type as X. Since min(X1, . . . , Xn ) = − max(−X1, . . . , −Xn ), understanding max-stable laws is sufficient for understanding min-stable ones. Max-stable laws are characterized by the following result, which was stated by Fisher and Tippett (1928), and rigorously proved by Gnedenko (1943). There are three classes of distributions that are maxstable, and they are the classical extreme value distributions—Gumbel, Fréchet, and Wiebull distributions (sometimes called stretched exponential distributions) respectively. Proposition 7.1 In what follows, σ > 0, μ ∈ R and ξ > 0. A distribution is maxstable if and only if it is one of the following three laws: (a) Max-stable of type I (Gumbel): X ∼Gumbel(μ, σ) with cdf and pdf F(x) = exp −e−(x−μ)/σ 1 f (x) = exp −(x − μ)/σ − e−(x−μ)/σ . σ (b) Max-stable of type II (Fréchet): X ∼Fréchet(μ, σ, ξ) with cdf and pdf 0 x≤μ F(x) = −ξ x > μ. exp −((x − μ)/σ) ξ f (x) = ((x − μ)/σ)−ξ−1 exp −((x − μ)/σ)−ξ x > μ. σ (c) Max-stable of type III (Weibull): X ∼Weibull(μ, σ, ξ) with cdf and pdf exp −((μ − x)/σ)ξ x 0, and −1/Y is Weibull(ξ, 0, 1). (c) If Z is Weibull(ξ, 0, 1), then − log(−Z) is Gumbel(0,1), −(−Z) p is Weibull(ξ/p, 0, 1) for p > 0, and −1/Z is Fréchet(ξ, 0, 1).
1.0
These distributions can be combined into a three parameter family called the (generalized) extreme value distribution: X ∼EV(γ, μ, σ) has cdf P(X ≤ x) = exp(−(1+γ(x−μ)/σ)−1/γ ). If γ < 0, this corresponds to a Weibull(1/γ, μ−σ/γ, σ/γ) distribution; if γ = 0, it corresponds to a Gumbel(μ, σ) distribution (take a limit as γ → 0); if γ > 0, it corresponds to a Fréchet(1/γ, μ − σ/γ, σ/γ) distribution.
0.0
0.2
0.4
0.6
0.8
Weibull ξ1 Gumbel Frechet ξ1
−4
−2
0
2
4
Fig. 7.1 Densities of standardized (μ = 0, σ = 1) extreme value distributions. d
It is known that in the three cases above, an max(X1, . . . , Xn ) + bn =X if ⎧ ⎪ a = 1, bn = log n ⎪ ⎨ n ⎪ an = n−1/ξ , bn = 0 ⎪ ⎪ ⎪ an = n1/ξ , bn = 0 ⎩
X is type I max-stable X is type II max-stable X is type III max-stable.
There is a definition of domain of attraction for max-stable laws similar to the one for sum stable laws. More information on max-stable and min-stable can be found
262
7 Related Distributions
in Embrechts et al. (1997), Rachev and Mittnik (2000), Kotz and Nadarajah (2000), and Beirlant et al. (2004).
7.3.2 Multiplication-stable A positive random variable is multiplication-stable if there exist constants an, bn > 0 d
such that X =an (X1 X2 · · · Xn )bn for all n > 0. Note that if Y = exp(X), where X is sum stable, then n n n d Yi = exp(Xi ) = exp Xi = exp(an X + bn ) = ebn Y an . i=1
i=1
i=1
7.3.3 Geometric-stable distributions and Linnik distributions Geometric-stable laws are a four parameter family, with characteristic functions 1 ⎧ ⎪ ⎪ ⎨ 1 + γ α |u| α (1 − i β(sign u) tan(πα/2)) − iδu ⎪ ψ(u) = 1 ⎪ ⎪ ⎪ 2 ⎩ 1 + γ|u|(1 + i β π (sign u) log |u|) − iδu
α1 α = 1,
(7.5)
where α ∈ (0, 2], β ∈ [−1, 1], γ > 0 and δ ∈ R. Section 4.4.4 of Kotz et al. (2001) gives many facts about geometric-stable laws. A univariate symmetric Linnik distribution with index α ∈ (0, 2] and scale parameter γ > 0 has characteristic function ψ(u) =
1 , 1 + γ α |u| α
u ∈ R.
This is a symmetric case of (7.5), taking β = 0 and δ = 0. (Some authors call geometric-stable laws nonsymmetric Linnik distributions). It can be shown that if X ∼ S (α, 0, γ, 0; 0) and E is an independent Exponential (1) random variable, then Y = E 1/α X is a Linnik distribution. See Section 4.3 of Kotz et al. (2001) for this and other facts on Linnik distributions, including ways to simulate them and multivariate Linnik distributions.
7.4 Mixtures of stable distributions: scale, sum, and convolutions
263
7.3.4 Discrete stable Discrete stable distributions (not to be confused with discretized stable distributions discussed in Sections 3.10.6 and 4.14), are integer valued distributions with the convolution property. This class was introduced in Steutel and van Harn (1979). It is a two parameter family: α ∈ (0, 1] and λ > 0 with probability generating function PX (z) = E(z X ) = exp(−λ(1 − z)α ), α ∈ (0, 1], λ > 0. If X1 and X2 are discrete stable and independent, then X1 + X2 is discrete stable because X +Y has probability generating function E(z X+Y ) = E(z X ) E(zY ) = exp(−λ(1 − z)α ) exp(−λ(1 − z)α ) = exp(−2λ(1 − z)α ), which is discrete stable. Note that we do not allow arbitrary coefficients of X1 and X2 . When α = 1, this is a Poisson(λ) distribution. There is no closed form expression for the probabilities P(X = k) in general, but there is a series expansion. These laws are infinitely divisible and self decomposable; moments are finite for 0 ≤ r < α < 1. Doray et al. (2009) discuss estimation for the discrete stable laws.
7.3.5 Generalized convolutions and generalized stability Urbanik (1964) introduced the concept of a generalized convolution. Based on that concept, generalized infinite divisibility can be defined and examined. A review of this topic can be found in Volkovich et al. (2010). The concept of stability has been generalized to an abstract setting by Davydov et al. (2008). They replace the operation + on the real numbers with a general operation on a semigroup. This topic is beyond the scope of this book on univariate stable laws.
7.4 Mixtures of stable distributions: scale, sum, and convolutions We will briefly discuss two other classes derived from stable laws. Little seems to be known about these classes, but it seems worthwhile to define them. The first class is a generalization of the mixture of normals: the density is a sum of different stable density’s, each possibly with the same, say f (x) =
n
pi fi (x),
(7.6)
i=1
where fi (x) is the density of a S (α, βi, γi, δi ; k) distribution and pi ≥ 0 and p1 + · · · + pn = 1. To avoid technicalities, we assume that every pi is strictly positive and
264
7 Related Distributions
that there are no repeats among the parameters, i.e. (βi, γi, δi ) (β j , γ j , δ j ) for i j. The resulting distribution is not stable, however convolutions of such densities are of the sameform. To be precise, if X had density f (x) of form (7.6) and Y has density q g (x), where q1, . . . , qm is a probability vector and g j is the density g(x) = m j=1 j j of a S α, β j , γ j , δ j ; k r.v., then X + Y has density n m
pi q j hi j (x),
i=1 j=1
where hi j (x) = ( fi ∗ g j )(x), which is the density of a S α, βi j , γi j , δi j ; k distribution with parameters given by Proposition 1.3 or 1.4. This class of distributions may be used to fit a variety of data sets, including skewed and multi-modal data. These distributions are easy to describe and tractable computationally because the density and d.f. are weighted sums of computable stable densities and d.f.s. Ravishanker and Dey (2000) use this class to model frailty distributions. A different combination of stable laws is from sums of stable laws with different indices of stability, say (7.7) X = X1 + · · · + Xn, where Xi ∼ S (αi, βi, γi, δi ; k) and all the terms are independent. The resulting distribution is stable if and only if all the αi ’s are the same. In general it is not stable, but has a density that is the convolution of the stable densities f (x|αi, βi, γi, δi ; k) (which is not the same thing as (7.6)). Since each of the terms in (7.7) is infinitely divisible, the sum is infinitely divisible. In general, this is a rather large and inaccessible class. One can combine all the terms with the same index of stability using the rules for sums of stable laws, so it suffices to assume all the αi ’s are distinct. If we consider limits in distributions all r.v.s of form (7.7), then the class can be represented by characteristic functions of the form ∫
E exp(iuX) = exp − ω(u|α, β(α); k)M(dα) + iδu , [0,2]
where M is a finite measure on (0, 2]. There does not seem to be research on numerical computation for these laws. A special case that is of interest in physics, astronomy, and meteorology is the Voigt distribution. It is defined to be the sum of a Gaussian and a Cauchy distribution (which they call a Lorentz term). Here there are physical justifications for using such a convolution, the Gaussian term coming from a Doppler effect and the Cauchy or Lorentz term from a rotation. See Mihalas (1978) or Olver et al. (2010) for more information and Wells (1999) for numerical algorithms for computing the resulting density. Zolotarev (1981) discusses a multivariate stable distribution similar to the above: convolving spherically symmetric stable distributions with different αs.
7.5 Infinitely divisible distributions
265
7.5 Infinitely divisible distributions Definition 3.1 gives the definition of a univariate infinitely divisible law. All possible infinitely divisible laws can be represented by a shift δ, σ > 0, and a measure ν in the following way. Theorem 7.1 Lévy-Khintchine Representation X is an infinitely divisible random variable if and only if it has characteristic function of the form
∫ iux σ 2 u2 iux + e −1− ν(dx) , E exp(iuX) = exp iδu − 2 1 + x2 R ∫ where δ ∈ R, σ ≥ 0, and ν is a measure on R that satisfies {x0} min(1, |x| 2 )ν(dx) < ∞. A proof of this result can be found in Breiman (1968), Section 9.5 or Sato (1999), Section 8. As noted earlier, stable laws are one kind of infinitely divisible law where all the summands are required to be of the same type as the original X. It can be shown that in the α stable case, we have either α = 2 (Gaussian case), in which case ν = 0, or 0 < α < 2 (non-Gaussian stable case), in which case the Lévy-Khintchine measure has a radial decomposition (see Breiman (1968), Section 9.9) γ− γ+ dν = 1 {x0} dx. dx |x| 1+α x
(7.8)
This gives Theorem 3.1. Examples of nonstable infinitely divisible distributions are Γ, Poisson, tdistributions, lognormal, product of stable and power of independent Γ, generalized inverse Gaussian, generalized Γ distributions ( f (x) = c1 x β−1 exp(−c2 x α ), x > 0, |α| < 1), generalized F-distributions ( f (x) = c1 x β−1 (1 + c2 x α )−γ , x > 0, |α| < 1), and hyperbolic distribution. More information can be found in Steutel and Van Harn (2004) and Sato (1999). A related class of interest are the tempered stable distributions. Koponen (1995) defined tempered stable laws to be infinitely divisible laws with Lévy-Khintchine measure of the form γ+ e−λx dν γ− e−λ|x | = 1 {x0} dx. 1+α dx |x| x The difference with (7.8) is in the exponential terms in the numerators. These terms “temper” the tails of the Lévy-Khintchine measure, resulting in a distribution that looks like an α-stable law in the center, but has tails that decay faster. The new parameter λ controls this tempering: λ = 0 is no tempering, so it is just a stable law; λ > 0 lightens the tails. A more general definition of tempered stable is to replace exponential terms above with a completely monotone function q(x), see Rosiński (2007) for that general definition and many results on this class.
266
7 Related Distributions
7.6 Problems Problem 7.1 Show (7.4) by directly calculating the density of X. Show (7.4) by using the Mellin transform. Problem 7.2 Prove Lemma 7.3.
Appendix A
Mathematical Facts
Here are some basic facts about random variables and characteristic functions.
A.1 Sums of random variables Let X and Y be two independent random variables with cdfs FX (x) = P(X ≤ x) and FY (y) = P(Y ≤ ∫y). The sum Z = X + Y is a new random variable with cdf ∞ FZ (z) = P(Z ≤ z) = −∞ FX (z − y)FY (dy). If X and Y have pdfs, then so does Z and it is given by ∫ fZ (z) =
∞
−∞
fX (z − y) fY (y) dy.
(A.1)
These are called the convolution formulas. In general it is difficult to compute convolutions, unless X and Y have special forms, e.g. the stable laws.
A.2 Symmetric random variables A random variable X is symmetric if P(X ≤ −x) = P(X ≥ x) for all x ≥ 0. If X has a pdf f (x), then it is symmetric if and only if f (−x) = f (x) for all x ≥ 0. If independent X and Y are symmetric, then aX and X + Y are symmetric for any a ∈ R. If X is any r.v., its symmetrization is defined by Y = X1 − X2 , where X1 and X2 are i.i.d. copies of X. The phrase “X is symmetric” will always mean X is symmetric around 0. The non-zero shift of a symmetric r.v., say X + a, is therefore not symmetric. In this case it is said that X be symmetric around a.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
267
268
A Mathematical Facts
A.3 Moments For any continuous random variable X, the mean or expected value of X is defined ∫∞ by E X = −∞ x f (x)dx, provided the integral makes sense. (When X is discrete, the integral is replaced by a sum, which may have some of the problems discussed below). If the density is bounded, say f (x) ≤ M for all x, then the convergence of the integral depends on how the pdf behaves on the tails. If both tails decay quickly, then both the integral over the negative axis and the integral over the positive axis are finite, and the mean exists. If one or both integrals diverge, then the mean does not exist. ∫∞ If p is a positive integer, then E(X p ) is defined as E(X p ) = −∞ x p f (x)dx. It will converge or diverge depending on how fast x p f (x) → 0 as x → ±∞. ∫ ∞ When p is not a integer, x p is generally undefined for x negative, so the integral −∞ x p f (x)dx is generally ∫ ∞not defined. To avoid this, fractional absolute moments are defined by E |X | p = −∞ |x| p f (x)dx. If E |X | p < ∞, then E |X | q < ∞ for all 0 < q ≤ p. In certain cases, it is possible to allow negative powers p. The negative moment E |X | p is defined by the same formula, but the problems of convergence now also must consider how the pdf behaves near 0. The existence of fractional moments for α-stable random variables is discussed in Section 3.7. One difficulty of working with fractional moments is that there is no simple formula for moments of shifts E(X + a) p when p is not an integer. In Section 3.7, exact formulas for fractional moments of stable distributions are available in all cases.
A.4 Characteristic functions Definition A.1 The characteristic function (ch.f.) of any random variable X is φ(u) = E exp(iuX). ∫If∞X is absolutely continuous with density f (x), then the above expectation is φ(u) = exp(iux) f (x)dx. If X is discrete with probability function f (x), the expectation is −∞ the sum φ(u) = j exp(iux j ) f (x j ). Up to a multiplicative constant, the characteristic function is the same as the Fourier transform. The following basic properties about characteristic functions can be found in Feller (1971). Theorem A.1 (a) φ(u) exists for any random variable. (b) Characteristic functions uniquely determine the distribution of a r.v., i.e. if two random variables have the same characteristic function, then they have the same distribution. (c) In general, φ(u) is complex valued and φ(−u) = φ(u). φ(u) is real valued if and only if X is symmetric around 0, in which case φ(−u) = φ(u). (d) φ(u) is continuous for all u ∈ R. (e) Characteristic functions are uniformly bounded, in particular φ(0) = 1 and |φ(u)| ≤ 1 for all u ∈ R.
A.5 Laplace transforms
269
(f) For any a, b ∈ R, φ aX+b (u) = eiub φ X (au). (g) If X1 and X2 are independent, then φ X1 +X2 (u) ∫ ∞ = φ X1 (u)φ X2 (u). (h) X has a density f (x) if and only if −∞ |φ(u)|du < ∞, in which case ∫∞ ∫∞ f (x) = (2π)−1 −∞ exp(−iux)φ(u)du. If −∞ |un φ(u)|du < ∞, then the pdf f (x) is n times differentiable. (i) If X is symmetric and φ (0) exists, then Var (X) = −φ (0)/2. d
(j) Let X1, X2, . . . be r.v. with ch.f.s φ1 (u), φ2 (u), . . . If Xn −→X, where X is some r.v. with ch.f. φ(u), then φn (u) → φ(u) pointwise. Conversely, suppose φn (u) → φ(u) pointwise and φ(u) is continuous at u = 0, then φ(u) is a ch.f., i.e. there exists a r.v. X whose ch.f. is φ(u). A key characterization is the following. Theorem A.2 Bochners Theorem. A continuous function φ(u) is a characteristic function if and only if φ(0) = 1 and φ(u) is positive definite: for any complex numbers λ1, . . . , λn , and any real u1, . . . , un , n n
λ j λk φ(u j − uk ) ≥ 0.
j=1 k=1
Behavior of φ(u) near the origin determines the tail behavior of X. Some results along this line are given in Wolfe (1973), Wolfe (1975b) and Wolfe (1975a).
A.5 Laplace transforms Definition A.2 The Laplace transform of a random variable X is L(u) = E exp(−uX). ∫If∞X is absolutely continuous with density f (x), then the above expectation is L(u) = exp(−ux) f (x)dx. If X is discrete with probability function f (x), the expectation −∞ is the sum L(u) = j exp(−ux j ) f (x j ). Unlike the characteristic function, the Laplace transform does not always exist. For positive random variables, the expectation will be finite, but when a distribution has a heavy left tail, the expectation will not be finite. For this reason, Laplace transforms are not useful for general stable distributions. Most applications assume that the Laplace transform is defined on a neighborhood of the origin, but it also is useful in cases where it is only defined on an interval containing the origin, e.g. [0, ). If the Laplace transform is defined on a neighborhood of the origin, then it is related to the moment generating function MX (u) = E exp(uX) = L X (−u). The following basic properties about Laplace transforms can be found in Feller (1971). Theorem A.3 (a) Laplace transforms uniquely determine the distribution of a r.v., i.e. if two random variables have the same Laplace transform on an interval [0, ), then they have the same distribution.
270
A Mathematical Facts
(b) L(u) is real valued, continuous and L(0) = 1. (c) For any a, b ∈ R, L aX+b (u) = e−ub L X (au). (d) If X1 and X2 are independent, then L X1 +X2 (u) = L X1 (u)L X2 (u). (e) Let X1, X2, . . . be r.v. with respective Laplace transforms L1 (u), L2 (u), . . . If d
Xn −→X, where X is some r.v. with ch.f. L(u), then L n (u) → L(u) pointwise. Conversely, suppose L n (u) → L(u) pointwise and L(u) is continuous at u = 0, then L(u) is a Laplace transform, i.e. there exists a r.v. X whose Laplace transform is L(u).
A.6 Mellin transforms Definition A.3 The Mellin transform of a positive random variable X is the function defined for complex u by M X (u) = E(X u ). When X has a density, M X (u) = ∫∞ u f (x)dx. x 0 If X has a pth moment for some p > 0, then the Mellin transform exists. The Mellin transform is defined in some vertical strip or half-plane in C. The standard mathematical definition of the Mellin transform∫is a shift of this, e.g. Section 17.41 of ∞ Gradshteyn and Ryzhik (2000) uses M X∗ (u) = 0 x u−1 f (x)dx. Note that M X∗ (u) = M X (u − 1). The standard definition has advantages in the theory of functions, but for our purposes working with random variables, particularly with powers of random variables, Definition A.3 is more convenient. For references to the use of Mellin transforms in probability, see Zolotarev (1957) and Springer (1979) The Mellin transform uniquely determines the distribution of X and has an inverse transform, see Gradshteyn and Ryzhik (2000). Basic properties of the Mellin transform for positive X, Y , a, with X and Y independent are as follows: M X p (u) = M X (pu) M aX (u) = au M X (u) M XY (u) = M X (u)MY (u) M X/Y (u) = M X (u)MY (−u) Mexp(X) (u) = E(euX ). Because of these properties, the Mellin transform is useful for powers, products, and ratios of random variables, similar to the way the characteristic function is useful for sums of random variables. Table A.1 lists some Mellin transforms. Note that there is no simple formula for the Mellin transform of a shift X + a from the Mellin transform of X. To deal with random variables having both positive and negative values, Springer (1979) decomposes a general random variable X = X + − X − , where X + = max(X, 0) and X − = max(−X, 0). Knowing the Mellin transform of X + and X − determine the
A.7 Gamma and related functions
271
distribution of X. Equivalently, since X + and X − have disjoint support, E |X | u = E(X + )u + E(X − )u and E X = E(X + )u − E(X − )u , so knowing the absolute and signed moments determines the distribution of X. Note that if X is symmetric, then E |X | u = 2E(X + )u and E X = 0, so knowing the Mellin transform of |X | determines the distribution of X. Corollary 3.5 gives the Mellin transform of the absolute value of a strictly stable distribution. Note the simplification in the symmetric case (3.49) or in the positive strictly stable case in (3.50). Theorem 3.8 gives these Mellin transforms for cutoffs of strictly stable random variables.
A.7 Gamma and related functions The gamma function is defined for x > 0 by ∫ ∞ Γ(x) = r x−1 e−r dr.
(A.2)
0
It comes up repeatedly when dealing with integrals involving powers of x and e−x , α and integrals involving e−x . A frequently used property of the gamma function is that Γ(x + 1) = xΓ(x). If the region of integration is split in two, the incomplete gamma functions are obtained: ∫ ∞ ∫ t r x−1 e−r dr and Γ(x, t) = r x−1 e−r dr. (A.3) γ(x, t) = 0
t
d The digamma function is ψ(x) = dx log Γ(x) = Γ (x)/Γ(x). The trigamma func tion is the derivative ψ (x). The related beta function is defined for p > −1, q > −1 by ∫ 1 Γ(p) Γ(q) = B(q, p) (A.4) B(p, q) = t p−1 (1 − t)q−1 dt = Γ(p + q) 0
no closed form, see Section 3.10 no closed form, see nolan:2013
log-stable, X = exp(−S (α, 1, γ, 0; 1))
Amplitude R of isotropic stable
Table A.1 Mellin transforms of some distributions. Note that all distributions that take on negative values are symmetric and the table lists M | X | (u).
Rayleigh(c)
Weibull(b, c)
Beta(a, b)
|Triangular(c)|
ratio of two U(0, 1)
product of n U(0, 1)
U(0, c) = |Uniform(−c, c)|
Uniform(a, b)
F(m, n)
|Laplace(c)|
Gamma(p, c)
Mellin transform M X (u) γu Γ(1 − u/α) , −∞ < Re u < α (cos αθ1 )u/α Γ(1 − u) u γ Γ(1 − u/α) cos uθ1 , −1 < Re u < α (cos αθ1 )u/α Γ(1 − u) cos(uπ/2) exp(γ α (sec π2α )u α ) α 1 α=1 exp(γ π2 u log u) u Γ(1 − u/α) Γ((d + u)/2) , −d < Re u < α (2γ0 ) Γ(1 − u/2) Γ(d/2) u/2−1 u+1 2 σu √ Γ , −1 < Re u 2 π Γ(p + u) cu , −p < Re u Γ(p)
2 2 1 e−x /(2σ ), x ∈ R √ 2πσ 1 x p−1 e−x/c , x > 0 c p Γ(p) 1 −| x |/c e ,x ∈R c u Γ(u + 1), −1 < Re u 2c n u Γ((m/2) + u)Γ((n/2) − u) x m/2−1 Γ((m + n)/2)(m/n) m/2 , −m/2 < Re u < n/2 ,x >0 (m+n)/2 Γ(m/2)Γ(n/2) m Γ(m/2)Γ(n/2) (1 + (m/n)x) u+1 u+1 1 b −a ,0 ≤ a < x < b , −1 < Re u b−a (u + 1)(b − a) 1 cu , −c < x < c , −1 < Re u 2c u+1 n−1 (− log(1/x)) 1 ,0 < x < 1 , Re u > −1 (u + 1) n (n − 1)! 1/2 00 (2c 2 )u/2 Γ((u/2) + 1), −1 < Re u e c
no closed form, see Section 3.2
|S (α, β, γ, 0; 1) |
|N(0, σ 2 )|
Density no closed form, see Section 3.2
Distribution
S (α, 1, γ, 0; 1), α < 1
272 A Mathematical Facts
Appendix B
Stable Quantiles
The following table presents selected quantiles zλ (α, β) = F −1 (λ) for a standardized stable distribution in the 0-parameterization. The tables are for α ∈ {0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.95, 1.99, 2} and β ∈ {0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}. For β < 0, use the reflection property: zλ (α, −β) = z1−λ (α, β). Note: when α = 2, β is meaningless. Also S (2, 0; 0) is N(0, 2), not N(0, 1), so the α = 2 quantiles are not the standard normal quantiles.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
273
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .1
-6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .2
-6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
-6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .3 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .4 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .5 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .6
Stable quantiles zλ (α, β), α = 2 β = .7 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .8 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
-6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
β = .9 β = 1.0 -6.0315 -5.2595 -4.6535 -4.3702 -3.6428 -3.2900 -2.9044 -2.6598 -2.4758 -2.3262 -2.1988 -2.0871 -1.9871 -1.8961 -1.8124 -1.1902 -0.7416 -0.3583 0 0.3583 0.7416 1.1902 1.8124 1.8961 1.9871 2.0871 2.1988 2.3262 2.4758 2.6598 2.9044 3.2900 3.6428 4.3702 4.6535 5.2595 6.0315
274 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-22.8 -7.5761 -4.9746 -4.5476 -3.6945 -3.3203 -2.9216 -2.6717 -2.4847 -2.3332 -2.2044 -2.0918 -1.9910 -1.8994 -1.8152 -1.1909 -0.7417 -0.3583 0 0.3583 0.7417 1.1909 1.8152 1.8994 1.9910 2.0918 2.2044 2.3332 2.4847 2.6717 2.9216 3.3203 3.6945 4.5476 4.9746 7.5761 22.8
β = .1
-21.64 -7.2387 -4.9331 -4.5251 -3.6867 -3.3150 -2.9179 -2.6687 -2.4822 -2.3309 -2.2024 -2.0899 -1.9892 -1.8978 -1.8137 -1.1899 -0.7409 -0.3576 0.0006802 0.359 0.7425 1.1919 1.8167 1.9011 1.9927 2.0936 2.2065 2.3354 2.4873 2.6747 2.9252 3.3256 3.7024 4.5705 5.0177 7.9025 23.91
β = .2
-20.41 -6.8937 -4.8931 -4.5030 -3.6789 -3.3097 -2.9142 -2.6657 -2.4796 -2.3287 -2.2004 -2.0880 -1.9875 -1.8962 -1.8121 -1.1889 -0.7401 -0.3569 0.00136 0.3597 0.7433 1.1929 1.8183 1.9027 1.9945 2.0955 2.2085 2.3377 2.4898 2.6776 2.9289 3.3310 3.7102 4.5938 5.0625 8.2176 24.97
-19.1 -6.5518 -4.8546 -4.4812 -3.6711 -3.3044 -2.9105 -2.6628 -2.4771 -2.3264 -2.1983 -2.0861 -1.9858 -1.8945 -1.8106 -1.1878 -0.7393 -0.3561 0.002041 0.3604 0.7441 1.1939 1.8198 1.9043 1.9962 2.0974 2.2106 2.3400 2.4924 2.6806 2.9326 3.3363 3.7182 4.6176 5.1091 8.5220 25.98
β = .3 -17.7 -6.2347 -4.8174 -4.4598 -3.6634 -3.2991 -2.9069 -2.6598 -2.4746 -2.3242 -2.1963 -2.0843 -1.9840 -1.8929 -1.8091 -1.1868 -0.7385 -0.3554 0.002721 0.3611 0.7449 1.1950 1.8213 1.9059 1.9980 2.0993 2.2126 2.3422 2.4950 2.6836 2.9363 3.3417 3.7261 4.6417 5.1576 8.8166 26.96
β = .4 -16.18 -5.9647 -4.7816 -4.4388 -3.6557 -3.2938 -2.9032 -2.6569 -2.4720 -2.3219 -2.1943 -2.0824 -1.9823 -1.8913 -1.8076 -1.1858 -0.7377 -0.3547 0.003401 0.3618 0.7458 1.1960 1.8229 1.9076 1.9997 2.1012 2.2147 2.3445 2.4975 2.6866 2.9400 3.3471 3.7340 4.6663 5.2079 9.1022 27.9
β = .5 -14.51 -5.7477 -4.7470 -4.4181 -3.6480 -3.2885 -2.8996 -2.6539 -2.4695 -2.3197 -2.1922 -2.0805 -1.9806 -1.8897 -1.8060 -1.1848 -0.7369 -0.354 0.004082 0.3625 0.7466 1.1970 1.8244 1.9092 2.0015 2.1031 2.2167 2.3468 2.5001 2.6896 2.9437 3.3525 3.7420 4.6913 5.2603 9.3795 28.82
β = .6
Stable quantiles zλ (α, β), α = 1.99 β = .7 -12.61 -5.5749 -4.7135 -4.3978 -3.6404 -3.2833 -2.8959 -2.6510 -2.4670 -2.3174 -2.1902 -2.0787 -1.9788 -1.8881 -1.8045 -1.1838 -0.7361 -0.3533 0.004762 0.3632 0.7474 1.1980 1.8260 1.9109 2.0032 2.1049 2.2188 2.3490 2.5026 2.6926 2.9474 3.3579 3.7501 4.7168 5.3147 9.6492 29.7
β = .8 -10.39 -5.4349 -4.6811 -4.3778 -3.6327 -3.2780 -2.8923 -2.6480 -2.4644 -2.3152 -2.1882 -2.0768 -1.9771 -1.8865 -1.8030 -1.1828 -0.7353 -0.3526 0.005442 0.364 0.7482 1.1990 1.8275 1.9125 2.0050 2.1068 2.2209 2.3513 2.5052 2.6956 2.9512 3.3633 3.7581 4.7428 5.3712 9.9118 30.57
-5.9835 -5.2195 -4.6193 -4.3387 -3.6176 -3.2676 -2.8850 -2.6421 -2.4594 -2.3107 -2.1841 -2.0731 -1.9737 -1.8832 -1.8000 -1.1808 -0.7337 -0.3512 0.006803 0.3654 0.7498 1.2011 1.8306 1.9158 2.0085 2.1106 2.2250 2.3559 2.5104 2.7016 2.9586 3.3741 3.7743 4.7961 5.4904 10.42 32.22
β = .9 β = 1.0 -7.5721 -5.3185 -4.6497 -4.3581 -3.6251 -3.2728 -2.8886 -2.6451 -2.4619 -2.3129 -2.1861 -2.0749 -1.9754 -1.8848 -1.8015 -1.1818 -0.7345 -0.3519 0.006123 0.3647 0.749 1.2001 1.8291 1.9141 2.0067 2.1087 2.2229 2.3536 2.5078 2.6986 2.9549 3.3687 3.7662 4.7692 5.4297 10.17 31.4
B Stable Quantiles 275
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-54.74 -16.98 -7.7907 -5.8347 -3.9415 -3.4569 -2.9961 -2.7225 -2.5227 -2.3629 -2.2284 -2.1115 -2.0074 -1.9133 -1.8270 -1.1935 -0.7421 -0.3582 0 0.3582 0.7421 1.1935 1.8270 1.9133 2.0074 2.1115 2.2284 2.3629 2.5227 2.7225 2.9961 3.4569 3.9415 5.8347 7.7907 16.98 54.74
β = .1
-51.86 -16.1 -7.4229 -5.6196 -3.8956 -3.4279 -2.9768 -2.7072 -2.5096 -2.3514 -2.2180 -2.1020 -1.9987 -1.9051 -1.8193 -1.1885 -0.7382 -0.3547 0.003319 0.3617 0.7461 1.1986 1.8347 1.9215 2.0162 2.1210 2.2388 2.3744 2.5358 2.7379 3.0155 3.4861 3.9882 6.0516 8.1444 17.82 57.48
β = .2
-48.82 -15.17 -7.0400 -5.4102 -3.8506 -3.3992 -2.9577 -2.6920 -2.4967 -2.3400 -2.2077 -2.0926 -1.9900 -1.8970 -1.8117 -1.1835 -0.7342 -0.3513 0.006637 0.3651 0.7501 1.2036 1.8425 1.9298 2.0251 2.1305 2.2493 2.3860 2.5489 2.7534 3.0350 3.5157 4.0357 6.2677 8.4854 18.62 60.1
-45.58 -14.19 -6.6417 -5.2106 -3.8063 -3.3708 -2.9386 -2.6768 -2.4838 -2.3286 -2.1975 -2.0832 -1.9813 -1.8889 -1.8041 -1.1785 -0.7303 -0.3478 0.009957 0.3686 0.7541 1.2087 1.8503 1.9381 2.0340 2.1402 2.2598 2.3977 2.5622 2.7690 3.0546 3.5455 4.0839 6.4812 8.8148 19.4 62.63
β = .3 -42.12 -13.13 -6.2316 -5.0243 -3.7628 -3.3427 -2.9198 -2.6618 -2.4710 -2.3173 -2.1873 -2.0739 -1.9726 -1.8808 -1.7965 -1.1736 -0.7264 -0.3444 0.01328 0.3721 0.7581 1.2138 1.8581 1.9464 2.0429 2.1498 2.2703 2.4094 2.5755 2.7846 3.0744 3.5756 4.1328 6.6911 9.1338 20.14 65.05
β = .4 -38.37 -12 -5.8229 -4.8534 -3.7202 -3.3149 -2.9010 -2.6468 -2.4582 -2.3060 -2.1772 -2.0646 -1.9641 -1.8728 -1.7890 -1.1687 -0.7225 -0.3409 0.0166 0.3756 0.7621 1.2190 1.8659 1.9547 2.0519 2.1595 2.2809 2.4211 2.5888 2.8004 3.0942 3.6059 4.1824 6.8970 9.4432 20.86 67.4
β = .5 -34.22 -10.75 -5.4436 -4.6986 -3.6783 -3.2874 -2.8824 -2.6319 -2.4455 -2.2948 -2.1671 -2.0553 -1.9555 -1.8648 -1.7815 -1.1637 -0.7186 -0.3375 0.01992 0.3791 0.7661 1.2241 1.8738 1.9631 2.0609 2.1692 2.2915 2.4330 2.6022 2.8162 3.1142 3.6365 4.2327 7.0988 9.7439 21.56 69.67
β = .6
Stable quantiles zλ (α, β), α = 1.95 β = .7 -29.54 -9.3517 -5.1219 -4.5589 -3.6373 -3.2602 -2.8639 -2.6171 -2.4329 -2.2837 -2.1570 -2.0461 -1.9470 -1.8569 -1.7740 -1.1588 -0.7147 -0.3341 0.02324 0.3826 0.7701 1.2293 1.8817 1.9716 2.0699 2.1790 2.3022 2.4448 2.6157 2.8321 3.1343 3.6673 4.2836 7.2964 10.04 22.24 71.87
β = .8 -24.03 -7.7305 -4.8632 -4.4330 -3.5972 -3.2333 -2.8456 -2.6025 -2.4204 -2.2726 -2.1470 -2.0370 -1.9385 -1.8490 -1.7666 -1.1540 -0.7109 -0.3307 0.02656 0.3861 0.7742 1.2345 1.8897 1.9800 2.0790 2.1888 2.3130 2.4568 2.6293 2.8480 3.1545 3.6984 4.3351 7.4900 10.32 22.9 74.02
-5.7957 -5.0629 -4.4858 -4.2155 -3.5193 -3.1805 -2.8094 -2.5734 -2.3955 -2.2507 -2.1272 -2.0189 -1.9217 -1.8333 -1.7519 -1.1443 -0.7032 -0.3238 0.03321 0.3932 0.7823 1.2449 1.9057 1.9971 2.0973 2.2086 2.3346 2.4807 2.6565 2.8802 3.1952 3.7612 4.4397 7.8657 10.87 24.17 78.13
β = .9 β = 1.0 -16.91 -5.9127 -4.6558 -4.3191 -3.5578 -3.2067 -2.8274 -2.5879 -2.4079 -2.2616 -2.1371 -2.0279 -1.9301 -1.8412 -1.7592 -1.1491 -0.707 -0.3272 0.02988 0.3896 0.7783 1.2397 1.8977 1.9886 2.0881 2.1987 2.3237 2.4687 2.6429 2.8641 3.1748 3.7297 4.3871 7.6797 10.6 23.54 76.1
276 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-86.54 -25.88 -11.33 -8.0839 -4.3676 -3.6691 -3.1049 -2.7951 -2.5761 -2.4043 -2.2616 -2.1386 -2.0300 -1.9322 -1.8430 -1.1970 -0.7426 -0.358 0 0.358 0.7426 1.1970 1.8430 1.9322 2.0300 2.1386 2.2616 2.4043 2.5761 2.7951 3.1049 3.6691 4.3676 8.0839 11.33 25.88 86.54
β = .1
-81.86 -24.48 -10.74 -7.6801 -4.2549 -3.6039 -3.0638 -2.7630 -2.5491 -2.3807 -2.2404 -2.1193 -2.0122 -1.9157 -1.8276 -1.1871 -0.7348 -0.3513 0.006445 0.3648 0.7504 1.2070 1.8586 1.9489 2.0479 2.1580 2.2829 2.4281 2.6033 2.8275 3.1465 3.7353 4.4825 8.4725 11.9 27.21 91
β = .2
-76.93 -23.01 -10.11 -7.2591 -4.1452 -3.5399 -3.0232 -2.7313 -2.5224 -2.3573 -2.2195 -2.1003 -1.9947 -1.8994 -1.8123 -1.1772 -0.7272 -0.3446 0.01289 0.3716 0.7582 1.2171 1.8744 1.9657 2.0660 2.1777 2.3044 2.4521 2.6308 2.8602 3.1886 3.8025 4.5994 8.8473 12.45 28.49 95.28
-71.7 -21.45 -9.4573 -6.8190 -4.0386 -3.4771 -2.9831 -2.7000 -2.4960 -2.3342 -2.1987 -2.0814 -1.9772 -1.8832 -1.7971 -1.1675 -0.7195 -0.3379 0.01934 0.3784 0.7661 1.2273 1.8903 1.9827 2.0842 2.1975 2.3262 2.4764 2.6585 2.8932 3.2311 3.8706 4.7177 9.2098 12.98 29.72 99.39
β = .3 -66.1 -19.79 -8.7589 -6.3588 -3.9357 -3.4156 -2.9436 -2.6690 -2.4699 -2.3113 -2.1782 -2.0627 -1.9600 -1.8672 -1.7821 -1.1578 -0.712 -0.3312 0.02579 0.3852 0.774 1.2376 1.9063 1.9998 2.1026 2.2174 2.3481 2.5009 2.6865 2.9265 3.2740 3.9394 4.8369 9.5611 13.49 30.91 103.4
β = .4 -60.04 -17.99 -8.0109 -5.8809 -3.8366 -3.3555 -2.9047 -2.6384 -2.4440 -2.2887 -2.1579 -2.0442 -1.9429 -1.8513 -1.7672 -1.1482 -0.7044 -0.3246 0.03225 0.3921 0.782 1.2479 1.9224 2.0170 2.1212 2.2375 2.3702 2.5255 2.7147 2.9601 3.3173 4.0090 4.9566 9.9022 13.98 32.06 107.2
β = .5 -53.38 -16.01 -7.2016 -5.4003 -3.7414 -3.2967 -2.8663 -2.6082 -2.4185 -2.2663 -2.1378 -2.0259 -1.9261 -1.8356 -1.7525 -1.1387 -0.6969 -0.318 0.03871 0.399 0.79 1.2584 1.9387 2.0344 2.1398 2.2578 2.3924 2.5504 2.7431 2.9940 3.3610 4.0791 5.0765 10.23 14.46 33.17 110.9
β = .6
Stable quantiles zλ (α, β), α = 1.9 β = .7 -45.87 -13.8 -6.3191 -4.9554 -3.6504 -3.2394 -2.8286 -2.5784 -2.3933 -2.2442 -2.1180 -2.0078 -1.9094 -1.8200 -1.7379 -1.1293 -0.6895 -0.3114 0.04518 0.4059 0.7981 1.2689 1.9552 2.0519 2.1587 2.2782 2.4149 2.5754 2.7718 3.0281 3.4049 4.1498 5.1963 10.56 14.93 34.25 114.5
β = .8 -37.05 -11.21 -5.4119 -4.5858 -3.5635 -3.1836 -2.7915 -2.5490 -2.3684 -2.2223 -2.0984 -1.9899 -1.8928 -1.8047 -1.7235 -1.1199 -0.6821 -0.3048 0.05165 0.4129 0.8063 1.2795 1.9717 2.0696 2.1777 2.2988 2.4375 2.6006 2.8006 3.0624 3.4492 4.2208 5.3157 10.87 15.38 35.3 118
-5.5694 -4.8744 -4.3254 -4.0675 -3.4016 -3.0764 -2.7192 -2.4915 -2.3197 -2.1795 -2.0598 -1.9547 -1.8604 -1.7744 -1.6951 -1.1015 -0.6674 -0.2916 0.06463 0.4269 0.8228 1.3009 2.0051 2.1053 2.2160 2.3404 2.4831 2.6515 2.8588 3.1317 3.5385 4.3638 5.5526 11.48 16.26 37.33 124.8
β = .9 β = 1.0 -25.74 -7.9523 -4.7329 -4.2960 -3.4806 -3.1292 -2.7550 -2.5201 -2.3439 -2.2008 -2.0790 -1.9722 -1.8765 -1.7894 -1.7093 -1.1107 -0.6747 -0.2982 0.05814 0.4198 0.8145 1.2902 1.9884 2.0874 2.1968 2.3196 2.4602 2.6260 2.8296 3.0970 3.4937 4.2922 5.4345 11.18 15.83 36.33 121.4
B Stable Quantiles 277
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-158.9 -44.31 -18.29 -12.59 -5.6428 -4.2768 -3.3915 -2.9786 -2.7079 -2.5049 -2.3411 -2.2031 -2.0831 -1.9765 -1.8803 -1.2045 -0.7433 -0.3575 0 0.3575 0.7433 1.2045 1.8803 1.9765 2.0831 2.2031 2.3411 2.5049 2.7079 2.9786 3.3915 4.2768 5.6428 12.59 18.29 44.31 158.9
β = .1
-149.9 -41.77 -17.24 -11.87 -5.3689 -4.1151 -3.2964 -2.9071 -2.6492 -2.4543 -2.2964 -2.1627 -2.0462 -1.9424 -1.8485 -1.1850 -0.7284 -0.3447 0.0122 0.3704 0.7584 1.2245 1.9126 2.0113 2.1207 2.2442 2.3867 2.5563 2.7677 3.0512 3.4881 4.4394 5.9114 13.27 19.29 46.74 167.6
β = .2
-140.3 -39.11 -16.14 -11.13 -5.0903 -3.9555 -3.2030 -2.8370 -2.5916 -2.4047 -2.2524 -2.1231 -2.0099 -1.9090 -1.8174 -1.1657 -0.7137 -0.332 0.0244 0.3834 0.7737 1.2447 1.9455 2.0466 2.1589 2.2860 2.4330 2.6086 2.8283 3.1248 3.5859 4.6021 6.1744 13.94 20.26 49.08 175.9
-130.3 -36.29 -14.99 -10.34 -4.8090 -3.7989 -3.1117 -2.7684 -2.5351 -2.3560 -2.2094 -2.0842 -1.9744 -1.8761 -1.7868 -1.1468 -0.6992 -0.3194 0.03662 0.3965 0.7892 1.2654 1.9789 2.0825 2.1978 2.3284 2.4800 2.6616 2.8897 3.1994 3.6846 4.7644 6.4318 14.58 21.2 51.33 183.9
β = .3 -119.5 -33.29 -13.76 -9.5105 -4.5285 -3.6468 -3.0228 -2.7014 -2.4799 -2.3085 -2.1672 -2.0462 -1.9396 -1.8440 -1.7569 -1.1283 -0.6849 -0.3069 0.04887 0.4097 0.8049 1.2863 2.0129 2.1189 2.2372 2.3715 2.5276 2.7153 2.9519 3.2747 3.7840 4.9258 6.6839 15.2 22.1 53.51 191.7
β = .4 -108 -30.06 -12.44 -8.6241 -4.2542 -3.5003 -2.9365 -2.6362 -2.4261 -2.2620 -2.1260 -2.0090 -1.9056 -1.8126 -1.7277 -1.1101 -0.6708 -0.2944 0.06115 0.4231 0.8209 1.3076 2.0473 2.1558 2.2771 2.4151 2.5758 2.7696 3.0147 3.3506 3.8838 5.0860 6.9309 15.8 22.98 55.62 199.2
β = .5 -95.37 -26.54 -11.01 -7.6677 -3.9936 -3.3606 -2.8530 -2.5728 -2.3736 -2.2167 -2.0858 -1.9727 -1.8724 -1.7819 -1.6990 -1.0922 -0.6569 -0.282 0.07348 0.4366 0.8371 1.3292 2.0822 2.1933 2.3175 2.4592 2.6246 2.8244 3.0781 3.4271 3.9839 5.2448 7.1731 16.38 23.83 57.68 206.5
β = .6
Stable quantiles zλ (α, β), α = 1.8 β = .7 -81.24 -22.61 -9.4253 -6.6208 -3.7536 -3.2286 -2.7727 -2.5114 -2.3227 -2.1726 -2.0466 -1.9372 -1.8399 -1.7518 -1.6711 -1.0746 -0.6431 -0.2696 0.08586 0.4502 0.8535 1.3512 2.1176 2.2311 2.3584 2.5038 2.6739 2.8798 3.1420 3.5040 4.0842 5.4021 7.4107 16.95 24.66 59.67 213.6
β = .8 -64.8 -18.05 -7.6128 -5.4645 -3.5381 -3.1047 -2.6955 -2.4520 -2.2732 -2.1297 -2.0085 -1.9027 -1.8083 -1.7226 -1.6437 -1.0574 -0.6295 -0.2573 0.0983 0.464 0.8702 1.3735 2.1533 2.2695 2.3998 2.5488 2.7236 2.9356 3.2063 3.5812 4.1845 5.5578 7.6441 17.51 25.47 61.62 220.5
-5.1410 -4.5182 -4.0226 -3.7887 -3.1807 -2.8816 -2.5512 -2.3395 -2.1791 -2.0477 -1.9354 -1.8364 -1.7474 -1.6662 -1.5911 -1.0238 -0.6026 -0.2327 0.1234 0.492 0.9042 1.4190 2.2261 2.3473 2.4837 2.6402 2.8242 3.0483 3.3358 3.7363 4.3845 5.8643 8.0989 18.59 27.04 65.37 233.9
β = .9 β = 1.0 -44.03 -12.32 -5.4483 -4.3907 -3.3477 -2.9891 -2.6217 -2.3947 -2.2254 -2.0881 -1.9714 -1.8691 -1.7775 -1.6940 -1.6171 -1.0404 -0.616 -0.245 0.1108 0.4779 0.8871 1.3961 2.1895 2.3082 2.4416 2.5943 2.7737 2.9918 3.2709 3.6587 4.2846 5.7119 7.8734 18.06 26.26 63.52 227.3
278 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-264.6 -68.37 -26.67 -17.85 -7.2897 -5.1519 -3.8004 -3.2314 -2.8850 -2.6373 -2.4443 -2.2856 -2.1503 -2.0320 -1.9265 -1.2130 -0.7436 -0.3566 0 0.3566 0.7436 1.2130 1.9265 2.0320 2.1503 2.2856 2.4443 2.6373 2.8850 3.2314 3.8004 5.1519 7.2897 17.85 26.67 68.37 264.6
β = .1
-248.6 -64.22 -25.04 -16.76 -6.8630 -4.8788 -3.6359 -3.1104 -2.7873 -2.5544 -2.3718 -2.2208 -2.0916 -1.9782 -1.8768 -1.1837 -0.722 -0.3384 0.01737 0.3751 0.7658 1.2432 1.9776 2.0872 2.2105 2.3519 2.5185 2.7220 2.9845 3.3543 3.9658 5.4209 7.7040 18.91 28.23 72.35 279.9
β = .2
-232 -59.88 -23.33 -15.62 -6.4228 -4.6021 -3.4735 -2.9918 -2.6919 -2.4736 -2.3012 -2.1578 -2.0345 -1.9259 -1.8284 -1.1552 -0.7008 -0.3203 0.03476 0.3939 0.7884 1.2741 2.0299 2.1437 2.2720 2.4196 2.5941 2.8081 3.0856 3.4785 4.1316 5.6854 8.1073 19.92 29.75 76.19 294.6
-214.4 -55.32 -21.54 -14.42 -5.9677 -4.3234 -3.3143 -2.8763 -2.5991 -2.3951 -2.2326 -2.0967 -1.9791 -1.8752 -1.7815 -1.1275 -0.6801 -0.3024 0.05222 0.413 0.8116 1.3058 2.0834 2.2014 2.3348 2.4887 2.6711 2.8956 3.1879 3.6035 4.2971 5.9456 8.5004 20.91 31.21 79.9 308.9
β = .3 -195.7 -50.48 -19.65 -13.16 -5.4967 -4.0454 -3.1596 -2.7644 -2.5094 -2.3192 -2.1663 -2.0375 -1.9256 -1.8261 -1.7362 -1.1006 -0.6598 -0.2846 0.06977 0.4323 0.8353 1.3383 2.1379 2.2602 2.3987 2.5588 2.7492 2.9842 3.2911 3.7292 4.4621 6.2014 8.8843 21.86 32.63 83.5 322.7
β = .4 -175.8 -45.3 -17.63 -11.82 -5.0100 -3.7728 -3.0109 -2.6569 -2.4230 -2.2461 -2.1024 -1.9805 -1.8739 -1.7787 -1.6924 -1.0745 -0.6398 -0.267 0.08743 0.452 0.8595 1.3715 2.1934 2.3199 2.4635 2.6300 2.8282 3.0736 3.3950 3.8551 4.6262 6.4531 9.2597 22.79 34.01 86.99 336.1
β = .5 -154.1 -39.67 -15.44 -10.37 -4.5132 -3.5124 -2.8698 -2.5543 -2.3404 -2.1760 -2.0410 -1.9256 -1.8241 -1.7331 -1.6502 -1.0491 -0.6203 -0.2494 0.1052 0.4721 0.8843 1.4055 2.2498 2.3806 2.5293 2.7020 2.9081 3.1638 3.4995 3.9811 4.7893 6.7008 9.6273 23.7 35.36 90.39 349.1
β = .6
Stable quantiles zλ (α, β), α = 1.7 β = .7 -130 -33.45 -13.03 -8.7794 -4.0267 -3.2714 -2.7372 -2.4571 -2.2617 -2.1090 -1.9823 -1.8730 -1.7764 -1.6892 -1.6095 -1.0245 -0.601 -0.2319 0.1232 0.4925 0.9096 1.4402 2.3070 2.4421 2.5959 2.7747 2.9886 3.2546 3.6043 4.1070 4.9513 6.9448 9.9877 24.59 36.67 93.71 361.9
β = .8 -102.3 -26.3 -10.28 -6.9891 -3.5928 -3.0552 -2.6140 -2.3656 -2.1871 -2.0453 -1.9262 -1.8228 -1.7306 -1.6472 -1.5705 -1.0005 -0.5821 -0.2144 0.1414 0.5133 0.9354 1.4756 2.3649 2.5043 2.6631 2.8482 3.0698 3.3457 3.7093 4.2326 5.1121 7.1852 10.34 25.45 37.95 96.95 374.3
-4.7392 -4.1844 -3.7395 -3.5284 -2.9755 -2.7013 -2.3964 -2.1999 -2.0504 -1.9276 -1.8222 -1.7291 -1.6451 -1.5683 -1.4972 -0.9546 -0.5449 -0.1794 0.1784 0.556 0.9887 1.5483 2.4828 2.6306 2.7995 2.9966 3.2334 3.5291 3.9196 4.4829 5.4300 7.6561 11.03 27.13 40.44 103.2 398.3
β = .9 β = 1.0 -67.93 -17.45 -6.9509 -4.9023 -3.2441 -2.8657 -2.5005 -2.2799 -2.1166 -1.9848 -1.8728 -1.7748 -1.6869 -1.6069 -1.5331 -0.9772 -0.5634 -0.1969 0.1598 0.5344 0.9618 1.5116 2.4236 2.5671 2.7310 2.9221 3.1514 3.4373 3.8145 4.3580 5.2717 7.4222 10.69 26.3 39.21 100.1 386.4
B Stable Quantiles 279
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-436 -103.5 -37.97 -24.72 -9.3323 -6.2841 -4.3601 -3.5780 -3.1247 -2.8143 -2.5804 -2.3933 -2.2372 -2.1031 -1.9853 -1.2229 -0.7435 -0.3553 0 0.3553 0.7435 1.2229 1.9853 2.1031 2.2372 2.3933 2.5804 2.8143 3.1247 3.5780 4.3601 6.2841 9.3323 24.72 37.97 103.5 436
β = .1
-408.2 -96.82 -35.5 -23.1 -8.7248 -5.8903 -4.1142 -3.3961 -2.9790 -2.6919 -2.4744 -2.2994 -2.1528 -2.0263 -1.9148 -1.1834 -0.7153 -0.332 0.02202 0.379 0.7726 1.2639 2.0581 2.1823 2.3241 2.4897 2.6891 2.9393 3.2727 3.7612 4.6046 6.6695 9.9224 26.28 40.35 109.9 462.9
β = .2
-379.1 -89.89 -32.93 -21.42 -8.0976 -5.4876 -3.8679 -3.2166 -2.8364 -2.5727 -2.3715 -2.2086 -2.0713 -1.9523 -1.8469 -1.1454 -0.688 -0.3091 0.04411 0.4033 0.8027 1.3064 2.1331 2.2637 2.4132 2.5884 2.8000 3.0663 3.4222 3.9449 4.8471 7.0470 10.5 27.79 42.65 116.1 488.8
-348.7 -82.63 -30.24 -19.67 -7.4482 -5.0757 -3.6228 -3.0408 -2.6979 -2.4575 -2.2723 -2.1212 -1.9930 -1.8812 -1.7817 -1.1090 -0.6616 -0.2865 0.06636 0.4281 0.8338 1.3503 2.2099 2.3470 2.5043 2.6890 2.9126 3.1947 3.5728 4.1286 5.0874 7.4173 11.06 29.26 44.89 122.1 513.9
β = .3 -316.6 -74.97 -27.41 -17.82 -6.7730 -4.6551 -3.3816 -2.8708 -2.5648 -2.3472 -2.1775 -2.0377 -1.9182 -1.8133 -1.7195 -1.0741 -0.6358 -0.264 0.08882 0.4535 0.8658 1.3956 2.2884 2.4319 2.5969 2.7911 3.0266 3.3243 3.7239 4.3118 5.3254 7.7808 11.61 30.69 47.06 127.9 538.4
β = .4 -282.4 -66.82 -24.41 -15.87 -6.0683 -4.2283 -3.1484 -2.7085 -2.4383 -2.2424 -2.0874 -1.9584 -1.8472 -1.7489 -1.6604 -1.0406 -0.6108 -0.2418 0.1116 0.4796 0.8989 1.4423 2.3684 2.5183 2.6909 2.8944 3.1416 3.4546 3.8752 4.4944 5.5610 8.1380 12.14 32.08 49.18 133.6 562.1
β = .5 -245.5 -58.04 -21.18 -13.77 -5.3300 -3.8031 -2.9282 -2.5563 -2.3195 -2.1437 -2.0026 -1.8836 -1.7801 -1.6879 -1.6045 -1.0085 -0.5863 -0.2196 0.1346 0.5064 0.9329 1.4902 2.4497 2.6059 2.7861 2.9988 3.2575 3.5855 4.0266 4.6762 5.7943 8.4894 12.67 33.44 51.25 139.2 585.3
β = .6
Stable quantiles zλ (α, β), α = 1.6 β = .7 -205 -48.39 -17.64 -11.48 -4.5584 -3.3987 -2.7262 -2.4157 -2.2091 -2.0518 -1.9231 -1.8134 -1.7169 -1.6305 -1.5516 -0.9777 -0.5624 -0.1974 0.1581 0.5339 0.968 1.5393 2.5322 2.6946 2.8823 3.1040 3.3741 3.7167 4.1779 4.8570 6.0251 8.8353 13.18 34.77 53.27 144.6 608
β = .8 -159 -37.45 -13.65 -8.9172 -3.7900 -3.0441 -2.5459 -2.2878 -2.1076 -1.9666 -1.8492 -1.7478 -1.6578 -1.5765 -1.5019 -0.948 -0.5388 -0.1751 0.182 0.5621 1.0040 1.5895 2.6156 2.7843 2.9793 3.2099 3.4911 3.8481 4.3289 5.0369 6.2538 9.1761 13.69 36.08 55.25 149.9 630.1
-4.3587 -3.8685 -3.4718 -3.2823 -2.7823 -2.5320 -2.2517 -2.0699 -1.9309 -1.8162 -1.7175 -1.6301 -1.5510 -1.4785 -1.4111 -0.8919 -0.4926 -0.1301 0.2312 0.6206 1.0787 1.6930 2.7849 2.9658 3.1752 3.4233 3.7261 4.1110 4.6297 5.3935 6.7045 9.8437 14.68 38.61 59.1 160.2 673.1
β = .9 β = 1.0 -102.9 -24.16 -8.8649 -5.8916 -3.1755 -2.7569 -2.3882 -2.1727 -2.0150 -1.8882 -1.7808 -1.6868 -1.6026 -1.5259 -1.4551 -0.9195 -0.5156 -0.1527 0.2064 0.591 1.0409 1.6407 2.6999 2.8747 3.0769 3.3164 3.6085 3.9796 4.4795 5.2157 6.4802 9.5121 14.19 37.35 57.19 155.1 651.9
280 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-735.5 -158.5 -54.34 -34.32 -11.98 -7.7364 -5.0967 -4.0428 -3.4476 -3.0519 -2.7622 -2.5360 -2.3515 -2.1959 -2.0615 -1.2346 -0.7428 -0.3533 0 0.3533 0.7428 1.2346 2.0615 2.1959 2.3515 2.5360 2.7622 3.0519 3.4476 4.0428 5.0967 7.7364 11.98 34.32 54.34 158.5 735.5
β = .1
-685.6 -147.7 -50.58 -31.93 -11.14 -7.1945 -4.7564 -3.7886 -3.2436 -2.8811 -2.6150 -2.4066 -2.2360 -2.0915 -1.9662 -1.1841 -0.7082 -0.3255 0.02618 0.382 0.779 1.2877 2.1603 2.3040 2.4707 2.6690 2.9127 3.2256 3.6534 4.2968 5.4333 8.2668 12.81 36.64 57.97 169 783.9
β = .2
-633.7 -136.5 -46.69 -29.46 -10.26 -6.6399 -4.4127 -3.5354 -3.0426 -2.7142 -2.4722 -2.2816 -2.1248 -1.9914 -1.8751 -1.1362 -0.6751 -0.2982 0.05251 0.4115 0.8169 1.3432 2.2622 2.4151 2.5929 2.8048 3.0657 3.4011 3.8601 4.5500 5.7661 8.7867 13.61 38.89 61.51 179.2 830.8
-579.6 -124.7 -42.64 -26.88 -9.3601 -6.0713 -4.0666 -3.2848 -2.8464 -2.5528 -2.3349 -2.1621 -2.0188 -1.8962 -1.7886 -1.0908 -0.6433 -0.2714 0.07914 0.442 0.8564 1.4010 2.3668 2.5288 2.7175 2.9428 3.2205 3.5779 4.0672 4.8020 6.0951 9.2970 14.4 41.08 64.94 189.1 876.4
β = .3 -522.9 -112.5 -38.39 -24.19 -8.4244 -5.4877 -3.7203 -3.0399 -2.6574 -2.3988 -2.2046 -2.0491 -1.9188 -1.8065 -1.7072 -1.0480 -0.6127 -0.2448 0.1062 0.4736 0.8975 1.4609 2.4736 2.6446 2.8441 3.0825 3.3767 3.7556 4.2742 5.0526 6.4203 9.7984 15.17 43.22 68.3 198.8 920.9
β = .4 -462.9 -99.48 -33.91 -21.35 -7.4488 -4.8883 -3.3782 -2.8051 -2.4788 -2.2542 -2.0827 -1.9434 -1.8255 -1.7227 -1.6313 -1.0076 -0.5832 -0.2184 0.1339 0.5064 0.9403 1.5229 2.5823 2.7622 2.9723 3.2236 3.5338 3.9337 4.4809 5.3018 6.7420 10.29 15.93 45.31 71.58 208.2 964.3
β = .5 -398.8 -85.61 -29.13 -18.33 -6.4256 -4.2748 -3.0497 -2.5861 -2.3138 -2.1208 -1.9703 -1.8459 -1.7392 -1.6453 -1.5609 -0.9694 -0.5544 -0.192 0.1621 0.5403 0.9847 1.5868 2.6926 2.8813 3.1017 3.3657 3.6917 4.1120 4.6871 5.5493 7.0603 10.78 16.67 47.36 74.78 217.4 1007
β = .6
Stable quantiles zλ (α, β), α = 1.5 β = .7 -329 -70.53 -23.95 -15.06 -5.3454 -3.6593 -2.7495 -2.3890 -2.1648 -1.9999 -1.8679 -1.7568 -1.6602 -1.5740 -1.4960 -0.9332 -0.5264 -0.1655 0.1912 0.5753 1.0306 1.6524 2.8043 3.0016 3.2323 3.5086 3.8501 4.2903 4.8926 5.7952 7.3752 11.26 17.4 49.37 77.93 226.4 1048
β = .8 -250.9 -53.65 -18.18 -11.44 -4.2060 -3.0928 -2.4922 -2.2172 -2.0331 -1.8919 -1.7757 -1.6761 -1.5881 -1.5089 -1.4363 -0.8989 -0.4988 -0.1386 0.221 0.6116 1.0780 1.7196 2.9172 3.1229 3.3637 3.6521 4.0088 4.4685 5.0975 6.0395 7.6871 11.73 18.12 51.35 81.01 235.3 1089
-3.9955 -3.5665 -3.2160 -3.0473 -2.5983 -2.3711 -2.1148 -1.9472 -1.8184 -1.7117 -1.6196 -1.5376 -1.4633 -1.3949 -1.3312 -0.8344 -0.4445 -0.08365 0.2833 0.6878 1.1769 1.8583 3.1457 3.3681 3.6284 3.9405 4.3265 4.8242 5.5049 6.5234 8.3020 12.65 19.53 55.19 87.03 252.5 1169
β = .9 β = 1.0 -157.7 -33.59 -11.37 -7.2156 -3.1605 -2.6624 -2.2825 -2.0708 -1.9181 -1.7962 -1.6932 -1.6033 -1.5227 -1.4493 -1.3816 -0.866 -0.4716 -0.1114 0.2517 0.6491 1.1268 1.7883 3.0310 3.2451 3.4957 3.7961 4.1676 4.6465 5.3016 6.2823 7.9960 12.19 18.83 53.28 84.04 244 1129
B Stable Quantiles 281
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-1299 -250.8 -79.56 -48.57 -15.59 -9.6588 -6.0628 -4.6571 -3.8778 -3.3699 -3.0055 -2.7269 -2.5039 -2.3192 -2.1622 -1.2491 -0.7415 -0.3506 0 0.3506 0.7415 1.2491 2.1622 2.3192 2.5039 2.7269 3.0055 3.3699 3.8778 4.6571 6.0628 9.6588 15.59 48.57 79.56 250.8 1299
β = .1
-1204 -232.5 -73.69 -44.96 -14.41 -8.9185 -5.6049 -4.3157 -3.6035 -3.1402 -2.8080 -2.5538 -2.3501 -2.1809 -2.0367 -1.1862 -0.7003 -0.3185 0.02982 0.3838 0.7851 1.3160 2.2926 2.4623 2.6623 2.9042 3.2066 3.6022 4.1533 4.9973 6.5153 10.39 16.76 52.09 85.26 268.6 1390
β = .2
-1107 -213.6 -67.64 -41.25 -13.19 -8.1625 -5.1419 -3.9739 -3.3315 -2.9145 -2.6156 -2.3864 -2.2021 -2.0486 -1.9172 -1.1273 -0.6615 -0.2874 0.05993 0.4185 0.8313 1.3866 2.4270 2.6093 2.8243 3.0846 3.4102 3.8362 4.4290 5.3357 6.9629 11.1 17.89 55.51 90.82 285.9 1480
-1006 -194.1 -61.38 -37.4 -11.93 -7.3892 -4.6740 -3.6330 -3.0638 -2.6950 -2.4302 -2.2264 -2.0616 -1.9235 -1.8046 -1.0725 -0.6248 -0.257 0.0906 0.4548 0.8801 1.4605 2.5647 2.7594 2.9892 3.2675 3.6157 4.0712 4.7047 5.6723 7.4055 11.8 19.01 58.86 96.26 302.8 1567
β = .3 -901.1 -173.7 -54.87 -33.4 -10.64 -6.5965 -4.2024 -3.2957 -2.8036 -2.4848 -2.2546 -2.0759 -1.9303 -1.8071 -1.7002 -1.0217 -0.5899 -0.2268 0.1221 0.4929 0.9315 1.5376 2.7052 2.9120 3.1563 3.4522 3.8226 4.3067 4.9799 6.0068 7.8435 12.49 20.1 62.14 101.6 319.4 1652
β = .4 -790.9 -152.3 -48.05 -29.21 -9.2912 -5.7824 -3.7299 -2.9675 -2.5564 -2.2881 -2.0919 -1.9374 -1.8097 -1.7005 -1.6047 -0.9747 -0.5565 -0.1968 0.1545 0.5328 0.9853 1.6175 2.8480 3.0667 3.3252 3.6383 4.0303 4.5426 5.2544 6.3393 8.2770 13.17 21.18 65.36 106.8 335.6 1735
β = .5 -674.2 -129.7 -40.83 -24.8 -7.8873 -4.9455 -3.2643 -2.6580 -2.3293 -2.1097 -1.9450 -1.8125 -1.7011 -1.6044 -1.5184 -0.9311 -0.5242 -0.1664 0.1882 0.5745 1.0414 1.7000 2.9927 3.2231 3.4955 3.8255 4.2386 4.7785 5.5281 6.6697 8.7064 13.84 22.24 68.52 111.9 351.6 1817
β = .6
Stable quantiles zλ (α, β), α = 1.4 β = .7 -548.7 -105.4 -33.1 -20.07 -6.4107 -4.0887 -2.8266 -2.3824 -2.1299 -1.9532 -1.8157 -1.7023 -1.6048 -1.5188 -1.4413 -0.8904 -0.4927 -0.1357 0.223 0.618 1.0997 1.7848 3.1390 3.3809 3.6669 4.0136 4.4474 5.0142 5.8008 6.9980 9.1318 14.51 23.28 71.63 116.9 367.2 1898
β = .8 -410.4 -78.66 -24.61 -14.91 -4.8457 -3.2432 -2.4572 -2.1534 -1.9621 -1.8195 -1.7042 -1.6061 -1.5201 -1.4430 -1.3725 -0.8522 -0.4616 -0.1043 0.2592 0.6633 1.1601 1.8717 3.2866 3.5399 3.8393 4.2023 4.6565 5.2497 6.0727 7.3244 9.5535 15.16 24.31 74.69 121.9 382.6 1977
-3.6464 -3.2758 -2.9694 -2.8208 -2.4212 -2.2166 -1.9836 -1.8300 -1.7113 -1.6124 -1.5267 -1.4501 -1.3804 -1.3161 -1.2561 -0.7811 -0.3996 -0.03905 0.3357 0.7592 1.2867 2.0511 3.5853 3.8606 4.1863 4.5810 5.0748 5.7195 6.6134 7.9714 10.39 16.45 26.33 80.67 131.6 412.7 2132
β = .9 β = 1.0 -249.7 -47.62 -14.82 -8.9947 -3.2476 -2.5847 -2.1815 -1.9721 -1.8242 -1.7071 -1.6086 -1.5227 -1.4459 -1.3759 -1.3112 -0.8159 -0.4306 -0.07211 0.2968 0.7104 1.2225 1.9605 3.4355 3.6998 4.0125 4.3914 4.8656 5.4848 6.3435 7.6488 9.9716 15.81 25.33 77.7 126.8 397.7 2055
282 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-2449 -416.8 -121 -71.04 -20.77 -12.31 -7.3590 -5.4757 -4.4512 -3.7947 -3.3314 -2.9828 -2.7084 -2.4846 -2.2971 -1.2678 -0.7392 -0.3468 0 0.3468 0.7392 1.2678 2.2971 2.4846 2.7084 2.9828 3.3314 3.7947 4.4512 5.4757 7.3590 12.31 20.77 71.04 121 416.8 2449
β = .1
-2258 -384.2 -111.4 -65.39 -19.07 -11.29 -6.7442 -5.0225 -4.0890 -3.4922 -3.0719 -2.7560 -2.5072 -2.3044 -2.1343 -1.1902 -0.6913 -0.311 0.03289 0.3844 0.791 1.3511 2.4662 2.6707 2.9149 3.2146 3.5949 4.0999 4.8146 5.9274 7.9681 13.32 22.44 76.57 130.3 448.6 2636
β = .2
-2063 -350.7 -101.6 -59.6 -17.34 -10.25 -6.1236 -4.5686 -3.7289 -3.1939 -2.8179 -2.5357 -2.3133 -2.1318 -1.9792 -1.1188 -0.6469 -0.2767 0.06628 0.4243 0.8466 1.4396 2.6403 2.8615 3.1258 3.4500 3.8612 4.4070 5.1783 6.3773 8.5716 14.31 24.09 81.99 139.4 479.8 2818
-1861 -316.3 -91.55 -53.65 -15.56 -9.1880 -5.4973 -4.1150 -3.3728 -2.9021 -2.5721 -2.3244 -2.1290 -1.9689 -1.8339 -1.0536 -0.6057 -0.2432 0.1007 0.4666 0.906 1.5328 2.8184 3.0560 3.3399 3.6882 4.1297 4.7152 5.5419 6.8252 9.1700 15.29 25.7 87.31 148.4 510.4 2998
β = .3 -1653 -280.8 -81.15 -47.51 -13.73 -8.1061 -4.8661 -3.6640 -3.0241 -2.6205 -2.3381 -2.1257 -1.9573 -1.8185 -1.7005 -0.9946 -0.5671 -0.21 0.1364 0.5117 0.969 1.6303 2.9999 3.2536 3.5568 3.9285 4.3998 5.0242 5.9051 7.2711 9.7633 16.26 27.29 92.54 157.2 540.5 3174
β = .4 -1436 -243.8 -70.36 -41.14 -11.85 -6.9999 -4.2318 -3.2201 -2.6885 -2.3553 -2.1215 -1.9441 -1.8019 -1.6832 -1.5810 -0.9413 -0.5305 -0.1766 0.1738 0.5595 1.0356 1.7314 3.1841 3.4537 3.7758 4.1706 4.6710 5.3336 6.2677 7.7147 10.35 17.22 28.86 97.68 165.9 570.1 3347
β = .5 -1209 -205.1 -59.07 -34.49 -9.9020 -5.8673 -3.6000 -2.7934 -2.3774 -2.1158 -1.9291 -1.7843 -1.6657 -1.5648 -1.4765 -0.8929 -0.4953 -0.1427 0.213 0.6099 1.1053 1.8359 3.3707 3.6559 3.9965 4.4141 4.9431 5.6432 6.6295 8.1563 10.94 18.16 30.41 102.8 174.5 599.2 3517
β = .6
Stable quantiles zλ (α, β), α = 1.3 β = .7 -969 -164.1 -47.13 -27.46 -7.8713 -4.7082 -2.9871 -2.4073 -2.1081 -1.9122 -1.7662 -1.6488 -1.5497 -1.4635 -1.3865 -0.8486 -0.4608 -0.1078 0.2542 0.663 1.1781 1.9434 3.5593 3.8598 4.2188 4.6587 5.2157 5.9529 6.9906 8.5958 11.52 19.1 31.95 107.8 182.9 628 3685
β = .8 -708.9 -119.8 -34.24 -19.9 -5.7378 -3.5372 -2.4515 -2.0975 -1.8938 -1.7482 -1.6330 -1.5365 -1.4525 -1.3775 -1.3094 -0.8074 -0.4266 -0.07175 0.2973 0.7186 1.2536 2.0536 3.7496 4.0652 4.4422 4.9041 5.4889 6.2624 7.3507 9.0333 12.09 20.03 33.46 112.7 191.2 656.3 3851
-3.3090 -2.9939 -2.7299 -2.6007 -2.2490 -2.0665 -1.8565 -1.7168 -1.6080 -1.5169 -1.4375 -1.3663 -1.3013 -1.2411 -1.1846 -0.7311 -0.3572 0.004482 0.3898 0.837 1.4123 2.2810 4.1345 4.4797 4.8920 5.3970 6.0360 6.8807 8.0683 9.9022 13.23 21.86 36.44 122.4 207.6 711.9 4176
β = .9 β = 1.0 -415.3 -69.81 -19.79 -11.48 -3.5067 -2.5322 -2.0836 -1.8748 -1.7315 -1.6193 -1.5254 -1.4438 -1.3708 -1.3043 -1.2428 -0.7685 -0.3921 -0.03436 0.3426 0.7766 1.3318 2.1662 3.9414 4.2719 4.6667 5.1503 5.7624 6.5717 7.7100 9.4687 12.66 20.95 34.96 117.6 199.4 684.3 4015
B Stable Quantiles 283
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-5050 -741.4 -194 -108.9 -28.63 -16.16 -9.1641 -6.5953 -5.2289 -4.3687 -3.7712 -3.3284 -2.9845 -2.7081 -2.4796 -1.2928 -0.736 -0.3416 0 0.3416 0.736 1.2928 2.4796 2.7081 2.9845 3.3284 3.7712 4.3687 5.2289 6.5953 9.1641 16.16 28.63 108.9 194 741.4 5050
β = .1
-4626 -678.8 -177.5 -99.62 -26.11 -14.71 -8.3257 -5.9895 -4.7502 -3.9719 -3.4326 -3.0335 -2.7240 -2.4754 -2.2700 -1.1975 -0.6807 -0.3026 0.03526 0.3833 0.7969 1.3961 2.6970 2.9481 3.2518 3.6293 4.1150 4.7693 5.7099 7.2009 9.9979 17.6 31.12 118.1 210.2 802.9 5468
β = .2
-4193 -615.2 -160.7 -90.14 -23.54 -13.24 -7.4829 -5.3842 -4.2749 -3.5806 -3.1008 -2.7465 -2.4722 -2.2520 -2.0701 -1.1109 -0.6307 -0.2657 0.0714 0.4286 0.8634 1.5065 2.9206 3.1940 3.5245 3.9350 4.4627 5.1729 6.1923 7.8058 10.83 19.02 33.58 127.2 226.2 863.4 5880
-3751 -550.1 -143.6 -80.46 -20.94 -11.75 -6.6360 -4.7808 -3.8051 -3.1971 -2.7785 -2.4703 -2.2321 -2.0409 -1.8828 -1.0337 -0.5855 -0.2299 0.1092 0.4775 0.9353 1.6230 3.1494 3.4447 3.8016 4.2445 4.8135 5.5786 6.6756 8.4097 11.65 20.43 36.01 136.1 242 923.2 6285
β = .3 -3299 -483.5 -126.1 -70.56 -18.29 -10.24 -5.7860 -4.1815 -3.3437 -2.8252 -2.4700 -2.2094 -2.0080 -1.8462 -1.7118 -0.9659 -0.5439 -0.1942 0.1491 0.5304 1.0121 1.7451 3.3825 3.6993 4.0822 4.5571 5.1668 5.9860 7.1594 9.0124 12.47 21.83 38.42 144.9 257.5 982.2 6686
β = .4 -2833 -415.1 -108 -60.4 -15.58 -8.7119 -4.9350 -3.5905 -2.8967 -2.4716 -2.1825 -1.9706 -1.8063 -1.6731 -1.5613 -0.9065 -0.505 -0.158 0.1916 0.5871 1.0937 1.8719 3.6192 3.9573 4.3658 4.8723 5.5221 6.3946 7.6434 9.6137 13.29 23.22 40.8 153.6 272.9 1040 7082
β = .5 -2352 -344.3 -89.44 -49.92 -12.8 -7.1580 -4.0881 -3.0169 -2.4758 -2.1494 -1.9278 -1.7636 -1.6338 -1.5264 -1.4343 -0.8542 -0.4675 -0.1206 0.2369 0.6477 1.1796 2.0031 3.8591 4.2182 4.6519 5.1896 5.8790 6.8042 8.1275 10.21 14.1 24.59 43.17 162.2 288.1 1098 7473
β = .6
Stable quantiles zλ (α, β), α = 1.2 β = .7 -1850 -270.5 -70.06 -39.01 -9.9482 -5.5812 -3.2587 -2.4836 -2.1072 -1.8796 -1.7195 -1.5957 -1.4940 -1.4070 -1.3303 -0.8072 -0.4306 -0.08159 0.2851 0.7118 1.2694 2.1381 4.1016 4.4816 4.9403 5.5087 6.2373 7.2145 8.6114 10.81 14.91 25.96 45.51 170.8 303.2 1155 7861
β = .8 -1319 -192.4 -49.6 -27.53 -6.9946 -3.9945 -2.4978 -2.0532 -1.8280 -1.6768 -1.5610 -1.4657 -1.3838 -1.3113 -1.2457 -0.764 -0.3935 -0.04063 0.3362 0.7793 1.3628 2.2766 4.3466 4.7471 5.2305 5.8294 6.5967 7.6254 9.0952 11.41 15.72 27.32 47.83 179.2 318.1 1212 8244
-2.9817 -2.7190 -2.4955 -2.3850 -2.0800 -1.9192 -1.7320 -1.6061 -1.5073 -1.4240 -1.3510 -1.2852 -1.2248 -1.1687 -1.1158 -0.6837 -0.3169 0.04766 0.4467 0.9238 1.5594 2.5626 4.8428 5.2837 5.8158 6.4745 7.3182 8.4483 10.06 12.6 17.32 30.01 52.43 195.9 347.6 1323 9001
β = .9 β = 1.0 -739.4 -107.3 -27.39 -15.11 -3.9475 -2.5287 -1.9880 -1.7772 -1.6383 -1.5313 -1.4424 -1.3652 -1.2964 -1.2336 -1.1756 -0.7232 -0.3558 0.002427 0.39 0.85 1.4596 2.4182 4.5937 5.0146 5.5224 6.1514 6.9570 8.0367 9.5786 12 16.52 28.67 50.14 187.6 332.9 1268 8624
284 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-1.172+04 -1445 -334.6 -178.2 -41.35 -22.07 -11.8 -8.1887 -6.3181 -5.1646 -4.3773 -3.8026 -3.3628 -3.0139 -2.7293 -1.3274 -0.7317 -0.3345 0 0.3345 0.7317 1.3274 2.7293 3.0139 3.3628 3.8026 4.3773 5.1646 6.3181 8.1887 11.8 22.07 41.35 178.2 334.6 1445 1.172+04
β = .1
-1.065+04 -1313 -303.8 -161.7 -37.41 -19.92 -10.62 -7.3607 -5.6753 -4.6385 -3.9322 -3.4178 -3.0247 -2.7133 -2.4595 -1.2099 -0.6681 -0.2929 0.03678 0.3803 0.8034 1.4556 3.0090 3.3242 3.7102 4.1962 4.8303 5.6978 6.9666 9.0204 12.98 24.21 45.26 194.6 365.1 1576 1.278+04
β = .2
-9567 -1179 -272.7 -145.1 -33.44 -17.77 -9.4467 -6.5376 -5.0398 -4.1211 -3.4970 -3.0437 -2.6980 -2.4247 -2.2023 -1.1045 -0.6124 -0.2543 0.075 0.4312 0.8829 1.5930 3.2971 3.6425 4.0651 4.5968 5.2897 6.2365 7.6195 9.8549 14.16 26.34 49.15 210.8 395.4 1706 1.383+04
-8473 -1044 -241.2 -128.2 -29.44 -15.6 -8.2719 -5.7214 -4.4138 -3.6151 -3.0747 -2.6835 -2.3861 -2.1515 -1.9609 -1.0127 -0.5635 -0.217 0.1158 0.4876 0.9698 1.7386 3.5923 3.9675 4.4265 5.0033 5.7545 6.7800 8.2761 10.69 15.33 28.46 53.02 227 425.5 1835 1.488+04
β = .3 -7365 -907 -209.4 -111.2 -25.41 -13.42 -7.1013 -4.9150 -3.8011 -3.1248 -2.6699 -2.3422 -2.0942 -1.8990 -1.7406 -0.9348 -0.5199 -0.1797 0.1601 0.5495 1.0634 1.8912 3.8934 4.2983 4.7933 5.4150 6.2239 7.3273 8.9357 11.53 16.51 30.58 56.87 243 455.4 1963 1.591+04
β = .4 -6239 -768 -177 -93.89 -21.33 -11.23 -5.9384 -4.1236 -3.2079 -2.6573 -2.2904 -2.0281 -1.8305 -1.6750 -1.5484 -0.8695 -0.4795 -0.1411 0.2082 0.6168 1.1631 2.0500 4.1998 4.6342 5.1648 5.8310 6.6973 7.8780 9.5978 12.37 17.69 32.69 60.7 258.9 485.1 2091 1.695+04
β = .5 -5093 -626.5 -144.1 -76.31 -17.22 -9.0372 -4.7902 -3.3568 -2.6459 -2.2263 -1.9509 -1.7555 -1.6077 -1.4898 -1.3918 -0.8141 -0.4407 -0.1003 0.2603 0.6891 1.2684 2.2145 4.5109 4.9744 5.5405 6.2508 7.1740 8.4316 10.26 13.21 18.86 34.79 64.51 274.7 514.6 2217 1.797+04
β = .6
Stable quantiles zλ (α, β), α = 1.1 β = .7 -3920 -481.7 -110.5 -58.36 -13.05 -6.8411 -3.6718 -2.6359 -2.1427 -1.8615 -1.6775 -1.5430 -1.4367 -1.3481 -1.2715 -0.7658 -0.4019 -0.05693 0.3164 0.7662 1.3788 2.3839 4.8261 5.3186 5.9199 6.6741 7.6539 8.9878 10.93 14.05 20.03 36.88 68.31 290.5 543.9 2343 1.899+04
β = .8 -2711 -332.5 -75.9 -39.92 -8.8318 -4.6652 -2.6278 -2.0303 -1.7655 -1.6041 -1.4865 -1.3925 -1.3129 -1.2431 -1.1803 -0.7216 -0.3623 -0.01062 0.3765 0.8478 1.4939 2.5580 5.1450 5.6663 6.3026 7.1004 8.1364 9.5462 11.6 14.89 21.2 38.97 72.09 306.1 573.1 2468 2+04
-2.6631 -2.4497 -2.2649 -2.1723 -1.9127 -1.7734 -1.6088 -1.4967 -1.4079 -1.3325 -1.2660 -1.2057 -1.1500 -1.0980 -1.0488 -0.6384 -0.2781 0.09117 0.508 1.0233 1.7367 2.9181 5.7925 6.3710 7.0766 7.9609 9.1084 10.67 12.94 16.58 23.54 43.12 79.6 337.2 631.1 2717 2.201+04
β = .9 β = 1.0 -1442 -176.1 -39.75 -20.74 -4.6025 -2.6226 -1.8944 -1.6775 -1.5430 -1.4416 -1.3581 -1.2859 -1.2215 -1.1629 -1.1085 -0.6796 -0.3212 0.03874 0.4404 0.9335 1.6133 2.7361 5.4672 6.0172 6.6883 7.5294 8.6213 10.11 12.27 15.73 22.37 41.05 75.85 321.7 602.2 2593 2.101+04
B Stable Quantiles 285
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-3.183+04 -3183 -636.6 -318.3 -63.66 -31.82 -15.89 -10.58 -7.9158 -6.3138 -5.2422 -4.4737 -3.8947 -3.4420 -3.0777 -1.3764 -0.7265 -0.3249 0 0.3249 0.7265 1.3764 3.0777 3.4420 3.8947 4.4737 5.2422 6.3138 7.9158 10.58 15.89 31.82 63.66 318.3 636.6 3183 3.183+04
β = .1
-2.865+04 -2865 -572.6 -286.1 -57.06 -28.45 -14.16 -9.4043 -7.0257 -5.5978 -4.6447 -3.9626 -3.4496 -3.0492 -2.7274 -1.2306 -0.653 -0.2814 0.03717 0.3748 0.8114 1.5362 3.4421 3.8489 4.3538 4.9987 5.8533 7.0433 8.8194 11.77 17.64 35.2 70.27 350.5 700.7 3502 3.501+04
β = .2
-2.546+04 -2546 -508.6 -254 -50.86 -25.1 -12.45 -8.2453 -6.1514 -4.8980 -4.0636 -3.4680 -3.0212 -2.6732 -2.3942 -1.1011 -0.5909 -0.2422 0.07671 0.4321 0.907 1.7081 3.8186 4.2679 4.8249 5.5355 6.4763 7.7844 9.7345 12.97 19.4 38.59 76.89 382.7 764.7 3821 3.82+04
-2.228+04 -2227 -444.6 -221.9 -44.03 -21.76 -10.75 -7.1054 -5.2965 -4.2179 -3.5025 -2.9937 -2.6133 -2.3180 -2.0819 -0.9905 -0.539 -0.2047 0.1204 0.4972 1.0123 1.8906 4.2056 4.6972 5.3063 6.0827 7.1094 8.5358 10.66 14.18 21.16 42 83.52 414.9 828.8 4140 4.138+04
β = .3 -1.91+04 -1908 -373.8 -187.1 -37.38 -18.45 -9.0737 -5.9892 -4.4660 -3.5626 -2.9668 -2.5453 -2.2318 -1.9895 -1.7966 -0.9003 -0.4945 -0.1665 0.1692 0.5698 1.1264 2.0825 4.6019 5.1358 5.7969 6.6390 7.7517 9.2962 11.59 15.39 22.94 45.41 90.16 447.1 892.9 4458 4.457+04
β = .4 -1.591+04 -1589 -316.6 -157.9 -31.66 -15.17 -7.4295 -4.9043 -3.6676 -2.9405 -2.4652 -2.1319 -1.8860 -1.6973 -1.5478 -0.8291 -0.454 -0.1259 0.2235 0.6495 1.2485 2.2827 5.0064 5.5827 6.2957 7.2034 8.4021 10.06 12.54 16.62 24.72 48.83 96.8 479.3 957 4777 4.775+04
β = .5 -1.273+04 -1271 -252.7 -125.6 -24.39 -11.93 -5.8270 -3.8630 -2.9148 -2.3660 -2.0133 -1.7701 -1.5929 -1.4575 -1.3494 -0.772 -0.4147 -0.08178 0.2834 0.736 1.3779 2.4904 5.4184 6.0370 6.8019 7.7752 9.0598 10.84 13.49 17.85 26.51 52.25 103.5 511.5 1021 5096 5.093+04
β = .6
Stable quantiles zλ (α, β), α = 1 β = .7 -9548 -952.1 -188.8 -93.65 -17.98 -8.7428 -4.2884 -2.8898 -2.2354 -1.8710 -1.6454 -1.4917 -1.3771 -1.2856 -1.2087 -0.7236 -0.3747 -0.03373 0.3488 0.8288 1.5141 2.7050 5.8371 6.4980 7.3149 8.3538 9.7242 11.62 14.44 19.09 28.31 55.69 110.1 543.7 1085 5415 5.412+04
β = .8 -6362 -633.5 -125.1 -61.76 -11.66 -5.6648 -2.8655 -2.0501 -1.7108 -1.5297 -1.4082 -1.3150 -1.2380 -1.1714 -1.1120 -0.6797 -0.3327 0.01852 0.4195 0.9274 1.6563 2.9258 6.2621 6.9652 7.8340 8.9384 10.39 12.41 15.4 20.34 30.11 59.13 116.8 576 1149 5734 5.73+04
-2.3525 -2.1849 -2.0367 -1.9613 -1.7458 -1.6275 -1.4855 -1.3873 -1.3087 -1.2413 -1.1814 -1.1267 -1.0759 -1.0282 -0.9828 -0.5948 -0.2405 0.1357 0.5756 1.1406 1.9575 3.3843 7.1287 7.9162 8.8888 10.12 11.75 14 17.34 22.85 33.73 66.02 130.1 640.5 1278 6372 6.367+04
β = .9 β = 1.0 -3179 -315.2 -61.53 -30.23 -5.5836 -2.8420 -1.8055 -1.5738 -1.4439 -1.3487 -1.2713 -1.2046 -1.1452 -1.0910 -1.0407 -0.6373 -0.2881 0.07502 0.4952 1.0314 1.8043 3.1524 6.6927 7.4381 8.3588 9.5288 11.07 13.21 16.37 21.59 31.92 62.57 123.5 608.2 1213 6053 6.048+04
286 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-1.07+05 -8281 -1385 -640.9 -107 -49.41 -22.75 -14.42 -10.41 -8.0679 -6.5404 -5.4677 -4.6743 -4.0641 -3.5805 -1.4470 -0.7209 -0.3121 0 0.3121 0.7209 1.4470 3.5805 4.0641 4.6743 5.4677 6.5404 8.0679 10.41 14.42 22.75 49.41 107 640.9 1385 8281 1.07+05
β = .1
-9.514+04 -7365 -1231 -569.6 -94.83 -43.68 -20.04 -12.66 -9.1150 -7.0521 -5.7084 -4.7667 -4.0714 -3.5378 -3.1156 -1.2644 -0.635 -0.2675 0.03608 0.3663 0.8224 1.6482 4.0672 4.6129 5.3004 6.1932 7.3983 9.1116 11.73 16.22 25.52 55.22 119.3 713.1 1540 9207 1.189+05
β = .2
-8.347+04 -6461 -1079 -499.1 -82.83 -38.04 -17.37 -10.94 -7.8570 -6.0681 -4.9060 -4.0936 -3.4954 -3.0374 -2.6759 -1.1034 -0.5655 -0.2292 0.07604 0.4311 0.9381 1.8656 4.5731 5.1814 5.9471 6.9402 8.2790 10.18 13.08 18.05 28.33 61.11 131.8 786 1697 1.014+04 1.31+05
-7.196+04 -5569 -929.9 -429.7 -71.03 -32.5 -14.76 -9.2636 -6.6398 -5.1211 -4.1381 -3.4534 -2.9510 -2.5677 -2.2662 -0.9679 -0.511 -0.1929 0.1224 0.5066 1.0666 2.0973 5.0960 5.7677 6.6122 7.7065 9.1805 11.27 14.46 19.91 31.18 67.07 144.4 859.6 1855 1.109+04 1.432+05
β = .3 -6.063+04 -4691 -782.7 -361.4 -59.45 -27.08 -12.23 -7.6463 -5.4710 -4.2184 -3.4119 -2.8530 -2.4452 -2.1356 -1.8934 -0.8616 -0.4671 -0.1548 0.1761 0.5922 1.2068 2.3420 5.6345 6.3699 7.2940 8.4905 10.1 12.38 15.86 21.79 34.06 73.08 157.1 933.8 2015 1.204+04 1.555+05
β = .4 -4.951+04 -3830 -638.3 -294.4 -48.12 -21.8 -9.7792 -6.0967 -4.3619 -3.3707 -2.7378 -2.3030 -1.9884 -1.7518 -1.5683 -0.7839 -0.4281 -0.1126 0.2376 0.6872 1.3576 2.5983 6.1870 6.9867 7.9910 9.2905 11.04 13.51 17.28 23.71 36.98 79.16 169.9 1009 2176 1.3+04 1.678+05
β = .5 -3.864+04 -2987 -497.2 -228.9 -37.09 -16.69 -7.4358 -4.6336 -3.3300 -2.5954 -2.1335 -1.8214 -1.5995 -1.4355 -1.3099 -0.7268 -0.3895 -0.06518 0.3067 0.7911 1.5180 2.8654 6.7524 7.6170 8.7021 10.11 11.99 14.66 18.72 25.64 39.93 85.29 182.8 1084 2339 1.397+04 1.803+05
β = .6
Stable quantiles zλ (α, β), α = 0.9 β = .7 -2.806+04 -2168 -360 -165.4 -26.46 -11.8 -5.2348 -3.2897 -2.4078 -1.9262 -1.6348 -1.4464 -1.3159 -1.2183 -1.1401 -0.6803 -0.3488 -0.01201 0.383 0.9032 1.6875 3.1422 7.3299 8.2598 9.4263 10.93 12.96 15.82 20.18 27.6 42.91 91.48 195.9 1160 2502 1.494+04 1.929+05
β = .8 -1.788+04 -1380 -228.1 -104.3 -16.35 -7.2253 -3.2499 -2.1376 -1.6766 -1.4549 -1.3243 -1.2313 -1.1573 -1.0946 -1.0393 -0.6379 -0.3047 0.04698 0.4663 1.0230 1.8652 3.4282 7.9185 8.9141 10.16 11.78 13.94 17 21.66 29.58 45.91 97.71 209 1237 2667 1.592+04 2.055+05
-2.0500 -1.9242 -1.8100 -1.7508 -1.5777 -1.4802 -1.3607 -1.2766 -1.2083 -1.1492 -1.0961 -1.0473 -1.0016 -0.9583 -0.917 -0.5526 -0.2038 0.1821 0.6525 1.2838 2.2433 4.0251 9.1264 10.25 11.67 13.49 15.94 19.4 24.66 33.59 52 110.3 235.5 1391 2999 1.79+04 2.311+05
β = .9 β = 1.0 -8273 -636.2 -104 -47.08 -7.1201 -3.2105 -1.7331 -1.4643 -1.3387 -1.2508 -1.1803 -1.1199 -1.0661 -1.0169 -0.9711 -0.596 -0.2565 0.1117 0.5563 1.1500 2.0506 3.7227 8.5175 9.5793 10.91 12.63 14.94 18.2 23.15 31.58 48.94 104 222.2 1313 2832 1.691+04 2.183+05
B Stable Quantiles 287
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-4.829+05 -2.715+04 -3631 -1526 -203.4 -85.14 -35.48 -21.17 -14.63 -10.96 -8.6301 -7.0381 -5.8868 -5.0191 -4.3439 -1.5508 -0.7157 -0.2952 0 0.2952 0.7157 1.5508 4.3439 5.0191 5.8868 7.0381 8.6301 10.96 14.63 21.17 35.48 85.14 203.4 1526 3631 2.715+04 4.829+05
β = .1
-4.233+05 -2.38+04 -3181 -1337 -177.7 -74.2 -30.79 -18.31 -12.61 -9.4206 -7.4036 -6.0259 -5.0316 -4.2837 -3.7029 -1.3187 -0.6139 -0.2501 0.03301 0.3542 0.8384 1.8086 5.0217 5.7938 6.7845 8.0970 9.9091 12.55 16.72 24.13 40.31 96.38 229.7 1720 4091 3.059+04 5.44+05
β = .2
-3.654+05 -2.054+04 -2744 -1152 -152.8 -63.6 -26.26 -15.55 -10.68 -7.9521 -6.2352 -5.0657 -4.2238 -3.5922 -3.1030 -1.1157 -0.5348 -0.2148 0.07235 0.4281 0.9801 2.0892 5.7327 6.6040 7.7208 9.1985 11.24 14.2 18.88 27.18 45.27 107.9 256.7 1919 4563 3.411+04 6.065+05
-3.092+05 -1.738+04 -2321 -974.2 -128.7 -53.36 -21.9 -12.91 -8.8309 -6.5589 -5.1324 -4.1641 -3.4697 -2.9507 -2.5502 -0.9467 -0.4781 -0.1818 0.1212 0.5163 1.1390 2.3905 6.4741 7.4469 8.6925 10.34 12.61 15.91 21.1 30.31 50.36 119.7 284.3 2121 5044 3.77+04 6.704+05
β = .3 -2.55+05 -1.433+04 -1913 -802.2 -105.5 -43.53 -17.73 -10.4 -7.0877 -5.2517 -4.1049 -3.3305 -2.7780 -2.3675 -2.0524 -0.8178 -0.437 -0.145 0.1806 0.6181 1.3135 2.7107 7.2436 8.3200 9.6971 11.52 14.02 17.66 23.38 33.51 55.56 131.7 312.4 2328 5535 4.136+04 7.354+05
β = .4 -2.03+05 -1.141+04 -1521 -637.4 -83.27 -34.16 -13.79 -8.0419 -5.4647 -4.0459 -3.1666 -2.5778 -2.1615 -1.8550 -1.6220 -0.7322 -0.4014 -0.1017 0.2505 0.7323 1.5025 3.0483 8.0392 9.2213 10.73 12.73 15.47 19.45 25.71 36.79 60.87 144 341.1 2539 6034 4.509+04 8.017+05
β = .5 -1.536+05 -8628 -1149 -480.8 -62.26 -25.33 -10.12 -5.8694 -3.9873 -2.9643 -2.3392 -1.9271 -1.6407 -1.4337 -1.2797 -0.6772 -0.3649 -0.05077 0.3305 0.8584 1.7050 3.4022 8.8593 10.15 11.8 13.97 16.96 21.29 28.1 40.13 66.29 156.5 370.2 2753 6542 4.888+04 8.69+05
β = .6
Stable quantiles zλ (α, β), α = 0.8 β = .7 -1.072+05 -6019 -799.9 -334 -42.66 -17.16 -6.7731 -3.9291 -2.6980 -2.0466 -1.6615 -1.4178 -1.2569 -1.1460 -1.0639 -0.6349 -0.3243 0.008115 0.4202 0.9954 1.9199 3.7713 9.7025 11.1 12.89 15.24 18.48 23.17 30.53 43.54 71.8 169.2 399.8 2970 7058 5.273+04 9.374+05
β = .8 -6.457+04 -3621 -479.5 -199.3 -24.87 -9.8407 -3.8712 -2.3148 -1.6849 -1.3864 -1.2341 -1.1392 -1.0687 -1.0108 -0.9604 -0.5957 -0.2783 0.07489 0.5191 1.1427 2.1466 4.1546 10.57 12.08 14 16.54 20.03 25.08 33.01 47.01 77.4 182.1 429.9 3191 7582 5.663+04 1.007+06
-1.7563 -1.6675 -1.5842 -1.5401 -1.4073 -1.3300 -1.2330 -1.1631 -1.1054 -1.0548 -1.0088 -0.9661 -0.9258 -0.8873 -0.8502 -0.5114 -0.1681 0.2312 0.7427 1.4661 2.6326 4.9610 12.36 14.09 16.31 19.22 23.22 29.02 38.1 54.12 88.87 208.4 491.3 3642 8651 6.461+04 1.149+06
β = .9 β = 1.0 -2.714+04 -1518 -198.8 -81.78 -9.7016 -3.8137 -1.7121 -1.3473 -1.2253 -1.1459 -1.0835 -1.0303 -0.9829 -0.9394 -0.8987 -0.5551 -0.2263 0.1493 0.6268 1.2998 2.3844 4.5514 11.45 13.07 15.14 17.87 21.61 27.03 35.53 50.54 83.1 195.1 460.4 3415 8113 6.059+04 1.077+06
288 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-3.333+06 -1.242+05 -1.246+04 -4626 -461.9 -170.6 -62.57 -34.61 -22.64 -16.24 -12.34 -9.7538 -7.9390 -6.6056 -5.5918 -1.7082 -0.7121 -0.273 0 0.273 0.7121 1.7082 5.5918 6.6056 7.9390 9.7538 12.34 16.24 22.64 34.61 62.57 170.6 461.9 4626 1.246+04 1.242+05 3.333+06
β = .1
-2.868+06 -1.069+05 -1.072+04 -3977 -396.3 -146 -53.31 -29.37 -19.14 -13.68 -10.37 -8.1729 -6.6352 -5.5080 -4.6527 -1.4054 -0.5896 -0.2283 0.02741 0.3378 0.8628 2.0486 6.5996 7.7795 9.3287 11.43 14.42 18.93 26.32 40.11 72.28 196.3 530.5 5304 1.428+04 1.424+05 3.82+06
β = .2
-2.423+06 -9.032+04 -9053 -3359 -333.8 -122.6 -44.53 -24.42 -15.85 -11.29 -8.5208 -6.6988 -5.4244 -4.4928 -3.7879 -1.1442 -0.4979 -0.1986 0.06497 0.4229 1.0393 2.4237 7.6713 9.0245 10.8 13.21 16.62 21.76 30.18 45.87 82.41 223.1 601.9 6008 1.618+04 1.612+05 4.325+06
-2.003+06 -7.462+04 -7477 -2773 -274.6 -100.5 -36.26 -19.77 -12.77 -9.0565 -6.8137 -5.3412 -4.3152 -3.5680 -3.0050 -0.9297 -0.4386 -0.1713 0.1164 0.5272 1.2396 2.8309 8.8030 10.34 12.34 15.07 18.92 24.72 34.21 51.87 92.94 250.9 676 6738 1.814+04 1.808+05 4.849+06
β = .3 -1.607+06 -5.987+04 -5995 -2222 -219.1 -79.77 -28.55 -15.46 -9.9333 -7.0124 -5.2582 -4.1125 -3.3188 -2.7440 -2.3136 -0.7694 -0.403 -0.1373 0.1822 0.6495 1.4621 3.2683 9.9913 11.71 13.96 17.01 21.32 27.8 38.4 58.09 103.9 279.7 752.5 7493 2.017+04 2.01+05 5.391+06
β = .4 -1.238+06 -4.613+04 -4616 -1710 -167.6 -60.61 -21.47 -11.53 -7.3610 -5.1756 -3.8733 -3.0299 -2.4507 -2.0353 -1.7273 -0.6717 -0.3736 -0.09347 0.2621 0.7889 1.7054 3.7340 11.23 13.15 15.65 19.03 23.82 31 42.73 64.53 115.1 309.4 831.5 8271 2.226+04 2.218+05 5.949+06
β = .5 -9.003+05 -3.353+04 -3352 -1240 -120.5 -43.18 -15.08 -8.0203 -5.0932 -3.5775 -2.6864 -2.1180 -1.7341 -1.4640 -1.2679 -0.6216 -0.3408 -0.03895 0.3555 0.9443 1.9684 4.2267 12.53 14.64 17.4 21.13 26.4 34.31 47.22 71.17 126.8 340 912.8 9072 2.441+04 2.432+05 6.524+06
β = .6
Stable quantiles zλ (α, β), α = 0.7 β = .7 -5.968+05 -2.222+04 -2217 -818.5 -78.45 -27.74 -9.5029 -5.0112 -3.1882 -2.2675 -1.7422 -1.4189 -1.2104 -1.0723 -0.9793 -0.5865 -0.3013 0.02635 0.4619 1.1149 2.2500 4.7450 13.87 16.18 19.21 23.29 29.07 37.72 51.84 78.02 138.7 371.5 996.4 9895 2.662+04 2.652+05 7.114+06
β = .8 -3.344+05 -1.244+04 -1237 -454.9 -42.51 -14.69 -4.9296 -2.6316 -1.7517 -1.3425 -1.1400 -1.0363 -0.9693 -0.9174 -0.8733 -0.5524 -0.2537 0.1022 0.5806 1.3000 2.5494 5.2878 15.26 17.78 21.08 25.53 31.82 41.24 56.59 85.05 151 403.8 1082 1.074+04 2.889+04 2.878+05 7.719+06
-1.4728 -1.4155 -1.3595 -1.3290 -1.2335 -1.1756 -1.1006 -1.0451 -0.9982 -0.9565 -0.918 -0.8818 -0.8473 -0.8139 -0.7814 -0.4709 -0.1334 0.2844 0.8533 1.7112 3.1986 6.4430 18.17 21.13 24.99 30.21 37.56 48.56 66.49 99.68 176.6 470.7 1260 1.249+04 3.359+04 3.345+05 8.973+06
β = .9 β = 1.0 -1.242+05 -4609 -454 -165.2 -14.5 -4.8482 -1.7557 -1.2236 -1.1011 -1.0319 -0.9789 -0.934 -0.8939 -0.8568 -0.8219 -0.5144 -0.1978 0.1884 0.7113 1.4990 2.8659 5.8541 16.69 19.43 23.01 27.84 34.65 44.85 61.48 92.28 163.6 436.9 1170 1.16+04 3.121+04 3.109+05 8.339+06
B Stable Quantiles 289
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-4.361+07 -9.394+05 -6.422+04 -2.021+04 -1374 -429.2 -133 -66.55 -40.5 -27.44 -19.89 -15.1 -11.86 -9.5619 -7.8640 -1.9595 -0.712 -0.2446 0 0.2446 0.712 1.9595 7.8640 9.5619 11.86 15.1 19.89 27.44 40.5 66.55 133 429.2 1374 2.021+04 6.422+04 9.394+05 4.361+07
β = .1
-3.659+07 -7.881+05 -5.386+04 -1.695+04 -1150 -358.4 -110.6 -55.09 -33.39 -22.54 -16.27 -12.31 -9.6399 -7.7446 -6.3497 -1.5473 -0.5621 -0.2007 0.01872 0.3163 0.901 2.4325 9.5251 11.55 14.29 18.13 23.81 32.74 48.17 78.88 157.1 505.1 1613 2.37+04 7.529+04 1.101+06 5.112+07
β = .2
-3.007+07 -6.476+05 -4.425+04 -1.392+04 -941.9 -292.8 -89.83 -44.53 -26.86 -18.04 -12.97 -9.7776 -7.6254 -6.1045 -4.9888 -1.1995 -0.454 -0.1798 0.05321 0.4152 1.1269 2.9634 11.33 13.7 16.9 21.4 28.02 38.43 56.38 92.05 182.8 585.8 1868 2.741+04 8.705+04 1.273+06 5.91+07
-2.407+07 -5.183+05 -3.54+04 -1.113+04 -751 -232.6 -70.88 -34.91 -20.94 -13.99 -10.01 -7.5064 -5.8303 -4.6510 -3.7896 -0.9207 -0.3905 -0.1613 0.1069 0.5402 1.3879 3.5500 13.27 16.01 19.71 24.89 32.51 44.48 65.1 106 210 671.3 2138 3.133+04 9.949+04 1.455+06 6.753+07
β = .3 -1.861+07 -4.009+05 -2.737+04 -8601 -577.9 -178.1 -53.8 -26.28 -15.65 -10.39 -7.3878 -5.5156 -4.2679 -3.3955 -2.7624 -0.7174 -0.3638 -0.1318 0.18 0.69 1.6827 4.1901 15.34 18.47 22.69 28.6 37.28 50.9 74.34 120.8 238.7 761.4 2422 3.546+04 1.126+05 1.646+06 7.641+07
β = .4 -1.374+07 -2.958+05 -2.018+04 -6338 -423.5 -129.7 -38.7 -18.71 -11.03 -7.2695 -5.1420 -3.8247 -2.9550 -2.3529 -1.9207 -0.6 -0.3437 -0.08837 0.272 0.8638 2.0099 4.8821 17.53 21.08 25.85 32.52 42.32 57.67 84.06 136.4 268.9 856 2720 3.978+04 1.263+05 1.847+06 8.572+07
β = .5 -9.47+06 -2.039+05 -1.39+04 -4360 -288.9 -87.62 -25.71 -12.25 -7.1514 -4.6786 -3.3006 -2.4604 -1.9150 -1.5446 -1.2845 -0.5579 -0.3168 -0.03038 0.3823 1.0607 2.3685 5.6243 19.86 23.83 29.18 36.65 47.61 64.78 94.27 152.7 300.6 954.9 3032 4.431+04 1.407+05 2.057+06 9.545+07
β = .6
Stable quantiles zλ (α, β), α = 0.6 β = .7 -5.862+06 -1.262+05 -8588 -2689 -175.8 -52.51 -15.02 -7.0337 -4.0697 -2.6686 -1.9101 -1.4633 -1.1852 -1.0064 -0.8898 -0.5337 -0.2795 0.04208 0.5104 1.2800 2.7575 6.4154 22.3 26.73 32.67 40.98 53.16 72.22 104.9 169.7 333.6 1058 3357 4.903+04 1.556+05 2.275+06 1.056+08
β = .8 -2.982+06 -6.414+04 -4353 -1358 -86.51 -25.13 -6.9164 -3.2080 -1.9062 -1.3342 -1.0559 -0.9218 -0.8562 -0.8116 -0.7754 -0.507 -0.2311 0.1288 0.6558 1.5211 3.1762 7.2542 24.87 29.76 36.33 45.51 58.96 79.99 116.1 187.4 368 1166 3695 5.394+04 1.712+05 2.503+06 1.162+08
-1.2024 -1.1700 -1.1365 -1.1176 -1.0555 -1.0157 -0.962 -0.9207 -0.8849 -0.8523 -0.8217 -0.7925 -0.7641 -0.7363 -0.7089 -0.431 -0.1001 0.3436 0.9966 2.0660 4.0994 9.0704 30.34 36.22 44.13 55.15 71.28 96.49 139.7 225 440.8 1393 4410 6.431+04 2.041+05 2.983+06 1.385+08
β = .9 β = 1.0 -9.392+05 -2.016+04 -1356 -418.8 -24.84 -6.7993 -1.9032 -1.1095 -0.9632 -0.9062 -0.8644 -0.829 -0.7971 -0.7673 -0.7389 -0.4732 -0.1713 0.2294 0.818 1.7832 3.6237 8.1395 27.54 32.93 40.15 50.24 65 88.08 127.7 205.9 403.7 1277 4046 5.903+04 1.874+05 2.739+06 1.271+08
290 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-1.592+09 -1.591+07 -6.36+05 -1.588+05 -6303 -1560 -382 -166.2 -91.52 -57.3 -38.91 -27.94 -20.9 -16.12 -12.74 -2.3975 -0.7182 -0.2089 0 0.2089 0.7182 2.3975 12.74 16.12 20.9 27.94 38.91 57.3 91.52 166.2 382 1560 6303 1.588+05 6.36+05 1.591+07 1.592+09
β = .1
-1.289+09 -1.289+07 -5.15+05 -1.286+05 -5094 -1258 -306.7 -132.9 -72.84 -45.41 -30.7 -21.95 -16.35 -12.56 -9.8850 -1.7942 -0.5308 -0.1666 0.006833 0.2882 0.9633 3.1129 15.98 20.15 26.02 34.67 48.11 70.6 112.4 203.3 465.6 1894 7640 1.923+05 7.697+05 1.925+07 1.926+09
β = .2
-1.019+09 -1.018+07 -4.068+05 -1.016+05 -4014 -988.3 -239.6 -103.3 -56.3 -34.91 -23.48 -16.7 -12.38 -9.4605 -7.4111 -1.3033 -0.4019 -0.1575 0.0365 0.4043 1.2657 3.9403 19.6 24.64 31.73 42.15 58.31 85.3 135.3 244.2 557.4 2261 9106 2.289+05 9.161+05 2.292+07 2.292+09
-7.798+08 -7.796+06 -3.114+05 -7.77+04 -3062 -751.3 -180.9 -77.41 -41.91 -25.82 -17.25 -12.19 -8.9813 -6.8267 -5.3198 -0.9255 -0.3327 -0.1511 0.0917 0.5568 1.6253 4.8795 23.6 29.59 38.01 50.36 69.49 101.4 160.5 288.8 657.6 2660 1.07+04 2.687+05 1.075+06 2.689+07 2.69+09
β = .3 -5.729+08 -5.727+06 -2.286+05 -5.703+04 -2239 -546.8 -130.4 -55.3 -29.68 -18.12 -12.01 -8.4266 -6.1643 -4.6577 -3.6116 -0.6621 -0.3175 -0.1284 0.1727 0.7456 2.0419 5.9304 27.99 35.01 44.87 59.31 81.66 118.9 187.7 337.2 766 3092 1.242+04 3.117+05 1.247+06 3.119+07 3.119+09
β = .4 -3.979+08 -3.976+06 -1.587+05 -3.955+04 -1544 -374.7 -88.22 -36.94 -19.59 -11.83 -7.7639 -5.4013 -3.9257 -2.9539 -2.2873 -0.5168 -0.3104 -0.08695 0.2794 0.9707 2.5155 7.0930 32.76 40.89 52.3 69.01 94.83 137.8 217.2 389.3 882.6 3556 1.428+04 3.579+05 1.432+06 3.581+07 3.581+09
β = .5 -2.546+08 -2.544+06 -1.015+05 -2.526+04 -978.7 -235.1 -54.3 -22.33 -11.65 -6.9394 -4.5081 -3.1186 -2.2666 -1.7169 -1.3485 -0.4839 -0.2918 -0.02615 0.4119 1.2318 3.0459 8.3672 37.91 47.24 60.32 79.44 109 158.1 248.8 445.1 1008 4053 1.626+04 4.072+05 1.629+06 4.074+07 4.074+09
β = .6
Stable quantiles zλ (α, β), α = 0.5 β = .7 -1.432+08 -1.431+06 -5.697+04 -1.416+04 -541.4 -128 -28.65 -11.47 -5.8603 -3.4525 -2.2466 -1.5815 -1.1908 -0.9517 -0.8021 -0.4743 -0.2589 0.05409 0.57 1.5290 3.6331 9.7530 43.44 54.05 68.91 90.62 124.1 179.8 282.5 504.7 1141 4582 1.836+04 4.598+05 1.84+06 4.599+07 4.6+09
β = .8 -6.365+07 -6.355+05 -2.524+04 -6253 -232.8 -53.3 -11.29 -4.3603 -2.2272 -1.3755 -0.9889 -0.8053 -0.727 -0.691 -0.6643 -0.4577 -0.211 0.1538 0.7538 1.8622 4.2771 11.25 49.35 61.32 78.07 102.5 140.3 202.9 318.4 568.1 1282 5144 2.06+04 5.155+05 2.062+06 5.157+07 5.157+09
-0.9487 -0.9339 -0.9175 -0.9076 -0.8731 -0.8493 -0.8152 -0.7877 -0.7629 -0.7397 -0.7173 -0.6954 -0.6737 -0.6521 -0.6304 -0.3911 -0.06907 0.4118 1.1981 2.6364 5.7353 14.58 62.33 77.26 98.14 128.6 175.5 253.3 396.6 706 1590 6365 2.546+04 6.366+05 2.546+06 6.366+07 6.366+09
β = .9 β = 1.0 -1.591+07 -1.586+05 -6246 -1532 -52.8 -11.11 -2.2084 -1.0276 -0.81 -0.7671 -0.7379 -0.7129 -0.6901 -0.6684 -0.6473 -0.4304 -0.1477 0.2731 0.9632 2.2313 4.9778 12.86 55.65 69.06 87.82 115.2 157.4 227.4 356.4 635.2 1432 5738 2.297+04 5.745+05 2.298+06 5.745+07 5.745+09
B Stable Quantiles 291
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-3.509+11 -1.109+09 -1.982+07 -3.5+06 -6.197+04 -1.081+04 -1862 -657.8 -311.8 -173.6 -107 -70.67 -49.14 -35.51 -26.46 -3.2817 -0.7358 -0.1646 0 0.1646 0.7358 3.2817 26.46 35.51 49.14 70.67 107 173.6 311.8 657.8 1862 1.081+04 6.197+04 3.5+06 1.982+07 1.109+09 3.509+11
β = .1
-2.696+11 -8.524+08 -1.523+07 -2.688+06 -4.751+04 -8272 -1418 -498.7 -235.3 -130.3 -79.92 -52.54 -36.34 -26.13 -19.37 -2.2766 -0.4937 -0.1262 -0.007285 0.2508 1.0726 4.5404 34.93 46.69 64.33 92.14 138.9 224.5 401.6 844.1 2380 1.377+04 7.878+04 4.443+06 2.516+07 1.408+09 4.453+11
β = .2
-2.008+11 -6.349+08 -1.134+07 -2.001+06 -3.53+04 -6129 -1045 -365.2 -171.3 -94.3 -57.46 -37.54 -25.8 -18.43 -13.57 -1.5076 -0.3393 -0.1307 0.01473 0.388 1.5109 6.0692 44.87 59.75 82.04 117.1 175.9 283.4 505.4 1059 2976 1.717+04 9.807+04 5.524+06 3.128+07 1.75+09 5.535+11
-1.438+11 -4.547+08 -8.12+06 -1.432+06 -2.519+04 -4358 -737.3 -255.8 -119 -65.04 -39.31 -25.48 -17.37 -12.31 -8.9916 -0.956 -0.2653 -0.139 0.06898 0.5797 2.0574 7.8836 56.34 74.81 102.4 145.8 218.3 350.8 623.9 1303 3654 2.103+04 1.199+05 6.75+06 3.821+07 2.138+09 6.761+11
β = .3 -9.784+10 -3.092+08 -5.521+06 -9.734+05 -1.705+04 -2936 -491.9 -168.9 -77.79 -42.05 -25.15 -16.13 -10.88 -7.6337 -5.5245 -0.6015 -0.2625 -0.1261 0.1576 0.8294 2.7181 9.9984 69.42 91.93 125.5 178.2 266.4 426.9 757.6 1579 4418 2.536+04 1.445+05 8.125+06 4.6+07 2.573+09 8.137+11
β = .4 -6.202+10 -1.96+08 -3.497+06 -6.163+05 -1.074+04 -1837 -303.4 -102.7 -46.6 -24.82 -14.63 -9.2525 -6.1639 -4.2770 -3.0686 -0.4233 -0.2711 -0.0896 0.2825 1.1403 3.4990 12.43 84.17 111.2 151.6 214.7 320.2 512.2 907.1 1887 5269 3.019+04 1.719+05 9.657+06 5.466+07 3.057+09 9.669+11
β = .5 -3.55+10 -1.122+08 -2+06 -3.521+05 -6087 -1030 -166.6 -55.18 -24.49 -12.77 -7.3848 -4.5950 -3.0269 -2.0915 -1.5084 -0.3973 -0.2638 -0.02802 0.4457 1.5154 4.4056 15.19 100.7 132.8 180.6 255.4 380.2 607 1073 2229 6212 3.554+04 2.021+05 1.135+07 6.424+07 3.593+09 1.136+12
β = .6
Stable quantiles zλ (α, β), α = 0.4 β = .7 -1.729+10 -5.463+07 -9.727+05 -1.71+05 -2916 -485.1 -75.67 -24.17 -10.37 -5.2557 -2.9856 -1.8567 -1.2516 -0.9109 -0.7134 -0.4053 -0.2384 0.05998 0.6491 1.9577 5.4435 18.28 119 156.6 212.7 300.3 446.4 711.6 1256 2605 7250 4.141+04 2.354+05 1.321+07 7.476+07 4.181+09 1.322+12
β = .8 -6.276+09 -1.981+07 -3.518+05 -6.163+04 -1023 -164.3 -23.84 -7.1373 -2.9447 -1.5113 -0.9379 -0.6913 -0.5871 -0.5544 -0.5385 -0.4017 -0.1937 0.1757 0.8945 2.4700 6.6178 21.74 139.1 182.9 248 349.7 519 826.4 1457 3017 8385 4.783+04 2.717+05 1.524+07 8.624+07 4.823+09 1.525+12
-0.7163 -0.7117 -0.7059 -0.7021 -0.6876 -0.6764 -0.659 -0.6438 -0.6295 -0.6154 -0.6013 -0.587 -0.5724 -0.5575 -0.5421 -0.3505 -0.04244 0.4949 1.5182 3.7159 9.3955 29.75 185.2 243 328.7 462.3 684.6 1087 1913 3952 1.096+04 6.239+04 3.54+05 1.984+07 1.122+08 6.276+09 1.985+12
β = .9 β = 1.0 -1.109+09 -3.496+06 -6.159+04 -1.068+04 -163.1 -23.52 -2.9043 -0.9655 -0.6451 -0.6144 -0.5981 -0.5839 -0.5704 -0.5571 -0.5437 -0.3838 -0.1287 0.3202 1.1836 3.0552 7.9335 25.55 161.2 211.7 286.6 403.7 598.4 951.5 1675 3465 9621 5.482+04 3.112+05 1.745+07 9.873+07 5.521+09 1.746+12
292 B Stable Quantiles
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-2.836+15 -1.316+12 -6.147+09 -6.088+08 -2.809+06 -2.739+05 -2.624+04 -6554 -2422 -1109 -581.7 -334.8 -206.3 -133.8 -90.39 -5.6199 -0.7774 -0.1109 0 0.1109 0.7774 5.6199 90.39 133.8 206.3 334.8 581.7 1109 2422 6554 2.624+04 2.739+05 2.809+06 6.088+08 6.147+09 1.316+12 2.836+15
β = .1
-1.996+15 -9.26+11 -4.326+09 -4.283+08 -1.973+06 -1.919+05 -1.829+04 -4544 -1670 -760.7 -396.6 -226.9 -138.9 -89.54 -60.09 -3.4571 -0.4459 -0.08219 -0.02038 0.1989 1.2939 8.6079 130 191.5 293.6 474.3 820.2 1557 3384 9116 3.635+04 3.779+05 3.868+06 8.368+08 8.448+09 1.808+12 3.896+15
β = .2
-1.348+15 -6.253+11 -2.92+09 -2.891+08 -1.328+06 -1.288+05 -1.22+04 -3010 -1099 -497 -257.2 -146 -88.68 -56.69 -37.72 -1.9770 -0.2618 -0.09914 -0.01059 0.3615 2.0350 12.57 180.4 264.6 404.1 650.1 1120 2117 4585 1.231+04 4.89+04 5.067+05 5.178+06 1.119+09 1.129+10 2.416+12 5.207+15
-8.636+14 -4.006+11 -1.87+09 -1.851+08 -8.477+05 -8.187+04 -7690 -1882 -680.8 -305 -156.3 -87.82 -52.77 -33.37 -21.95 -1.0441 -0.19 -0.1212 0.0367 0.6148 3.0419 17.66 243.1 355.2 540.7 867 1488 2806 6058 1.621+04 6.422+04 6.635+05 6.77+06 1.461+09 1.475+10 3.155+12 6.799+15
β = .3 -5.166+14 -2.396+11 -1.118+09 -1.106+08 -5.045+05 -4.846+04 -4502 -1089 -389.1 -172 -86.96 -48.18 -28.53 -17.77 -11.52 -0.5305 -0.1985 -0.1216 0.1296 0.9758 4.3576 24.04 319.7 465.7 706.7 1130 1935 3637 7832 2.091+04 8.262+04 8.515+05 8.678+06 1.871+09 1.888+10 4.039+12 8.704+15
β = .4 -2.813+14 -1.305+11 -6.086+08 -6.014+07 -2.727+05 -2.6+04 -2377 -565.3 -198.4 -86.11 -42.68 -23.17 -13.44 -8.2013 -5.2124 -0.3177 -0.2217 -0.09561 0.2765 1.4620 6.0264 31.87 411.6 598.1 905.5 1444 2467 4627 9942 2.648+04 1.044+05 1.074+06 1.093+07 2.356+09 2.377+10 5.084+12 1.096+16
β = .5 -1.337+14 -6.2+10 -2.889+08 -2.852+07 -1.282+05 -1.209+04 -1078 -249.8 -85.25 -35.91 -17.27 -9.0954 -5.1312 -3.0612 -1.9199 -0.2987 -0.2283 -0.03847 0.4865 2.0915 8.0944 41.33 520.6 754.8 1140 1815 3094 5792 1.242+04 3.302+04 1.299+05 1.334+06 1.357+07 2.921+09 2.947+10 6.304+12 1.358+16
β = .6
Stable quantiles zλ (α, β), α = 0.3 β = .7 -5.125+13 -2.376+10 -1.106+08 -1.089+07 -4.824+04 -4460 -381.5 -84.39 -27.42 -10.99 -5.0502 -2.5757 -1.4445 -0.8944 -0.6165 -0.3225 -0.2151 0.05495 0.7686 2.8832 10.61 52.58 648.3 938.2 1415 2248 3825 7148 1.53+04 4.061+04 1.596+05 1.635+06 1.662+07 3.576+09 3.608+10 7.716+12 1.663+16
β = .8 -1.326+13 -6.145+09 -2.851+07 -2.799+06 -1.203+04 -1068 -83.53 -16.76 -4.9695 -1.8924 -0.9118 -0.5669 -0.444 -0.4067 -0.4005 -0.3334 -0.179 0.19 1.1323 3.8565 13.62 65.82 796.4 1151 1732 2748 4669 8712 1.862+04 4.934+04 1.936+05 1.981+06 2.012+07 4.328+09 4.365+10 9.336+12 2.012+16
-0.5086 -0.5079 -0.5069 -0.5061 -0.5026 -0.4994 -0.4937 -0.488 -0.4822 -0.476 -0.4695 -0.4625 -0.455 -0.447 -0.4384 -0.3064 -0.02456 0.6068 2.1443 6.4273 21.32 98.95 1161 1673 2512 3974 6734 1.253+04 2.672+04 7.062+04 2.764+05 2.822+06 2.863+07 6.15+09 6.203+10 1.326+13 2.858+16
β = .9 β = 1.0 -1.316+12 -6.083+08 -2.798+06 -2.717+05 -1063 -82.68 -4.8889 -0.9295 -0.4869 -0.4525 -0.447 -0.4418 -0.4363 -0.4306 -0.4244 -0.3288 -0.1167 0.372 1.5876 5.0310 17.17 81.21 966.7 1395 2097 3322 5636 1.05+04 2.241+04 5.932+04 2.324+05 2.376+06 2.411+07 5.183+09 5.228+10 1.118+13 2.409+16
B Stable Quantiles 293
λ
0.00001 0.00010 0.00050 0.00100 0.00500 0.01000 0.02000 0.03000 0.04000 0.05000 0.06000 0.07000 0.08000 0.09000 0.10000 0.20000 0.30000 0.40000 0.50000 0.60000 0.70000 0.80000 0.90000 0.91000 0.92000 0.93000 0.94000 0.95000 0.96000 0.97000 0.98000 0.99000 0.99500 0.99900 0.99950 0.99990 0.99999
β = .0
-1.878+23 -1.877+18 -5.993+14 -1.868+13 -5.857+09 -1.784+08 -5.291+06 -6.607+05 -1.485+05 -4.606+04 -1.75+04 -7646 -3700 -1934 -1075 -16.84 -0.8856 -0.05039 0 0.05039 0.8856 16.84 1075 1934 3700 7646 1.75+04 4.606+04 1.485+05 6.607+05 5.291+06 1.784+08 5.857+09 1.868+13 5.993+14 1.877+18 1.878+23
β = .1
-1.109+23 -1.108+18 -3.538+14 -1.102+13 -3.448+09 -1.047+08 -3.084+06 -3.825+05 -8.538+04 -2.628+04 -9909 -4295 -2061 -1068 -588.3 -8.2079 -0.3737 -0.04172 -0.02548 0.1242 1.9033 31.52 1840 3287 6244 1.282+04 2.915+04 7.627+04 2.445+05 1.081+06 8.611+06 2.888+08 9.457+09 3.01+13 9.655+14 3.023+18 3.024+23
β = .2
-6.152+22 -6.149+17 -1.963+14 -6.113+12 -1.906+09 -5.762+07 -1.684+06 -2.07+05 -4.58+04 -1.397+04 -5214 -2237 -1062 -544.2 -296.1 -3.5433 -0.1634 -0.06488 -0.03235 0.3118 3.7038 54.92 2987 5307 1.003+04 2.047+04 4.632+04 1.206+05 3.846+05 1.693+06 1.342+07 4.481+08 1.464+10 4.653+13 1.492+15 4.67+18 4.672+23
-3.156+22 -3.153+17 -1.006+14 -3.132+12 -9.725+08 -2.925+07 -8.456+05 -1.028+05 -2.247+04 -6768 -2494 -1055 -493.6 -249.1 -133.4 -1.3137 -0.1124 -0.09154 -0.004781 0.6799 6.6475 90.39 4646 8215 1.545+04 3.141+04 7.076+04 1.834+05 5.827+05 2.555+06 2.018+07 6.711+08 2.189+10 6.945+13 2.227+15 6.969+18 6.971+23
β = .3 -1.46+22 -1.459+17 -4.652+13 -1.447+12 -4.469+08 -1.335+07 -3.803+05 -4.553+04 -9793 -2899 -1049 -435.1 -199.3 -98.39 -51.45 -0.4351 -0.1298 -0.1058 0.07929 1.3157 11.19 142.1 6970 1.228+04 2.3+04 4.659+04 1.046+05 2.701+05 8.551+05 3.737+06 2.941+07 9.751+08 3.176+10 1.006+14 3.226+15 1.01+19 1.01+24
β = .4 -5.867+21 -5.862+16 -1.868+13 -5.805+11 -1.779+08 -5.262+06 -1.468+05 -1.72+04 -3612 -1042 -366.8 -147.7 -65.56 -31.28 -15.78 -0.1991 -0.1572 -0.09887 0.2489 2.3298 17.89 215 1.014+04 1.78+04 3.324+04 6.712+04 1.502+05 3.868+05 1.221+06 5.32+06 4.175+07 1.381+09 4.49+10 1.421+14 4.555+15 1.425+19 1.426+24
β = .5 -1.922+21 -1.92+16 -6.111+12 -1.897+11 -5.746+07 -1.674+06 -4.527+04 -5121 -1036 -286.7 -96.41 -36.97 -15.57 -7.0413 -3.3794 -0.1947 -0.1756 -0.05885 0.5412 3.8586 27.42 314.9 1.437+04 2.516+04 4.685+04 9.434+04 2.105+05 5.407+05 1.702+06 7.399+06 5.793+07 1.911+09 6.207+10 1.963+14 6.291+15 1.968+19 1.969+24
β = .6
Stable quantiles zλ (α, β), α = 0.2 β = .7 -4.562+20 -4.554+15 -1.447+12 -4.48+10 -1.331+07 -3.782+05 -9676 -1029 -194.1 -49.7 -15.36 -5.4023 -2.1200 -0.9418 -0.4964 -0.2226 -0.1796 0.02939 1.0020 6.0672 40.59 448.8 1.991+04 3.477+04 6.458+04 1.297+05 2.888+05 7.401+05 2.325+06 1.008+07 7.878+07 2.593+09 8.414+10 2.659+14 8.52+15 2.665+19 2.666+24
β = .8 -6.007+19 -5.992+14 -1.896+11 -5.844+09 -1.67+06 -4.501+04 -1022 -94.43 -15.14 -3.2969 -0.9497 -0.4144 -0.2885 -0.2623 -0.2597 -0.2437 -0.1621 0.1847 1.6868 9.1519 58.33 624.6 2.704+04 4.711+04 8.731+04 1.75+05 3.889+05 9.945+05 3.118+06 1.35+07 1.052+08 3.457+09 1.121+11 3.539+14 1.134+16 3.547+19 3.548+24
-0.3249 -0.3249 -0.3249 -0.3248 -0.3246 -0.3244 -0.3237 -0.3229 -0.322 -0.3208 -0.3195 -0.3178 -0.3159 -0.3137 -0.311 -0.2498 -0.02485 0.7928 4.0055 18.91 112 1140 4.737+04 8.223+04 1.519+05 3.034+05 6.717+05 1.712+06 5.349+06 2.307+07 1.793+08 5.873+09 1.901+11 5.995+14 1.921+16 6.007+19 6.008+24
β = .9 β = 1.0 -1.877+18 -1.867+13 -5.843+09 -1.775+08 -4.487+04 -1016 -14.93 -0.9577 -0.3184 -0.2919 -0.2918 -0.2911 -0.2903 -0.2893 -0.2881 -0.2543 -0.1141 0.4301 2.6621 13.34 81.73 851.5 3.607+04 6.273+04 1.16+05 2.322+05 5.149+05 1.314+06 4.114+06 1.777+07 1.384+08 4.538+09 1.47+11 4.638+14 1.486+16 4.648+19 4.649+24
294 B Stable Quantiles
Appendix C
Stable Modes
Location of the mode m(α, β) of Z(α, β) ∼ S (α, β; 0). For β < 0, use the symmetry property: m(α, −β) = −m(α, β).
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
295
296
C Stable Modes
α 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00
β = 0.0 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
β = 0.1 -0.00787 -0.01584 -0.02401 -0.03249 -0.04142 -0.05095 -0.06128 -0.06765 -0.07454 -0.07955 -0.08267 -0.08399 -0.08372 -0.08213 -0.07949 -0.07606 -0.07208 -0.06772 -0.06314 -0.05847 -0.05380 -0.04921 -0.04474 -0.04043 -0.03632 -0.03242 -0.02873 -0.02527 -0.02204 -0.01903 -0.01624 -0.01366 -0.01130 -0.00913 -0.00716 -0.00538 -0.00377 -0.00235 -0.00109 0.00000
β = 0.2 -0.01574 -0.03168 -0.04802 -0.06498 -0.08284 -0.10191 -0.11756 -0.13452 -0.14754 -0.15704 -0.16288 -0.16527 -0.16463 -0.16147 -0.15631 -0.14964 -0.14188 -0.13339 -0.12448 -0.11537 -0.10625 -0.09725 -0.08850 -0.08005 -0.07197 -0.06429 -0.05703 -0.05020 -0.04381 -0.03786 -0.03233 -0.02722 -0.02252 -0.01822 -0.01429 -0.01074 -0.00754 -0.00469 -0.00218 0.00000
β = 0.3 -0.02361 -0.04752 -0.07202 -0.09748 -0.12426 -0.14786 -0.17705 -0.20004 -0.21852 -0.23174 -0.23963 -0.24259 -0.24127 -0.23642 -0.22878 -0.21904 -0.20777 -0.19549 -0.18259 -0.16940 -0.15619 -0.14315 -0.13042 -0.11812 -0.10633 -0.09511 -0.08448 -0.07446 -0.06507 -0.05629 -0.04813 -0.04058 -0.03361 -0.02721 -0.02137 -0.01607 -0.01130 -0.00703 -0.00327 0.00000
β = 0.4 -0.03148 -0.06335 -0.09603 -0.12997 -0.16569 -0.19881 -0.23479 -0.26418 -0.28730 -0.30340 -0.31258 -0.31549 -0.31307 -0.30630 -0.29613 -0.28341 -0.26886 -0.25308 -0.23656 -0.21970 -0.20281 -0.18613 -0.16983 -0.15406 -0.13891 -0.12444 -0.11072 -0.09775 -0.08556 -0.07415 -0.06350 -0.05362 -0.04448 -0.03607 -0.02837 -0.02136 -0.01503 -0.00937 -0.00436 0.00000
β = 0.5 -0.03935 -0.07919 -0.12004 -0.16246 -0.20711 -0.24976 -0.29177 -0.32686 -0.35380 -0.37193 -0.38163 -0.38388 -0.37990 -0.37092 -0.35810 -0.34243 -0.32474 -0.30572 -0.28592 -0.26577 -0.24561 -0.22571 -0.20626 -0.18742 -0.16929 -0.15195 -0.13545 -0.11983 -0.10510 -0.09126 -0.07832 -0.06627 -0.05508 -0.04475 -0.03526 -0.02660 -0.01874 -0.01170 -0.00545 0.00000
C Stable Modes
α 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00
β = 0.6 -0.04722 -0.09503 -0.14405 -0.19495 -0.24353 -0.29986 -0.34792 -0.38800 -0.41794 -0.43729 -0.44678 -0.44776 -0.44177 -0.43030 -0.41468 -0.39605 -0.37534 -0.35328 -0.33047 -0.30736 -0.28431 -0.26160 -0.23941 -0.21791 -0.19720 -0.17736 -0.15845 -0.14049 -0.12350 -0.10750 -0.09248 -0.07843 -0.06534 -0.05321 -0.04202 -0.03176 -0.02243 -0.01401 -0.00653 0.00000
297
β = 0.7 -0.05509 -0.11087 -0.16806 -0.22744 -0.28495 -0.34879 -0.40319 -0.44754 -0.47967 -0.49945 -0.50804 -0.50719 -0.49876 -0.48452 -0.46597 -0.44436 -0.42070 -0.39577 -0.37020 -0.34443 -0.31884 -0.29367 -0.26914 -0.24537 -0.22247 -0.20051 -0.17953 -0.15957 -0.14063 -0.12273 -0.10587 -0.09003 -0.07522 -0.06142 -0.04863 -0.03684 -0.02607 -0.01632 -0.00762 0.00000
β = 0.8 -0.06296 -0.12671 -0.19206 -0.25994 -0.32637 -0.39731 -0.45751 -0.50540 -0.53894 -0.55841 -0.56544 -0.56223 -0.55099 -0.53372 -0.51211 -0.48750 -0.46097 -0.43332 -0.40519 -0.37704 -0.34920 -0.32192 -0.29539 -0.26972 -0.24499 -0.22127 -0.19859 -0.17696 -0.15639 -0.13687 -0.11842 -0.10101 -0.08465 -0.06933 -0.05505 -0.04183 -0.02967 -0.01861 -0.00870 0.00000
β = 0.9 -0.07083 -0.14255 -0.21607 -0.28743 -0.36779 -0.44539 -0.51082 -0.56153 -0.59572 -0.61415 -0.61902 -0.61297 -0.59857 -0.57806 -0.55327 -0.52565 -0.49631 -0.46609 -0.43559 -0.40528 -0.37547 -0.34639 -0.31818 -0.29094 -0.26474 -0.23961 -0.21555 -0.19258 -0.17068 -0.14984 -0.13006 -0.11131 -0.09360 -0.07692 -0.06128 -0.04670 -0.03323 -0.02089 -0.00978 0.00000
β = 1.0 -0.07870 -0.15338 -0.23508 -0.31992 -0.40921 -0.49299 -0.56307 -0.61587 -0.64995 -0.66667 -0.66880 -0.65948 -0.64162 -0.61768 -0.58963 -0.55899 -0.52692 -0.49424 -0.46156 -0.42931 -0.39778 -0.36717 -0.33758 -0.30909 -0.28173 -0.25550 -0.23040 -0.20639 -0.18347 -0.16159 -0.14073 -0.12088 -0.10202 -0.08415 -0.06729 -0.05146 -0.03673 -0.02315 -0.01086 0.00000
Appendix D
Asymptotic standard deviations and correlation coefficients for ML estimators
These tables are used to get confidence intervals for maximum likelihood estimators stable parameters, see Section 4.7.
α β σα 0.500 0.0 0.484 0.1 0.483 0.2 0.482 0.3 0.479 0.4 0.475 0.5 0.470 0.6 0.463 0.7 0.455 0.8 0.437 0.9 0.420 1.0 0.412 0.600 0.0 0.604 0.1 0.604 0.2 0.602 0.3 0.598 0.4 0.594 0.5 0.588 0.6 0.580 0.7 0.570 0.8 0.551 0.9 0.539 1.0 0.500
σβ 1.110 1.106 1.092 1.068 1.033 0.985 0.919 0.829 0.707 0.519 0.000 1.203 1.198 1.183 1.156 1.117 1.063 0.991 0.893 0.759 0.561 0.000
σγ 2.390 2.386 2.376 2.360 2.336 2.305 2.267 2.222 2.136 2.025 1.949 2.075 2.071 2.061 2.045 2.022 1.992 1.954 1.909 1.853 1.775 1.673
σδ 0.769 0.788 0.841 0.920 1.018 1.127 1.241 1.356 1.449 1.533 1.606 0.922 0.936 0.977 1.041 1.122 1.214 1.313 1.413 1.505 1.581 1.637
ρα, β 0.000 -0.011 -0.022 -0.032 -0.041 -0.047 -0.051 -0.052 -0.078 -0.123 * 0.000 -0.007 -0.014 -0.020 -0.025 -0.028 -0.030 -0.028 -0.019 -0.076 *
ρα,γ -0.034 -0.036 -0.042 -0.053 -0.068 -0.088 -0.114 -0.148 -0.193 -0.247 -0.311 0.028 0.026 0.020 0.010 -0.004 -0.023 -0.049 -0.082 -0.107 -0.157 -0.239
ρα, δ 0.000 0.060 0.109 0.145 0.166 0.175 0.173 0.162 0.189 0.233 0.272 0.000 0.061 0.115 0.158 0.189 0.208 0.217 0.215 0.223 0.243 0.293
ρβ,γ 0.000 -0.023 -0.045 -0.065 -0.084 -0.099 -0.109 -0.120 -0.130 -0.140 * 0.000 -0.021 -0.042 -0.061 -0.079 -0.093 -0.103 -0.115 -0.125 -0.135 *
ρβ, δ 0.677 0.653 0.589 0.505 0.415 0.330 0.252 0.184 0.125 0.075 * 0.430 0.418 0.385 0.339 0.286 0.232 0.181 0.134 0.094 0.050 *
ργ, δ 0.000 0.218 0.408 0.556 0.664 0.741 0.797 0.837 0.900 0.930 0.960 0.000 0.179 0.341 0.477 0.585 0.669 0.732 0.781 0.825 0.875 0.910
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
299
300
D Asymptotic standard deviations and correlation coefficients for ML estimators α β σα 0.700 0.0 0.728 0.1 0.728 0.2 0.726 0.3 0.722 0.4 0.717 0.5 0.710 0.6 0.700 0.7 0.689 0.8 0.673 0.9 0.667 1.0 0.632 0.800 0.0 0.855 0.1 0.854 0.2 0.851 0.3 0.847 0.4 0.841 0.5 0.834 0.6 0.824 0.7 0.810 0.8 0.791 0.9 0.775 1.0 0.742 0.900 0.0 0.980 0.1 0.979 0.2 0.973 0.3 0.972 0.4 0.963 0.5 0.957 0.6 0.947 0.7 0.929 0.8 0.913 0.9 0.900 1.0 0.849 1.000 0.0 1.102 0.1 1.079 0.2 1.074 0.3 1.071 0.4 1.063 0.5 1.056 0.6 1.042 0.7 1.027 0.8 1.008 0.9 0.988 1.0 0.949 1.100 0.0 1.219 0.1 1.218 0.2 1.215 0.3 1.210 0.4 1.203 0.5 1.193 0.6 1.180 0.7 1.163
σβ 1.312 1.306 1.289 1.259 1.215 1.155 1.075 0.967 0.821 0.601 0.000 1.436 1.429 1.410 1.376 1.327 1.260 1.171 1.053 0.948 0.634 0.000 1.575 1.568 1.544 1.507 1.450 1.379 1.230 1.123 1.005 0.694 0.000 1.732 1.704 1.677 1.637 1.575 1.498 1.367 1.208 1.092 0.750 0.000 1.911 1.902 1.875 1.829 1.762 1.671 1.552 1.395
σγ 1.851 1.848 1.839 1.823 1.801 1.773 1.737 1.693 1.641 1.580 1.517 1.684 1.681 1.673 1.658 1.638 1.611 1.577 1.536 1.500 1.436 1.360 1.553 1.550 1.534 1.527 1.504 1.486 1.466 1.424 1.365 1.283 1.216 1.445 1.438 1.427 1.417 1.407 1.381 1.346 1.305 1.265 1.225 1.135 1.355 1.353 1.346 1.336 1.321 1.301 1.276 1.244
σδ 1.073 1.084 1.115 1.164 1.227 1.301 1.382 1.466 1.548 1.612 1.667 1.214 1.221 1.244 1.280 1.328 1.385 1.449 1.515 1.586 1.643 1.688 1.338 1.344 1.349 1.383 1.411 1.463 1.517 1.565 1.612 1.658 1.709 1.445 1.445 1.442 1.466 1.483 1.517 1.565 1.597 1.643 1.682 1.718 1.536 1.538 1.546 1.558 1.574 1.595 1.618 1.644
ρα, β 0.000 -0.003 -0.006 -0.008 -0.009 -0.009 -0.008 -0.004 0.004 0.000 * 0.000 0.001 0.002 0.004 0.006 0.009 0.013 0.020 0.009 0.040 * 0.000 0.005 0.014 0.015 0.026 0.026 0.024 0.052 0.075 0.106 * 0.000 0.007 0.019 0.023 0.034 0.037 0.042 0.056 0.082 0.142 * 0.000 0.011 0.022 0.033 0.045 0.057 0.070 0.084
ρα,γ 0.082 0.080 0.075 0.066 0.053 0.035 0.012 -0.020 -0.062 -0.104 -0.177 0.129 0.128 0.123 0.115 0.103 0.087 0.066 0.037 -0.004 -0.054 -0.129 0.170 0.169 0.156 0.156 0.140 0.133 0.123 0.091 0.053 0.022 -0.097 0.206 0.199 0.190 0.187 0.175 0.166 0.142 0.117 0.092 0.063 -0.056 0.237 0.236 0.233 0.227 0.219 0.207 0.191 0.169
ρα, δ 0.000 0.058 0.112 0.158 0.195 0.222 0.240 0.248 0.246 0.265 0.303 0.000 0.055 0.107 0.154 0.194 0.226 0.251 0.268 0.266 0.291 0.335 0.000 0.052 0.099 0.144 0.176 0.227 0.247 0.277 0.297 0.322 0.366 0.000 0.050 0.097 0.140 0.178 0.229 0.257 0.291 0.319 0.340 0.381 0.000 0.048 0.095 0.141 0.184 0.226 0.264 0.298
ρβ,γ 0.000 -0.018 -0.036 -0.053 -0.067 -0.079 -0.087 -0.100 -0.110 -0.120 * 0.000 -0.014 -0.028 -0.041 -0.053 -0.062 -0.070 -0.083 -0.090 -0.100 * 0.000 -0.010 -0.015 -0.026 -0.035 -0.041 -0.054 -0.067 -0.075 -0.085 * 0.000 -0.006 -0.007 -0.014 -0.016 -0.022 -0.033 -0.044 -0.055 -0.065 * 0.000 -0.001 -0.001 -0.001 0.001 0.003 0.008 0.016
ρβ, δ 0.212 0.207 0.193 0.172 0.147 0.120 0.094 0.069 0.049 0.025 * 0.037 0.036 0.032 0.026 0.019 0.012 0.006 0.002 -0.006 -0.010 * -0.100 -0.099 -0.085 -0.088 -0.067 -0.083 -0.060 -0.046 -0.025 -0.015 * -0.206 -0.179 -0.167 -0.160 -0.143 -0.144 -0.136 -0.105 -0.075 -0.050 * -0.290 -0.288 -0.281 -0.269 -0.253 -0.232 -0.205 -0.173
ργ, δ 0.000 0.146 0.284 0.406 0.509 0.594 0.663 0.718 0.762 0.825 0.850 0.000 0.121 0.238 0.345 0.440 0.523 0.593 0.652 0.706 0.750 0.800 0.000 0.102 0.172 0.289 0.372 0.446 0.495 0.583 0.650 0.700 0.750 0.000 0.086 0.155 0.245 0.322 0.404 0.449 0.508 0.595 0.635 0.680 0.000 0.074 0.147 0.218 0.286 0.351 0.412 0.470
D Asymptotic standard deviations and correlation coefficients for ML estimators α
1.200
1.300
1.400
1.500
1.600
β 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5
σα 1.139 1.103 1.019 1.325 1.324 1.321 1.316 1.308 1.298 1.284 1.265 1.240 1.201 1.111 1.418 1.417 1.413 1.408 1.399 1.388 1.373 1.354 1.326 1.284 1.174 1.490 1.489 1.485 1.479 1.471 1.459 1.443 1.422 1.393 1.347 1.225 1.535 1.534 1.531 1.524 1.515 1.503 1.487 1.464 1.433 1.384 1.271 1.544 1.543 1.539 1.533 1.524 1.511
σβ 1.185 0.878 0.000 2.117 2.108 2.078 2.027 1.954 1.854 1.723 1.552 1.321 0.984 0.000 2.361 2.350 2.318 2.263 2.182 2.073 1.930 1.742 1.488 1.116 0.000 2.658 2.647 2.611 2.551 2.463 2.344 2.187 1.981 1.701 1.287 0.000 3.037 3.024 2.986 2.920 2.825 2.695 2.523 2.295 1.984 1.517 0.000 3.551 3.537 3.496 3.425 3.321 3.179
σγ 1.205 1.155 1.062 1.276 1.274 1.269 1.259 1.246 1.229 1.206 1.178 1.144 1.098 1.011 1.206 1.204 1.199 1.191 1.180 1.165 1.145 1.121 1.090 1.049 0.969 1.141 1.140 1.136 1.129 1.119 1.106 1.090 1.069 1.042 1.006 0.932 1.081 1.080 1.076 1.071 1.063 1.052 1.038 1.020 0.997 0.966 0.900 1.023 1.022 1.019 1.014 1.008 0.999
σδ 1.671 1.698 1.726 1.610 1.612 1.616 1.624 1.634 1.647 1.662 1.679 1.696 1.713 1.741 1.670 1.671 1.674 1.678 1.683 1.690 1.699 1.708 1.718 1.728 1.745 1.716 1.716 1.717 1.719 1.721 1.725 1.728 1.732 1.737 1.741 1.752 1.747 1.747 1.748 1.748 1.748 1.749 1.749 1.750 1.751 1.751 1.752 1.764 1.764 1.764 1.763 1.763 1.762
ρα, β 0.099 0.155 * 0.000 0.014 0.028 0.043 0.057 0.073 0.088 0.105 0.122 0.186 * 0.000 0.017 0.035 0.052 0.070 0.088 0.107 0.126 0.146 0.216 * 0.000 0.021 0.041 0.062 0.084 0.105 0.127 0.149 0.170 0.254 * 0.000 0.025 0.049 0.074 0.099 0.124 0.149 0.173 0.197 0.300 * 0.000 0.029 0.058 0.087 0.116 0.145
ρα,γ 0.138 0.088 -0.033 0.264 0.263 0.260 0.255 0.247 0.236 0.222 0.201 0.172 0.125 0.006 0.287 0.286 0.284 0.279 0.272 0.261 0.248 0.228 0.201 0.155 0.025 0.307 0.306 0.303 0.299 0.292 0.282 0.269 0.251 0.224 0.180 0.047 0.323 0.322 0.320 0.315 0.309 0.299 0.287 0.269 0.243 0.200 0.080 0.336 0.335 0.332 0.328 0.322 0.313
ρα, δ 0.328 0.352 0.412 0.000 0.047 0.093 0.139 0.183 0.226 0.266 0.305 0.340 0.372 0.441 0.000 0.046 0.092 0.137 0.182 0.226 0.269 0.311 0.351 0.390 0.464 0.000 0.046 0.091 0.136 0.181 0.226 0.271 0.315 0.360 0.405 0.499 0.000 0.045 0.090 0.135 0.181 0.226 0.272 0.319 0.367 0.417 0.519 0.000 0.044 0.089 0.134 0.179 0.225
ρβ,γ 0.029 0.047 * 0.000 0.004 0.008 0.013 0.019 0.026 0.034 0.045 0.059 0.077 * 0.000 0.009 0.017 0.027 0.037 0.048 0.060 0.074 0.089 0.106 * 0.000 0.013 0.026 0.040 0.054 0.069 0.084 0.101 0.117 0.132 * 0.000 0.017 0.035 0.053 0.071 0.089 0.108 0.126 0.143 0.155 * 0.000 0.022 0.043 0.065 0.086 0.108
ρβ, δ -0.131 -0.077 * -0.357 -0.354 -0.346 -0.332 -0.313 -0.288 -0.256 -0.216 -0.166 -0.100 * -0.410 -0.407 -0.398 -0.383 -0.362 -0.333 -0.297 -0.252 -0.195 -0.120 * -0.452 -0.449 -0.439 -0.423 -0.400 -0.369 -0.330 -0.281 -0.219 -0.136 * -0.485 -0.481 -0.471 -0.454 -0.430 -0.397 -0.356 -0.304 -0.238 -0.149 * -0.509 -0.505 -0.495 -0.477 -0.451 -0.417
301 ργ, δ 0.523 0.571 0.615 0.000 0.063 0.126 0.188 0.248 0.307 0.363 0.417 0.468 0.517 0.562 0.000 0.055 0.109 0.163 0.216 0.268 0.319 0.368 0.417 0.464 0.506 0.000 0.047 0.094 0.140 0.187 0.233 0.278 0.323 0.368 0.412 0.453 0.000 0.040 0.080 0.120 0.160 0.200 0.240 0.281 0.321 0.362 0.411 0.000 0.034 0.068 0.102 0.136 0.170
302
D Asymptotic standard deviations and correlation coefficients for ML estimators α
1.700
1.800
1.900
1.950
1.990
β 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3
σα 1.493 1.470 1.437 1.387 1.269 1.502 1.501 1.497 1.490 1.481 1.468 1.450 1.426 1.393 1.341 1.220 1.383 1.382 1.378 1.372 1.362 1.349 1.332 1.308 1.276 1.226 1.107 1.127 1.126 1.123 1.117 1.109 1.097 1.082 1.062 1.034 0.993 0.902 0.884 0.883 0.881 0.876 0.870 0.860 0.849 0.833 0.812 0.781 0.694 0.477 0.477 0.476 0.474
σβ 2.989 2.736 2.384 1.844 0.000 4.318 4.304 4.259 4.183 4.070 3.913 3.701 3.411 3.001 2.353 0.000 5.671 5.655 5.607 5.523 5.397 5.218 4.970 4.620 4.108 3.265 0.000 9.060 9.041 8.983 8.879 8.717 8.479 8.132 7.623 6.840 5.490 0.000 14.550 14.523 14.437 14.282 14.039 13.673 13.132 12.323 11.062 8.864 0.000 43.060 42.951 42.614 42.018
σγ 0.988 0.973 0.954 0.928 0.870 0.965 0.964 0.962 0.958 0.953 0.946 0.937 0.926 0.911 0.889 0.841 0.904 0.903 0.902 0.899 0.896 0.891 0.884 0.876 0.864 0.848 0.810 0.835 0.834 0.834 0.832 0.830 0.827 0.823 0.817 0.810 0.800 0.776 0.791 0.791 0.790 0.789 0.788 0.786 0.783 0.780 0.775 0.769 0.743 0.738 0.738 0.738 0.738
σδ 1.761 1.760 1.758 1.757 1.761 1.764 1.764 1.764 1.763 1.762 1.761 1.759 1.757 1.755 1.753 1.758 1.742 1.742 1.742 1.741 1.740 1.739 1.738 1.736 1.735 1.733 1.737 1.682 1.682 1.682 1.681 1.681 1.680 1.679 1.678 1.677 1.675 1.672 1.619 1.619 1.618 1.618 1.618 1.617 1.616 1.615 1.613 1.611 1.606 1.504 1.504 1.504 1.504
ρα, β 0.173 0.200 0.225 0.329 * 0.000 0.034 0.069 0.102 0.124 0.169 0.200 0.229 0.254 0.364 * 0.000 0.040 0.080 0.120 0.141 0.194 0.229 0.259 0.281 0.405 * 0.000 0.046 0.091 0.135 0.166 0.215 0.250 0.278 0.296 0.440 * 0.000 0.046 0.091 0.134 0.175 0.212 0.244 0.269 0.282 0.412 * 0.000 0.035 0.069 0.102
ρα,γ 0.300 0.283 0.257 0.216 0.097 0.344 0.343 0.341 0.336 0.330 0.321 0.309 0.292 0.268 0.227 0.113 0.345 0.345 0.342 0.338 0.332 0.324 0.312 0.296 0.273 0.235 0.129 0.332 0.331 0.329 0.325 0.320 0.312 0.302 0.288 0.268 0.237 0.156 0.306 0.305 0.303 0.300 0.296 0.290 0.281 0.270 0.254 0.229 0.127 0.224 0.223 0.222 0.221
ρα, δ 0.272 0.320 0.371 0.426 0.536 0.000 0.043 0.087 0.131 0.175 0.221 0.268 0.317 0.369 0.429 0.542 0.000 0.041 0.083 0.125 0.167 0.211 0.257 0.306 0.360 0.422 0.527 0.000 0.037 0.074 0.112 0.151 0.191 0.233 0.280 0.331 0.394 0.497 0.000 0.033 0.065 0.099 0.134 0.170 0.208 0.251 0.298 0.357 0.432 0.000 0.024 0.048 0.073
ρβ,γ 0.128 0.148 0.165 0.174 * 0.000 0.025 0.050 0.075 0.100 0.123 0.146 0.166 0.182 0.186 * 0.000 0.028 0.056 0.083 0.109 0.133 0.156 0.174 0.187 0.186 * 0.000 0.027 0.054 0.080 0.105 0.127 0.147 0.162 0.170 0.164 * 0.000 0.023 0.046 0.068 0.088 0.107 0.122 0.134 0.139 0.131 * 0.000 0.012 0.023 0.034
ρβ, δ -0.374 -0.320 -0.251 -0.160 * -0.523 -0.520 -0.509 -0.490 -0.464 -0.429 -0.384 -0.329 -0.259 -0.167 * -0.526 -0.522 -0.511 -0.492 -0.465 -0.430 -0.385 -0.331 -0.263 -0.174 * -0.502 -0.499 -0.488 -0.470 -0.445 -0.412 -0.371 -0.321 -0.259 -0.178 * -0.460 -0.457 -0.448 -0.432 -0.409 -0.380 -0.345 -0.301 -0.246 -0.174 * -0.330 -0.328 -0.322 -0.312
ργ, δ 0.205 0.240 0.276 0.313 0.359 0.000 0.028 0.056 0.084 0.112 0.141 0.170 0.200 0.231 0.264 0.306 0.000 0.022 0.044 0.066 0.089 0.112 0.135 0.160 0.185 0.213 0.248 0.000 0.016 0.031 0.047 0.063 0.079 0.096 0.114 0.134 0.155 0.186 0.000 0.011 0.023 0.035 0.047 0.059 0.072 0.085 0.100 0.117 0.142 0.000 0.006 0.011 0.017
D Asymptotic standard deviations and correlation coefficients for ML estimators α
β 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.995 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 2.000 0.0
σα 0.471 0.466 0.461 0.454 0.445 0.433 0.377 0.362 0.361 0.361 0.359 0.357 0.354 0.351 0.346 0.341 0.333 0.294 0.000
σβ 41.107 39.786 37.905 35.199 31.153 24.433 0.000 67.412 67.214 66.607 65.545 63.943 61.663 58.478 53.994 47.444 36.833 0.000 ∞
σγ 0.737 0.736 0.736 0.735 0.733 0.731 0.723 0.727 0.727 0.727 0.726 0.726 0.726 0.725 0.725 0.724 0.723 0.719 0.707
σδ 1.503 1.503 1.502 1.501 1.500 1.498 1.493 1.473 1.473 1.473 1.472 1.472 1.472 1.471 1.470 1.469 1.468 1.457 1.414
ρα, β 0.133 0.160 0.183 0.200 0.206 0.312 * 0.000 0.029 0.058 0.085 0.111 0.134 0.154 0.168 0.172 0.273 * *
ρα,γ 0.218 0.215 0.210 0.204 0.197 0.186 0.134 0.187 0.187 0.186 0.185 0.183 0.181 0.177 0.173 0.168 0.160 0.129 *
ρα, δ 0.098 0.125 0.153 0.184 0.217 0.257 0.287 0.000 0.020 0.041 0.062 0.084 0.106 0.130 0.155 0.182 0.214 0.231 *
ρβ,γ 0.044 0.053 0.061 0.066 0.068 0.062 * 0.000 0.008 0.016 0.024 0.031 0.037 0.042 0.045 0.047 0.043 * *
ρβ, δ -0.298 -0.279 -0.255 -0.225 -0.188 -0.136 * -0.272 -0.270 -0.266 -0.258 -0.246 -0.231 -0.212 -0.188 -0.157 -0.114 * *
303 ργ, δ 0.023 0.029 0.036 0.043 0.050 0.058 0.069 0.000 0.004 0.008 0.012 0.016 0.021 0.025 0.030 0.035 0.040 0.048 0.000
References
Abdul-Hamid, H., and J.P. Nolan. 1998. Multivariate stable densities as functions of one dimensional projections. Journal of Multivariate Analysis 67: 80–89. Abramowitz, M., and I.A. Stegun. 1972. Handbook of Mathematical Functions (Tenth Printing ed.). U.S: Government Printing Office. Ahlfors, L. 1979. Complex Analysis (3rd ed.). McGraw-Hill Book Co. Anderson, C. 2006. The Long Tail: Why the Future of Business Is Selling Less of More. NY, NY: Hyperion. Anderssen, R.S., S.A. Husain, and R.J. Loy. 2004, August. The Kohlrausch function: properties and applications. Australian New Zealand Industrial and Applied Mathematics Journal 45 (E): C800–C816. Andrews, B., M. Calder, and R. Davis. 2009. Maximum likelihood estimation for α-stable autoregressive processes. Annals of Statistics 37 (4): 1946–1982. Antoniadis, A., A. Feuerverger, and P. Gonçalves. 2006. Wavelet-based estimation for univariate stable laws. Annals of the Institute of Statistical Mathematics 58 (4): 779–807. Arce, G.R. 2005. Nonlinear Signal Processing. NY: Wiley. Audus, D., J. Douglas, and J.P. Nolan. 2020. Approximation of α−capacity via stable random walks in R d . Preprint. Barker, A.W. 2015, July. Log quantile differences and the temporal aggregation of alpha-stable moving average processes. Ph.D. thesis, Macquarie University, Department of Statistics. Barthelemy, P., J. Bertolotti, and D.S. Wiersma. 2008. A Lévy flight for light. Nature 453: 495–498. Supplementary information available. Basterfield, D., T. Bundt, and G. Murphy. 2003. The stable Paretian hypothesis and the Asian currency crisis. Technical report, Oregon Graduate Institute. Basterfield, D., T. Bundt, and G. Murphy. 2005a. Risk management and the Asian currency crisis: backtesting stable Paretian value-at-risk. Technical report, Hillsdale College, Department of Economics and Business Administration.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
305
306
References
Basterfield, D., T. Bundt, and G. Murphy. 2005b. Unconditional stable distributions vs. GARCH: backtesting value at risk during the Asian currency crisis. Technical report, Hillsdale College, Department of Economics and Business Administration. Beirlant, J., U. Goegebeur, J. Segers, and J.L. Teugels. 2004. Statistics of Extremes: Theory and Applications. Wiley Series in Probability and Statistics. Wiley. Benson, D.A., R. Schumer, M.M. Meerschaert, and S.W. Wheatcraft. 2001. Fractional dispersion, Lévy motion, and the MADE tracer tests. Transport in Porous Media 42 (1–2): 211–240. Bergström, H. 1952. On some expansions of stable distributions. Arkiv für Matematik 2: 375–378. Bingham, N., C. Goldie, and J. Teugels. 1987. Regular Variation. Cambridge University Press. Blattberg, R., and T. Sargent. 1971. Regression with non-gaussian disturbances: some sampling results. Econometrica 39: 501–510. Boldyrev, S., and C. Gwinn. 2003. Scintillations and Lévy flights through the interstellar medium. Physical Review Letters 91: 131101. Boto, J.P., and N. Stollenwerk. 2009. Fractional calculus and Lévy flights: modelling spatial epidemic spreading. InProceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE 2009, ed. J. Vigo-Aguiar, vol. 1. CMMSE. Brant, R. 1984. Approximate likelihood and probability calculations based on transforms. Annals of Statistics 12: 989–1005. Breiman, L. 1968. Probability. Redding, Mass.: Addison-Wesley. Breiman, L. 1992. Probability. Philadelphia: SIAM. Reprint of original 1968 edition. Bremmer, I., and P. Keat. 2009. The Fat Tail. Oxford: Oxford, UK. Brent, R.P. 1971. An algorithm with guaranteed convergence for finding a zero of a function. The Computer Journal 14: 422–425. Brockmann, D., and L. Hufnagel. 2007, April. Front propagation in reactionsuperdiffusion dynamics: Taming lévy flights with fluctuations. Physical Review Letters 98: 178301. Brockmann, D., L. Hufnagel, and T. Geisel. 2006. The scaling laws of human travel. Nature 439: 462–465. Broda, S.A., M. Haas, J. Krause, M.S. Paolella, and S.C. Steude. 2013. Stable mixture GARCH models. Journal of Econometrics 172 (2): 292–306. Brorsen, B.W., and S.R. Yang. 1990. Maximum likelihood estimates of symmetric stable distribution parameters. Communications in Statistics Simulation 19: 1459– 1464. Brothers, K.M., W.H. DuMouchel, and A.S. Paulson. 1983. Fractiles of the stable laws. Technical report, Rensselaer Polytechnic Institute, Troy, NY. Brown, G.W., and J.W. Tukey. 1946. Some distribution of sample means. Annals of Mathematical Statistics 17: 1–12. Brown, R.J. 2000. Return distributions of private real estate investment. Ph.D. thesis, University of Pennsylvania. Brown, R.J. 2004. Risk and private real estate investments. Journal of Real Estate Portfolio Management 10 (2): 113–127.
References
307
Brown, R.J. 2005. Private Real Estate Investment: Data Analysis and Decision Making. Academic Press/Elsevier. Brynjolfsson, E., Y.J. Hu, and M.D. Smith. 2006, Summer. From niches to riches: anatomy of the long tail. Sloan Management Review 47 (4): 67–71. Buckle, D.J. 1993. Stable distributions and portfolio analysis: a Bayesian approach via MCMC. Ph.D. thesis, Imperial College, London. Buckle, D.J. 1995. Bayesian inference for stable distributions. JASA 90: 605–613. Carassso, A.S. 2002. The APEX method in image sharpening and the use of low exponent Lévy stable laws. SIAM Journal on Applied Mathematics 63: 593–618. Caron, F., and E.B. Fox. 2017. Sparse graphs using exchangeable random measures. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79 (5): 1295–1366. Casella, G., and R. Berger. 1990. Statistical Inference. Wadsworth and Brooks/Cole. Chakraborty, P., M.M. Meerschaert, and C.Y. Lim. 2009. Parameter estimation for fractional transport: a particle tracking approach. Water Resources Research 45: W10415. https://doi.org/10.1029/2008WR007577. Chambers, J., C. Mallows, and B. Stuck. 1976. A method for simulating stable random variables. Journal of the American Statistical Association 71 (354): 340– 344. Correction in JASA 82: 704 (1987). Chitre, M.A., J.R. Potter, and S. Ong. 2006. Optimal and near-optimal signal detection in snapping shrimp dominated ambient noise. IEEE Journal of Oceanic Engineering 31 (2): 497–503. Christoph, G., and W. Wolf. 1992. Convergence Theorems with a Stable Limit Law. Berlin: Akademie Verlag. Cormode, G. 2003. Stable distributions for stream computations: it’s as easy as 0,1,2. In Workshop on Management and Processing of Massive Data Streams at FCRC. Cormode, G., M. Datar, P. Indyk, and S. Muthukrishnan. 2002. Comparing data streams using hamming norms (How to zero in). In Proceedings of the 28th DLDB Conference, Hong Kong. Cormode, G. and P. Indyk. 2006. Stable distributions in streaming computations. In Data Stream Management: Processing High-Speed Data Streams, ed. M. Garofalakis, J. Gehrke, and R. Rastogi. Springer. (The book containing this chapter has yet to be published). Cormode, G., and S. Muthukrishnan. 2003. Estimating dominance norms of multiple data streams. In Proceedings of European Symposium on Algorithms. Cramér, H. 1962. On the approximation to a stable probability distribution. In Studies in Mathematical Analysis and Related Topics, 70–76. Stanford, California: Stanford University Press. Cramér, H. 1963. On asymptotic expansions for sums of independent random variables with a limiting stable distribution. Sankhy¯a Ser. A 25: 13–24; addendum, ibid. 25: 216. Cressie, N. 1975. A note on the behavior of stable distributions for small index α. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 33: 61–64.
308
References
Crovella, M., M. Taqqu, and A. Bestavros. 1998. A practical guide to heavy tails, Chapter Heavy-tailed probability distributions in the World Wide Web, 3–26. Boston: Birkhauser. Crovella, M.E., and L. Lipsky. 1997. Long-lasting transient conditions in simulations with heavy-tailed workloads. In Proceedings of the 1997 Winter Simulation Conference, ed. S. Andradottir, K.J. Healy, D.H. Withers, and B.L. Nelson. ACM. Csorgo, M. 1981. Limit behavior of the empirical characteristic function. Annals of Probability 9: 130–144. Csörgő, S. 1987. Testing for stability. In Goodness-of-fit (Debrecen, 1984). Colloquia Mathematica Societatis János Bolyai, vol. 45, 101–132. Amsterdam: NorthHolland. Csörgő, T., S. Hegyi, and W.A. Zajc. 2004. Bose-Einstein correlations for Lévy stable source distributions. European Physics Journal C 36: 67–78. Cushman, J.H., and M. Moroni. 2001. Statistical mechanics with three-dimensional particle tracking velocimetry experiments in the study of anomalous dispersion. I. Theory. Physics of Fluids 13 (1): 75–80. Cushman, J.H., M. Park, N. Kleinfelter, and M. Moroni. 2005. Super-diffusion via lévy lagrangian velocity processes. Geophysical Research Letters 32 (9). D’Agostino, R.B., and M.A. Stephens. 1986. Goodness-of-Fit Techniques. Marcel Dekker. Dance, C.R. and E.E. Kuruoğlu. 1999. Estimation of the parameters of skewed αstable distributions. In Proceedings of the ASA-IMS Conference on Heavy Tailed Distributions, ed. J.P. Nolan, and A. Swami. Davis, R., and S. Resnick. 1985. More limit theory for the sample correlation function of moving averages. Stochastic Processes and Their Applications 20 (2): 257–279. Davis, R., and S. Resnick. 1986. Limit theory for the sample covariance and correlation functions of moving averages. Annals of Statistics 14: 533–558. Davydov, Y., I. Molchanov, and S. Zuyev. 2008. Strictly stable distributions on convex cones. Electronic Journal of Probability 13 (11): 259–321. de Haan, L., and A.F. Ferreira. 2006. Extreme Value Theory. New York: Springer. de Haan, L., and L. Peng. 1999. Exact rates of convergence to a stable law. The Journal of the London Mathematical Society 2 (59): 1134–1152. De Vany, A. 2003. Hollywood Economics: How Extreme Uncertainty Shapes the Film Industry. Routledge. De Vany, A., and W.D. Walls. 1999. Uncertainty in the movie industry: does star power reduce the terror of the box office? Journal of Cultural Economics 23 (4): 285–318. Dharmadhikari, S.W., and M. Sreehari. 1976. A note on stable characteristic functions. Sankhy¯a Series A 38 (2): 179–185. Ditlevsen, P.D. 2004. Turbulence and Climate Dynamics. Copenhagen: Frydenberg. Doray, L.G., S.M. Jiang, and A. Luong. 2009. Some simple methods of estimation for the parameters of the discrete stable distribution with the probability generating function. Communications in Statistics - Simulation and Computation 38: 2004– 2017.
References
309
Draper, N.R., and H. Smith. 1981. Applied Regression Analysis (2nd ed.). New York: Wiley. DuMouchel, W.H. 1971. Stable distributions in statistical inference. Ph.D. thesis, University Ann Arbor, University Microfilms, Ann Arbor, Michigan. DuMouchel, W.H. 1973a. On the asymptotic normality of the maximum-likelihood estimate when sampling from a stable distribution. Annals of Statistics 1: 948–957. DuMouchel, W.H. 1973b. Stable distributions in statistical inference. 1: symmetric stable distributions compared to other symmetric long-tailed distributions. JASA 68: 469–477. DuMouchel, W.H. 1975. Stable distributions in statistical inference 2: information from stably distributed samples. JASA 70: 386–393. DuMouchel, W.H. 1983. Estimating the stable index α in order to measure tail thickness: a critique. Annals of Statistics 11: 1019–1031. El Barmi, H., and P.I. Nelson. 1997. Inference from stable distributions. In Selected Proceedings of the Symposium on Estimating Functions (Athens, GA, 1996). IMS Lecture Notes Monograph Series, vol. 32, 439–456. Hayward, CA: Institute of Mathematical Statistics. Elton, D.C. 2018, August. Stretched Exponential Relaxation. arXiv e-prints, arXiv:1808.00881. Embrechts, P., C. Klüppelberg, and T. Mikosch. 1997. Modelling Extreme Events for Insurance and Finance. Berlin: Springer. Epps, T.W. 1993. Characteristic functions and their empirical counterparts: geometrical interpretations and applications to statistical inference. JASA 47: 33–48. Epstein, B. 1948. Some applications of the Mellin transform in statistics. Annals of Mathematical Statistics 19: 370–379. Fama, E., and R. Roll. 1968. Some properties of symmetric stable distributions. JASA 63: 817–83. Fama, E.F. 1965. The behavior of stock market prices. Journal of Business 38: 34–105. Fan, Z. 2006. Parameter estimation of stable distributions. Communications in Statistics: Theory and Methods 35: 245–255. Fan, Z. 2009. Minimum-distance estimator for stable exponent. Communications in Statistics Theory Methods 38 (3–5): 511–528. Farsad, N., W. Guo, C. Chae, and A. Eckford. 2015, December. Stable distributions as noise models for molecular communication. In 2015 IEEE Global Communications Conference (GLOBECOM), 1–6. Feller, W. 1952. On a generalization of Marcel Riesz’ potentials and the semi-groups generated by them. In Comm. Sém. Math. Univ. Lund, Tome Supplémentaire Dédié à Marcel Riesz, 74–81. Gauthier-Villars, Paris. Feller, W. 1968. An Introduction to Probability Theory and Its Applications, vol. 1 (3rd ed.). New York: Wiley. Feller, W. 1971. An Introduction to Probability Theory and Its Applications, vol. 2 (2nd ed.). New York: Wiley. Ferguson, T. 1996. A Course in Large Sample Theory. London: Chapman and Hall.
310
References
Feuerverger, A., and P. McDunnough. 1981. On efficient inference in symmetric stable laws and processes. In Statistics and Related Topics, ed. M. Csörgö, D.A. Dawson, N.J.K. Rao, and A.K. Saleh, 109–122. Amsterdam: North-Holland. Feuerverger, A., and R.A. Mureika. 1977. The empirical characteristic function and its applications. Annals of Statistics 5: 88–97. Fisher, R.A., and L.H.C. Tippett. 1928. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society 24: 180–190. Fofack, H., and J.P. Nolan. 1999. Tail behavior, modes and other characteristics of stable distributions. Extremes 2 (1): 39–58. Fofack, H., and J.P. Nolan. 2001. Distribution of parallel exchange rates in African countries. Journal of International Money and Finance 20: 987–1001. Fougères, A.-L., C. Mercadier, and J.P. Nolan. 2013. Dense classes of multivariate extreme value distributions. Journal of Multivariate Analysis 116: 109–129. Fougéres, A.-L., J.P. Nolan, and H. Rootzén. 2009. Models for dependent extremes using stable mixtures. Scandanavian Journal of Statistics 36: 42–59. Fox, C. 1961. The G and H-functions as symmetrical Fourier kernels. Transactions of the American Mathematical Society 98: 395–429. Freeman, M.P., and G. Chisham. 2005. On the probability distributions of SuperDARN Doppler spectral width measurements inside and outside the cusp. Geophysical Research Letters 31. https://doi.org/10.1029/2003GL019074. Friedland, O., and O. Guédon. 2011. Random embedding of pn into rn . Mathematische Annalen 350 (4): 953–97. Garcia, R., E. Renault, and D. Veredas. 2011. Estimation of stable distributions by indirect inference. Journal of Econometrics 161 (2): 325–337. Gatto, R., and S.R. Jammalamadaka. 2003. Inference for wrapped symmetric alpha stable circular models. Sanhkya 65: 333–355. Gaver, D.P., P.A. Jacobs, and E.A. Seglie. 2004. Reliability growth by “Test-AnalyzeFix-Test” with Bayesian and success-run stopping rules. Technical report, Naval Postgraduate School. Gawronski, W. 1984. On the bell-shape of stable densities. The Annals of Probability 12: 230–242. Gawronski, W., and M. Wiessner. 1992. Asymptotics and inequalities for the mode of stable laws. Statistics and Decisions 10: 183–197. Gaynor, G., E.Y. Chang, S.L. Painter, and L. Paterson. 2000. Application of lévy random fractal simulation techniques in modelling reservoir mechanisms in the kuparuk river field, north slope, alaska. SPE Reservoir Evaluation & Engineering 3 (3): 263–271. Geluk, J.L., and L. de Haan. 1987. Regular variation, extensions and Tauberian theorems. AMSTERDAM: CWI Tract 40. Geluk, J.L., and L. de Haan. 2000. Stable probability distributions and their domains of attraction: a direct approach. Probability and Mathematical Statistics 20 (1): 169–188. Geluk, J.L., and L. Peng. 2000. Second order regular variation and the domain of attraction of stable distributions. Analysis 20: 359–371.
References
311
Geluk, J.L., L. Peng, and C.G. de Vries. 2000. Convolutions of heavy-tailed random variables and applications to portfolio diversifcation and MA(1) times series. The Advances in Applied Probability 32: 1011–1026. Georgiadis, A. 2000. Adaptive Equalisation for Impulsive Noise Environments. Ph.D. thesis, University Edinburgh, Electrical & Electronics Engineering Department. Gnedenko, B.V. 1943. Sur la distribution limite du terme maximum d’une serie aleatoire. Annals of Mathematics 44: 423–453. Gnedenko, B.V., and A.N. Kolmogorov. 1954. Limit Distributions for Sums of Independent Random Variables. Addison-Wesley. Gneiting, T. 1997. Normal scale mixtures and dual probability densities. Journal of Statistical Computation and Simulation 59: 375–384. Godsill, S.J. 1999. Mcmc and em-based methods for inference in heavy-tailed processes with alpha-stable innovations. In IEEE Signal Processing Workshop on Higher-Order Statistics. IEEE. Godsill, S.J. 2000. Inference in symmetric alpha-stable noise using mcmc and the slice sampler. In IEEE International Conference on Acoustics, Speech and Signal Processing, vol. VI, 3806–3809. IEEE. Goldie, C.M. 1978. Subexponential distributions and dominated-variation tails. Journal of Applied Probability 15 (2): 440–442. Gomes, C., and B. Selman. 1999. Heavy-tailed distributions in computational methods. In Proceedings of the ASA-IMS Conference on Heavy Tailed Distributions, ed. J.P. Nolan, and A. Swami. Gomi, C., and Y. Kuzuha. 2013. Simulation of a daily precipitation time series using a stochastic model with filtering. Open Journal of Modern Hydrology 3: 206–213. Gonzalez, J. 1997. Robust Techniques for Wireless Communications in NonGaussian Environments. Ph.D. Electrical Engineering, University of Delaware, USA. Gorenflo, R., and F. Mainardi. 1998. Fractional calculus and stable probability distributions. Archives of Mechanics 50: 377–388. Górska, K., and K.A. Penson. 2011. Lévy stable two-sided distributions: exact and explicit densities for asymmetric case. Physical Review E 83 (6): 061125. Gradshteyn, I., and I. Ryzhik. 2000. Table of Integrals, Series, and Products. Academic Press. Graff, R.A., A. Harrington, and M. Young. 1997. The shape of Australian real estate return distributions and comparisons to the United States. Journal of Real Estate Research 14: 291–308. Guadagnini, A., S.P. Neuman, T. Nan, M. Riva, and C.L. Winter. 2015. Scalable statistics of correlated random variables and extremes applied to deep borehole porosities. Hydrology and Earth System Sciences 19 (2): 729–745. Guadagnini, A., S.P. Neuman, M.G. Schaap, and M. Riva. 2013. Anisotropic statistical scaling of vadose zone hydraulic property estimates near Maricopa, Arizona. Water Resources Research 49 (12): 8463–8479. Guadagnini, A., S.P. Neuman, M.G. Schaap, and M. Riva. 2014. Anisotropic statistical scaling of soil and sediment texture in a stratified deep vadose zone near Maricopa, Arizona. Geoderma 214–215: 217–227.
312
References
Gunning, J. 2002. On the use of multivariate levy-stable random field models for geological heterogeneity. Mathematical Geology 34 (1): 43–62. Gupta, V.K. and E. Waymire. 1990. Multiscaling properties of spatial rainfall and river flow distributions. Hall, P. 1981a. A comedy of errors: the canonical form for a stable characteristic function. Bulletin of the London Mathematical Society 13: 23–27. Hall, P. 1981b. Two-sided bounds on the rate of convergence to a stable law. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57: 349–364. Hall, P. 1984. On unimodality and rates of convergence for stable laws. London Mathematical Society 2 (30): 371–384. Hallin, M., Y. Swan, T. Verdebout, and D. Veredas. 2011. Rank-based testing in linear models with stable errors. Journal of Nonparametric Statistics 23 (2): 305–320. Hallin, M., Y. Swan, T. Verdebout, and D. Veredas. 2013. One-step R-estimation in linear models with stable errors. Hanagal, D.D. 2011. Modeling Survival Data Using Frailty Models. Boca Raton, FL: CRC Press. Harchol-Balter, M. 2013. Performance Modeling and Design of Computer Systems Queueing Theory in Action. Cambridge, UK: Cambridge University Press. Hardin, C.D. 1984. Skewed stable variables and processes. Technical Report 79, Center for Stochastic Processes, University of North Carolina, Chapel Hill. Heathcote, C.R. 1982. Linear regression by functional least squares. Journal of Applied Probability (Special Vol. 19A), 225–239. Essays in statistical science. Heinrich, L., F. Pukelsheim, and U. Schwingenschlögl. 2004. Sante-Laguë’s chisquare divergence for rounding probabilities and its convergence to a stable law. Statistics and Decisions 22: 43–59. Hill, B. 1975. A simple general approach to inference about the tail of a distribution. Annals of Statistics 3: 1163–1174. Hoffmann-Jørgensen, J. 1994. Stable densities. Theory of Probability and Its Applications 38: 350–355. Holt, D., and E. Crow. 1973. Tables and graphs of the stable probability density functions. Journal of Research of the National Bureau of Standards 77B, 143– 198. Houdré, C., and P. Marchal. 2004. On the concentration of measure phenomenon for stable and related random vectors. Annals of Probability 32 (2): 1496–1508. Hougaard, P. 1986. A class of multivariate failure time distributions. Biometrika 73: 671–678. Hougaard, P. 2000. Analysis of Multivariate Survival Data. New York: Springer. Hughes, B.D. 1995. Random Walks and Random Environments, vol. 1. Oxford: Oxford University Press. Ibragimov, I., and K. Chernin. 1959. On the unimodality of stable laws. Theory of Probability and Its Applications 4: 417–419. Ibragimov, I.A., and Y.V. Linnik. 1971. Independent and Stationary Sequences of Random Variables. Groningen: Wolters-Nordhoff. Ibragimov, R. 2005, November. On the robustness of economic models to heavytailedness assumptions. Preprint, BundesBank November 2005 Conference.
References
313
Indyk, P. 2000. Stable distributions, pseudorandom generators, embeddings and data stream computation. In Proceedings of the 41st IEEE Symposium on Foundations of Computer Science. IEEE Computer Society. Jammalamadaka, S.R., and A. SenGupta. 2001. Topics in Circular Statistics. Singapore: World Scientific. Janicki, A., and A. Weron. 1994. Simulation and Chaotic Behavior of α-Stable Stochastic Processes. New York: Marcel Dekker. Jaoua, N., E. Duflos, P. Vanheeghe, L. Clavier, and F. Septier. 2014. Joint estimation of state and noise parameters in a linear dynamic system with impulsive measurement noise: application to OFDM systems. Digital Signal Processing 35: 21–36. Jin, H.J. 2005. Heavy tailed behavior of commodity price distribution and optimal hedging demand. Department of Ag-Business and Applied Economics, North Dakota State University. To appear in Journal Risk and Insurance. Kanter, M. 1976. On the unimodality of stable densities. The Annals of Probability 4: 1006–1008. Kaplan, P.D. 2012. Frontiers of Modern Asset Allocation. Hoboken, N.J.: Wiley. Kapoor, B., A. Banerjee, G.A. Tsihrintzis, and N. Nandhakumar. 1999. UWB radar detection of targets in foliage using alpha-stable clutter models. IEEE Transactions on Aerospace and Electronic Systems 35 (3): 819–834. Kesten, H., M.V. Kozlov, and F. Spitzer. 1975. A limit law for random walk in a random environment. Compositio Mathematica 30: 145–168. Khindanova, I., and Z. Atakhanova. 2002. Stable modeling in energy risk management. Mathematical Methods of Operations Research 55 (2): 225–245. Khindanova, I., S. Rachev, and E. Schwartz. 2001. Stable modeling of value at risk. Mathematical and Computer Modelling 34 (9–11): 1223–1259. Kidmose, P. 2001. Independent components analysis using the spectral measure for alpha-stable distributions. In Proceedings of IEEE-EURASIP 2001 Workshop on Nonlinear Signal and Image Processing, Volume CD-ROM NSIP-2001. (file cr1126.pdf). King, D.A., and M.S. Young. 1994. Why diversification doesn’t work. Real Estate Review 25 (2): 6–12. King, G., and L. Zeng. 2001. Explaining rare events in international relations. International Organization 55: 693–715. Kogon, S.M., and D.B. Williams. 1998. Characteristic function based estimation of stable parameters. In A Practical Guide to Heavy Tailed Data, ed. R. Adler, R. Feldman, and M. Taqqu, 311–338. Boston, MA.: Birkhäuser. Kohlrausch, R. 1847. Ueber das Dellmann’sche elektrometer. Annals of Physics (Leipzig) 12: 393. Koponen, I. 1995. Jul). Analytic approach to the problem of convergence of truncated lévy flights towards the gaussian stochastic process. Physical Review E 52: 1197– 1199. Kosko, B., and S. Mitaim. 2001. Robust stochastic resonance: signal detection and adaptation in impulsive noise. Physical Review E 64 (051110): 1–11.
314
References
Kotulska, M. 2007. Natural fluctuations of an electropore show fractional lévy stable motion. Biophysical Journal 92 (7): 2412–2421. Kotz, S., T.J. Kozubowski, and K. Podgórski. 2001. The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance. Boston: Birkhäuser. Kotz, S., and S. Nadarajah. 2000. Extreme Value Distributions: Theory and Applications. London: Imperial College Press. Koutrouvelis, I.A. 1980. Regression type estimation of the parameters of stable laws. JASA 75: 918–928. Koutrouvelis, I.A. 1981. An iterative procedure for the estimation of the parameters of stable laws. Communications in Statistics Simulation 10: 17–28. Kratz, M., and S.I. Resnick. 1996. The QQ-estimator and heavy tails. Communications in Statistics Stochastic Models 12 (4): 699–724. Krutto, A. 2018, December. Empirical cumulant function based parameter estimation in stable laws. Ph.D. thesis, University of Tartu, Tartu, Estonia. Kuruoglu, E., and J. Zerubia. 2003. Skewed alpha-stable distributions for modelling textures. Pattern Recognition Letters 24 (1–3): 339–348. Kuske, R., and J.B. Keller. 2001. Rate of convergence to a stable law. SIAM Journal on Applied Mathematics 61 (4): 1308–1323. Kwaśnicki, M. 2020. A new class of bell-shaped functions. Transactions of the American Mathematical Society 373 (4): 2255–2280. Kwaśnicki, M., and T. Simon. 2019, October. Characterisation of the class of bellshaped functions. arXiv e-prints, arXiv:1910.07752. To appear in Transactions of the American Mathematical Society Lamantia, F., S. Ortobelli, and S. Rachev. 2006. An empirical comparison among var models and time rules with elliptical and stable distributed returns. Landau, L. 1944. On the energy loss of fast particles by ionization. Journal of Physics (USSR) 8: 201. Lange, K.L., R.J.A. Little, and J.M.G. Taylor. 1989. Robust statistical modeling using the t distribution. Journal of the American Statistical Association 84 (408): 881–896. Lavallée, D., and R.J. Archuleta. 2003. Stochastic modeling of slip spatial complexities for the 1979 imperial valley, California, earthquake. Geophysical Research Letters 30. https://doi.org/10.1029/2002GL015839. Lavallée, D., and H. Beltrami. 2004. Stochastic modeling of climatic variability in dendrochronology. Geophysical Research Letters 31. Ledoux, M., and M. Talagrand. 1991. Probability in Banach Spaces: Isoperimetry and Processes. Springer. Leo, W.R. 1994. Techniques for Nuclear and Particle Physics Experiments. New York: Springer. Lévy, P. 1925. Calcul des Probabilités. Paris: Gauthier-Villars. Li, L., and J. Mustard. 2000. Compositional gradients across mare-highland contacts: importance and geological implication of lateral transport. Journal of Geophysical Research 105 (E8): 20431–20450.
References
315
Li, L., and J. Mustard. 2005. Lateral mixing on the moon: remote sensing observations and modeling. Geophysical Research 110: EE11002. Li, R., Z. Zhao, Y. Zhong, C. Qi, and H. Zhang. 2019. The stochastic geometry analyses of cellular networks with α-stable self-similarity. IEEE Transactions on Communications 67 (3): 2487–2503. Liechty, J.C., D.K.J. Lin, and J.P. McDermott. 2003. Single-pass low-storage arbitrary quantile estimation for massive data sets. Statistics and Computing 13: 91–100. Linder, F., J. Tran-Gia, S.R. Dahmen, and H. Hinrichsen. 2008, April. Long-range epidemic spreading with immunization. Journal of Physics A: Mathematical and Theoretical 41 (18): 185005. Lombardi, M.J., and G. Calzolari. 2005, November. Indirect estimation of α-stable stochastic volatility models. Preprint, BundesBank November 2005 Conference. Lovejoy, S. 1982. Area-perimeter relation for rain and cloud areas. Science 216: 185–187. Lovejoy, S., and B. Mandelbrot. 1985. Fractal properties of rain, and a fractal model. Tellus 37 (A): 209–232. Lovejoy, S., and D. Schertzer. 1986. Scale invariance, symmetries, fractals and stochastic simulations of atmospheric phenomena. Bulletin of the American Meteorological Society 67 (1): 21–32. Machado, J.A.T., and A.M. Lopes. 2020. Rare and extreme events: the case of covid19 pandemic. Nonlinear Dynamics. https://doi.org/10.1007/s11071-020-05680w. Mallick, M., and N. Ravishankaer. 2004, November. Bivariate positive stable frailty models: Stat Dept., U. Conn. Mandelbrot, B. 1963. The variation of certain speculative prices. Journal of Business 26: 394–419. Mandelbrot, B.B. 1982. The Fractal Geometry of Nature. San Francisco: W.H. Freeman and Co. Mantegna, R.N. 1994. Fast, accurate algorithm for numerical simulation of Lévy stable stochastic processes. Physical Review E 49 (5): 4677–4683. Marchal, P. 2005. Measure concentration for stable laws with index close to 2. Electronic Communications in Probability 10: 29–35 (electronic). Marcus, A.H. 1970. Distribution and covariance function of elevations on a cratered planetary surface. The Moon 1: 297–337. Marcus, M.B. 1981. Weak convergence of the empirical characteristic function. The Annals of Probability 9: 194–201. Martin, R.D., S. Rachev, and F. Siboulet. 2006. Phi-Alpha Optimal Portfolios and Extreme Risk Management. Preprint: FinAnlytica Inc. Mathai, A.M. 1993. A Handbook of Generalized Special Functions for Statistical and Physical Sciences. New York: Oxford University Press. Matsui, M. 2005. Fisher information matrix of general stable distributions close to the normal distribution. Technical report, University of Tokyo. http://arxiv.org/ abs/math/0502559.
316
References
Matsui, M., and Z. Pawlas. 2016, May. Fractional absolute moments of heavy tailed distributions. Brazilian Journal of Probability and Statistics 30 (2): 272–298. Matsui, M., and A. Takemura. 2006. Some improvements in numerical evaluation of symmetric stable density and its derivatives. Communications in Statistics Theory and Methods 35 (1): 149–172. Mayer-Wolf, E., A. Roitershtein, and O. Zeitouni. 2004. Limit theorems for onedimensional transient random walks in Markov environments. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 40 (5): 635–659. McCulloch, J.H. 1986. Simple consistent estimators of stable distribution parameters. Communications in Statistics. Simulation and Computation 15: 1109–1136. McCulloch, J.H. 1996. Financial applications of stable distributions. In Handbook of Statistics, vol. 14, ed. G.S. Maddala, and C.R. Rao. New York: North-Holland. McCulloch, J.H. 1997. Measuring tail thickness to estimate the stable index alpha: a critique. Journal of Business & Economic Statistics 15: 74–81. McCulloch, J.H. 1998a. Linear regression with stable disturbances. In A Practical Guide to Heavy Tails: Statistical Techniques for Analyzing Heavy Tailed Distributions, ed. R. Adler, R. Feldman, and M. Taqqu, 359–378. Boston: Birkhäuser. McCulloch, J.H. 1998b. Maximum likelihood estimation of symmetric stable parameters. Technical Report, Department of Economics, Ohio State University. McCulloch, J.H. 1998c. Numerical approximation of the symmetric stable distribution and density. In A Practical Guide to Heavy Tails: Statistical Techniques for Analyzing Heavy Tailed Distributions, ed. R. Adler, R. Feldman, and M. Taqqu, 489–500. Boston: Birkhäuser. McCulloch, J.H. 2003. The risk-neutral measure under log-stable uncertainty. Technical Report,Department of Economics, Ohio State University. McCulloch, J.H., and D. Panton. 1998. Tables of the maximally-skewed stable distributions. In A Practical Guide to Heavy Tails: Statistical Techniques for Analyzing Heavy Tailed Distributions, ed. R. Adler, R. Feldman, and M. Taqqu, 501–508. Boston: Birkhäuser. McDermott, J.P., G.J. Babu, J.C. Liechty, and D.K. Lin. 2007. Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation. Statistics and Computing 17 (4): 311–321. https://doi.org/10.1007/s11222-007-9021-3. Meerschaert, M., T. Kozubowski, F. Molz, and S. Lu. 2004. Fractional Laplace model for hydraulic conductivity. Geophysical Research Letters 31: L08501. Meerschaert, M.M., D.A. Benson, H.-P. Scheffler, and B. Baeumer. 2002. Stochastic solution of space-time fractional diffusion equations. Physical Review E (3) 65 (4): 041103, 4. Meerschaert, M.M., and H.-P. Scheffler. 1999. Sample covariance matrix for random vectors with heavy tails. Journal of Theoretical Probability 12 (3): 821–838. Meerschaert, M.M., and H.-P. Scheffler. 2001. Limit Distributions for Sums of Independent Random Vectors. New York: Wiley. Meerschaert, M.M., and A. Sikorskii. 2012. Stochastic Models for Fractional Calculus. Berlin: De Gruyter.
References
317
Menabde, M., and M. Sivapalan. 2000. Modeling of rainfall time series and extremes, using bounded random cascades and Levy-stable distributions. Water Resources Research 36: 3293–3300. Mihalas, D. 1978. Stellar Atmospheres, 2nd ed. San Francisco: Freeman. Mikosch, T., T. Gadrich, C. Klüppelberg, and R.J. Adler. 1995. Parameter estimate for ARMA models with infinite variance innovations. Annals of Statistics 23 (1): 305–326. Millán, H., J. Rodríguez, B. Ghanbarian-Alavijeh, R. Biondi, and G. Llerena. 2011. Temporal complexity of daily precipitation records from different atmospheric environments: Chaotic and Levy stable parameters. Atmospheric Research 101: 879–892. Miller, K.S., and B. Ross. 1995. An Introduction to the Fractional Calculus and Fractional Diffusion Equations. New York: Wiley. Mitra, S.S. 1981. Distribution of symmetric stable laws of index 2−n . The Annals of Probability 9: 710–711. Mitra, S.S. 1982. Stable laws of index 2−n . The Annals of Probability 10: 857–859. Mohammadi, M., and A. Mohammadpour. 2011. On the order statistics of α-stable distributions. Preprint. Molz, F.J., H. Rajaram, and S.L. Lu. 2004. Stochastic fractal-based models of heterogeneity in subsurface hydrology: origins, applications, limitations, and future research questions. Reviews of Geophysics 42 (1), Article Number: RG1002. Nan, T. 2014. Scaling and extreme value statistics of sub-Gaussian fields with application to neutron porosity data. Ph.D. thesis, The University of Arizona. Nan, T., S. Neuman, M. Riva, and A. Guadagnini. 2016. Handbook of Groundwater Engineering. Chapter Analyzing Randomly Fluctuating Hierarchical Variables and Extremes. CRC Press. Nelsen, R.B. 1999. An Introduction to Copulas, vol. 139. Lecture Notes in Statistics. New York: Springer. Nikias, C.L., and M. Shao. 1995. Signal Processing with Alpha-Stable Distributions and Applications. New York: Wiley. Nolan, J.P. 1997. Numerical calculation of stable densities and distribution functions. Communications in Statistics - Stochastic Models 13: 759–774. Nolan, J.P. 2001. Maximum likelihood estimation of stable parameters. In Lévy Processes: Theory and Applications, ed. O.E. Barndorff-Nielsen, T. Mikosch, and S.I. Resnick, 379–400. Boston: Birkhäuser. Nolan, J.P. 2008, July. Advances in nonlinear signal processing for heavy tailed noise. International Workshop in Applied Probability 2008. Nolan, J.P. 2010. Metrics for multivariate stable distributions. In Stability in Probability. Banach Center Publications, vol. 90, 83–102. Polish Academy of Sciences Institute of Mathematics, Warsaw. Nolan, J.P. 2013. Multivariate elliptically contoured stable distributions: theory and estimation. Computational Statistics 28 (5): 2067–2089. Nolan, J.P. 2014. Financial modeling with heavy tailed stable distributions. WIREs Computational Statistics 6 (1): 45–55. https://doi.org/10.1002/wics.1286. Nolan, J.P. 2019a. Graphical diagnostics for heavy tailed data. Preprint.
318
References
Nolan, J.P. 2019b. Multivariate stable cumulative probabilities in polar form and related functions. In progress. Nolan, J.P. 2019c. Zolotarev-type integrals for multivariate stable functions. In progress. Nolan, J.P., J.G. Gonzalez, and R.C. Núñez. 2010. Stable filters: a robust signal processing framework for heavy-tailed noise. In Proceedings of the 2010 IEEE Radar Conference, 470–473. Nolan, J.P., and D. Ojeda-Revah. 2013. Linear and nonlinear regression with stable errors. Journal of Econometrics 172: 186–194. Núñez, R.C., J.G. Gonzalez, G.R. Arce, and J.P. Nolan. 2008. Fast and accurate computation of the myriad filter via branch-and-bound search. IEEE Transactions on Signal Processing 56: 3340–3346. Ojeda-Revah, D. 2001. Comparative study of stable parameter estimators and regression with stably distributed errors. Ph.D. thesis, American University. Olver, F.W.J., D.W. Lozier, R.F. Boisvert, and C.W. Clark. 2010. NIST Handbook of Mathematical Functions. Cambridge University Press. Painter, S. 2001. Flexible scaling model for use in random field simulation of hydraulic conductivity. Water Resources Research 37 (5): 1155–1163. Painter, S., G. Geresford, and L. Paterson. 1995. On the distribution of seismic reflection coefficients and seismic amplitudes. Geophysics 60 (4): 1187–1194. Panton, D. 1992. Cumulative distribution function values for symmetric standardized stable distributions. Statistics Simulation 21: 458–492. Paolella, M.S. 2007. Intermediate Probability: A Computational Approach. Chichester: Wiley. Paulson, A.S., and T.A. Delehanty. 1993. Tables of the fractiles of the stable law. Technical report, Renesselaer Polytechnic Institute, Troy, NY. Paulson, A.S., E.W. Holcomb, and R. Leitch. 1975. The estimation of the parameters of the stable laws. Biometrika 62: 163–170. Peach, G. 1981. Theory of the pressure broadening and shift of spectral lines. Advances in Physics 30 (3): 367–474. Penson, K.A., and K. Górska. 2010. Exact and explicit probability densities for one-sided levy stable distributions. Physical Review Letters 105 (21): 210604. Peters, E.E. 1994. Fractal Market Analysis. NY: Wiley. Peters, G., S. Sisson, and Y. Fan. 2012. Likelihood-free Bayesian inference for αstable models. Computational Statistics and Data Analysis 56 (11), 3743–3756. 1st issue of the Annals of Computational and Financial Econometrics Sixth Special Issue on Computational Econometrics. Pewsey, A. 2008. The wrapped stable family of distributions as a flexible model for circular data. Computational Statistics and Data Analysis. Piessens, R., E. deDoncker Kapenga, C.W. Überhuber, and D.K. Kahaner. 1983. QUADPACK: A Subroutine Package for Automatic Integration. New York: Springer. Pinelis, I. 2011. Positive-part moments via the Fourier-Laplace transform. Journal of Theoretical Probability 24 (2): 409–421.
References
319
Pitman, E.J.G., and J. Pitman. 2016. A direct approach to the stable distributions. Advances in Applied Probability 48 (A): 261–282. Press, S.J. 1972. Estimation in univariate and multivariate stable distributions. JASA 67: 842–846. Prudnikov, A.P., Y.A. Brychkov, and O.I. Marichev. 1990. Integrals and Series, vol. 3. New York: Gordon and Breach Science Publishers. Translation from the original 1986 Russian edition. Qiou, Z., N. Ravishanker, and D. Dey. 1999. Multivariate survival analysis with positive stable frailties. Biometrics 55 (2): 637–644. Rachev, S.T. 1991. Probability Metrics and the Stability of Stochastic Models. New York: Wiley. Rachev, S.T. 2003. Handbook of Heavy Tailed Distributions in Finance. Amsterdam: Elsevier. Rachev, S.T., and S. Mittnik. 2000. Stable Paretian Models in Finance. New York, NY: Wiley. Ravishanker, N., and D. Dey. 2000. Multivariate survival models with a mixture of positive stable frailties. Methodology and Computing in Applied Probability 2 (3): 293–308. Reiss, R.-D., and M. Thomas. 2001. Statistical Analysis of Extreme Values, 2nd ed. Basel: Birkhäuser. Resnick, S. 1987. Extreme Values, Regular Variation and Point Processes. New York: Springer. Resnick, S. 1997. Heavy tail modeling and teletraffic data. Annals of Statistics 25: 1805–1869. Resnick, S., and C. Stărică. 1995. Consistency of Hill’s estimator for dependent data. Journal of Applied Probability 32: 139–167. Resnick, S.I., and C. Stărică. 1997. Smoothing the Hill estimator. Advances in Applied Probability 29: 271–293. Reynolds, A.M., and M.A. Frye. 2007, April. Free-flight odor tracking in Drosophila is consistent with an optimal intermittent scale-free search. PLOS ONE 2 (4): 1–9. Rimmer, R. 2014, April. Notes on stable order statistics. Private correspondence. Rishmawi, S. 2005. Fitting concentration data with stable distributions. Ph.D. thesis, American University, Washington, DC. Robinson, G. 2003a, March. High precision computation of density and tail areas for stable distributions. Technical report, CSIRO, Mathematical and Information Sciences. Robinson, G. 2003b, March. Use of log maximally skew stable distribution for pricing and evaluating portfolio risk. Technical report, CSIRO, Mathematical and Information Sciences. Robust Analysis Inc. 2009. User Manual for STABLE 5.1. Software and user manual available online at www.RobustAnalysis.com. Rodriguez-Aguilar, R., J.A. Marmolejo-Saucedo, and B. Retana-Blanco. 2019. Prices of Mexican wholesale electricity market: an application of alpha-stable regression. Sustainability 11: 1–14. https://doi.org/10.3390/su11113185.
320
References
Roll, R. 1970. The Behavior of Interest Rates; The Application of the Efficient Market Hypothesis to U. S. Treasury Bills. New York, New York: Basic Books. Rosiński, J. 2007. Tempering stable processes. Stochastic Processes and Their Applications 117 (6): 677–707. Rosiński, J., and W. Woyczynski. 1987. Multilinear forms in Pareto-like random variables and product random measures. Colloquium Mathematicum 51: 303– 313. Roughan, M., and C. Kalmanek. 2003. Pragmatic modeling of broadband access traffic. Computer Communications 26: 804–816. Sahimi, M., and S.E. Tajer. 2005. Self-affine fractal distributions of the bulk density, elastic moduli, and seismic wave velocities of rock. Physical Review E 71 (4), Article Number: 046301. Samorodnitsky, G., S.T. Rachev, J.-R. Kurz-Kim, and S. Stoyanov. 2007. Asymptotic distribution of unbiased linear estimators in the presence of heavy-tailed stochastic regressors and residuals. Probability and Mathematical Statistics 27 (2): 275–302. Samorodnitsky, G., and M. Taqqu. 1994. Stable Non-Gaussian Random Processes. New York: Chapman and Hall. Samuelson, P. 1967. Efficient portfolio selection for Pareto-Lévy investments. Journal of Finance and Quantitative Analysis 2: 107–117. Sato, K. 1999. Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press. Sato, K., and M. Yamazato. 1978. On distribution functions of class L. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 43: 273–308. Schlesinger, M.F., G.M. Zaslavsky, and U. Frisch. 1995. Lévy Flights and Related Topics in Physics. Lecture Notes in Physics, No. 450. Springer. Schneider, W.R. 1986. Stable distributions: fox function representation and generalization. In Stochastic Processes in Classical and Quantum Systems, ed. S. Albeverio, G. Casati, and D. Merlini, 497–511. Berlin: Springer. Sigman, K. 1999, December. Appendix: a primer on heavy-tailed distributions. Queueing Systems 33 (1): 261–275. Simon, T. 2011a. A multiplicative short proof for the unimodality of stable densities. Electronic Communications in Probability 16: 623–629. Simon, T. 2011b. Multiplicative strong unimodality for positive stable laws. Proceedings of the American Mathematical Society 139 (7): 2587–2595. Simon, T. 2015. Positive stable densities and the bell-shape. Proceedings of the American Mathematical Society 143 (2): 885–895. Sivia, D.S. 1996. Data Analysis. New York: The Clarendon Press, Oxford University Press. A Bayesian tutorial. Sorace, J. 2012, 13 November. Humans are not electrons: characterization of the medicare disease probability distribution. Seminar, NIH. Springer, M. 1979. The Algebra of Random Variables. Wiley. Steutel, F., and K. Van Harn. 2004. Infinite Divisibility of Probability Distributions on the Real Line. NY, NY: Marcel Dekker. Steutel, F.W., and K. van Harn. 1979. Discrete analogues of self-decomposability and stability. The Annals of Probability 7 (5): 893–899.
References
321
Stuck, B.W., and B. Kleiner. 1974. A statistical analysis of telephone noise. Bell System Technical Journal 53 (7): 1263–1320. Tsihrintzis, G., and C. Nikias. 1996. Fast estimation of the parameters of alphastable impulsive interference. IEEE Transactions on Signal Processing 44 (6): 1492–1503. Tsionas, E.G. 1999. Monte Carlo inference in econometric models with symmetric stable disturbances. Journal of Econometrics 88: 365–401. Uchaikin, V.V., and R. Sibatov. 2013. Fractional Kinetics In Solids : Anomalous Charge Transport in Semiconductors, Dielectrics, and Nanosystems. Singapore: World Scientific. Uchaikin, V.V., and V.M. Zolotarev. 1999. Chance and Stability. Utrecht: VSP Press. Urbanik, K. 1964. Generalized convolutions. Studia Mathematica 23: 217–245. Ushakov, N.G. 1999. Selected Topics in Characteristic Functions. Utrecht: VSP Press. Van den Heuvel, F., F. Fiorini, N. Schreuder, and B. George. 2018, March. Using stable distributions to characterize proton pencil beams. Medical Physics 45. Van den Heuvel, F., S. Hackett, F. Fiorini, C. Taylor, S. Darby, and K. Vallis. 2015. Su-f-brd-04: robustness analysis of proton breast treatments using an alpha-stable distribution parameterization. Medical Physics 42 (6 Part 25): 3526. Vaupel, J.W., K.G. Manton, and E. Stallard. 1979. The impact of heterogeneity on individual frailty on the dynamics of mortality. Demography 16: 439–454. Velis, D.R. 2003. Estimating the distribution of primary reflection coefficients. Geophysics 68 (4): 1417–1422. Venzon, D.J., and S.H. Moolgavkar. 1988. A method for computing profilelikelihood-based confidence intervals. Journal of the Royal Statistical Society. Series C (Applied Statistics) 37 (1): 87–94. Verdi, M. 2014, August. Calculating properties for shape classification using Lévy flights. Seminar slides. Presentation to SURF Program at NIST. Viswanathan, G.M., S.V. Buldyrev, S. Havlin, M.G.E. da Luz, E.P. Raposo, and H.E. Stanley. 1999. Optimizing the success of random searches. Nature 401: 911–914. Volkovich, Z.V., D. Toledano-Kitai, and R. Avros. 2010. On analytical properties of generalized convolutions. In Stability in Probability. Banach Center Publications, vol. 90, 243–274. Polish Academy of Sciences Institute of Mathematics, Warsaw. Walls, W.D., and J. McKenzie. 2019. Black swan models for the entertainment industry with an application to the movie business. Empirical Economics. Wang, B., E.E. Kuruoglu, and J. Zhang. 2009. ICA by maximizing non-stability. In Independent Component Analysis and Signal Separation: 8th International Conference, 179–186. Springer. Wassell, J.T., W.C. Wojciechowski, and D.D. Landen. 1999. Recurrent injury eventtime analysis. Statistics in Medicine 18: 3355–3363. Wells, R.J. 1999. Rapid approximation to the Voigt/Faddeeva function and its derivatives. JQSRT 62: 29–48. Weron, R. 1996. On the Chambers-Mallows-Stuck method for simulating skewed stable random variables. Statistics and Probability Letters 28: 165–171.
322
References
Weron, R. 2005, November. Heavy tails and electricity prices. Preprint, BundesBank November 2005 Conference. West, B.J. 1999. Physiology, Promiscuity and Prophecy at the Millennium: A Tail of Tails. New Jersey: World Scientific. West, B.J. 2016. Fractional Calculus View of Complexity: Tomorrow’s Science. Boca Raton: CRC Press. Williams, E.J. 1977. Some representations of stable random variables as products. Biometrika 64: 167–169. Williams, G., and D.C. Watts. 1970. Non-symmetrical dielectric relaxation behaviour arising from a simple empirical decay function. Transactions of the Faraday Society 66: 80. Willinger, W., V. Paxson, and M. Taqqu. 1998. A Practical Guide to Heavy Tails, Chapter Self-similarity and heavy tails: structural modeling of network traffic, 27–54. Boston: Birkhauser. Wintner, A. 1936. On a class of Fourier transforms. American Journal of Mathematics 58: 45–90. Wise, J. 1966. Linear estimators for linear regression systems having infinite residual variances. Unpublished manuscript, Economics Department, University of Hawaii. Wolfe, S.J. 1973. On the local behavior of characteristic functions. Annals of Probability 1: 862–866. Wolfe, S.J. 1975a. On derivatives of characteristic functions. Annals of Probability 3 (4): 737–738. Wolfe, S.J. 1975b. On moments of probability distribution functions. In Fractional Calculus and Its Applications (Proceedings of the International Conference, University of New Haven, West Haven, Connecticut, 1974), 306–316. Lecture Notes in Mathematics, vol. 457. Berlin: Springer. Wolpert, R.L., S.E. Ogburn, and E.S. Calder. 2016. The longevity of lava dome eruptions. Journal of Geophysical Research: Solid Earth 121 (2): 676–686. Worsdale, G. 1975. Tables of cumulative distribution function for symmetric stable distributions. Applied Statistics 24: 123–131. Xu, Q., and K. Liu. 2019, June. A new feature extraction method for bearing faults in impulsive noise using fractional lower-order statistics. Shock and Vibration 2019: 1–13. Yamazato, M. 1978. Unimodality of infinitely divisible distribution functions of class L. The Annals of Probability 6: 523–531. Yang, X., and A.P. Petropulu. 2001. Long-range dependent alpha-stable impulsive noise in a Poisson field of interferers. In Proceedings of the 11th IEEE Signal Processing Workshop on Statistical Signal Processing, 54–57. IEEE. Young, M.S. 2008. Revisiting non-normal real estate return distributions by property type in the U.S. Journal of Real Estate Finance and Economics 36: 233–248. Young, M.S., R.A. Graff. 1995. Real estate is not normal: a fresh look at real estate return distributions. Journal of Real Estate Finance and Economics 10: 225–259. Young, M.S., S. Lee, and S. Devaney. 2006. Non-normal real estate return distributions by property type in the U.K. Journal of Property Research 23: 109–133.
References
323
Zaliapin, I.V., Y.Y. Kagan, and F. Schoenberg. 2005. Approximating the distribution of Pareto sums. Pure and Applied Geophysics 162: 1187–1228. http://moho.ess. ucla.edu/~kagan/tmoml.pdf. Zha, D., and T. Qiu. 2006. A new blind source separation method based on fractional lower-order statistics. International Journal of Adaptive Control and Signal Processing 20 (5): 213–223. Zhang, X. 2018, December. Statistical methods for stable distributions using the empirical characteristic function. M.S. Thesis, American University, Washington, DC. Zhang, Y., M.G. Schaap, and Y. Zha. 2018. A high-resolution global map of soil hydraulic properties produced by a hierarchical parameterization of a physically based water retention model. Water Resources Research. Zhidkov, S.V. 2018. Statistical characterization and modeling of noise effects in near-ultrasound aerial acoustic communications. The Journal of the Acoustical Society of America 144 (4): 2605–2612. Zieliński, R. 2000. A reparametrization of the symmetric α-stable distributions and their dispersive ordering. Teor. Veroyatnost. i Primenen. 45 (2): 410–411. Zieliński, R. 2001. High accuracy evaluation of the cumulative distribution function of α-stable symmetric distributions. Journal of Mathematical Sciences 105 (6): 2631–2632. Zolotarev, V.M. 1957. Mellin-Stieltjes transformations in probability theory. Teor. Veroyatnost. i Primenen. 2: 444–469. Zolotarev, V.M. 1981. Integral transformations of distributions and estimates of parameters of multidimensional spherically symmetric stable laws. In Contributions to Probability, 283–305. New York: Academic Press. Zolotarev, V.M. 1986. One-Dimensional Stable Distributions. Translations of mathematical monographs, vol. 65. American Mathematical Society. Translation from the original 1983 Russian edition. Zolotarev, V.M. 1995. On representation of densities of stable laws by special functions. Theory of Probability and Its Applications 39: 354–362.
Index
A amplitude, 132 B beta function, 271 Box-Muller algorithm, 19 C Carleman’s Theorem, 137 Cauchy distribution, 2 d.f., 22 central limit theorem, 20 Chambers-Mallows-Stuck algorithm, 85 characteristic function, 268 chracteristic transform, 119 convergence of type theorem, 141 convolution, 267 copula, 39 cumulant, 118 D data streams, 46 differential equations, 149 discretized stable distribution, 135 dispersive ordering, 130 domain of attraction, 21, 141 normal, 143 partial, 145 E entropy, 148 expected shortfall, 35 exponential power distributions, 138 exponential-stable, 131 extreme value distributions, 139, 260
F FLOM, 15 Fréchet distribution, 139, 260 Fréchet limit, 120 fractional differential equations, 151 fractionally stable distribution, 129 frailty, 44 G gamma function, 271 incomplete, 271 generalized central limit theorem, 141 Gumbel distribution, 139, 260 H hazard function, 43 Holtzmark distribution, 28 I infinitely divisible, 64, 265 K Kohlrausch-William-Watts function, 42 L Lévy-Khintchine representation, 64 Landau distribution, 43 Laplace transform, 117, 269 Lévy distribution, 3 d.f., 23 Lévy flights, 30 log-stable, 130 Lorentz distribution, 2, 264
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
325
326 M max-stable, 260 Mellin transform, 118, 119, 270 metric, 119 min-stable, 260 modes, 295 moments, 13, 107, 268 fractional, 15, 107, 268 integer moments of totally skewed, 118 logarithmic, 131 negative, 107, 268 signed, 107 multiplication stable, 262 multivariate stable definition, 22 strictly, 22 symmetric, 22 myriad filter, 242 N nonlinear function, 183 normal distribution, 2 O order statistics, 160 P Pareto distribution, 14, 23, 255 probability of detection, 245 R random walk, 29 regularly varying function, 144 second order, 147 S saturation probability, 135 scaling property, 19 score function, 182 semi-group convolution, 150 operator, 150 stable, 149 signal-to-noise ratio geometric, 251 standard, 250
Index signed power, 113 sketches, 46 slowly varying function, 144 stable α ↓ 0 limit, 121 characteristic function, 5, 6, 92 definition, 2 density, 10, 65 series, 75 discretized, 135 distribution function, 10, 65 series, 75 Laplace transform, 117 moments, 14, 107 parameter estimation, 159 parameterizations, 5, 91 quantiles, 13, 273 ratios of, 132 scaling property, 19 semi-group, 149 simulation, 19 strictly, 2, 19 sums, 17 tail probabilities, 13 tempered, 265 wrapped, 134 stable filter matched, 244 unweighted, 239 weighted, 244 stochastically ordered, 129 stretched exponential, 42 stretched exponential distribution, 138, 260 symmetrization, 267 T tempered stable distributions, 265 trans-stable, 66, 149 type of a distribution, 2 V VaR (Value at Risk), 35 Voigt distribution, 264 W Weibull distribution, 139, 260
Author Index
A Abdul-Hamid, H., 87 Abramowitz, M., 257, 258 Adler, R., 133 Ahlfors, L., 73 Anatoniadis, A., 185 Anderson, C., 46 Anderssen, R., 42 Andrews, B., 224 Arce, G., 39, 239, 242, 250 Archuleta, R., 42 Atakhanova, Z., 35 Audus, D., 31 Avros, R., 263 B Babu, G., 161 Baeumer, B., 215 Banerjee, A., 39, 239 Barker, A., 178 Barthelemy, P., 43 Basterfield, D., 36 Beirlant, J., 262 Beltrami, H., 42 Benson, D., 42, 215 Berger, R., 160 Bergström, H., 75 Bertolotti, J., 43 Bestavros, A., 41 Bingham, N., 144, 155 Biondi, R., 42 Blattberg, R., 224 Boisvert, R., 264 Boldyrev, S., 30 Boto, J., 31 Brant, R., 178
Breiman, L., 64, 144, 147, 154, 265 Bremer, I., 46 Brent, R., 84 Brockman, D., 31 Brockmann, R., 31 Broda, S., 109 Brorsen, B., 178 Brothers, K., 83 Brown, G. W., 51, 153 Brown, R., 36, 37 Brychkov, Yu., 256, 257 Brynjolfsson, E., 46 Buckle, D., 185 Buldyrev, S. V., 31 Bundt, T., 36 C Calder, E., 42 Calder, M., 224 Calzolari, G., 185 Carasso, A., 39, 40, 239 Caron, F., 46 Casella, G., 160 Chae, C., 45 Chakraborty, P., 218 Chambers, J., 7, 20, 84, 86, 163 Chang, E., 42 Chernin, K., 76 Chisham, G., 31 Chitre, M., 39, 206, 239 Christoph, G., 147 Clark, C., 264 Clavier, L., 239 Cormode, G., 46 Cramér, H., 147 Cressie, N., 120
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
327
328
Author Index
Crovella, M., 41, 46 Crow, E., 83 Csörgö, M., 221 Csörgő, S., 204 Csörgő, T., 43 Cushman, J., 34
Fox, C., 65 Fox, E., 46 Freeman, M., 31 Friedland, O., 43 Frisch, U., 30, 42 Frye, M., 31
D da Luz, M. G. E., 31 Dagostino, R., 199 Dahmen, S., 31 Dance, C., 176 Darby, S., 45 Datar, M., 46 Davis, R., 49, 132, 224 Davydov, Y., 263 de Haan, L., 144, 147, 165 De Vany, A., 37 de Vries, C., 147 deDoncker-Kapenga, E., 84 Delehanty, T., 83 Devaney, S., 37 Dey, D., 45, 264 Dharmadhikari, S., 61 Ditlevsen, P., 34 Doray, L., 263 Douglas, J., 31 Draper, N., 227 Duflos, E., 239 DuMouchel, W., 83, 162, 177, 178, 181, 192, 226, 227
G Gadrich, T., 133 Garcia, R., 185 Gatto, R., 134 Gaver, D., 45 Gawronski, W., 76, 77, 79 Gaynor, G., 42 Geisel, T., 31 Geluk, J., 144, 147 George, B., 45 Georgiadis, A., 239 Geresford, G., 42 Ghanbarian-Alavijeh, B., 42 Gnedenko, B., vii, 143–145, 260 Gnedenkon, B., 143 Gneiting, T., 138 Godsil, S., 185 Goegebeur, Y., 262 Goldie, C., 144, 155 Gomes, C., 46 Gomi, C., 42 Goncalves, P., 185 Gonzalez, J., 39, 239, 242, 250 Gorenflo, R., 152 Gorska, K., 65 Gradshteyn, I., 66, 88, 111, 258, 270 Graff, R., 36 Guédon, O., 43 Guadagnini, A., 42 Gunning, J., 42 Guo, W., 45 Gupta, V., 42 Gwinn, C., 30
E Eckford, A., 45 Elbarmi, H., 224 Elton, D., 42 Embrechts, P., 37, 163, 259, 262 Epps, T. W., 221 Epstein, B., 119 F Fama, E., 35, 166 Fan, Y., 185 Fan, Z., 184, 185 Farsad, N., 45 Feller, W., 28, 65, 75, 94, 96, 99, 100, 144, 150, 268, 269 Ferguson, T., 160, 182 Ferreira, A., 165 Feuerverger, A., 170, 185 Fiorini, F., 45 Fisher, R., 260 Fofack, H., 36, 101 Fougères, A.-L., 47, 139
H Haas, M., 109 Hackett, S., 45 Hall, P., 7, 76, 77, 147 Hallin, M., 224 Hanagal, D., 45 Harchol-Balter, M., 46 Hardin, C., 109 Harrington, A., 36 Havlin, S., 31 Heathcote, C., 224 Hegyi, S., 43 Heinrich, L., 45
Author Index Hill, B., 162 Hinrichsen, H., 31 Hoffman-Jørgensen, J., 65 Holcomb, E., 170 Holt, D., 83 Houdré, C., 107 Hougaard, P., 44, 45 Hu, Y., 46 Hufnagel, L., 31 Hughes, B. D., 31 Husain, S., 42 I Ibragimov, I., 76, 143 Ibragimov, R., 35 Ilerena, G., 42 Indyk, P., 46 J Jacobs, P., 45 Jammalamadaka, S., 134 Janicki, A., vii, 25, 156 Jaoua, N., 239 Jiang, S., 263 Jin, H., 37 K Kagan, Y., 42, 148 Kahaner, D., 84 Kalmanek, C., 41 Kanter, M., 76 Kaplan, P., 35 Kapoor, R., 39, 239 Keat, P., 46 Keller, J. B., 147 Kesten, H., 31 Khindanova, I., 35 Kidmose, P., 39, 239 King, D., 37 King, G., 46 Kleiner, B., 39, 239 Kleinfleter, N., 34 Klüppelberg, C., 37, 133, 163, 259, 262 Kogon, S., 170, 173, 174, 186 Kohlrausch, R., 42 Kolmogorov, A., vii, 143–145 Koponen, I., 265 Kosko, B., 41 Kotulska, M., 45 Kotz, S., 139, 262 Koutrouvelis, I., 170, 171 Kozlov, M., 31 Kozubowski, T., 42, 262 Kratz, M., 162
329 Krause, J., 109 Krutto, A., 174 Kuruoğlu, E., 148, 176, 247 Kurz-Kim, J., 224 Kuske, R., 147 Kuzuha, Y., 42 Kwaśnicki, M., 77 L Lévy, P., vii Lamantia, F., 35 Landau, L., 43 Landen, D., 45 Lange, K., 223 Lavalle, D., 42 Lavallee, D., 42 Ledoux, M., 43 Lee, S., 37 Leitch, R., 170 Leo, W., 43 Lévy, P., 1, 96 Li, L., 42 Li, R., 239 Liechty, J., 161 Lim, C., 218 Lin, D., 161 Linger, F., 31 Linnik, Y., 143 Lipsky, L., 46 Little, R., 223 Liu, K., 39 Lombardi, M., 185 Lopes, A., 31 Lovejoy, S., 42 Loy, R., 42 Lozier, D., 264 Lu, S., 42 Luong, A., 263 M Machado, J., 31 Mainardi, F., 152 Mallick, M., 45 Mallows, C., 7, 20, 84, 86, 163 Mandelbrot, B., 30, 35, 42 Mantegna, R., 156 Manton, K. G., 44 Marchal, P., 107 Marcus, A., 42 Marcus, M., 221 Marichev, O., 256, 257 Marmolejo-Saucedo, J., 234 Martin, R., 35 Mathai, A., 256
330 Matsui, M., 84, 110, 178 Mayer-Wolf, E., 31 McCulloch, J. H., 35, 83, 84, 162, 166, 178, 206, 224 McDermott, J., 161 McDunnough, P., 170 McKenzie, J., 234 Meerschaert, M., vii, 33, 39, 42, 49, 215, 218 Menabde, M., 42 Mercadier, C., 47, 139 Mihalas, D., 264 Mikosch, T., 37, 133, 163, 259, 262 Millan, H., 42 Mitaim, S., 41 Mitra, S. S., 153 Mittnik, S., vii, 35, 262 Mohammadi, M., 221 Mohammadpour, A., 221 Molchanov, I., 263 Molz, F., 42 Moolgavkar, S., 228 Moroni, M., 34 Mureika, R., 170 Murphy, G., 36 Mustard, J., 42 Muthukrishnan, S., 46 N Núñez, R., 39, 250 Nadarajah, S., 139, 262 Nan, T., 42 Nandhakumar, N., 39, 239 Nelsen, R., 39 Nelson, P., 224 Neuman, S., 42 Nikias, C., vii, 25, 39, 165, 175, 183, 239 Nolan, J., 31, 33, 35, 36, 39, 47, 83, 87, 101, 117, 120, 132, 139, 178, 210, 224, 227, 239, 250, 272 O Ogburn, S., 42 Ojeda-Revah, D., 168, 224, 226, 233, 234 Olver, F., 264 Ong, S., 39, 206, 239 Ortobelli, S., 35 P Painter, S., 42 Panton, D., 83 Paolella, M., 109, 110 Park, M., 34 Paterson, L., 42 Paulson, A., 83, 170
Author Index Pawlas, Z., 110 Paxson, V., 41 Peach, G., 43 Peng, L., 147 Penson, K., 65 Peters, E., 35 Peters, G., 185 Petropulu, A., 239 Pewsey, A., 134 Piessens, R., 84 Pinelis, I., 111, 112 Pitman, E., 61 Pitman, J., 61 Podgórski, K., 262 Potter, J., 39, 206, 239 Press, S., 170 Prudnikov, A., 256, 257 Pukelsheim, F., 45 Q Qi, C., 239 Qiou, Z., 45 Qiu, T., 39, 239 R Rachev, S., vii, 35, 147, 224, 262 Rajaram, H., 42 Raposo, E. P., 31 Ravishanker, N., 45, 264 Reiss, R., 259 Renault, E., 185 Resnick, S., 49, 132, 144, 162–164 Retana-Blanco, B., 234 Reynolds, A., 31 Rimmer, R., 153, 221 Rishmawi, S., 42, 215, 217 Riva, M., 42 Robinson, G., 35, 84 Robust Analysis Inc., 239, 249 Rodriguez, J., 42 Rodriguez-Aguilar, R., 234 Roitershtein, A., 31 Roll, R., 35, 166 Rootzén, H., 47, 139 Rosiński, J., 147, 265 Roughan, M., 41 Ryzhik, I., 66, 88, 111, 258, 270 S Sahimi, M., 42 Samorodnitsky, G., vii, 22, 25, 39, 140, 224 Samuelson, P., 35 Sargent, T., 224 Sato, K., 33, 76, 265
Author Index Schaap, M., 42 Scheffler, H.-P., vii, 39, 49, 215 Schertzer, D., 42 Schlesinger, M., 30, 42 Schneider, W., 65 Schoenberg, F., 42, 148 Schreuder, N., 45 Schumer, R., 42 Schwartz, E., 35 Schwingenschlögl, U., 45 Segers, J., 262 Seglie, E., 45 Selman, B., 46 SenGupta, A., 134 Septier, F., 239 Shao, M., vii, 25, 39, 175, 183, 239 Sibatov, R., 129 Siboulet, F., 35 Sigman, K., 41 Sikorskii, A., 33 Simon, T., 76, 77 Sisson, S., 185 Sivaplan, M., 42 Sivia, D. S., 26 Smith, H., 227 Smith, M., 46 Sorace, J., 46 Spitzer, F., 31 Springer, M., 119, 270 Sreehari, M., 61 Stallard, E., 44 Stanley, H. E., 31 Stărică, C., 163 Stegun, I., 257, 258 Stephens, M., 199 Steude, S., 109 Steutel, F., 263, 265 Stollenwerk, N., 31 Stoyanov, S., 224 Stuck, B., 7, 20, 39, 84, 86, 163, 239 Swan, Y., 224 T Tajer, S., 42 Takemura, A., 84, 178 Talagrand, M., 43 Taqqu, M., vii, 22, 25, 39, 41, 140 Taylor, C., 45 Taylor, J., 223 Teugels, J., 144, 155, 262 Thomas, M., 259 Tippett, L., 260 Toledano-Kitai, D., 263 Tran-Gia, J., 31
331 Tsihrintzis, G., 39, 165, 239 Tsionas, E.„ 185 Tukey, J. W., 51, 153 U Überhuber, C., 84 Uchaikin, V., vii, 25, 45, 65, 75, 107, 118, 122, 129, 150, 178 Ushakov, N., 170 V Vallis, K., 45 Van den Heuvel, F., 45 Van Harn, K., 263, 265 Vanheeghe, P., 239 Vaupel, J. W., 44 Velis, D., 42 Venzon, D., 228 Verdebout, T., 224 Verdi, M., 31 Veredas, D., 185, 224 Viswanathan, G. M., 31 Volkovich, Z., 263 W Walls, W., 37, 234 Wang, B., 148 Wassell, J., 45 Watts, D., 42 Waymire, E., 42 Wells, R., 264 Weron, A., vii, 25, 156 Weron, R., 37, 85 West, B., 42, 45 Wheatcraft, S., 42 Wiersma, D., 43 Wiessner, M., 76, 77, 79 Williams, D., 170, 173, 174, 186 Williams, E., 137 Williams, G., 42 Willinger, W., 41 Winter, A., 76, 77 Wise, J., 224 Wojciechowski, W., 45 Wolf, W., 147 Wolfe, S., 269 Wolpert, R., 42 Worsdale, G., 83 Woyczynski, W., 147 X Xu, Q., 39
332 Y Yamazato, M., 76 Yang, S., 178 Yang, X., 239 Young, M., 36, 37 Z Zajc, W., 43 Zaliapin, I., 42, 148 Zaslavsky, G., 30, 42 Zeitouni, O., 31 Zeng, L., 46 Zerubia, J., 247 Zha, D., 39, 239
Author Index Zha, Y., 42 Zhang, H., 239 Zhang, J., 148 Zhang, X., 174, 203 Zhang, Y., 42 Zhao, Z., 239 Zhidkov, S., 239 Zhong, Y., 239 Zieliński, R., 84, 130 Zolotarev, V., vii, 4, 7, 25, 45, 65, 66, 75, 80, 93, 100, 107, 109, 116, 118, 119, 122, 147, 149, 150, 178, 264, 270 Zuyev, S., 263
Symbol Index
Symbols B(p, q) (beta function), 277 D A(Z) (domain of attraction), 21 p E(X − a)+ (truncated moments), 110 [x], 135 Γ(x) (gamma function), 277 Γ(x, t) (incomplete), 277 γ(x, t) (incomplete), 277 ψ(x) (digamma), 277 η(u |hyperindex f or matα; k), 61 γE ul er (Euler’s constant), 59 M X (u) (Mellin transform), 119, 276 ω(u |hyperindex f or matα, β; k), 61 θ0 , 67 ζ, 65 f (x) ∼ g(x), 13 x < p > (signed power), 114 zλ (stable quantile), 16 sign u (sign of u), 4 d
= (equal in distribution), 1 d
−→ (converges in distribution), 21
EVD (extreme value distribution), 139, 267 ExpS(α, μ, σ) (exponential stable), 132 FLOM (fractional lower order moment), 170 g d (x |α, β), 87 g d (x |α, β), 87 h d (x |α, β), 87 d (x |α, β), 87 h SNR (signal-to-noise ratio), 256 S (α, β, γ, δ; 0) (0-parameterization), 6 S (α, β, γ, δ; 1) (1-parameterization), 6 S (α, β, γ, δ; 2), 92 S (α, β, γ, δ; 3), 93 S (α, β, γ, δ; 4), 93 S (α, β, γ, δ; 5), 93 S (α, β, γ, δ; 6), 93 S (α, β, γ; 7), 94 S (α, ρ, γ; 8), 95 S (α, β, τ; 9), 95 S (α, γ+, γ−, δ; 10), 96 S (0, β, γ, δ; k), 122 SαS(γ) (symmetric α-stable), 7
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. P. Nolan, Univariate Stable Distributions, Springer Series in Operations Research and Financial Engineering, https://doi.org/10.1007/978-3-030-52915-4
333